S.O.C.C.E.R - A novel framework for measuring sports facts relevance
Publication date
Authors
DOI
Document Type
Master Thesis
Metadata
Show full item recordCollections
License
CC-BY-NC-ND
Abstract
Sports stakeholders capture more details about all the facet of a sport due to economical reasons. Next, they have to select the relevant information to sports fans, gamblers, journalists and other stakeholders that can use this information. Sports has been a research subject of the Natural Language Generation (NLG) field due to the data it produces and the standard natural language at times. But, NLG system have struggled to deliver a suitable level of performance compared with journalists or other sports experts due to various issues. We established that one of these issues is due to the lack of a proper definition of relevance for sports content. We set out to fix this issue by defining a relevance measurement framework for the sports domain.
We introduce the term sports fact to refer to natural language about sports. We defined seven general types of content that are part of a sports fact and can have an influence on relevance. We identified twelve properties that are used to measure relevance in other domains. We established a set a measuring guidelines that describe what and how a measurement should be done for a given property and content type. All of these form the S.O.C.C.E.R framework, the main artifact of this research project.
Keywords
sports facts, relevance, natural language processing, content, interestingness, sports statistics, natural language generation