Dinamarca
In this paper we present an annotated cor- pus created with the aim of analyzing the informative behaviour of emoji – an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Re- dundant, Non Redundant, and Non Redun- dant + POS. We explain how the corpus was collected, describe the annotation pro- cedure and the interface developed for the task. We provide an analysis of the cor- pus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future im- provements.
© 2001-2026 Fundación Dialnet · Todos los derechos reservados