TASS: A Naive-Bayes strategy for sentiment analysis on Spanish tweets

Pablo Gamallo Otero; Marcos García González; Santiago Fernández Lanza

Ayuda

TASS: A Naive-Bayes strategy for sentiment analysis on Spanish tweets

Pablo Gamallo ^[1] ; Marcos Garcia ^[2] ; Santiago Fernández-Lanza ^[1]
1. [1] Universidade de Santiago de Compostela
  
  Universidade de Santiago de Compostela
  
  Santiago de Compostela, España
2. [2] CILENIS, S.L.
Localización: XXIX Congreso de la Sociedad Española de Procesamiento de Lenguaje Natural: SEPLN 2013 / coord. por Alberto Díaz Esteban, Iñaki Alegría Loinaz, Julio Villena Román, 2013, ISBN 978-84-695-8349-4, págs. 126-132
Idioma: inglés
Títulos paralelos:
- TASS: una estrategia Naive-Bayes para el análisis del sentimiento en tweets en español
Texto completo no disponible (Saber más ...)

Dialnet Métricas: 2 Citas

Resumen
- español
  En este artículo, se describe la estrategia que subyace al sistema presentado por nuestro grupo para la tarea de análisis de sentimiento en el TASS 2013. El sistema se basa principalmente en un clasificador Naive-Bayes orientado a la detección de la polaridad en tweets escritos en español. Los experimentos realizados han mostrado que los mejores resultados se han alcanzado utilizando clasificadores binarios que distinguen apenas entre dos categorías de polaridad: positivo y negativo. Para poder identificar más niveles de subjetividad, hemos incorporado al sistema umbrales de separación con los que distinguir valores de polaridad fuertes, medios y débiles o neutros. Además, para poder detectar si un tweet tiene o no tiene polaridad, el sistema incorpora también una regla básica basada en la búsqueda de palabras con polaridad dentro del texto analizado. Los resultados de la evaluación muestran valores razonablemente altos (cerca del 67% de precisión) cuando el sistema se aplica para detectar cuatro categorías de sentimiento.
- English
  This article describes the strategy underlying the system presented by our team for the sentiment analysis task at TASS 2013. The system is mainly based on a naive-bayes classifier for detecting the polarity of Spanish tweets. The experiments have shown that the best performance is achieved by using a binary classifier distinguishing between just two sharp polarity categories: positive and negative. To identify more polarity levels, the system is provided with experimentally set thresholds for detecting strong, average, and weak (or neutral) values. In addition, in order to detect tweets with and without polarity, the system makes use of a very basic rule that searchs for polarity words within the analysed text. Evaluation results show a good performance of the system (about 67% accuracy) when it is used to detect four sentiment categories.