Performance analysis of Particle Swarm Optimization applied to unsupervised categorization of short texts

Leticia Cagnina; Diego Alejandro Ingaramo; Marcelo Luis Errecalde; Paolo Rosso

Ayuda

Performance analysis of Particle Swarm Optimization applied to unsupervised categorization of short texts

Autores: Leticia Cagnina, Diego Alejandro Ingaramo, Marcelo Luis Errecalde, Paolo Rosso
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 47, 2011, págs. 207-214
Idioma: inglés
Enlaces
- Texto completo
Resumen
- español
  Existe actualmente la necesidad de acceder a información en línea tal como resúmenes, noticias, opiniones, evaluaciones de productos, etc. Dicha información está disponible en la web, generalmente con el formato de textos cortos. Trabajos previos han demostrado la efectividad de un algoritmo discreto Particle Swarm Optimization, llamado CLUDIPSO, para el agrupamiento de colecciones pequeñas de textos cortos. Este artículo presenta un estudio preliminar sobre la prestación de CLUDIPSO con colecciones más grandes. Los resultados fueron comparados con los obtenidos con algoritmos representativos del estado del arte en el área. El trabajo experimental muestra una fuerte evidencia sobre los inconvenientes que posee el algoritmo cuando debe agrupar colecciones de mayor tamaño. Con respecto a este último aspecto, se discuten posibles razones del comportamiento inadecuado de CLUDIPSO y se consideran algunas alternativas para resolver los problemas observados.
- English
  Nowadays there is a need to access to on line information such as abstracts, news, opinions, evaluations of products, etc. That information is generally available on the web as short texts. Previous works have demonstrated the effectiveness of a discrete Particle Swarm Optimization algorithm, named CLUDIPSO, for clustering small short-text corpora. This article presents a preliminary study about the performance of CLUDIPSO on larger short-text corpora. The results were compared with those of the most representative algorithms of the state-of-the-art in the area. The experimental work gives strong evidence about the drawbacks of this algorithm to manage larger corpora. With respect to this last aspect, some possible reasons about the poor behavior of CLUDIPSO with larger short texts corpora are discussed and some alternatives in order to solve the problems observed, are considered.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: