Machine learning techniques for word sense disambiguation

Gerard Escudero Bakx

Ayuda

Machine learning techniques for word sense disambiguation

Autores: Gerard Escudero Bakx
Directores de la Tesis: Lluís Márquez i Villodre (dir. tes.), Germán Rigau Claramunt (dir. tes.)
Lectura: En la Universitat Politècnica de Catalunya (UPC) ( España ) en 2006
Idioma: español
Tribunal Calificador de la Tesis: Horacio Rodríguez Hontoria (presid.), Lluís Padró Cirera (secret.), Eneko Agirre Bengoa (voc.), Walter Dealemans (voc.), Mark W. Stevenson (voc.)
Materias:
- Matemáticas
  - Ciencia de los ordenadores
    - Inteligencia artificial
- Lingüística
  - Lingüística aplicada
    - Lingüística informatizada
  - Lingüística sincrónica
    - Semántica
Texto completo no disponible (Saber más ...)
Resumen
- In the Natural Language Processing (NLP) community, Word Sense Disambiguation (WSD) has been described as the task which selects the appropriate meaning (sense) to a given word in a text or discourse where this meaning is distinguishable from other senses potentially attributable to that word, These senses could be seen as the target labels of a classification problem. That is, Machine Learning (ML) seems to be a posible way to tackle this problem.
  
  This work studies the possible application of the algorithms and techniques of the Machine Learning field in order to handle the WSD task.
  
  The first issue treated has been the adaptation of alternative ML algorithms to deal with word senses as classes. Then, a comparison of these methods is performed under the same conditions. The evaluation measures applied to compare the performances of these methods are the typical precision and recall, but also agreement rates and kappa statistics.
  
  The second topic explored is the cross-corpora application of supervised Machine Learning systems for WSD to test the generalisation ability across corpora and domains. The results obtained are very disappointing, seriously questioning the possibility of constructing a general enough training corpus (labelled or unlabelled), and the way its examples should be used to develop a general purpose Word Sense Tagger.
  
  The use of unlabelled data to train classifiers for Word Sense Disambiguation is a very challenging line of research in order to develop a really robust, complete and accurate Word Sense Tagger. Due to this fact, the next topic treated in this work is the application of two bootstrapping approaches on WSD: the Transductive Support Vector Machines and the Greedy Agreement bootstrapping algorithm by Steven Abney.
  
  During the development of this research we have been interested in the construction and evaluation of several WSD systems. We have participated in the last two editions of the English Lexical Sample task of Sen

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: