Ayuda
Ir al contenido

Dialnet


Named entity recognition and transliteration in Bengali

  • Autores: Asif Ekbal, Sudip Kumar Naskar, S. Bandyopadhyay
  • Localización: Linguisticae investigationes: Revue internationale de linguistique française et de linguistique générale, ISSN 0378-4169, Tome 30, Fascicule 1, 2007, págs. 95-114
  • Idioma: inglés
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • The paper recports about the development of a Named Entity Recognition (NER) sustem in Bengali using a tagged Bengali news corpus and the subsequent transliteration of the recognized Bengali Named Entities (NEs) into English. Three different models of the NER have been developed. A semi-supervised learning method has been adopted to develop the first two models, one without linguistic features (Model A) and the other with linguistic features (Model B). The third one (Model C) is based on statistical Hidden Markov Model. A modified joint-source channel model has been used along with a number of alternatives to generate the English transliterations of Bengali NEs and vice-versa. The transliteration models learn the mappings from the bilingual training sets optionally guided by linguistic knowledge in the form of conjuncts and diphthongs training sets optionally guided by linguistic knowledge in the form of conjuncts and diphthongs training sets optionally guided by linguistic knowledge in the form of conjuncts and diphthongs training sets optionally guided by linguistic knowledge in the form of conjunctus and diphthongs in Bengali and their representations in English. The NER system has demonstrated the highest average REcall, PRecision and F-Score values of 89.62%, 78.67% and 83.79% respectively in Model C. Evaluation of the proposed transliteration models demonstrated that the modified joint source-channel model performs best in terms of evaluation metrics for person and location names for both Bengali to English (B2E) transliteration and English to Bengali transliteration (E2B). The use of the linguistic knwledge during training of the transliteration models improves performance


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno