Ayuda
Ir al contenido

Dialnet


Bilingual terminology acquisition from unrelated corpora

  • Autores: Rogelio Nazar
  • Localización: Proceedings of the XIII EURALEX International Congress (Barcelona, 15-19 July 2008) / coord. por Janet Ann DeCesaris, Elisenda Bernal, 2008, ISBN 978-84-96742-67-3, págs. 1023-1029
  • Idioma: inglés
  • Enlaces
  • Resumen
    • This paper presents a simple yet effective technique for the extraction of term equivalents in different languages. In general, techniques for bilingual lexicon extraction have been related to the elaboration of parallel corpora and have yielded accurate results. However, parallel corpora of different domains and languages are not easy to compile. Because of this, some authors have explored techniques to extract a bilingual lexicon from nonparallel but comparable corpora, which are pairs of texts that are not exactly translations of each other but that roughly "talk about the same things". This paper describes an algorithm that performs bilingual terminology extraction without the need of large amounts of data;

      dealing with infrequent units; needing not the corpora to be comparable nor other resources like an initial bilingual lexicon to use as seed words. In spite of its simplicity, the results of this algorithm are comparable to those of the state of the art techniques, however it supersedes them considering that it offers a domain and language independent method specially suitable for the extraction of specialized terminology, which is the most dynamic part of the lexicon and the most difficult to acquire.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno