Bilingual terminology acquisition from unrelated corpora

Rogelio Nazar

Ayuda

Bilingual terminology acquisition from unrelated corpora

Autores: Rogelio Nazar
Localización: Proceedings of the XIII EURALEX International Congress (Barcelona, 15-19 July 2008) / coord. por Janet Ann DeCesaris, Elisenda Bernal, 2008, ISBN 978-84-96742-67-3, págs. 1023-1029
Idioma: inglés
Enlaces
- Texto completo (pdf)
Resumen
- This paper presents a simple yet effective technique for the extraction of term equivalents in different languages. In general, techniques for bilingual lexicon extraction have been related to the elaboration of parallel corpora and have yielded accurate results. However, parallel corpora of different domains and languages are not easy to compile. Because of this, some authors have explored techniques to extract a bilingual lexicon from nonparallel but comparable corpora, which are pairs of texts that are not exactly translations of each other but that roughly "talk about the same things". This paper describes an algorithm that performs bilingual terminology extraction without the need of large amounts of data;
  
  dealing with infrequent units; needing not the corpora to be comparable nor other resources like an initial bilingual lexicon to use as seed words. In spite of its simplicity, the results of this algorithm are comparable to those of the state of the art techniques, however it supersedes them considering that it offers a domain and language independent method specially suitable for the extraction of specialized terminology, which is the most dynamic part of the lexicon and the most difficult to acquire.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: