Integrating verb+noun collocations into a french - romanian lexical alignment system for law domain

Amalia Todirascu ^[1] ; Mirabela Navlea ^[1]
1. [1] Université de Strasbourg
Localización: Computerised and Corpus-based Approaches to Phraseology: Monolingual and Multilingual Perspectives / coord. por Gloria Corpas Pastor, María Rosario Bautista Zambrana, Cristina Castillo Rodríguez, Isabel Durán Muñoz, Jorge Jesús Leiva Rojo, Gema María Lobillo Mora, Pablo Salvador Pérez Pérez, Miriam Seghiri, María Cristina Toledo Báez, Míriam Urbano Mendaña, Anna Zaretskaya, 2016, págs. 176-186
Idioma: inglés
Enlaces
- Texto Completo Libro (pdf)
Resumen
- In this article, we compare two methods to integrate a specific class of multiword expressions, Verb+Noun collocations, into a French - Romanian lexical alignment tool. In our experiments, we use a French - Romanian parallel corpus for law domain. This corpus is tokenized, tagged, lemmatized and chunked. The first method uses a dictionary-based approach to complete Verb+Noun collocations alignment. The second method proposes an alignment algorithm which uses a set of MWEs candidates previously extracted from the monolingual part of the training corpus. These candidates were detected by a hybrid extraction method combining statistical measures and linguistic filters. The best results were obtained with the hybrid method.

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: