Ayuda
Ir al contenido

Dialnet


Automatic generation of the Estonian Collocations: Dictionary database

    1. [1] Institute of the Estonian Language

      Institute of the Estonian Language

      Kesklinna linnaosa, Estonia

    2. [2] Lexical Computing
  • Localización: Electronic lexicography in the 21st century: linking lexical data in the digital age : proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom / Iztok Kosem (ed. lit.), Miloš Jakubíček (ed. lit.), Jelena Kallas (ed. lit.), Simon Krek (ed. lit.), 2015, ISBN 978-961-93594-3-3, págs. 1-20
  • Idioma: español
  • Enlaces
  • Resumen
    • This paper reports on the process of the automatic generation of the Estonian Collocations Dictionary (ECD) database. The database has been compiled by the Institute of the Estonian Language in collaboration with Lexical Computing Ltd. The ECD is a monolingual online scholarly dictionary aimed at learners of Estonian as a foreign or second language at the upper intermediate and advanced levels. The dictionary contains about 10,000 headwords, including single and multi-word lexical items. The collocates within each headword are grouped according to the lexico-grammatical structure formed by the collocational phrase, and for collocations example sentences are provided. For the automatic generation of the ECD database, the corpus query system Sketch Engine (Kilgarriff et al., 2004) functions Word List, Word Sketch and Good Dictionary Example (GDEX) were used. The data were automatically extracted in an XML format from the 463-million-word Estonian National Corpus and imported into the XML-based EELex dictionary writing system. To make the importing of automatically extracted data from Sketch Engine into EELex possible, the XML structure for extracted data was matched with the XML structure of ECD in EELex. The ECD project started in 2014 and the dictionary is scheduled to be published in 2018.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno