Eslovenia
The Thesaurus of Modern Slovene is the largest open-source digital collection of Slovene synonyms, published in March 2018 by the Centre of Language Resources and Technologies of the University of Ljubljana. The Thesaurus was initially compiled entirely automatically and allows users to contribute toward improving the resource by adding suggestions for missing synonyms and/or by evaluating both the synonym candidates from the initial database as well as the suggestions added by other users. As an automatically generated language resource, however, the initial database of the Thesaurus includes a certain degree of noise. In the paper, we present two crowdsourcing activities aimed at cleaning up the database. The first is a targeted annotation campaign aimed at evaluating multi-word synonym candidates in the Thesaurus, and the second is an analysis of user votes provided directly in the Thesaurus interface. Both scenarios are examples of an effective postprocessing method for an automatically generated language resource and demonstrate that crowdsourcing can play an important role in smart lexicography, especially in the case of less-resourced languages.
© 2001-2025 Fundación Dialnet · Todos los derechos reservados