Innere Stadt, Austria
This paper reports on some issues encountered when using various ‘external points of reference’ in the development of POS-tagging guidelines for the Vienna-Oxford International Corpus of English (VOICE). VOICE is a corpus of spoken English as a Lingua Franca (ELF) containing naturally occurring, plurilingual data. As in all kinds of natural language use, speakers recorded in VOICE exploit available linguistic resources, often resulting in non-codified language use and language which is difficult to classify unambiguously. However, detailed tagging solutions for such phenomena are rarely reported. We discuss usefulness and limitations of external points of reference with regard to their suitability for POS-tagging VOICE and address methodological as well as practical issues, especially the handling of non-codified language use and different types of ambiguities. We suggest that the solutions found, and the theoretical approach adopted, could be relevant for the tagging of other spoken corpora.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados