This paper focuses on the description of the corpus «PEST-INTER» in five languages and the process of its compilation and incorporation. The aim is to give step-by-step instruction on the corpus compilation. The further purpose is to show up the practical solutions for the problems raising in different stages of the corpus compilation. Describing the decisions taken and the strategies followed I discuss the corpus planning going into depth on web crawling, character and corpus encoding, automatic alignment and editing of the compiled texts.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados