We discuss the development of a corpus of learner Korean, performing an error analysis of particle usage with it. Although the corpus was largely developed for the evaluation of natural language processing (NLP) systems - as discussed in Lee et al. (2012) - there are two major design decisions which affect the use of the corpus and its annotation for qualitatively and quantitatively studying learner behavior and which have not been fully discussed before. First is the composition of the corpus, specifically what learner data to include. Second is how we define grammaticality, a particularly thorny problem for error annotation of Korean particles, which are, to some extent, optional. After explaining the nuances of particles in Korean in general, we turn to these two issues and then provide an error analysis, showing the differential error patterns between heritage and non-heritage learners. In particular, particle omission rates differ, illustrating the importance of clearly defining grammaticality for (sometimes) optional elements, both for annotation and for pedagogy.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados