Ayuda
Ir al contenido

Dialnet


Resumen de Semi-lexical features in corpus transcription

Gisle Andersen

  • An aspect of corpus compilation that poses a particular challenge is the question of how to transcribe orthographically units that are not part of any standardised vocabulary. Among the problematic categories we find voiced pauses, minimal response signals, interjections, certain discourse markers, phonologically reduced forms, colloquialisms and dialect forms. Such semi-lexical features are usually represented by regular phonemic-graphemic correspondences but are nevertheless often inconsistently handled. This paper reviews a number of existing transcription guidelines and assesses whether the recommendations they provide are sufficient and detailed enough to secure a consistent transcription of the categories mentioned. Further, the paper assesses to what extent transcription of semi-lexical features is consistent within and across two spoken corpora. On the basis of a cross-corpus comparison of the Bergen Corpus of London Teenage Language (COLT) and the London English Corpus (LEC), the paper provides specific recommendations for corpus transcription.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus