Stadtkreis Heidelberg, Alemania
City of Cape Town, Sudáfrica
When modelling linguistic resources as Linked Data, the identification of languages using language tags and language codes is a mandatory task. IETF’s BCP 47 defines the standard for tags, and ISO 639 provides the codes. However, these codes are insufficient for the identification of diatopic variation within a language and, also, for different historical language stages. This weakness hampers the accurate identification of data, which in turn leads to ambiguity when extending, aggregating and re-using this data—a key notion of Linked Open Data and the Semantic Web. We show the limitations of language identification with a case study of French linguistic data from both a diachronic and a diatopic perspective. Our exemplary data derives from dictionaries of Old French, Middle French, and of Modern French dialects, and from a Modern French linguistic atlas. For each exemplar, we propose a solution using the privateuse sub-tag of BCP 47’s language tag, staying within the boundaries of existing standards. Using a predefined pattern for the privateuse sub-tag, the solutions enable a dialect, a patois, in combination with a time period, to be defined and identified. This can lead to shared agreement of language tags that will increase interoperability within the context of Linked Data.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados