Ayuda
Ir al contenido

Dialnet


GraWiTas: a Grammar-based Wikipedia Talk Page Parser

  • Autores: Benjamín Cabrera, Laura Steinert, Björn Ross
  • Localización: 15th Conference of the European Chapter of the Association for Computational Linguistics: Proceedings of the Software Demonstrations : April 3-7, 2017 Valencia, Spain / Anselmo Peñas Padilla (ed. lit.), André Martins Brandão (ed. lit.), 2017, ISBN 978-1-945626-36-4, págs. 21-24
  • Idioma: inglés
  • Enlaces
  • Resumen
    • Wikipedia offers researchers unique insights into the collaboration and communication patterns of a large self-regulating community of editors. The main medium of direct communication between editors of an article is the article’s talk page. However, a talk page file is unstructured and therefore difficult to analyse automat- ically. A few parsers exist that enable its transformation into a structured data format. However, they are rarely open source, support only a limited subset of the talk page syntax – resulting in the loss of content – and usually support only one export format. Together with this article we offer a very fast, lightweight, open source parser with support for various output formats. In a preliminary evaluation it achieved a high accuracy. The parser uses a gram- mar -based approach – offering a transparent implementation and easy extensibility.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno