Three-step coreference-based summarizer for Polish news texts

Mateusz Kopec

Ayuda

Three-step coreference-based summarizer for Polish news texts

Mateusz Kopec ^[1]
1. [1] Institute of computer Science, Polish Academy of Sciences
Localización: Poznan Studies in Contemporary Linguistics, ISSN 1732-0747, ISSN-e 1897-7499, Vol. 55, Nº. 2, 2019 (Ejemplar dedicado a: Current state of the art in language technology for polish), págs. 397-443
Idioma: inglés
Texto completo no disponible (Saber más ...)
Resumen
- This article addresses the problem of automatic summarization of press articles in Polish. The main novelty of this research lays in the proposal of a three-step summarization algorithm which benefits from using coreference information.
  
  In related work section, all coreference-based approaches to summarization are presented. Then we describe in detail all publicly available summarization tools developed for Polish language. We state the problem of single-document press article summarization for Polish, describing the training and evaluation dataset: the POLISH SUMMARIES CORPUS.
  
  Next, a new coreference-based extractive summarization system NICOLAS is introduced. Its algorithm utilises advanced third-party preprocessing tools to extract the coreference information from the text to be summarized. This information is transformed into a complex set of features related to coreference concepts (mentions and coreference clusters) that are used for training the summarization system (on the basis of a manually prepared gold summaries corpus).
  
  The proposed solution is compared to the best publicly available summarization systems for Polish language and two state-of-the-art tools, developed for English language, but adapted to Polish for this article. NICOLAS summarization system obtains best scores, for selected metrics outperforming other systems in a statistically significant way. The evaluation also contains calculation of interesting upper-bounds: human performance and theoretical upper-bound.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Opciones de compartir

Opciones de entorno

Sugerencia / Errata

Coordinado por: