Ayuda
Ir al contenido

Dialnet


No Need to Get Wasteful: The Way to Train a Lightweight Competitive Spelling Checker Using (Concentrated) Synthetic Datasets

  • Autores: Vladimir Starchenko
  • Localización: Computación y Sistemas (CyS), ISSN 1405-5546, ISSN-e 2007-9737, Vol. 28, Nº. 4, 2024, págs. 1865-1877
  • Idioma: inglés
  • Enlaces
  • Resumen
    • Abstract: This study focuses on spelling checkers, which remains problematic for modern error correction systems. Based on T5 architecture, we create a lightweight spelling check tool that can be used in combination with a large language model (LLM) and significantly improves the overall result of the error correction system. It also performs competitively compared to other recently developed spelling check tools, despite being considerably smaller in size. The high performance of the model is obtained as a result of introducing two synthetic datasets: a dataset with a high density of spelling errors and the dataset with errors more difficult for correction.

Los metadatos del artículo han sido obtenidos de SciELO México

Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno