Performance of a  phonetic encoding scheme for speech recognition using neural networks

Autores: V. Rodellar Biarge, Pedro Gómez Vilda, María Mercedes Pérez Castellanos, María Isabel Garcia Clemente, Fernando J. Naharro Berrocal, Consuelo Gonzalo Martín
Localización: Panel '92: actas, XVIII Conferencia Latinoamericana de Informática, 1992, págs. 1041-1048
Idioma: portugués
Texto completo no disponible (Saber más ...)
Resumen
- Through the present work, an Encoding Scheme for the Identification of the Phonetic Features of Speech, introduced and developed in a preliminary research [1, 2], is implemented on a Time-Delay Back-Propagation Neural Network (TDBPNN), and its most relevant features are analized. The Encoding Scheme is based on an 8-bit Hamming code, and can be represented by an 8-dimensional hypercube. A separate Phonetic subgraph, when the relevant nodes and edges are considered, is presented, showing the Minimum Distance Pairs for Spanish. The problem of using a Time Delay Neural Network to support this Encoding Scheme is adressed, pointing to the methods train Network. For such, a Fragmentation and Labeling Technique based in the PARCORgram Correlation Matrix of Speech is presented. Results show that using a 48:9:8 Back Propagation Time-Delay Network with fragments of Speech of the kind VCV or CV from the densest subset of sounds in the Phonetic Subgraph, the percentage or failures produced during the recognition process may be substantially reduced. The technique shown may be used in Computer Aided Speech Learning.

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: