Aiming at the higher correlation between the objective evaluation of computer English speech and the subjective evaluationof experts, an acoustic model based on discriminative training is proposed to improve the confidence score of objectiveevaluation. First, the process of obtaining the pronunciation quality evaluation score of the speech vector by the forcedmatching algorithm is introduced, and then the mathematical theory of hypothesis testing is used to prove that the acousticmodel trained based on the discriminative algorithm ‘minimum phoneme error’ is more effective than the acoustic modelbased on the traditional maximum likelihood algorithm. Confidence scores close to subjective assessments are obtained.By calculating the correlation coefficient of the subjective and objective evaluation results, the experiment verifies that thespeech evaluation system using the discriminative acoustic model can give a higher confidence score and proposes a dataselection method based on dynamic weighting, which is applied to continuous speech recognition in the discriminativetraining of the acoustic model. This method combines the posterior probability and the phoneme accuracy rate to select thedata. First, the Beam algorithm of the posterior probability is used to trim the word graph. On this basis, according to theerror rate of the candidate path where the candidate word is located, the probability dynamically assigns different weights tothe candidate words; second, by calculating the confusion degree between the phoneme pairs, different penalty weights isdynamically added to the easily confused phoneme pairs to calculate the phoneme accuracy; finally, the expected accuracyof the obtained arc is calculated on the basis of the probability distribution. The Gaussian function is used to softly weightthe expected phoneme accuracy of all competing arcs. The experimental results show that compared with the minimumphoneme error criterion, the dynamic weighting method has higher recognition accuracy and can effectively reduce thetraining time.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados