中国邮电高校学报(英文) ›› 2007, Vol. 14 ›› Issue (4): 126-130.doi: 1005-8885 (2007) 04-0126-05

• Signal processing • 上一篇    

Discriminative tonal feature extraction method in mandarin speech recognition

黄浩; 朱杰   

  1. Department of Electronic Engineering, Shanghai Jiao Tong University
  • 收稿日期:2007-04-24 修回日期:1900-01-01 出版日期:2007-12-24
  • 通讯作者: 黄浩

Discriminative tonal feature extraction method in mandarin speech recognition

HUANG Hao; ZHU Jie   

  1. Department of Electronic Engineering, Shanghai Jiao Tong University
  • Received:2007-04-24 Revised:1900-01-01 Online:2007-12-24
  • Contact: HUANG Hao

摘要:

To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as “minimum tone error”, which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.

关键词:

discriminative training; tone recognition; feature extraction; Mandarin speech recognition

Abstract:

To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as “minimum tone error”, which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.

Key words:

discriminative training; tone recognition; feature extraction; Mandarin speech recognition