The Journal of China Universities of Posts and Telecommunications ›› 2019, Vol. 26 ›› Issue (3): 98-104.doi: 10.19682/j.cnki.1005-8885.2019.0010

• Others • Previous Articles    

Improved vocal effort modeling by exploiting echo state network and radial basis function network


  • Received:2018-09-18 Revised:2018-12-17 Online:2019-06-30 Published:2019-06-30
  • Contact: Hao CHAO
  • Supported by:
    ;Fundamental Research Funds for the Universities of Henan Province;Foundation for University Key Teacher by Henan Province;Foundation for scientific and technological project of Henan Province

Abstract: The independent hypothesis between frames in vocal effect (VE) recognition makes it difficult for frame based spectral features to describe the intrinsic temporal correlation and dynamic change information in speech phenomena. A novel VE detection method based on echo state network (ESN) is presented. The input sequences are mapped into a fixed-dimensionality vector in high dimensional coding space by reservoir of the ESN. Then, radial basis function (RBF) networks are employed to fit the probability density function (pdf) of each VE mode by using the vectors in the high dimensional coding space. Finally, the minimum error rate Bayesian decision is employed to judge the VE mode. The experiments which are conducted on isolated words test set achieve 79.5% average recognition accuracy, and the results show that the proposed method can overcome the defect of the independent hypothesis between frames effectively.

Key words: vocal effort, echo state network, reservoir, radial basis function, support vector machine

CLC Number: