中国邮电高校学报(英文版) ›› 2016, Vol. 23 ›› Issue (4): 9-16.doi: 10.1016/S1005-8885(16)60040-7

• Networks • 上一篇    下一篇

Novel DTD and VAD assisted voice detection algorithm for VoIP systems

明萌1,王珂1,纪红2   

  1. 1. 北京邮电大学
    2. 北邮199信箱
  • 收稿日期:2016-03-14 修回日期:2016-05-24 出版日期:2016-08-30 发布日期:2016-08-30
  • 通讯作者: 明萌 E-mail:mingmeng2013@126.com

Novel DTD and VAD assisted voice detection algorithm for VoIP systems

Meng MING1 Wang-Ke 1Ji-Hong2   

  • Received:2016-03-14 Revised:2016-05-24 Online:2016-08-30 Published:2016-08-30
  • Contact: Meng MING E-mail:mingmeng2013@126.com

摘要: Echo cancellation plays an important role in current Internet protocol (IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk detection (DTD) and voice activity detection (VAD). DTD is used to detect doubletalk and prevent filter divergence in the presence of near-end speech, and VAD is used to determine the near-end voice activity and output silence indicator when near-end is silent. However, DTD straightforwardly proceeded may mistakenly declare double talk under double silent condition, coefficients update under the far-end silence condition may lead to filter divergence, and current VAD algorithms may misjudge the residual echo from the near end to be far-end voice. Therefore, a voice detection algorithm combining DTD and far-end VAD is proposed. DTD is implemented when VAD declares far-end speech, filtering and coefficients update will be halted when VAD declares far-end silence, and the far-end VAD adopted is multi-feature VAD based on short-time energy and correlation. The new algorithm can improve the accuracy of DTD, prevent filter divergence, and exclude the circumstance that far-end signal only contains residual echo from near end. Actual test results show that the voice state decision of the new algorithm is accurate, and the performance of echo cancellation is improved.

关键词: echo cancellation, double talk detection (DTD), voice activity detection (VAD), adaptive filter

Abstract: Echo cancellation plays an important role in current Internet protocol (IP) based voice interactive systems. Voice state detection is an essential part in echo cancellation. It mainly comprises two parts: double talk detection (DTD) and voice activity detection (VAD). DTD is used to detect doubletalk and prevent filter divergence in the presence of near-end speech, and VAD is used to determine the near-end voice activity and output silence indicator when near-end is silent. However, DTD straightforwardly proceeded may mistakenly declare double talk under double silent condition, coefficients update under the far-end silence condition may lead to filter divergence, and current VAD algorithms may misjudge the residual echo from the near end to be far-end voice. Therefore, a voice detection algorithm combining DTD and far-end VAD is proposed. DTD is implemented when VAD declares far-end speech, filtering and coefficients update will be halted when VAD declares far-end silence, and the far-end VAD adopted is multi-feature VAD based on short-time energy and correlation. The new algorithm can improve the accuracy of DTD, prevent filter divergence, and exclude the circumstance that far-end signal only contains residual echo from near end. Actual test results show that the voice state decision of the new algorithm is accurate, and the performance of echo cancellation is improved.

Key words: echo cancellation, double talk detection (DTD), voice activity detection (VAD), adaptive filter

中图分类号: