Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification

doi:1005-8885 (2008) 02-0130-05

中国邮电高校学报(英文) ›› 2008, Vol. 15 ›› Issue (2): 130-134.doi: 1005-8885 (2008) 02-0130-05

• Speech Recognition • 上一篇

Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification

赵剑董远赵贤宇杨浩王海拉

Laboratory of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications, Beijing 100876, China

收稿日期:2007-07-06 修回日期:1900-01-01 出版日期:2008-06-30
通讯作者: 赵剑

Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification

ZHAO Jian, DONG Yuan, ZHAO Xian-yu, YANG Hao, WANG Hai-la

Laboratory of Pattern Recognition and Intelligent System, Beijing University of Posts and Telecommunications, Beijing 100876, China

Received:2007-07-06 Revised:1900-01-01 Online:2008-06-30
Contact: ZHAO Jian

摘要/Abstract

摘要：

Speaker adaptive test normalization (ATnorm) is the most effective approach of the widely used score normalization in text-independent speaker verification, which selects speaker adaptive impostor cohorts with an extra development corpus in order to enhance the recognition performance. In this paper, an improved implementation of ATnorm that can offer overall significant advantages over the original ATnorm is presented. This method adopts a novel cross similarity measurement in speaker adaptive cohort model selection without an extra development corpus. It can achieve a comparable performance with the original ATnorm and reduce the computation complexity moderately. With the full use of the saved extra development corpus, the overall system performance can be improved significantly. The results are presented on NIST 2006 Speaker Recognition Evaluation data corpora where it is shown that this method provides significant improvements in system performance, with relatively 14.4% gain on equal error rate (EER) and 14.6% gain on decision cost function (DCF) obtained as a whole.

关键词:

speaker;ATnorm,;score;normalization,;cross;similarity;measurement,;speaker;verification,;NIST;speaker;recognition;evaluation

Abstract:

Key words:

speaker ATnorm;score normalization;cross similarity measurement;speaker verification;NIST speaker recognition evaluation

中图分类号:

TN912.34

ZHAO Jian, DONG Yuan, ZHAO Xian-yu, YANG Hao, WANG Hai-la. Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification[J]. Acta Metallurgica Sinica(English letters), 2008, 15(2): 130-134.

参考文献

1. Li K P, Porter J E. Normalizations and selection of speech segments for speaker recognition scoring. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 1, Apr 11-14, 1988, New York, NY, USA. New York, NY, USA: IEEE, 1988: 595-598

2. Reynolds D A. The effect of handset variability on speaker recognition performance: experiments on the switchboard corpus. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vol 1, May 7-11, 1996, Atlanta, GA, USA. Piscataway, NJ, USA: IEEE, 1996: 113-116

3. Auckenthaler C, Thomas L. Score normalization for text- independent speaker verification systems. Digital Signal Processing: A Review Journal, 2000, 10(1-3): 42-54

4. Bimbot F, Bonastre J, Fredouille C, et al. A Tutorial on text-independent speaker verification. Eurasip Journal on Applied Signal Processing, 2004, (4): 430-451

5. Sturim D E, Reynolds D A. Speaker adaptive cohort selection for TNorm in text-independent speaker verification. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing: Vol 1, Mar 18-23, 2005, Philadelphia, PA, USA. Piscataway, NJ, USA: IEEE, 2005: 741-744

6. Mami Y, Charlet D. Speaker recognition by location in the space of reference speakers. Speech Communication, 2006, 48(2): 127-141

7. Peng Di, Lui Gang, Guo Jun. Refining context-dependent tonal acoustic modeling in mandarin LVCSR. Journal of Beijing University of Posts and Telecommunications, 2006, 29(S2): 188-191 (in Chinese)

8. Chen Hong-mei, Chen Jian. Study on the speech enhancement algorithm based on MMSE short time logarithmic spectral analysis. Journal of Chongqing University of Posts and Telecommunications: Natural Science Edition, 2004, 16(3): 65-68 (in Chinese)

9. Hermansky H, Morgan N. RASTA processing of speech. IEEE Transactions on Speech and Audio Processing, 1994, 2(4): 578-589

10. Reynolds D A. Channel robust speaker verification via feature mapping. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol 2, Apr 6-10, 2003, Hong Kong, China. 2003: 53-56

11. Kumar N. Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition. Ph. D. thesis, Baltimore, MD, USA: John Hopkins University, 1997

12. Yang Hao, Dong Yuan, Zhao Xian-yu, et al. Discriminative Transformation for Sufficient Adaptation in Text-Independent Speaker Verification. 2006, Proceedings of Chinese Spoken Language Processing, Dec 13-16, 2006, Singapore. Berlin, Germany: Spriger Verlag, 4274: 558-565

13. Gonzalez R C, Woods R E. Digital image processing. Eng Prentice-Hall, New Jersey, 2002

14. Torre Á, Peinado A M, Segura J C, et al. Histogram equalization of speech representation for robust speech recognition. IEEE Transactions on Speech and Audio Processing, 2005, 13(3): 355-366

15. Zhao Xian-yu, Dong Yuan, Luo Jun, et al. Multigrained model adaptation with map and reference speaker weighting for text independent speaker verification. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vol 1, Apr 14-19, 2006, Toulouse, France. 2006: 913-916

16. Martin A, Doddington G, Kamm T. The DET Curve in Assessment of Detection Task Performance. Proceedings of 5th European Conference on Speech Communication and Technology (Eurospeech’97): Vol 4, Sep 22-25, 1997, Rhodes, Greece. 1997: 1895-1898

17. The NIST year 2006 speaker recognition evaluation plan [2008-03-12]. http://www.nist.gov/speech/tests/spk/2006/sre-06 evalplan-v9.pdf

18. The NIST year 2004 speaker recognition evaluation plan [2008-03-12]. http://www.nist.gov/speech/tests/spk/2004/sre-04 evalplan-vla.pdf

[1]	彭宏刘耀宗. Joint global constraint and Fisher discrimination based multi-layer dictionary learning for image classification[J]. 中国邮电高校学报(英文版), 2023, 30(5): 1-10.
[2]	赵海英郭轩 . Pointer-prototype fusion network for few-shot named entity recognition[J]. 中国邮电高校学报(英文版), 2023, 30(5): 32-41.
[3]	Fu Li, Cui Zhe, Deng Hongwei. Research on statistical characteristics of engine backward RCS based on K-KDE algorithm[J]. 中国邮电高校学报(英文版), 2023, 30(4): 33-42.
[4]	束丰张玲华丁寅. Iterative subspace matching pursuit for joint sparse recovery[J]. 中国邮电高校学报(英文版), 2023, 30(2): 26-35.
[5]	Wang Xianlun, Wang Guangyu, Cui Yuxia. Facial expression recognition based on improved ResNet[J]. 中国邮电高校学报(英文版), 2023, 30(1): 28-38.
[6]	Bai Xia, Su Ming, Wang Qingmin, Liu Yang, Gao Zhiwei. Frequency-, radiation pattern- and polarization-reconfigurable antenna for wireless applications[J]. 中国邮电高校学报(英文版), 2023, 30(1): 87-92.
[7]	李娜武阳阳刘颖李大湘高嘉乐. Saliency guided self-attention network for pedestrian attribute recognition in surveillance scenarios[J]. 中国邮电高校学报(英文版), 2022, 29(5): 21-29.
[8]	晁浩连卫芳刘永利. Spatiotemporal emotion recognition based on 3D time-frequency domain feature matrix[J]. 中国邮电高校学报(英文版), 2022, 29(5): 62-72.
[9]	李玉杰张晶晶蒋伟王春晓. Research on emotional space for movie and TV drama videos[J]. 中国邮电高校学报(英文版), 2022, 29(5): 73-82.
[10]	Meng Wei, Wang Liting, Lu Meng. Summary of research on recommendation system based on serendipity[J]. 中国邮电高校学报(英文版), 2022, 29(4): 89-105.
[11]	Jiang Yujian, Yang Xue, Zhang Junming, Song Yang. Lighting control with Myo armband based on customized classifier[J]. 中国邮电高校学报(英文版), 2022, 29(4): 106-116.
[12]	Jia Wei, Gong Chao. Precise and efficient Chinese license plate recognition in the real monitoring scene of intelligent transportation system[J]. 中国邮电高校学报(英文版), 2022, 29(3): 1-14.
[13]	Xu Yan, Li Zheng, Ding Long, Xu Rui. Cross-domain data cloud storage auditing scheme based on certificateless cryptography [J]. 中国邮电高校学报(英文版), 2021, 28(6): 36-47.
[14]	Sheng Haiyan, Wei Shimin, Yu Xiuli, Tang Ling. Research on robot grabbing system based on hybrid cloud[J]. 中国邮电高校学报(英文版), 2021, 28(6): 48-54.
[15]	何欣周军华廖中华翟翔孙司远. Research on flame classification and recognition based on object detection and similarity fusion [J]. 中国邮电高校学报(英文版), 2021, 28(5): 59-67.

Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification

Cross similarity measurement for speaker adaptive test normalization in text-independent speaker verification

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价