TY - GEN
T1 - Feature and signal enhancement for robust speaker identification of G.729 decoded speech
AU - Raval, Kalpesh
AU - Ramachandran, Ravi P.
AU - Shetty, Sachin S.
AU - Smolenski, Brett Y.
PY - 2012
Y1 - 2012
N2 - For wireless remote access security, there is an emerging need for biometric speaker identification systems (SID) to be robust to speech coding distortion. This paper presents results on a Gaussian mixture model (GMM) based SID system that is trained on clean speech and tested on the decoded speech of the G.729 codec. To mitigate the performance loss due to mismatched training and testing conditions, five robust features, two enhancement approaches and three fusion strategies are used. The first enhancement method is feature compensation based on the affine transform. The second is the McCree signal enhancement approach based on the spectral envelope information in the G.729 bit stream. Ensemble systems using decision level, score fusion and Borda count are studied. The best performance is obtained by performing signal enhancement, feature compensation and decision level fusion. This results in an identification success rate (ISR) of 89.8%.
AB - For wireless remote access security, there is an emerging need for biometric speaker identification systems (SID) to be robust to speech coding distortion. This paper presents results on a Gaussian mixture model (GMM) based SID system that is trained on clean speech and tested on the decoded speech of the G.729 codec. To mitigate the performance loss due to mismatched training and testing conditions, five robust features, two enhancement approaches and three fusion strategies are used. The first enhancement method is feature compensation based on the affine transform. The second is the McCree signal enhancement approach based on the spectral envelope information in the G.729 bit stream. Ensemble systems using decision level, score fusion and Borda count are studied. The best performance is obtained by performing signal enhancement, feature compensation and decision level fusion. This results in an identification success rate (ISR) of 89.8%.
UR - http://www.scopus.com/inward/record.url?scp=84869057700&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84869057700&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-34500-5_41
DO - 10.1007/978-3-642-34500-5_41
M3 - Conference contribution
AN - SCOPUS:84869057700
SN - 9783642344992
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 345
EP - 352
BT - Neural Information Processing - 19th International Conference, ICONIP 2012, Proceedings
T2 - 19th International Conference on Neural Information Processing, ICONIP 2012
Y2 - 12 November 2012 through 15 November 2012
ER -