Rank-based frame classification for usable speech detection in speaker identification systems

James Ethridge, Ravi Ramachandran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

The performance of a speaker identification (SID) system degrades substantially when there is a mismatch between the training and testing conditions. Discriminating between temporal sections of speech signals which are speech-like (SID usable) and noise-like (SID unusable) while only retaining frames labeled SID usable can augment SID performance substantially. In this paper, a novel labeling system for SID usable and SID unusable frames is presented for a GMM based SID system. This is motivated by a control experiment demonstrating that very high SID accuracies are theoretically achievable by removing frames that contribute more to the scores of competing speakers rather than the true speaker. To blindly identify these SID usable and unusable frames, the Mahalanobis distance and an ensemble of decision tree classifiers (with boosting) were trained on a dataset which was different from the enrollment database for the SID system. The classifier based techniques yielded improvements over the base speaker identification system (all frames used) in all cases when the speech signal was corrupted with additive white or additive pink noise.

Original languageEnglish (US)
Title of host publication2015 IEEE International Conference on Digital Signal Processing, DSP 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages292-296
Number of pages5
ISBN (Electronic)9781479980581, 9781479980581
DOIs
StatePublished - Sep 9 2015
EventIEEE International Conference on Digital Signal Processing, DSP 2015 - Singapore, Singapore
Duration: Jul 21 2015Jul 24 2015

Publication series

NameInternational Conference on Digital Signal Processing, DSP
Volume2015-September

Other

OtherIEEE International Conference on Digital Signal Processing, DSP 2015
CountrySingapore
CitySingapore
Period7/21/157/24/15

All Science Journal Classification (ASJC) codes

  • Signal Processing

Fingerprint Dive into the research topics of 'Rank-based frame classification for usable speech detection in speaker identification systems'. Together they form a unique fingerprint.

  • Cite this

    Ethridge, J., & Ramachandran, R. (2015). Rank-based frame classification for usable speech detection in speaker identification systems. In 2015 IEEE International Conference on Digital Signal Processing, DSP 2015 (pp. 292-296). [7251878] (International Conference on Digital Signal Processing, DSP; Vol. 2015-September). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDSP.2015.7251878