A Comparative Study of Robust Linear Predictive Analysis Methods with Applications to Speaker Identification

Ravi P. Ramachandran, Mihailo S. Zilovic, Richard J. Mammone

Research output: Contribution to journalArticlepeer-review

30 Scopus citations

Abstract

In this paper, various linear predictive (LP) analysis methods are studied and compared from the points of view of robustness to noise and of application to speaker identification. The key to the success of LP techniques is in separating the vocal tract information from the pitch information present in a speech signal even under noisy conditions. In addition to considering the conventional, one-shot weighted least-squares methods, we propose three other approaches with the above point as a motivation. The first is an iterative approach that leads to the weighted least absolute value solution. The second is an extension of the one-shot least-squares approach and achieves an iterative update of the weights. The update is a function of the residual and is based on minimizing a Mahalanobis distance. Third, the weighted total least-squares formulation is considered. A study of the deviations in the LP parameters is done when noise (white Gaussian and impulsive) is added to the speech. It is revealed that the most robust method depends on the type of noise. Closed-set speaker identification experiments with 20 speakers are conducted using a vector quantizer classifier trained on clean speech. The relative performance of the various LP approaches depends on the type of speech material used for testing.

Original languageEnglish (US)
Pages (from-to)117-125
Number of pages9
JournalIEEE Transactions on Speech and Audio Processing
Volume3
Issue number2
DOIs
StatePublished - Mar 1995

All Science Journal Classification (ASJC) codes

  • Software
  • Acoustics and Ultrasonics
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A Comparative Study of Robust Linear Predictive Analysis Methods with Applications to Speaker Identification'. Together they form a unique fingerprint.

Cite this