An analysis of data fusion methods for speaker verification

Kevin R. Farrell, Ravi P. Ramachandran, Richard J. Mammone

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

We analyse the diversity of information as provided by several modeling approaches for speaker verification. This information is used to facilitate the fusion of the individual results into an overall result that provides advantages in accuracy over the individual models. The modeling methods that are evaluated consist of the neural tree network (NTN), Gaussian mixture model (GMM), hidden Markov model (HMM), and dynamic time warping (DTW). With the exception of DTW, all methods utilize subword-based approaches. The phrase-level scores for each modeling approach are used for combination. Several data fusion methods are evaluated for combining the model results, including the linear and log opinion pool approaches along with voting. The results of the above analysis have been integrated into a system that has been tested with several databases collected within landline and cellular environments. We have found the linear and log opinion pool methods to consistently reduce the error rate from that obtained when the models are need individually.

Original languageEnglish (US)
Title of host publicationProceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
Pages1129-1132
Number of pages4
DOIs
StatePublished - Dec 1 1998
Event1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998 - Seattle, WA, United States
Duration: May 12 1998May 15 1998

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2
ISSN (Print)1520-6149

Other

Other1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
CountryUnited States
CitySeattle, WA
Period5/12/985/15/98

Fingerprint

Data fusion
Hidden Markov models

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Farrell, K. R., Ramachandran, R. P., & Mammone, R. J. (1998). An analysis of data fusion methods for speaker verification. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998 (pp. 1129-1132). [675468] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2). https://doi.org/10.1109/ICASSP.1998.675468
Farrell, Kevin R. ; Ramachandran, Ravi P. ; Mammone, Richard J. / An analysis of data fusion methods for speaker verification. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998. 1998. pp. 1129-1132 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{d1388259fb6f480394f068a189cbe4e9,
title = "An analysis of data fusion methods for speaker verification",
abstract = "We analyse the diversity of information as provided by several modeling approaches for speaker verification. This information is used to facilitate the fusion of the individual results into an overall result that provides advantages in accuracy over the individual models. The modeling methods that are evaluated consist of the neural tree network (NTN), Gaussian mixture model (GMM), hidden Markov model (HMM), and dynamic time warping (DTW). With the exception of DTW, all methods utilize subword-based approaches. The phrase-level scores for each modeling approach are used for combination. Several data fusion methods are evaluated for combining the model results, including the linear and log opinion pool approaches along with voting. The results of the above analysis have been integrated into a system that has been tested with several databases collected within landline and cellular environments. We have found the linear and log opinion pool methods to consistently reduce the error rate from that obtained when the models are need individually.",
author = "Farrell, {Kevin R.} and Ramachandran, {Ravi P.} and Mammone, {Richard J.}",
year = "1998",
month = "12",
day = "1",
doi = "10.1109/ICASSP.1998.675468",
language = "English (US)",
isbn = "0780344286",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
pages = "1129--1132",
booktitle = "Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998",

}

Farrell, KR, Ramachandran, RP & Mammone, RJ 1998, An analysis of data fusion methods for speaker verification. in Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998., 675468, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2, pp. 1129-1132, 1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998, Seattle, WA, United States, 5/12/98. https://doi.org/10.1109/ICASSP.1998.675468

An analysis of data fusion methods for speaker verification. / Farrell, Kevin R.; Ramachandran, Ravi P.; Mammone, Richard J.

Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998. 1998. p. 1129-1132 675468 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - An analysis of data fusion methods for speaker verification

AU - Farrell, Kevin R.

AU - Ramachandran, Ravi P.

AU - Mammone, Richard J.

PY - 1998/12/1

Y1 - 1998/12/1

N2 - We analyse the diversity of information as provided by several modeling approaches for speaker verification. This information is used to facilitate the fusion of the individual results into an overall result that provides advantages in accuracy over the individual models. The modeling methods that are evaluated consist of the neural tree network (NTN), Gaussian mixture model (GMM), hidden Markov model (HMM), and dynamic time warping (DTW). With the exception of DTW, all methods utilize subword-based approaches. The phrase-level scores for each modeling approach are used for combination. Several data fusion methods are evaluated for combining the model results, including the linear and log opinion pool approaches along with voting. The results of the above analysis have been integrated into a system that has been tested with several databases collected within landline and cellular environments. We have found the linear and log opinion pool methods to consistently reduce the error rate from that obtained when the models are need individually.

AB - We analyse the diversity of information as provided by several modeling approaches for speaker verification. This information is used to facilitate the fusion of the individual results into an overall result that provides advantages in accuracy over the individual models. The modeling methods that are evaluated consist of the neural tree network (NTN), Gaussian mixture model (GMM), hidden Markov model (HMM), and dynamic time warping (DTW). With the exception of DTW, all methods utilize subword-based approaches. The phrase-level scores for each modeling approach are used for combination. Several data fusion methods are evaluated for combining the model results, including the linear and log opinion pool approaches along with voting. The results of the above analysis have been integrated into a system that has been tested with several databases collected within landline and cellular environments. We have found the linear and log opinion pool methods to consistently reduce the error rate from that obtained when the models are need individually.

UR - http://www.scopus.com/inward/record.url?scp=0002525996&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0002525996&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.1998.675468

DO - 10.1109/ICASSP.1998.675468

M3 - Conference contribution

AN - SCOPUS:0002525996

SN - 0780344286

SN - 9780780344280

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 1129

EP - 1132

BT - Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998

ER -

Farrell KR, Ramachandran RP, Mammone RJ. An analysis of data fusion methods for speaker verification. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998. 1998. p. 1129-1132. 675468. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.1998.675468