TY - GEN
T1 - An analysis of data fusion methods for speaker verification
AU - Farrell, Kevin R.
AU - Ramachandran, Ravi P.
AU - Mammone, Richard J.
PY - 1998
Y1 - 1998
N2 - We analyse the diversity of information as provided by several modeling approaches for speaker verification. This information is used to facilitate the fusion of the individual results into an overall result that provides advantages in accuracy over the individual models. The modeling methods that are evaluated consist of the neural tree network (NTN), Gaussian mixture model (GMM), hidden Markov model (HMM), and dynamic time warping (DTW). With the exception of DTW, all methods utilize subword-based approaches. The phrase-level scores for each modeling approach are used for combination. Several data fusion methods are evaluated for combining the model results, including the linear and log opinion pool approaches along with voting. The results of the above analysis have been integrated into a system that has been tested with several databases collected within landline and cellular environments. We have found the linear and log opinion pool methods to consistently reduce the error rate from that obtained when the models are need individually.
AB - We analyse the diversity of information as provided by several modeling approaches for speaker verification. This information is used to facilitate the fusion of the individual results into an overall result that provides advantages in accuracy over the individual models. The modeling methods that are evaluated consist of the neural tree network (NTN), Gaussian mixture model (GMM), hidden Markov model (HMM), and dynamic time warping (DTW). With the exception of DTW, all methods utilize subword-based approaches. The phrase-level scores for each modeling approach are used for combination. Several data fusion methods are evaluated for combining the model results, including the linear and log opinion pool approaches along with voting. The results of the above analysis have been integrated into a system that has been tested with several databases collected within landline and cellular environments. We have found the linear and log opinion pool methods to consistently reduce the error rate from that obtained when the models are need individually.
UR - http://www.scopus.com/inward/record.url?scp=0002525996&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0002525996&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.1998.675468
DO - 10.1109/ICASSP.1998.675468
M3 - Conference contribution
AN - SCOPUS:0002525996
SN - 0780344286
SN - 9780780344280
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 1129
EP - 1132
BT - Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
T2 - 1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
Y2 - 12 May 1998 through 15 May 1998
ER -