Article Information
Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models

Keywords: Mel Frequency Cepstral Coefficients, Gaussian Mixture Models, Expectation Maximization, Speaker Recognition

Mehran University Research Journal of Engineering & Technology

Volume 32 ,  Issue 4

SHEERAZ  MEMON,SANIA  BHATTI,FARZANA RAUF  ABRO

Abstract

This paper investigates the task of SR (Speaker Recognition) for the state-of-the-art techniques. The paper initially presents the technical description of automatic SR, followed by the comparative analysis of a number of methods available for feature extraction and modeling. Based on this analysis the NIST 2001, NIST 2002, NIST 2004 and NIST 2006 Speaker recognition corpora are used to investigate the state of the art feature extraction and modeling techniques. The state of the art technique for feature extraction is delta MFCC ( Mel Frequency Cepstral Coefficients) and for modeling is GMM (Gaussian Mixture Models) based on EM (Expectation Maximization). Further in this paper the details about the enrollment/training and recognition/testing is also presented. For different stages of SR systems the conventional methods are summarized