MFCC-GMM Method for Speaker Identification by Voice

Authors

  • N.A. Niyozmatova Digital Technologies and Artificial Intelligence, “Tashkent Institute of Irrigation and Agricultural Mechanization Engineers” National Research University, 100000 Tashkent, Uzbekistan
  • N.S. Mamatov Digital Technologies and Artificial Intelligence, “Tashkent Institute of Irrigation and Agricultural Mechanization Engineers” National Research University, 100000 Tashkent, Uzbekistan
  • X.T. Dusonov Department of Computer Science and Programming, Mirzo Ulugbek National University of Uzbekistan, Jizzak, Uzbekistan
  • B.N. Samijonov Sejong University, Seoul, South Korea
  • A.N. Samijonov Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, 100200 Tashkent, Uzbekistan

DOI:

https://doi.org/10.37934/ard.128.1.163171

Keywords:

MFCC, VQ, GMM, hamming, Fourier transform, mel-filter, code word

Abstract

Voice, as a characteristic of humans, provides great opportunities for communication and identification. Today, voice recognition systems are widely used in many areas of human activity. However, the problem of developing perfect voice recognition systems is still considered an urgent task by researchers. Especially when the speech sample duration is relatively short, it is important to solve the problem of low recognition accuracy. Therefore, in this article, the Mel-frequency cepstral coefficients (MFCC) feature set extraction algorithm and Gaussian mixture model (GMM) were researched in the implementation of identification, based on which experimental researches were conducted on the recognition of a person based on his voice. In this, the speech samples of male and female speakers recorded in different environments were used and the efficiency of the methods was compared by applying the MFCC-GMM and MFCC-VQ methods to them. As a result, the MFCC-GMM method is found to be more accurate than the MFCC-VQ method. That is, in the text-dependent condition, the accuracy of the speech recognition ranged from 82.8% to 94.5% and in the text-independent condition, it showed an accuracy of 79.5% to 87.4%.

Downloads

Download data is not yet available.

Author Biographies

N.A. Niyozmatova, Digital Technologies and Artificial Intelligence, “Tashkent Institute of Irrigation and Agricultural Mechanization Engineers” National Research University, 100000 Tashkent, Uzbekistan

n_nilufar@mail.ru

N.S. Mamatov, Digital Technologies and Artificial Intelligence, “Tashkent Institute of Irrigation and Agricultural Mechanization Engineers” National Research University, 100000 Tashkent, Uzbekistan

m_narzullo@mail.ru

X.T. Dusonov, Department of Computer Science and Programming, Mirzo Ulugbek National University of Uzbekistan, Jizzak, Uzbekistan

xurshid3868@mail.ru

B.N. Samijonov, Sejong University, Seoul, South Korea

bn_samijonov@mail.ru

A.N. Samijonov, Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, 100200 Tashkent, Uzbekistan

an_samijonov@mail.ru

Downloads

Published

2025-05-02

How to Cite

Niyozmatova, N., Mamatov, N. ., Dusonov, X., Samijonov, B., & Samijonov, A. (2025). MFCC-GMM Method for Speaker Identification by Voice. Journal of Advanced Research Design, 128(1), 163–171. https://doi.org/10.37934/ard.128.1.163171
سرور مجازی ایران Decentralized Exchange

Issue

Section

Articles
فروشگاه اینترنتی