Mel frequency cepstral coefficients (Mfcc) feature extraction enhancement in the application of speech recognition: A comparison study

Sayf A. Majeed, Hafizah Husain, Salina Abdul Samad, Tariq F. Idbeaa

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Mel Frequency Cepstral Coefficients (MFCCs) are the most widely used features in the majority of the speaker and speech recognition applications. Since 1980s, remarkable efforts have been undertaken for the development of these features. Issues such as use suitable spectral estimation methods, design of effective filter banks, and the number of chosen features all play an important role in the performance and robustness of the speech recognition systems. This paper provides an overview of MFCC's enhancement techniques that are applied in speech recognition systems. The details such as accuracy, types of environments, the nature of data, and the number of features are investigated and summarized in the table combined with the corresponding key references. Benefits and drawbacks of these MFCC's enhancement techniques have been discussed. This study will hopefully contribute to raising initiatives towards the enhancement of MFCC in terms of robustness features, high accuracy, and less complexity.

Original languageEnglish
Pages (from-to)38-56
Number of pages19
JournalJournal of Theoretical and Applied Information Technology
Volume79
Issue number1
Publication statusPublished - 10 Sep 2015

Fingerprint

Speech Recognition
Speech recognition
Feature Extraction
Feature extraction
Enhancement
Coefficient
Robustness
Spectral Estimation
Speaker Recognition
Filter Banks
Filter banks
Design Method
Table
High Accuracy

Keywords

  • Feature Extraction
  • Mel Frequency Cepstral Coefficients (MFCC)
  • Speech Recognition

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Mel frequency cepstral coefficients (Mfcc) feature extraction enhancement in the application of speech recognition : A comparison study. / Majeed, Sayf A.; Husain, Hafizah; Abdul Samad, Salina; Idbeaa, Tariq F.

In: Journal of Theoretical and Applied Information Technology, Vol. 79, No. 1, 10.09.2015, p. 38-56.

Research output: Contribution to journalArticle

@article{c88b55f86fe64b7c9efb6e5a47b16f38,
title = "Mel frequency cepstral coefficients (Mfcc) feature extraction enhancement in the application of speech recognition: A comparison study",
abstract = "Mel Frequency Cepstral Coefficients (MFCCs) are the most widely used features in the majority of the speaker and speech recognition applications. Since 1980s, remarkable efforts have been undertaken for the development of these features. Issues such as use suitable spectral estimation methods, design of effective filter banks, and the number of chosen features all play an important role in the performance and robustness of the speech recognition systems. This paper provides an overview of MFCC's enhancement techniques that are applied in speech recognition systems. The details such as accuracy, types of environments, the nature of data, and the number of features are investigated and summarized in the table combined with the corresponding key references. Benefits and drawbacks of these MFCC's enhancement techniques have been discussed. This study will hopefully contribute to raising initiatives towards the enhancement of MFCC in terms of robustness features, high accuracy, and less complexity.",
keywords = "Feature Extraction, Mel Frequency Cepstral Coefficients (MFCC), Speech Recognition",
author = "Majeed, {Sayf A.} and Hafizah Husain and {Abdul Samad}, Salina and Idbeaa, {Tariq F.}",
year = "2015",
month = "9",
day = "10",
language = "English",
volume = "79",
pages = "38--56",
journal = "Journal of Theoretical and Applied Information Technology",
issn = "1992-8645",
publisher = "Asian Research Publishing Network (ARPN)",
number = "1",

}

TY - JOUR

T1 - Mel frequency cepstral coefficients (Mfcc) feature extraction enhancement in the application of speech recognition

T2 - A comparison study

AU - Majeed, Sayf A.

AU - Husain, Hafizah

AU - Abdul Samad, Salina

AU - Idbeaa, Tariq F.

PY - 2015/9/10

Y1 - 2015/9/10

N2 - Mel Frequency Cepstral Coefficients (MFCCs) are the most widely used features in the majority of the speaker and speech recognition applications. Since 1980s, remarkable efforts have been undertaken for the development of these features. Issues such as use suitable spectral estimation methods, design of effective filter banks, and the number of chosen features all play an important role in the performance and robustness of the speech recognition systems. This paper provides an overview of MFCC's enhancement techniques that are applied in speech recognition systems. The details such as accuracy, types of environments, the nature of data, and the number of features are investigated and summarized in the table combined with the corresponding key references. Benefits and drawbacks of these MFCC's enhancement techniques have been discussed. This study will hopefully contribute to raising initiatives towards the enhancement of MFCC in terms of robustness features, high accuracy, and less complexity.

AB - Mel Frequency Cepstral Coefficients (MFCCs) are the most widely used features in the majority of the speaker and speech recognition applications. Since 1980s, remarkable efforts have been undertaken for the development of these features. Issues such as use suitable spectral estimation methods, design of effective filter banks, and the number of chosen features all play an important role in the performance and robustness of the speech recognition systems. This paper provides an overview of MFCC's enhancement techniques that are applied in speech recognition systems. The details such as accuracy, types of environments, the nature of data, and the number of features are investigated and summarized in the table combined with the corresponding key references. Benefits and drawbacks of these MFCC's enhancement techniques have been discussed. This study will hopefully contribute to raising initiatives towards the enhancement of MFCC in terms of robustness features, high accuracy, and less complexity.

KW - Feature Extraction

KW - Mel Frequency Cepstral Coefficients (MFCC)

KW - Speech Recognition

UR - http://www.scopus.com/inward/record.url?scp=84941038947&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84941038947&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84941038947

VL - 79

SP - 38

EP - 56

JO - Journal of Theoretical and Applied Information Technology

JF - Journal of Theoretical and Applied Information Technology

SN - 1992-8645

IS - 1

ER -