Purging of silence for robust speaker identification in colossal database

P. Rama Koteswara Rao, Sunitha Ravi, Thotakura Haritha

Abstract


The aim of this work is to develop an effective speaker recognition system under noisy environments for large data sets. The important phases involved in typical identification systems are feature extraction, training and testing. During the feature extraction phase, the speaker-specific information is processed based on the characteristics of the voice signal. Effective methods have been proposed for the silence removal in order to achieve accurate recognition under noisy environments in this work. Pitch and Pitch-strength parameters are extracted as distinct features from the input speech spectrum. Multi-linear principle component analysis (MPCA) is is utilized to minimize the complexity of the parameter matrix. Silence removal using zero crossing rate (ZCR) and endpoint detection algorithm (EDA) methods are applied on the source utterance during the feature extraction phase. These features are useful in later classification phase, where the identification is made on the basis of support vector machine (SVM) algorithms. Forward loking schostic (FOLOS) is the efficient large-scale SVM algorithm that has been employed for the effective classification among speakers. The evaluation findings indicate that the methods suggested increase the performance for large amounts of data in noise ecosystems.

Keywords


dimensionality reduction; large scale SVM; multi-linear principle factor analysis ; un-pitched; voiced;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v11i4.pp3084-3092

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).