Gender detection in children’s speech utterances for human-robot interaction
Abstract
The human voice speech essentially includes paralinguistic information used in many real-time applications. Detecting the children’s gender is considered a challenging task compared to the adult’s gender. In this study, a system for human-robot interaction (HRI) is proposed to detect the gender in children’s speech utterances without depending on the text. The robot's perception includes three phases: Feature’s extraction phase where four formants are measured at each glottal pulse and then a median is calculated across these measurements. After that, three types of features are measured which are formant average (AF), formant dispersion (DF), and formant position (PF). Feature’s standardization phase where the measured feature dimensions are standardized using the z-score method. The semantic understanding phase is where the children’s gender is detected accurately using the logistic regression classifier. At the same time, the action of the robot is specified via a speech response using the text to speech (TTS) technique. Experiments are conducted on the Carnegie Mellon University (CMU) Kids dataset to measure the suggested system’s performance. In the suggested system, the overall accuracy is 98%. The results show a relatively clear improvement in terms of accuracy of up to 13% compared to related works that utilized the CMU Kids dataset.
Keywords
Children’s gender identification; Formant average feature; Formant dispersion feature; Formant position feature; Logistic regression
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v12i5.pp5049-5054
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).