Integrating time-frequency features with deep learning for lung sound classification
Abstract
Deep learning has transformed medical diagnostics, especially in analyzing lung sounds to assess respiratory conditions. Traditional methods like CT scans and X-rays are impractical in resource-limited settings due to radiation exposure and time consumption, while conventional stethoscopes often lead to misdiagnosis due to subjective interpretation and environmental noise. This study evaluates deep learning models for lung sound classification using the International Conference on Biomedical Health Informatics 2017 dataset, comprising 920 annotated samples from 126 subjects. Pre-processing includes down sampling, segmentation, normalization, and audio clipping, with feature extraction techniques like spectrogram and Mel-frequency cepstral coefficients (MFCC). The adopted automatic lung sound diagnosis network (ASLD-Net) model with triple feature input (time domain, spectrogram, and MFCC) achieved the highest accuracy at 97.25%, followed by the dual feature model (spectrogram and MFCC) at 95.65%. Single-input models with spectrogram and MFCC performed well, while the time domain input alone had the lowest accuracy.
Keywords
Deep learning; Lung sound; Lung sound classification; Medical signal processing; Time-frequency feature
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v15i4.pp3737-3747
Copyright (c) 2025 Su Yuan Chang, Marni Azira Markom, Zhi Sheng Choong, Arni Munira Markom, Latifah Munirah Kamaruddin, Erdy Sulino Mohd Muslim Tan
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by theĀ Institute of Advanced Engineering and Science (IAES).