Multi-class pneumonia detection using fine-tuned vision transformer model

Khushboo Trivedi, Chintan B. Thacker

Abstract


Distinguishing between the various forms of pneumonia (bacterial, viral, fungal, and normal) using chest X-rays is a major problem in global health. Conventional approaches to pneumonia identification frequently depend on laborious and error-prone manual interpretation. Current machine learning (ML) models, like convolutional neural networks (CNNs), have demonstrated some success, but they frequently fail on jobs requiring multi-class classification or generalization. The potential of vision transformer (ViT) models, fine-tuned to address these limitations, is explored. The approach enhances the accuracy of pneumonia classification into four distinct classes by leveraging the attention mechanism in vision transformers (ViTs). Fine-tuning with a tagged chest X-ray dataset improves the algorithm's ability to detect subtle variations in pneumonia types. The findings demonstrate the model's effectiveness in multi-class pneumonia diagnosis, achieving a significant performance improvement with 98% accuracy across the four classes. This work highlights the promise of vision transformers in medical imaging, enabling the development of improved and scalable pneumonia classification methods.

Keywords


Chest X-Ray classification; Fine-tuning; Medical imaging; Multi-class pneumonia; Vision transformer

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v15i4.pp3996-4003

Copyright (c) 2025 Khushboo Trivedi, Chintan B. Thacker

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by theĀ Institute of Advanced Engineering and Science (IAES).