Assessing risk factors for heart disease using machine learning methods
Abstract
This paper examines various machine learning methods for assessing risk factors for cardiovascular diseases. To build predictive models, two approaches were used: the extreme gradient boosting (XGBoost) algorithm and a convolutional neural network (CNN). The focus is on analyzing the performance of each model in classification and regression tasks, as well as their ability to identify key biomarkers and risk factors such as cholesterol, ferritin, homocysteine and aspartate aminotransferase (AST) levels. XGBoost parameters have been optimized for working with tabular data, demonstrating high accuracy in risk prediction. The CNN model, despite the initial reduction in error on the training set, showed signs of overfitting when analyzing validation data. Performance evaluation using the metrics of mean squared error (MSE), coefficient of determination (R²), Akaike information criterion (AIC), and Bayesian information criterion (BIC) revealed significant differences between the models. The study results confirm the effectiveness of XGBoost in analyzing tabular data and summarizing risk factor knowledge, while the CNN model needs further optimization to handle sparse data. The work demonstrates the importance of choosing the right model architecture and training parameters to ensure reliable diagnosis of cardiovascular diseases.
Keywords
Biochemical indicators; Cardiovascular diseases; Machine learning technologies; Mean squared error; Pathology; Vanilla CNN; XGBoost
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v14i6.pp6734-6742
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).