A new data imputation technique for efficient used car price forecasting
Abstract
This research presents an innovative methodology for addressing missing data challenges, specifically applied to predicting the resale value of used vehicles. The study integrates a tailored feature selection algorithm with a sophisticated imputation strategy utilizing the HistGradientBoostingRegressor to enhance efficiency and accuracy while maintaining data fidelity. The approach effectively resolves data preprocessing and missing value imputation issues in complex datasets. A comprehensive flowchart delineates the process from initial data acquisition and integration to ultimate preprocessing steps, encompassing feature engineering, data partitioning, model training, and imputation procedures. The results demonstrate the superiority of the HistGradientBoostingRegressor for imputation over conventional methods, with boosted models eXtreme gradient boosting (XGBoost) regressor and gradient boosting regressor exhibiting exceptional performance in price forecasting. While the study’s potential limitations include generalizability across diverse datasets, its applications include enhancing pricing models in the automotive sector and improving data quality in large-scale market analyses.
Keywords
Feature engineering; Imputation; Missing values; Preprocessing; Regression; Used car forecasting
Full Text:
PDFDOI: http://doi.org/10.11591/ijece.v15i2.pp2364-2371
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).