An approach for improved students’ performance prediction using homogeneous and heterogeneous ensemble methods

Edmund De Leon Evangelista, Benedict Descargar Sy


Web-based learning technologies of educational institutions store a massive amount of interaction data which can be helpful to predict students’ performance through the aid of machine learning algorithms. With this, various researchers focused on studying ensemble learning methods as it is known to improve the predictive accuracy of traditional classification algorithms. This study proposed an approach for enhancing the performance prediction of different single classification algorithms by using them as base classifiers of homogeneous ensembles (bagging and boosting) and heterogeneous ensembles (voting and stacking). The model utilized various single classifiers such as multilayer perceptron or neural networks (NN), random forest (RF), naïve Bayes (NB), J48, JRip, OneR, logistic regression (LR), k-nearest neighbor (KNN), and support vector machine (SVM) to determine the base classifiers of the ensembles. In addition, the study made use of the University of California Irvine (UCI) open-access student dataset to predict students’ performance. The comparative analysis of the model’s accuracy showed that the best-performing single classifier’s accuracy increased further from 93.10% to 93.68% when used as a base classifier of a voting ensemble method. Moreover, results in this study showed that voting heterogeneous ensemble performed slightly better than bagging and boosting homogeneous ensemble methods.


Bagging and boosting; Ensemble methods; Machine learning; Student performance prediction; Weka experiment

