A Survey of Machine Learning Techniques for Self-tuning Hadoop Performance

Md. Armanur Rahman, J. Hossen, Venkataseshaiah C, CK Ho, Tan Kim Geok, Aziza Sultana, Jesmeen M. Z. H., Ferdous Hossain


The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this paper, the concept of critical issues of Hadoop system, big data and machine learning have been highlighted and an analysis of some machine learning techniques applied so far, for improving the Hadoop performance is presented. Then, a promising machine learning technique using deep learning algorithm is proposed for Hadoop system performance improvement.


hadoop; HDFS; machine learning; mapreduce; parameter

Full Text:


DOI: http://doi.org/10.11591/ijece.v8i3.pp1854-1862

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578