A smart method for spark using neural network for big data

Md. Armanur Rahman, J. Hossen, Aziza Sultana, Abdullah Al Mamun, Nor Azlina Ab. Aziz

Abstract


Apache spark, famously known for big data handling ability, is a distributed open-source framework that utilizes the idea of distributed memory to process big data. As the performance of the spark is mostly being affected by the spark predominant configuration parameters, it is challenging to achieve the optimal result from spark. The current practice of tuning the parameters is ineffective, as it is performed manually. Manual tuning is challenging for large space of parameters and complex interactions with and among the parameters. This paper proposes a more effective, self-tuning approach subject to a neural network called Smart method for spark using neural network for big data (SSNNB) to avoid the disadvantages of manual tuning of the parameters. The paper has selected five predominant parameters with five different sizes of data to test the approach. The proposed approach has increased the speed of around 30% compared with the default parameter configuration.

Keywords


apache spark; big data; configuration parameters; machine learning; self-configuration;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v11i3.pp2525-2534

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).