A smart method for spark using neural network for big data

Md. Armanur Rahman, J. Hossen, Aziza Sultana, Abdullah Al Mamun, Nor Azlina Ab. Aziz


Apache Spark, famously known for Big data handling ability, is a distributed open-source framework that utilizes the idea of distributed memory to process Big data. As the performance of the Spark is mostly being affected by the Spark predominant configuration parameters, it is challenging to achieve the optimal result from Spark. The current practice of tuning the parameters is ineffective, as it is performed manually. Manual tuning is challenging for large space of parameters and complex interactions with and among the parameters. This paper proposes a more effective, self-tuning approach subject to a neural network called Smart method for Spark using Neural Network for Big data (SSNNB) to avoid the disadvantages of manual tuning of the parameters. The paper has selected five predominant parameters with five different sizes of data to test the approach. The proposed approach has increased the speed of around 30% compared with the default parameter configuration.


apache spark; big data; configuration parameters; machine learning; self-configuration;

DOI: http://doi.org/10.11591/ijece.v11i4.pp%25p
Total views : 0 times

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

ISSN 2088-8708, e-ISSN 2722-2578