Optimized memory model for hadoop map reduce framework

Archana Bhaskar, Rajeev Ranjan

Abstract


Map Reduce is the preferred computing framework used in large data analysis and processing applications. Hadoop is a widely used Map Reduce framework across different community due to its open source nature. Cloud service provider such as Microsoft azure HDInsight offers resources to its customer and only pays for their use. However, the critical challenges of cloud service provider is to meet user task Service level agreement (SLA) requirement (task deadline). Currently, the onus is on client to compute the amount of resource required to run a job on cloud. This work present a novel memory optimization model for Hadoop Map Reduce framework namely MOHMR (Optimized Hadoop Map Reduce) to process data in real-time and utilize system resource efficiently. The MOHMR present accurate model to compute job memory optimization and also present a model to provision the amount of cloud resource required to meet task deadline. The MOHMR first build a profile for each job and computes memory optimization time of job using greedy approach. Experiment are conducted on Microsoft Azure HDInsight cloud platform considering different application such as text computing and bioinformatics application to evaluate performance of MOHMR of over existing model shows significant performance improvement in terms of computation time. Experiment are conducted on Microsoft Azure HDInsight cloud. Overall, good correlation is reported between practical memory optimization values and theoretical memory optimization values.

Keywords


big data; bioinformatics; cloud computing; hadoop; map reduce; parallel computing;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v9i5.pp4396-4407

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578