Tiny datablock in saving Hadoop distributed file system wasted memory

Mohammad Bahjat Al-Masadeh, Mohad Sanusi Azmi, Sharifah Sakinah Syed Ahmad

Abstract


Hadoop distributed file system (HDFS) is the file system whereby Hadoop is use it to store all the upcoming data inside it. Since it been declared, HDFS is consuming a huge memory amount in order to serve a normal dataset. Nonetheless, the current file saving mechanism in HDFS save only one file in one datablock. Thus, a file with just 5 Mb in size will take up the whole datablock capacity causing the rest of the memory unavailable for other upcoming files, and this is considered a huge waste of memory in serving a normal size dataset. This paper proposed a method called tiny datablock-HDFS (TD-HDFS) to increase the usability of HDFS memory and increase the file hosting capabilities by reducing the datablock size to the minimum capacity, and then merging all the related datablocks into one master datablock. This master datablock consists of tiny virtual datablocks that contain the related small files together; will exploit the full memory of the master datablock. The result of this study is a running HDFS with a minimum amount of wasted memory with the same read/write data performance. The results were examined through a comparison between the standard HDFS file hosting and the proposed solution of this study.

TRANSLATE with xEnglishArabicHebrewPolishBulgarianHindiPortugueseCatalanHmong DawRomanianChinese SimplifiedHungarianRussianChinese TraditionalIndonesianSlovakCzechItalianSlovenianDanishJapaneseSpanishDutchKlingonSwedishEnglishKoreanThaiEstonianLatvianTurkishFinnishLithuanianUkrainianFrenchMalayUrduGermanMalteseVietnameseGreekNorwegianWelshHaitian CreolePersian   TRANSLATE with COPY THE URL BELOW BackEMBED THE SNIPPET BELOW IN YOUR SITE Enable collaborative features and customize widget: Bing Webmaster PortalBack

Keywords


big data; datablock; datanode; hadoop; hadoop distributed file system; wasted memory;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v13i2.pp1757-1772

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).