De-Identified Personal Health Care System Using Hadoop

Dasari Madhavi, B.V. Ramana


Hadoop technology plays a vital role in improving the quality of healthcare by delivering right information to right people at right time and reduces its cost and time. Most properly health care functions like admission, discharge, and transfer patient data maintained in Computer based Patient Records (CPR), Personal Health Information (PHI), and Electronic Health Records (EHR). The use of medical Big Data is increasingly popular in health care services and clinical research. The biggest challenges in health care centers are the huge amount of data flows into the systems daily. Crunching this Big Data and de-identifying it in a traditional data mining tools had problems. Therefore to provide solution to the de-identifying personal health information, Map Reduce application uses jar files which contain a combination of MR code and PIG queries. This application also uses advanced mechanism of using UDF (User Data File) which is used to protect the health care dataset. De-identified personal health care system is using Map Reduce, Pig Queries which are needed to be executed on the health care dataset. The application input dataset that contains the information of patients and de-identifies their personal health care.  De-identification using Hadoop is also suitable for social and demographic data.


Big Data;Information Security

Full Text:



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).