Online Data Preprocessing: A Case Study Approach

Mohammed Zuhair Al-Taie, Seifedine Kadry, Joel Pinho Lucas

Abstract


Besides the Internet search facility and e-mails, social networking is now one of the three best uses of the Internet. A tremendous number of volunteers every day write articles, share photos, videos and links at a scope and scale never imagined before. However, because social network data are huge and come from heterogeneous sources, the data are highly susceptible to inconsistency, redundancy, noise, and loss. For data scientists, preparing the data and getting it into a standard format is critical because the quality of data is going to directly affect the performance of mining algorithms that are going to be applied next. Low-quality data will certainly limit the analysis and lower the quality of mining results. To this end, the goal of this study is to provide an overview of the different phases involved in data preprocessing, with a focus on social network data. As a case study, we will show how we applied preprocessing to the data that we collected for the Malaysian Flight MH370 that disappeared in 2014.

Keywords


Data Preprocessing; Data Science; Flight MH370; Social Networks

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v9i4.pp2620-2626
Total views : 121 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.