Inferring Student's Chat Topic in Colloquial Arabic Text using Semantic Representation

Faisal T. Khamayseh

Abstract


Since the colloquial Arabic is now widespread it is required to describe the collection and classification of a multi-dialectal corpus of Arabic. Nowadays, colloquial multi-dialectal comes in almost country based forms such as Egyptian, Iraqi, Levantine, Tunisian, etc. This paper discusses a new method for analyzing the conversation of the educational chat room using Corpus for Palestinian Arabic and Stanford Tagger. This method represents the key words using semantic net-like representation to obtain the main subjects of the conversation. The main subject of the chat is obtained using the proposed method which shows a high accuracy. Using Arabic Corpus, Stanford Tagger and percentage of words will add more accuracy. The study also examines the effect of pivot distribution based on occurrences and betweeness values of the pivots over the text. This study examines some of the characteristics of the texts written in colloquial Arabic dialect and analyzes the free expressive Arabic statements. The results of the paper show that the core can be determined by combining both the occurrences and the distribution of the word over the conversation.

Keywords


Arabic Chat; Semantic net;Palestinian Arabic Corpus; Stanford Tagger

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v6i4.pp1897-1906

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).