Enhancing cyberbullying detection with advanced text preprocessing and machine learning

Rakesh Bapu Dhumale; Ajay Kumar Dass; Amit Umbrajkaar; Pradeep Mane

doi:10.11591/ijece.v15i3.pp3139-3148

Enhancing cyberbullying detection with advanced text preprocessing and machine learning

Rakesh Bapu Dhumale, Ajay Kumar Dass, Amit Umbrajkaar, Pradeep Mane

Abstract

The use of social media and the internet has been increasing dramatically in recent years. Cyber-bullying is the term used to describe the misuse of social media by some people who make threatening comments. This has a devastating influence on people's lives, especially those of children and teenagers, and can lead to feelings of depression and suicidal thoughts. The methodology proposed in this paper includes four steps for identifying cyberbullying: preprocessing, feature extraction, classification, and evaluation. The first step is to create a labeled, varied dataset. Word2Vec and term frequency-inverse document frequency are used in feature extraction to transform text into high-dimensional vectors. Word2Vec creates word embeddings using the skip-gram and continuous bag-of-words models, while term frequency-inverse document frequency assesses the text's term relevancy. Support vector machine classifiers are used in the model, and their effectiveness is compared to that of other techniques like logistic regression and naïve Bayes. The classifiers support vector machine, naïve Bayes, and logistic regression were assessed. The maximum accuracy was 95% for the support vector classifier with skip-gram and 93% for continuous bag-of-words. For sentiment categories, F1-scores, recall, and precision were computed. The average precision and recall were 0.77 and 0.79, respectively.

Keywords

Cyberbullying; Detection; Online threats; Social media; Social media misuse; Support vector machines; Text classification

Full Text:

PDF

DOI: http://doi.org/10.11591/ijece.v15i3.pp3139-3148

Copyright (c) 2025 Rakesh Bapu Dhumale, Ajay Kumar Dass, Amit Umbrajkaar, Pradeep Mane Mail

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES).

Username
Password
Remember me