Offensive Language Detection Using Soft Voting Ensemble Model

Loading...
Thumbnail Image

Authors

Fieri, Brillian
Suhartono, Derwin

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Automation and Computer Science, Brno University of Technology

ORCID

Altmetrics

Abstract

Offensive language is one of the problems that have become increasingly severe along with the rise of the internet and social media usage. This language can be used to attack a person or specific groups. Automatic moderation, such as the usage of machine learning, can help detect and filter this particular language for someone who needs it. This study focuses on improving the performance of the soft voting classifier to detect offensive language by experimenting with the combinations of the soft voting estimators. The model was applied to a Twitter dataset that was augmented using several augmentation techniques. The features were extracted using Term Frequency-Inverse Document Frequency, sentiment analysis, and GloVe embedding. In this study, there were two types of soft voting models: machine learning-based, with the estimators of Random Forest, Decision Tree, Logistic Regression, Naïve Bayes, and AdaBoost as the best combination, and deep learning-based, with the best estimator combination of Convolutional Neural Network, Bidirectional Long Short-Term Memory, and Bidirectional Gated Recurrent Unit. The results of this study show that the soft voting classifier was better in performance compared to classic machine learning and deep learning models on both original and augmented datasets.

Description

Citation

Mendel. 2023 vol. 29, č. 1, s. 1-6. ISSN 1803-3814
https://mendel-journal.org/index.php/mendel/article/view/211

Document type

Peer-reviewed

Document version

Published version

Date of access to the full text

Language of document

en

Study field

Comittee

Date of acceptance

Defence

Result of defence

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license
Citace PRO