Self-training for cyberbully detection: Achieving high accuracy with a balanced multi-class dataset

Ahmadinejad, Mohamad Hosein

Self-training for cyberbully detection: Achieving high accuracy with a balanced multi-class dataset

dc.contributor.advisor	Nashid, Shahriar
dc.contributor.advisor	Lisa, Fan
dc.contributor.author	Ahmadinejad, Mohamad Hosein
dc.contributor.committeemember	Samira, Sadaoui
dc.contributor.externalexaminer	Andrei, Volodin
dc.date.accessioned	2023-12-18T17:32:49Z
dc.date.available	2023-12-18T17:32:49Z
dc.date.issued	2023-08
dc.description	A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. xiii, 99.
dc.description.abstract	Cyberbullying has become an alarming issue in the digital era, causing significant harm to its victims. The development of automated methods for detecting cyberbullying in social media is of paramount importance to safeguard vulnerable individuals. In this thesis, we propose a robust approach based on Machine Learning (ML) and Deep Learning (DL) techniques for cyberbully detection in social media platforms. Our approach involves the meticulous curation of a balanced dataset specifically designed for training the ML/ DL models. To overcome the challenge of limited labeled data, we employ a semi-supervised self-training algorithm, which effectively expands the size of the labeled dataset. By leveraging real-world social media data, we train and test the model, evaluating its performance using key metrics such as precision, recall, and F1-score. In addition, we present our meticulously annotated dataset comprising 99,991 tweets, which we have made publicly available for future scientific investigations. This dataset serves as a valuable resource for further research in this field, facilitating the development and evaluation of novel techniques for cyberbully detection. Our results underscore the near-perfect performance of the proposed approach in the context of cyberbully detection, reaffirming the efficacy of ML and DL techniques for addressing this pervasive problem. These findings offer crucial insights for future research endeavors in this domain and hold practical implications for the development of automated systems capable of detecting and combating cyberbullying in social media platforms. By continuously advancing our understanding of cyberbullying detection and developing sophisticated ML and DL models, we can foster safer digital environments and protect individuals from the detrimental effects of cyberbullying.
dc.description.authorstatus	Student	en
dc.description.peerreview	yes	en
dc.identifier.tcnumber	TC-SRU-16190
dc.identifier.thesisurl	https://ourspace.uregina.ca/bitstreams/98a00e42-b220-496c-bda5-931c153f106b/download
dc.identifier.uri	https://hdl.handle.net/10294/16190
dc.language.iso	en	en
dc.publisher	Faculty of Graduate Studies and Research, University of Regina	en
dc.title	Self-training for cyberbully detection: Achieving high accuracy with a balanced multi-class dataset
dc.type	master thesis	en
thesis.degree.department	Department of Computer Science
thesis.degree.discipline	Computer Science
thesis.degree.grantor	Faculty of Graduate Studies and Research, University of Regina	en
thesis.degree.level	Master's	en
thesis.degree.name	Master of Science (MSc)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ahmadinejad,MohamadHosein_MSc_CS_Thesis_2023Fall.pdf
Size:: 1.52 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.22 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Master’s and Doctoral Theses