Self-training for cyberbully detection: Achieving high accuracy with a balanced multi-class dataset

dc.contributor.advisorNashid, Shahriar
dc.contributor.advisorLisa, Fan
dc.contributor.authorAhmadinejad, Mohamad Hosein
dc.contributor.committeememberSamira, Sadaoui
dc.contributor.externalexaminerAndrei, Volodin
dc.date.accessioned2023-12-18T17:32:49Z
dc.date.available2023-12-18T17:32:49Z
dc.date.issued2023-08
dc.descriptionA Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. xiii, 99.
dc.description.abstractCyberbullying has become an alarming issue in the digital era, causing significant harm to its victims. The development of automated methods for detecting cyberbullying in social media is of paramount importance to safeguard vulnerable individuals. In this thesis, we propose a robust approach based on Machine Learning (ML) and Deep Learning (DL) techniques for cyberbully detection in social media platforms. Our approach involves the meticulous curation of a balanced dataset specifically designed for training the ML/ DL models. To overcome the challenge of limited labeled data, we employ a semi-supervised self-training algorithm, which effectively expands the size of the labeled dataset. By leveraging real-world social media data, we train and test the model, evaluating its performance using key metrics such as precision, recall, and F1-score. In addition, we present our meticulously annotated dataset comprising 99,991 tweets, which we have made publicly available for future scientific investigations. This dataset serves as a valuable resource for further research in this field, facilitating the development and evaluation of novel techniques for cyberbully detection. Our results underscore the near-perfect performance of the proposed approach in the context of cyberbully detection, reaffirming the efficacy of ML and DL techniques for addressing this pervasive problem. These findings offer crucial insights for future research endeavors in this domain and hold practical implications for the development of automated systems capable of detecting and combating cyberbullying in social media platforms. By continuously advancing our understanding of cyberbullying detection and developing sophisticated ML and DL models, we can foster safer digital environments and protect individuals from the detrimental effects of cyberbullying.
dc.description.authorstatusStudenten
dc.description.peerreviewyesen
dc.identifier.tcnumberTC-SRU-16190
dc.identifier.thesisurlhttps://ourspace.uregina.ca/bitstreams/98a00e42-b220-496c-bda5-931c153f106b/download
dc.identifier.urihttps://hdl.handle.net/10294/16190
dc.language.isoenen
dc.publisherFaculty of Graduate Studies and Research, University of Reginaen
dc.titleSelf-training for cyberbully detection: Achieving high accuracy with a balanced multi-class dataset
dc.typeThesisen
thesis.degree.departmentDepartment of Computer Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Reginaen
thesis.degree.levelMaster'sen
thesis.degree.nameMaster of Science (MSc)
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ahmadinejad,MohamadHosein_MSc_CS_Thesis_2023Fall.pdf
Size:
1.52 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections