Analysis of Removing Weak Associations During Consensus Clustering

dc.contributor.advisorZilles, Sandra
dc.contributor.authorNaran Chirakkal, Ruckiya Sinorina
dc.contributor.committeememberHamilton, Howard
dc.contributor.externalexaminerLawler, Samantha
dc.date.accessioned2021-09-23T20:57:53Z
dc.date.available2021-09-23T20:57:53Z
dc.date.issued2020-08
dc.descriptionA Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. xiii, 95 p.en_US
dc.description.abstractGiven multiple base clusterings of a dataset, e.g., as created by multiple clustering algorithms on the same data, consensus clustering aims to generate a single robust aggregated clustering. Consensus methods measure the strength of an association between two data objects based on how often the objects are grouped together by the base clusterings. However, incorporating weak associations in the consensus process can have a negative e ect on the quality of the aggregated clustering. This thesis presents our research on an automatic approach for removing weak associations during the consensus process. In particular, we propose an e cient approach called the WAT approach for removing weak associations, and two methods using the WAT approach, namely WAT(K) and WAT(GMM), are tested in this thesis. We compare our methods to a brute force method used in an existing consensus function, NegMM, which tends to be rather inefficient in terms of runtime. Our empirical analysis on multiple datasets shows that the proposed approach produces consensus clusterings that are comparable in quality to the ones produced by the original NegMM method, yet at a much lower run time. Moreover, this thesis also presents an empirical analysis to study the effect of our approach to remove the weak associations on the CSPA and MCLA consensus functions, which are well-known consensus functions from the literature. Our WAT approach improved the consensus built by CSPA significantly in many cases, but the original MCLA tends to outperform the combination of MCLA with the WAT methods.en_US
dc.description.authorstatusStudenten
dc.description.peerreviewyesen
dc.identifier.tcnumberTC-SRU-14398
dc.identifier.thesisurlhttps://ourspace.uregina.ca/bitstream/handle/10294/14398/NaranChirakkal_RuckiyaSinorina_MSC_CS_Spring2021.pdf
dc.identifier.urihttps://hdl.handle.net/10294/14398
dc.language.isoenen_US
dc.publisherFaculty of Graduate Studies and Research, University of Reginaen_US
dc.titleAnalysis of Removing Weak Associations During Consensus Clusteringen_US
dc.typeThesisen_US
thesis.degree.departmentDepartment of Computer Scienceen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Reginaen
thesis.degree.levelMaster'sen
thesis.degree.nameMaster of Science (MSc)en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
NaranChirakkal_RuckiyaSinorina_MSC_CS_Spring2021.pdf
Size:
926.25 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections