Learning Pattern Languages from a Small Number of Helpfully Chosen Examples

dc.contributor.advisorZilles, Sandra
dc.contributor.authorMazadi, Zeinab
dc.contributor.committeememberYang, Boting
dc.contributor.committeememberHilderman, Robert
dc.contributor.externalexaminerFarenick, Douglas
dc.date.accessioned2014-10-17T15:39:56Z
dc.date.available2014-10-17T15:39:56Z
dc.date.issued2013-08
dc.descriptionA Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. viii, 74 p.en_US
dc.description.abstractA pattern is a string containing variable symbols and constants. The language of a pattern is the set of all strings obtained by replacing all variables in the pattern with non-empty strings. Patterns and their languages were introduced by Angluin in 1980. Since that time, learning of pattern languages has been a topic of great interest in the research area of computational learning theory, mainly because of its relevance for many applications. Areas in which patterns are suitable for modelling data are for example bioinformatics (e.g., when representing sets of amino acid sequences) or text mining (e.g., for automated information extraction). This thesis studies learning of pattern languages in the context of computational learning theory. Computational learning theory studies various models of learning and investigates the learnability of classes of languages using each learning model. More- over, determining the number of data points (sample complexity) and the amount of computational time (time complexity) required for learning a particular class of languages in a particular model is a main goal in computational learning theory. In this thesis we focus on learning classes of pattern languages in a model called “learning from helpful examples” or “learning from teachers”. In particular, we are interested in determining the worst case number of examples that need to be com- municated between a teacher and a learner to identify the target language from all other languages in the class. The sample complexity measures we use are the teaching dimension in the classic teaching model and the recursive teaching dimension in the cooperative teaching model that was introduced later. We are interested in computing these parameters for certain classes of pattern languages over various sizes of alphabets from which the constant symbols can be taken. In particular, we aim to determine the effect of the alphabet size on these parameters. We study arbitrary patterns, regular patterns and one-variable patterns over three types of alphabets, namely finite alphabets of size at least two, singleton alphabets and infinite alphabets. Our results show that sometimes the alphabet size does in- fluence the sample complexity parameters and sometimes it does not influence them. Moreover, we demonstrate that more advanced models of teaching, like the recursive teaching model, can be more sample-efficient than the classic teaching model when learning certain kinds of pattern languages.en_US
dc.description.authorstatusStudenten
dc.description.peerreviewyesen
dc.identifier.tcnumberTC-SRU-5389
dc.identifier.thesisurlhttp://ourspace.uregina.ca/bitstream/handle/10294/5389/Mazadi_Zeinab_200299002_MSC_CS_Spring2014.pdf
dc.identifier.urihttps://hdl.handle.net/10294/5389
dc.language.isoenen_US
dc.publisherFaculty of Graduate Studies and Research, University of Reginaen_US
dc.titleLearning Pattern Languages from a Small Number of Helpfully Chosen Examplesen_US
dc.typeThesisen
thesis.degree.departmentDepartment of Computer Scienceen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Reginaen
thesis.degree.levelMaster'sen
thesis.degree.nameMaster of Science (MSc)en_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mazadi_Zeinab_200299002_MSC_CS_Spring2014.pdf
Size:
365.16 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections