Distinguishing Linear Sets and Pattern Languages With Membership Examples

Gao, Ziyuan

Distinguishing Linear Sets and Pattern Languages With Membership Examples

dc.contributor.advisor	Zilles, Sandra
dc.contributor.author	Gao, Ziyuan
dc.contributor.committeemember	Yang, Boting
dc.contributor.committeemember	Yao, Yiyu
dc.contributor.committeemember	Fallat, Shaun
dc.contributor.externalexaminer	Reidenbach, Daniel
dc.date.accessioned	2018-12-03T21:16:02Z
dc.date.available	2018-12-03T21:16:02Z
dc.date.issued	2017-09
dc.description	A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science, University of Regina. vii, 194 p.	en_US
dc.description.abstract	In Computational Learning Theory, a binary concept is often represented by a set of pairs (x; l), where x varies over a set of instances known as a universe or instance space, and l is a label, either "+" or "-", indicating whether or not x belongs to the concept. This thesis will consider a type of machine learning task known as supervised learning, where an algorithm M is fed with a set of labelled data for a target concept belonging to a given class of concepts, and M produces a hypothesis function that is used to accurately predict the correct labels of unseen instances. The learning process typically involves two agents, a teacher and a learner. The task of a teacher is to provide labelled examples for a target concept to a learner, who in turn makes a hypothesis based on the seen data. The present thesis focusses on the teacher's role; specifically, the question: given any concept class C and a learning algorithm M, what is the minimum number of labelled examples needed by M to exactly identify any given concept belonging to C? This quantity may be conceived as a measure of the information complexity or teaching complexity of C. Our work will study the teaching complexity of three related families of concept and the erasing pattern languages. Linear sets and pattern languages are mathematical objects that are closely connected to automata theory and formal languages. In this thesis, the teaching complexity of a concept class will be measured mainly by two combinatorial parameters, the teaching dimension (TD) and the recursive teaching dimension (RTD). The TD of a concept C with respect to a class C containing C is defined as the size of a smallest sample (called a teaching set for C with respect to C) that is labelled consistently with C but not with any C0 in C distinct from C; the TD of C is the worst-case TD over all C in C (or infinity if the set consisting of the teaching dimensions of all C in C with respect to C is unbounded). Concerning the TD, we are chiefly interested in two questions with respect to any given concept class C: first, is there a decidable characterization of the concepts in C that have finite TD with respect to C; second, given a representation for concepts in C, how does the TD of C with respect to C vary with the size of the representation of C? The second main parameter studied in this work, the recursive teaching dimension (RTD), arises from the recently introduced recursive teaching model, and it is of particular interest in learning theory due to its links to other learning models as well as to the Vapnik-Chervonenkis dimension { one of the most important parameters in Statistical Learning Theory { in the context of finite concept classes. Our main finding about the RTD in this thesis is that for many classes of linear sets (resp. pattern languages), recursive teaching is significantly more sample efficient than the classical teaching protocol.	en_US
dc.description.authorstatus	Student	en
dc.description.peerreview	yes	en
dc.identifier.tcnumber	TC-SRU-8465
dc.identifier.thesisurl	https://ourspace.uregina.ca/bitstream/handle/10294/8465/GAO_Ziyuan_200332624_PHD_CS_SPRING2018.pdf
dc.identifier.uri	https://hdl.handle.net/10294/8465
dc.language.iso	en	en_US
dc.publisher	Faculty of Graduate Studies and Research, University of Regina	en_US
dc.title	Distinguishing Linear Sets and Pattern Languages With Membership Examples	en_US
dc.type	master thesis	en
thesis.degree.department	Department of Computer Science	en_US
thesis.degree.discipline	Computer Science	en_US
thesis.degree.grantor	Faculty of Graduate Studies and Research, University of Regina	en
thesis.degree.level	Doctoral -- first	en
thesis.degree.name	Doctor of Philosophy (PhD)	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: GAO_Ziyuan_200332624_PHD_CS_SPRING2018.pdf
Size:: 1.01 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.22 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Master’s and Doctoral Theses