Sparse Coding Tone-Like Structures in Sound Using Local Image Features

Date

2015-08

Authors

Ubbens, Jordan Robert

Journal Title

Journal ISSN

Volume Title

Publisher

Faculty of Graduate Studies and Research, University of Regina

Abstract

A trend in machine learning has emphasized the use of features which are learned algorithmically, in contrast to the hand-engineered features traditionally used in classification tasks. Classical sparse coding is a robust feature learning paradigm which represents inputs as a sparse vector of coefficients applied to a dictionary of basis functions. While sparse coding has yielded state-of-the-art results in many application domains, computational challenges often make it impractical to use. This thesis examines the application of a local image feature based sparse coding algorithm (ScSPM) to the problem domain of audio classification. The convex optimization problems involved in dictionary learning are discussed, and existing methods are reviewed. With the goal of mitigating some of the computational expense involved in sparse coding local image features, alternative image-based representations of audio are proposed which isolate the tone-like structures present in the signal. The proposed alternative representations are evaluated in a multi-class audio classification task with respect to training time as well as classification accuracy. i

Description

A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. viii, 76 p.

Keywords

Citation