1,720,971 research outputs found
Semantic Models for Machine Learning
In this thesis we present approaches to the creation and usage of semantic models by the analysis of the data spread in the feature space. We aim to introduce the general notion of using feature selection techniques in machine learning applications. The applied approaches obtain new feature directions on data, such that machine learning applications would show an increase in performance. We review three principle methods that are used throughout the thesis. Firstly Canonical Correlation Analysis (CCA), which is a method of correlating linear relationships between two multidimensional variables. CCA can be seen as using complex labels as a way of guiding feature selection towards the underlying semantics. CCA makes use of two views of the same semantic object to extract a representation of the semantics. Secondly Partial Least Squares (PLS), a method similar to CCA. It selects feature directions that are useful for the task at hand, though PLS only uses one view of an object and the label as the corresponding pair. PLS could be thought of as a method that looks for directions that are good for distinguishing the different labels. The third method is the Fisher kernel. A method that aims to extract more information of a generative model than simply by their output probabilities. The aim is to analyse how the Fisher score depends on the model and which aspects of the model are important in determining the Fisher score. We focus our theoretical investigation primarily on CCA and its kernel variant. Providing a theoretical analysis of the method's stability using Rademacher complexity, hence deriving the error bound for new data. We conclude the thesis by applying the described approaches to problems in the various fields of image, text, music application and medical analysis, describing several novel applications on relevant real-world data. The aim of the thesis is to provide a theoretical understanding of semantic models, while also providing a good application foundation on how these models can be practically used
One-class Machine Learning Approach for fMRI Analysis
One-Class Machine Learning techniques (i.e. "bottleneck" neural networks and one-class support vector machines (SVM)) are applied to classify whether a subject is performing a task or not by looking solely at the raw fMRI slices of his brain. "One-class" means that during training the system only has access to positive (i.e. task performing) examples. "Two-class" means it has access to negative examples as well. Successful classification of data by a system trained under either of the one-class systems was accomplished at close to the 60% level. (In contrast, an implementation of a standard two class SVM succeeds at around the 70% level.) These results were stable over repeated experiments and for both motor and visual tasks. Since the one-class neural network technique is naturally related to dimension reduction, it is possible that this mechanism may also be used for feature selection
Signal Extraction for Brain-Computer Interface
We use Kernel Canonical Correlation Analysis (KCCA) for detecting brain activity in function MRI by learning a semantic representation of fMRI brain scans and their associated time frequency. The semantic space provides a common representation and enables a comparison between the fMRI and time frequency. We compare the approach against Canonical Correlation Analysis (CCA) by localising brain regions that control finger movement and regions that are involved in mental calculation. We also compare the two approaches on a simulated null data set. We hypothesis that once a link can be established between regions of the brain to task one could create a brain-computer interface were computer related tasks could be activated by brain "thought" activity
Generic object recognition by combining distinct features in machine learning
In a genetic image object recognition or categorization system, the relevant features or descriptors from a characteristic point, patch or region of an image are often obtained by different approaches. And these features are often separately selected and learned by machine learning methods. In this paper, the relation between distinct features obtained by different feature extraction approaches from the same original images were studied by Kernel Canonical Correlation Analysis (KCCA). We apply a Support Vector Machine (SVM) classifier in the learnt semantic space of the combined features and compare against SVM on the raw data and previously published state-of-the-art results. Experiment show that significant improvement is achieved with the SVM in the semantic space in comparison with direct SVM classification on the raw data
KCCA Feature Selection for fMRI Analysis
We use Kernel Canonical Correlation Analysis (KCCA) to infer brain activity in functional MRI by learning a semantic representation of fMRI brain scans and their associated activity signal. The semantic space provides a common representation and enables a comparison between the fMRI and the activity signal. We compare the approach against Canonical Correlation Analysis (CCA) and the more commonly used Ordinary Correlation Analysis (OCA) by localising “activity” on a simulated null data set. We also compare performance of the methods on the localisation of brain regions which control finger movement and regions that are involved in mental calculation. Finally we present an approach to reconstruct an activity signal from an “unknown” testing-set fMRI scans. This is used to validate the learnt semantics as non-trivial
Learning the semantics of multimedia content with application to web image retrieval and classification
We use kernel Canonical Correlation Analysis to learn a semantic representation of Web images and their associated text. This representation is used in two applications. In first application we consider classification of images into one of three categories. We use SVM in the semantic space and compare against the SVM on raw data and against previously published results using ICA. In the second application we retrieve images based only on their content from a text query. The semantic space provides a common representation and enables a comparison between the text and image. We compare against a standard cross-representation retrieval technique known as the Generalised Vector Space Model
Using String Kernels to Identify Famous Performers from their Playing Style
In this paper we show a novel application of string kernels: that is to the problem of recognising famous pianists from their style of playing. The characteristics of performers playing the same piece are obtained from changes in beat-level tempo and beat-level loudness, which over the time of the piece form a performance worm. From such worms, general performance alphabets can be derived, and pianists’ performances can then be represented as strings. We show that when using the string kernel on this data, both kernel partial least squares and Support Vector Machines outperform the current best results. Furthermore we suggest a new method of obtaining feature directions from the Kernel Partial Least Squares algorithm and show that this can deliver better performance than methods previously used in the literature when used in conjunction with a Support Vector Machine
Canonical correlation analysis; An overview with application to learning methods
We present a general method using kernel Canonical Correlation Analysis to learn a semantic representation to web images and their associated text. The semantic space provides a common representation and enables a comparison between the text and images. In the experiments we look at two approaches of retrieving images based only on their content from a text query. We compare the approaches against a standard cross-representation retrieval technique known as the Generalised Vector Space Model
- …
