1,721,013 research outputs found

    Generative models for classification: classification, feature and features organization

    No full text
    La classificazione è uno dei problemi più studiati in machine learning e consiste nell'assegnare un etichetta o classed ad oggetti in input sulla base di informazione quantitativa. Esistono due principali tipi di classificatori, i classificatori generativi, basati su di un modello generativo, e i classificatori discriminative, basati sul concetto di confine separatore. In questa tesi viene mostrato cosa un modello generativo può fare per risolvere il problema della classificazione. In particolar un modello generativo può essere usato come classificatore, per organizzare le descrizioni quantitative degli oggetti o per fornire descrizioni che vengono poi classificate con metodi discriminativi.Classification is one of the most studied problems in machine learning and involves the placement of data into groups based on quantitative information. Group membership is determined by whether or not a datum contains a specific ``feature'', a decision made is based on a training set of previously labeled items. In order to build robust classifiers, one has to capture various aspect of the data at the same time, this can be accomplished using generative models. Generative models are statistical models that can explain the input data as tangible effects generated from a combination of hidden variables, encoding the causes of data generation, coupled with conditional independencies. These models should be fairly simple, but capable of adapting to the data; the machine learning community has defined this models as flexible when they are minimally structured probability models with a large number of parameters that can adapt so as to explain the input data. Although a generative model can be used for classification using the Bayes rule and marginalization, another family of classifiers, the discriminative classifiers, in normal regimes, achieve better performances since they enable us to construct decision boundaries, incorporating during the learning the concept of discrimination. On the other hand generative models can deal with missing/hidden information, variable length descriptions and they are more robust to overtraining. The prevailing wisdom among the machine learning community is that an ideal classifier should combine these two complementary approaches. In this dissertation I will show what a generative model can do for classification. I will face the genotype classification task, showing with a computational Biology example how a generative model outperforms discriminative classifiers; the idea here is that genotype data are too complex and not separable in classes without calculating the hidden causes that have generated them (the haplotypes). Then I will present a novel family of generative models that can be used to spatially organize features in a set of image, for a more efficient use with a generic discriminative or generative classifier. Finally, I will show original ways to provide features from generative model: by deriving kernel functions for use in discriminative methods, by using by-product of generative models as features or by extracting similarities between samples under a generative model

    Learning natural scene categories by selective multi-level feature extraction

    No full text
    Natural scene categorization from images represents a very useful task for automatic image analysis systems. In the literature, several methods have been proposed facing this issue with excellent results. Typically, features of several types are clustered so as to generate a vocabulary able to describe in a multi-faceted way the considered image collection. This vocabulary is formed by a discrete set of visual codewords whose co-occurrence and/or composition allows to classify the scene category. A common drawback of these methods is that features are usually extracted from the whole image, actually disregarding whether they derive properly from the natural scene to be classified or from foreground objects, possibly present in it, which are not peculiar for the scene. As quoted by perceptual studies, objects present in an image are not useful to natural scene categorization, indeed bringing an important source of clutter, in dependence of their size. In this paper, a novel, multi-scale, statistical approach for image representation aimed at scene categorization is presented. The method is able to select, at different levels, sets of features that represent exclusively the scene disregarding other non-characteristic, clutter, elements. The proposed procedure, based on a generative model, is then able to produce a robust representation scheme, useful for image classification. The obtained results are very convincing and prove the goodness of the approach even by just considering simple features like local color image histograms

    BIOMETRICS ON VISUAL PREFERENCES: A “PUMP AND DISTILL” REGRESSION APPROACH

    No full text
    We present a statistical behavioural biometric approach forrecognizing people by their aesthetic preferences, usingcolour images. In the enrollment phase, a model is learntfor each user, using a training set of preferred images. In therecognition/authentication phase, such model is tested withan unseen set of pictures preferred by a probe subject. Theapproach is dubbed “pump and distill”, since the training setof each user is pumped by bagging, producing a set of imageensembles. In the distill step, each ensemble is reduced intoa set of surrogates, that is, aggregates of images sharing asimilar visual content. Finally, LASSO regression is performedon these surrogates; the resulting regressor, employedas a classifier, takes test images belonging to a single user,predicting his identity. The approach improves the state-ofthe-arton recognition and authentication tasks in average, ona dataset of 40000 Flickr images and 200 users. In practice,given a pool of 20 preferred images of a user, the approachrecognizes his identity with an accuracy of 92%, and sets anauthentication accuracy of 91% in terms of normalized AreaUnder the Curve of the CMC and ROC curve, respectivel

    The pictures we like are our image: continuous mapping of favorite pictures into self-assessed and attributed personality traits

    Full text link
    Flickr allows its users to tag the pictures they like as “favorite”. As a result, many users of the popular photo-sharing platform produce galleries of favorite pictures. This article proposes new approaches, based on Computational Aesthetics, capable to infer the personality traits of Flickr users from the galleries above. In particular, the approaches map low-level features extracted from the pictures into numerical scores corresponding to the Big-Five Traits, both self-assessed and attributed. The experiments were performed over 60,000 pictures tagged as favorite by 300 users (the PsychoFlickr Corpus). The results show that it is possible to predict beyond chance both self-assessed and attributed traits. In line with the state-of-the art of Personality Computing, these latter are predicted with higher effectiveness (correlation up to 0.68 between actual and predicted traits)

    Free energy score spaces: using generative information in discriminative classifiers

    Full text link
    A score function induced by a generative model of the data can provide a feature vector of a fixed dimension for eachdata sample. Data samples themselves may be of differing lengths (e.g., speech segments, or other sequential data), but as ascore function is based on the properties of the data generation process, it produces a fixed-length vector in a highly informativespace, typically referred to as “score space”. Discriminative classifiers have been shown to achieve higher performances inappropriately chosen score spaces with respect to what is achievable by either the corresponding generative likelihood-basedclassifiers, or the discriminative classifiers using standard feature extractors. In this paper, we present a novel score space thatexploits the free energy associated with a generative model. The resulting free energy score space (FESS) takes into accountthe latent structure of the data at various levels, and can be shown to lead to classification performance that at least matchesthe performance of the free energy classifier based on the same generative model, and the same factorization of the posterior.We also show that in several typical computer vision and computational biology applications the classifiers optimized in FESSoutperform the corresponding pure generative approaches, as well as a number of previous approaches combining discriminatingand generative models

    Capturing video structure with mixture of probabilistic index maps

    No full text
    The ability to segment or separate foreground from background in video images is useful to a number of applications including video compression, human-computer interaction, and object tracking to name a few. In order to generate such segmentation in both a reliable and visually pleasing manner the fusion of both spatial and temporal information is required. This fusion typically requires to process a large amount of information thereby imposing a heavy computational cost and/or requiring substantial manual interaction. This heavy computational cost unfortunately limits its applicability. In this paper a generative model to solve this problem is proposed. The model has been designed with a particular emphasis on efficiency, but also provide visually pleasing results. The approach selects salient appearance poses of the foreground shared across the entire sequence in an unsupervised way, and uses them to better extract the foreground from the single frames. Results prove the validity of the approach

    Feature selection using Counting Grids: application to microarray data

    No full text
    In this paper a novel feature selection scheme is proposed, which exploits the potentialities of a recent probabilistic generative model, the Counting Grid. This model is able to cluster together similar observations, highlighting the compactness of a class and its underlying structure. The proposed feature selection scheme is applied to the expression microarray scenario, a peculiar context with very few patterns and a huge number of features. Experiments on benchmark datasets show that the proposed approach is effective and stable, assessing state-of-the-art classification accuracies

    Free energy score space

    No full text
    A score function induced by a generative model of the data can provide a feature vector of a fixed dimension for each data sample. Data samples themselves may be of differing lengths (e.g., speech segments, or other sequence data), but as a score function is based on the properties of the data generation process, it produces a fixed-length vector in a highly informative space, typically referred to as a “score space”. Discriminative classifiers have been shown to achieve higher performance in appropriately chosen score spaces than is achievable by either the corresponding generative likelihood-based classifiers, or the discriminative classifiers using standard feature extractors. In this paper, we present a novel score space that exploits the free energy associated with a generative model. The resulting free energy score space (FESS) takes into account latent structure of the data at various levels, and can be trivially shown to lead to classification performance that at least matches the performance of the free energy classifier based on the same generative model, and the same factorization of the posterior. We also show that in several typical vision and computational biology applications the classifiers optimized in FESS outperform the corresponding pure generative approaches, as well as a number of previous approaches to combining discriminating and generative models

    Stel component analysis: Modeling spatial correlations in image class structure

    No full text
    As a useful concept in the study of the low level image class structure, we introduce the notion of a structure element - dasiastel.psila The notion is related to the notions of a pixel, superpixel, segment or a part, but instead of referring to an element or a region of a single image, stel is a probabilistic element of an entire image class. Stels often define clear object or scene parts as a consequence of the modeling constraint which forces the regions belonging to a single stel to have a tight distribution over local measurements, such as color or texture. This self-similarity within a region in a single image is typical of most meaningful image parts, even when in different images of similar objects the corresponding parts may not have similar local measurements. The stel itself is expected to be consistent within a class, yet flexible, which we accomplish using a novel approach we dubbed stel component analysis. Experimental results show how stel component analysis can assist in image/video segmentation and object recognition where, in particular, it can be used as an alternative of, or in conjunction with, bag-of-features and related classifiers, where stel inference provides a meaningful spatial partition of features
    corecore