1,720,979 research outputs found

    Ranking to Learn and Learning to Rank: On the Role of Ranking in Pattern Recognition Applications

    Full text link
    The last decade has seen a revolution in the theory and application of machine learning and pattern recognition. Through these advancements, variable ranking has emerged as an active and growing research area and it is now beginning to be applied to many new problems. The rationale behind this fact is that many pattern recognition problems are by nature ranking problems. The main objective of a ranking algorithm is to sort objects according to some criteria, so that, the most relevant items will appear early in the produced result list. Ranking methods can be analysed from two different methodological perspectives: ranking to learn and learning to rank. The former aims at studying methods and techniques to sort objects for improving the accuracy of a machine learning model. Enhancing a model performance can be challenging at times. For example, in pattern classification tasks, different data representations can complicate and hide the different explanatory factors of variation behind the data. In particular, hand-crafted features contain many cues that are either redundant or irrelevant, which turn out to reduce the overall accuracy of the classifier. In such a case feature selection is used, that, by producing ranked lists of features, helps to filter out the unwanted information. Moreover, in real-time systems (e.g., visual trackers) ranking approaches are used as optimization procedures which improve the robustness of the system that deals with the high variability of the image streams that change over time. The other way around, learning to rank is necessary in the construction of ranking models for information retrieval, biometric authentication, re-identification, and recommender systems. In this context, the ranking model's purpose is to sort objects according to their degrees of relevance, importance, or preference as defined in the specific application. This thesis addresses these issues and discusses different aspects of variable ranking in pattern recognition, biometrics, and computer vision. In particular, this work explores the merits of ranking to learn, by proposing novel solutions in feature selection that efficiently remove unwanted cues from the information stream. A novel graph-based ranking framework is proposed that exploits the convergence properties of power series of matrices thereby individuating candidate features, which turn out to be effective from a classification point of view. Moreover, it investigates the difficulties of ranking in real-time while presenting interesting solutions to better handle data variability in an important computer vision setting: Visual Object Tracking. The second part of this thesis focuses on the problem of learning to rank. Firstly, an interesting scenario of automatic user re-identification and verification in text chats is considered. Here, we start from the challenging problem of feature handcrafting to automatic feature learning solutions. We explore different techniques which turn out to produce effective ranks, contributing to push forward the state of the art. Moreover, we focus on advert recommendation, where deep convolutional neural networks with shallow architectures are used to rank ads according to users' preferences. We demonstrate the quality of our solutions in extensive experimental evaluations. Finally, this thesis introduces representative datasets and code libraries in different research areas that facilitate large-scale performance evaluation

    Feature Selection via Eigenvector Centrality

    No full text
    In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph - where features are the nodes - the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigenvector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data, object recognition, among others), and compared against filter, embedded, and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time

    Feature Selection via Eigenvector Centrality

    Full text link
    In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph - where features are the nodes - the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigenvector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data, object recognition, among others), and compared against filter, embedded, and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time

    Personality in Computational Advertising: A Benchmark

    Full text link
    In the last decade, new ways of shopping online have increased the possibility of buying products and services more easily and faster than ever. In this new context, personality is a key determinant in the decision making of the consumer when shopping. A person’s buying choices are influenced by psychological factors like impulsiveness; indeed some consumers may be more susceptible to making impulse purchases than others. Since affective metadata are more closely related to the user’s experience than generic parameters, accurate predictions reveal important aspects of user’s attitudes, social life, including attitude of others and social identity. This work proposes a highly innovative research that uses a personality perspective to determine the unique associations among the consumer’s buying tendency and advert recommendations. In fact, the lack of a publicly available benchmark for computational advertising do not allow both the exploration of this intriguing research direction and the evaluation of recent algorithms. We present the ADS Dataset, a publicly available benchmark consisting of 300 real advertisements (i.e., Rich Media Ads, Image Ads, Text Ads) rated by 120 unacquainted individuals, enriched with Big-Five users’ personality factors and 1,200 personal users’ pictures

    Statistical Analysis of Personality and Identity in Chats Using a Keylogging Platform

    No full text
    Interacting via text chats can be considered as a hybrid type of communication, in which textual information delivery follows turn-taking dynamics, resembling spoken interactions. An interesting research question is whether personality can be observed in chats, similarly as happening in face-to-face exchanges. After an encouraging preliminary work on Skype, in this study we have set up our own chat service in which key-logging functionalities have been activated, so that the timings of each key pressing can be measured. Using this framework, we organized semi-structured chats between 50 subjects, whose personality traits have been analyzed through psychometric tests, and a single operator, for a total of 16 hours of conversation. On this data, we have observed that some personality traits are linked with the way we are chatting (measured by stylometric cues), by means of statistically significant correlations and regression studies. Finally, we have assessed that some of the stylometric cues are very discriminative for the recognition of a user in a identification scenario. These facts taken together could underlie that some personality traits drive us in chatting in a particular fashion, which turns out to be very recognizable

    Ranking to Learn:

    No full text
    In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph - where features are the nodes - the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigenvector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data and object recognition, among others), and compared against filter, embedded and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time. © Springer International Publishing AG 2017

    Online Feature Selection for Visual Tracking

    Full text link
    Object tracking is one of the most important tasks in many applications of computer vision. Many tracking methods use a fixed set of features ignoring that appearance of a target object may change drastically due to intrinsic and extrinsic factors. The ability to dynamically identify discriminative features would help in handling the appearance variability by improving tracking performance. The contribution of this work is threefold. Firstly, this paper presents a collection of several modern feature selection approaches selected among filter, embedded, and wrapper methods. Secondly, we provide extensive tests regarding the classification task intended to explore the strengths and weaknesses of the proposed methods with the goal to identify the right candidates for online tracking. Finally, we show how feature selection mechanisms can be successfully employed for ranking the features used by a tracking system, maintaining high frame rates. In particular, feature selection mounted on the Adaptive Color Tracking (ACT) system operates at over 110 FPS. This work demonstrates the importance of feature selection in online and realtime applications, resulted in what is clearly a very impressive performance, our solutions improve by 3% up to 7% the baseline ACT while providing superior results compared to 29 state-of-the-art tracking methods

    Infinite feature selection: a graph-based feature filtering approach

    Full text link
    We propose a filtering feature selection framework that considers a subset of features as a path in a graph, where a node is a feature and an edge indicates pairwise (customizable) relations among features, dealing with relevance and redundancy principles. By two different interpretations (exploiting properties of power series of matrices and relying on Markov chains fundamentals) we can evaluate the values of paths (i.e., feature subsets) of arbitrary lengths, eventually go to infinite, from which we dub our framework Infinite Feature Selection (Inf-FS). Going to infinite allows to constrain the computational complexity of the selection process, and to rank the features in an elegant way, that is, considering the value of any path (subset) containing a particular feature. We also propose a simple unsupervised strategy to cut the ranking, so providing the subset of features to keep. In the experiments, we analyze diverse setups with heterogeneous features, for a total of 11 benchmarks, comparing against 18 widely-known yet effective comparative approaches. The results show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori, or when the decision of the subset cardinality is part of the process

    Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach

    Full text link
    Feature selection is playing an increasingly significant role with respect to many computer vision applications spanning from object recognition to visual object tracking. However, most of the recent solutions in feature selection are not robust across different and heterogeneous set of data. In this paper, we address this issue proposing a robust probabilistic latent graph-based feature selection algorithm that performs the ranking step while considering all the possible subsets of features, as paths on a graph, bypassing the combinatorial problem analytically. An appealing characteristic of the approach is that it aims to discover an abstraction behind low-level sensory data, that is, relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired generative process that allows the investigation of the importance of a feature when injected into an arbitrary set of cues. The proposed method has been tested on ten diverse benchmarks, and compared against eleven state of the art feature selection methods. Results show that the proposed approach attains the highest performance levels across many different scenarios and difficulties, thereby confirming its strong robustness while setting a new state of the art in feature selection domain

    Reading between the turns: statistical modeling for identity recognition and verification in chats

    Full text link
    Identity safekeeping has recently become an important problem for the social web: as a case study, we focus here on instant messaging platforms, proposing novel soft-biometric cues for user recognition and verification. Specifically, we design a set of features encoding effectively how a person converses: since chats are crossbreeds of written text and face-to-face verbal communication, the features inherit equally from textual authorship attribution and conversational analysis of speech. Importantly, our cues ignore completely the semantics of the chat, relying solely on non-verbal aspects, taking care of possible privacy and ethical issues. We apply our approach on a novel dataset of 94 different individuals, whose chat conversations have been recorded for an average period of five months; recognition rate, intended as normalized AUC on CMC curve, is 95.73%, while verification rate amounts to 95.66%, as normalized AUC on ROC curve
    corecore