1,720,961 research outputs found

    Apprendimento Automatico di Modelli Visuali con Dati Incompleti

    No full text
    L'obiettivo di un sistema di apprendimento automatico è catturare la struttura e le regolarità presenti nei dati in ingresso in modo da permettere la classificazione di dati futuri. I metodi di apprendimento artificiale sono in grado di astrarre modelli di classificazione da dati di training precedentemente annotati, ma riscontrano difficoltà quando la distribuzione di tali dati non è esplicitamente modellata. Una considerevole quantità di dati visuali è oggi disponibile in varie applicazioni, le difficoltà, sfortunatamente, risiedono nell'avere a disposizione dati annotati e nella possibilità di etichettare i dati sulla base delle risorse di tempo disponibili o della conoscenza accessibile. Questa tesi è focalizzata sull'apprendimento automatico di modelli discriminativi in scenari con una scarsa disponibilità di dati annotati o con dati incompleti. Con dati incompleti ci riferiamo sia al caso in cui solamente un sottoinsieme dei dati di ingresso sia annotato, sia al caso in cui solo una frazione delle classi di addestramento sia annotata. Il problema dell'apprendimento automatico con dati parzialmente etichettati è stato qui valutato in tre diverse applicazioni nel campo della visione artificiale, ovvero localizzazione e inseguimento di persone, classificazione di nuove categorie di immagini e analisi di immagini di documenti. Nella video sorveglianza l'input di un sistema di tracking può essere visto come un insieme di dati solo parzialmente annotati, dove sono presenti alcuni esempi del target da seguire e diversi esempi non etichettati. Tali dati non etichettati possono discostarsi anche notevolmente dal modello dei dati annotati a causa di occlusioni, cambiamenti di posa o di illuminazione, rendendo il problema di associazione tra dati etichettati e non ancora più complicato. In questa tesi viene proposto un metodo di apprendimento automatico semi supervisionato per risolvere il problema di inseguimento di persone e viene dimostrato mediante un’analisi sperimentale l’efficacia della soluzione proposta. Riguardo alla classificazione di immagini, un'interessante sfida è rappresentata dall’individuazione di nuove categorie e sottocategorie di oggetti. Assumendo che gli oggetti siano organizzati in tassonomie, può verificarsi il caso in cui gli elementi da classificare differiscano dalla gerarchia appresa o condividano solo parte dei nodi parentali. Il lavoro è qui dedicato all’apprendimento di un modello dai dati di training che sia in grado di generalizzare anche su classi non viste durante la fase di apprendimento. Infine, l’ultima parte affronta la segmentazione di figure in scansioni di testi antichi e il recupero di immagini simili da altre sorgenti. Lavorare sulla segmentazione di documenti datati risulta in una considerevole quantità di elementi illustrativi e quindi nella difficoltà di avere a disposizione esempi rappresentativi di questa eterogeneità. Viene proposta una rappresentazione efficace delle caratteristiche delle immagini e l’utilizzo di Support Vector Machines come metodo di classificazione. L'uso di queste due tecniche ha condotto ad un miglioramento nei confronti di altri metodi esistenti anche nel caso in cui un modello dettagliato dei dati di training non è disponibile.The goal of a learning system is to capture patterns and regularities in training data which allow for future classification. Machine learning methods are able to generalize a classification model from labelled training data but difficulties arise when the distribution of the training data is not explicitly modelled. Real world applications offer a massive amount of visual data, but unfortunately labelled data are not always easy to find and the labelling process is costly and time consuming or may not be possible for a lack of knowledge. This work is focused on the learning of discriminative visual models in scenarios with partially annotated or incomplete data. With incomplete data we refer either to the case where only a subset of the training data is labelled or where only a fraction of the training classes is known. We evaluate the problem of learning from incomplete data in three separate computer vision applications, namely people tracking, novel image classification and document image analysis. In video surveillance the input of a tracking system might be interpreted as a set of partially labelled data where there are only few annotated instances of the target and several not annotated samples. Not annotated test data might also deviate from training data because of occlusions, changes in pose or appearance making the target association problem challenging. We exploit a semi supervised learning method to solve the problem of people tracking and we demonstrate with an experimental analysis the effectiveness of the proposed approach. Regarding image categorization, an interesting challenge is represented by the detection of novel categories and subcategories of objects. Assuming that objects can be organized in taxonomies, the instances to be classified may differ from the hierarchy learned from training data and they might share only parent nodes. Our work is devoted to derive a learning model from labelled data able to generalize over data coming from classes not seen during training. Finally, the last part addresses the picture segmentation in document images of old books. Dealing with the layout segmentation of old documents results in a variety of pictorial elements, thus in the difficulty of being able to collect samples representative of this heterogeneity. We propose an effective feature representation and a Support Vector Machines classification along with an experimental evaluation that demonstrate an improvement over baseline methods of document layout analysis even if a detailed model of the input space is not available

    Transductive People Tracking in Unconstrained Surveillance

    No full text
    Long term tracking of people in unconstrained scenarios is still an open problem due to the absence of constant elements in the problem setting. The camera, when active, may move and both the background and the target appearance may change abruptly leading to the inadequacy of most standard tracking techniques. We propose to exploit a learning approach that considers the tracking task as a semi supervised learning (SSL) problem. Given few target samples the aim is to search the target occurrences in the video stream re-interpreting the problem as label propagation on a similarity graph. We propose a solution based on graph transduction that works iteratively frame by frame. Additionally, in order to avoid drifting, we introduce an update strategy based on an evolutionary clustering technique that chooses the visual templates that better describe target appearance evolving the model during the processing of the video. Since we model people appearance by means of covariance matrices on color and gradient information our framework is directly related to structure learning on Riemannian manifolds. Tests on publicly available datasets and comparisons with stateof- the-art techniques allow to conclude that our solution exhibit interesting performances in terms of tracking precision and recall in most of the considered scenarios

    Active query process for digital video surveillance forensic applications

    No full text
    Multimedia forensics is a new emerging discipline regarding the analysis and exploitation of digital data as support for investigation to extract probative elements. Among them, visual data about people and people activities, extracted from videos in an efficient way, are becoming day by day more appealing for forensics, due to the availability of large video-surveillance footage. Thus, many research studies and prototypes investigate the analysis of soft biometrics data, such as people appearance and people trajectories. In this work, we propose new solutions for querying and retrieving visual data in an interactive and active fashion for soft biometrics in forensics. The innovative proposal joins the capability of transductive learning for semi-supervised search by similarity and a typical multimedia methodology based on user-guided relevance feedback to allow an active interaction with the visual data of people, appearance and trajectory in large surveillance areas. Approaches proposed are very general and can be exploited independently by the surveillance setting and the type of video analytic tools

    Illustrations Segmentation in Digitized Documents Using Local Correlation Features

    No full text
    In this paper we propose an approach for Document Layout Analysis based on local correlation features. We identify and extract illustrations in digitized documents by learning the discriminative patterns of textual and pictorial regions. The proposal has been demonstrated to be effective on historical datasets and to outperform the state-of-the-art in presence of challenging documents with a large variety of pictorial elements

    People appearance tracing in video by spectral graph transduction

    No full text
    Following people in different video sources is a challenging task: variations in the type of camera, in the lighting conditions, in the scene settings (e.g. crowd or occlusions) and in the point of view must be accounted. In this paper we propose a system based only on appearance information that, disregarding temporal and spatial information, can be flexibly applied on both moving and static cameras. We exploit the joint use of transductive learning and spectral properties of graph Laplacians proposing a formulation of the people tracing problem as a semi-supervised classification. The knowledge encoded in two labeled input sets of positive and negative samples of the target person and the continuous spectral update of these models allow us to obtain a robust approach for people tracing in surveillance video sequences. Experiments on publicly available datasets show satisfactory results and exhibit a good robustness in dealing with short and long term occlusions

    Appearance tracking by transduction in surveillance scenarios

    No full text
    We propose a formulation of people tracking problem as a Transductive Learning (TL) problem. TL is an effective semi-supervised learning technique by which many classification problems have been recently reinterpreted as learning labels from incomplete datasets. In our proposal the joint exploitation of spectral graph theory and Riemannian manifold learning tools leads to the formulation of a robust approach for appearance based tracking in Video Surveillance scenarios. The key advantage of the presented method is a continuously updated model of the tracked target, used in the TL process, that allows to on-line learn the target visual appearance and consequently to improve the tracker accuracy. Experiments on public datasets show an encouraging advancement over alternative state-of the-art techniques

    Iterative active querying for surveillance data retrieval in crime detection and forensics

    No full text
    Large sets of visual data are now available both, in real time andoff line, at time of investigation in multimedia forensics, however passive querying systems often encounter difficulties in retrieving significant results. In this paper we propose an iterativeactive querying system for video surveillance and forensic applications based on the continuous interaction between the userand the system. The positive and negative user feedbacks areexploited as the input of a graph based transductive procedurefor iteratively refining the initial query results. Experimentsare shown using people trajectories and people appearance asdistance metrics

    Layout analysis and content enrichment of digitized books

    No full text
    In this paper we describe a system for automatically analyzing old documents and creating hyper linking between different epochs, thus opening ancient documents to young people and to make them available on the web with old and current content. We propose a supervised learning approach to segment text and illustration of digitized old documents using a texture feature based on local correlation aimed at detecting the repeating patterns of text regions and differentiate them from pictorial elements. Moreover we present a solution to help the user in finding contemporary content connected to what is automatically extracted from the ancient documents

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore