Search CORE

1,720,966 research outputs found

Identificazione di anomalie nell’attenzione del guidatore e nel comportamento delle persone.

Author: ABATI DAVIDE
Publication venue
Publication date: 09/03/2020
Field of study

Attraverso sensori e dispositivi informatici sempre più pervasivi il mondo diventa di giorno in giorno sempre più interconnesso e digitalizzato: di conseguenza, emergono nuove opportunità per l'intelligenza artificiale. In particolare, il monitoraggio pubblico si candida come tema critico e la visione artificiale ha le potenzialità per emergere come tecnologia guida nella costruzione di un mondo più sicuro. In questa tesi, presentiamo soluzioni per affrontare la salvaguardia pubblica in due diverse aree applicative. Consideriamo innanzitutto la sicurezza al volante, sviluppando un sistema in grado di prevedere su quali elementi della scena circostante un guidatore posa la sua attenzione. Nonostante il grande potenziale per il miglioramento della sicurezza, tale previsione appare molto complessa dal momento che guidare un'auto è un compito complicato, ed è altamente soggettivo dal punto di vista attentivo. A tal proposito, raccogliamo e rilasciamo DR(eye)VE, un dataset costituito da video acquisiti sia dal punto di vista del guidatore che da quello dell’auto, annotato con i punti di fissazione del guidatore sulla scena urbana esterna. Successivamente, una profonda ispezione di tali dati permette di stabilire quali fattori influenzano maggiormente l’attenzione del guidatore, in termini di movimento e di semantica. Guidati da tali evidenze, sviluppiamo infine una rete neurale profonda che, a partire da una scena urbana, identifica quali regioni sono salienti per l'attenzione del guidatore. In secondo luogo, affrontiamo la sicurezza in ambito videosorveglianza introducendo un modello di rilevamento delle anomalie. Tale modello è in grado di apprendere gli aspetti che caratterizzano situazioni normali (sicure), e quindi di generare una allerta ogni qualvolta compaiano eventi imprevisti. Addestrare tali modelli in assenza di esempi di condizioni anormale è lo scopo della ricerca per il rilevamento di anomalie (o rilevamento di novità). Nonostante la sua importanza ed una esuberanza di lavori precedenti, la natura imprevedibile di eventi anomali e la loro inaccessibilità durante la procedura di training degrada significativamente l'efficacia dei sistemi preesistenti. In questo contesto, proponiamo un modello generale costituito da un autoencoder profondo dotato di uno stimatore di densità parametrico, il quale impara la distribuzione delle sue rappresentazioni latenti attraverso una procedura autoregressiva. Mostriamo che un obiettivo di maximum likelihood nello spazio latente regolarizza l’obiettivo di ricostruzione dell'autoencoder e minimizza l'entropia differenziale della distribuzione dei vettori latenti. Intuitivamente, tale ottimizzazione congiunta forza il modello a descrivere (e ricostruire) ogni esempio in termini di features che appaiono frequentemente nel set di addestramento (pertanto, più rappresentative della normalità). Ampie indagini sperimentali e confronti con lo stato dell’arte dimostrano l'efficacia di entrambe le nostre proposte.As the world matures increasingly connected and digitized by the day, with sensors and computing devices becoming more and more pervasive, new opportunities appear for artificial intelligence. In particular, public monitoring steps forward as a critical theme, and computer vision can forcefully prevail as the lead technology to help build a safer world. In this thesis, we present solutions to tackle public safeguard in two different areas of operation. First, we begin with vehicle-based safety by developing a system capable of predicting where a person is likely to focus her attention on while driving. Such activity has a vast potential to improve driving safety. Nevertheless, it appears utterly complex since driving a car is a complicated task, and it is highly subjective from an attentive perspective. To handle attention prediction, we collect and release DR(eye)VE, a dataset consisting of driver-centric and car-centric clips, along with driver's fixation points on the outer urban scene. Next, we deeply inspect such data in order to establish which factors most influence a driver's gaze, both in terms of motion and semantics. Guided by such evidence, we finally develop a deep neural network that, given a car-centric urban scene, identifies which regions are likely to capture the driver's attention. Secondly, we address surveillance-based safety by introducing an anomaly detection model capable of learning the traits that characterize healthy (safe) situations and, therefore, alert when unexpected events appear. Learning such models without utilizing examples of abnormal conditions is the aim of anomaly detection (a.k.a. novelty detection) research. Despite its importance and a plethora of prior work, the unpredictable nature of novel events and their inaccessibility during the training procedure severely degrades the effectiveness of state-of-the-art systems. In this framework, we propose a general model consisting of a deep autoencoder equipped with a parametric density estimator, fitting its latent representations through an autoregressive procedure. We show that a maximum likelihood objective in latent space effectively regularizes the optimization of the autoencoder's reconstruction error, and minimizes the differential entropy of the distribution spanned by latent vectors. Intuitively, such a joint optimization forces the model to describe (and reconstruct) each example in terms of features that frequently appear in the training set. Extensive experimental inquiries and comparisons with prior art show the effectiveness of both our proposals

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Self-Supervised Optical Flow Estimation by Projective Bootstrap

Author: Alletto Stefano
Abati Davide
Cucchiara Rita
Calderara Simone
Rigazio Luca
Publication venue
Publication date: 01/01/2019
Field of study

Dense optical flow estimation is complex and time consuming, with state-of-the-art methods relying either on large synthetic data sets or on pipelines requiring up to a few minutes per frame pair. In this paper, we address the problem of optical flow estimation in the automotive scenario in a self-supervised manner. We argue that optical flow can be cast as a geometrical warping between two successive video frames and devise a deep architecture to estimate such transformation in two stages. First, a dense pixel-level flow is computed with a projective bootstrap on rigid surfaces. We show how such global transformation can be approximated with a homography and extend spatial transformer layers so that they can be employed to compute the flow field implied by such transformation. Subsequently, we refine the prediction by feeding a second, deeper network that accounts for moving objects. A final reconstruction loss compares the warping of frame Xt with the subsequent frame Xt+1 and guides both estimates. The model has the speed advantages of end-to-end deep architectures while achieving competitive performances, both outperforming recent unsupervised methods and showing good generalization capabilities on new automotive data sets

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Latent Space Autoregression for Novelty Detection

Author: Rita Cucchiara
Simone Calderara
PORRELLO ANGELO
Angelo Porrello
ABATI DAVIDE
Davide Abati
Publication venue
Publication date: 01/01/2019
Field of study

Novelty detection is commonly referred to as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying its latent representations through an autoregressive procedure. We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and video anomaly detection settings. Differently from prior works, our proposal does not make any assumption about the nature of the novelties, making our work readily applicable to diverse contexts

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Exploring Architectural Details Through aWearable Egocentric Vision Device

Author: SERRA GIUSEPPE
Giuseppe Serra
Rita Cucchiara
CUCCHIARA Rita
ALLETTO STEFANO
ABATI DAVIDE
Stefano Alletto
Davide Abati
Publication venue
Publication date: 01/01/2016
Field of study

Augmented user experiences in the cultural heritage domain are in increasing demand by the new digital native tourists of 21st century. In this paper, we propose a novel solution that aims at assisting the visitor during an outdoor tour of a cultural site using the unique first person perspective of wearable cameras. In particular, the approach exploits computer vision techniques to retrieve the details by proposing a robust descriptor based on the covariance of local features. Using a lightweight wearable board the solution can localize the user with respect to the 3D point cloud of the historical landmark and provide him with information about the details he is currently looking at. Experimental results validate the method both in terms of accuracy and computational effort. Furthermore, user evaluation based on real-world experiments shows that the proposal is deemed effective in enriching a cultural experience

Multidisciplinary Digital Publishing Institute

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Learning to Map Vehicles into Bird's Eye View

Author: Rita Cucchiara
CALDERARA Simone
BORGHI GUIDO
CUCCHIARA Rita
Andrea Palazzi
Simone Calderara
ABATI DAVIDE
Guido Borghi
Davide Abati
PALAZZI ANDREA
Publication venue
Publication date: 01/01/2017
Field of study

Awareness of the road scene is an essential component for both autonomous vehicles and Advances Driver Assistance Systems and is gaining importance both for the academia and car companies. This paper presents a way to learn a semantic-aware transformation which maps detections from a dashboard camera view onto a broader bird's eye occupancy map of the scene. To this end, a huge synthetic dataset featuring 1M couples of frames, taken from both car dashboard and bird's eye view, has been collected and automatically annotated. A deep-network is then trained to warp detections from the first to the second view. We demonstrate the effectiveness of our model against several baselines and observe that is able to generalize on real-world data despite having been trained solely on synthetic ones

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Predicting the Driver's Focus of Attention: the DR(eye)VE Project

Author: Rita Cucchiara
Abati Davide
Andrea Palazzi
Solera Francesco
Simone Calderara
Cucchiara Rita
Francesco Solera
Palazzi Andrea
Calderara Simone
Davide Abati
Publication venue
Publication date: 01/01/2019
Field of study

Predicting the Driver's Focus of Attention: the DR(eye)VE Project Andrea Palazzi, Davide Abati, Simone Calderara, Francesco Solera, Rita Cucchiara (Submitted on 10 May 2017 (v1), last revised 6 Jun 2018 (this version, v3)) In this work we aim to predict the driver's focus of attention. The goal is to estimate what a person would pay attention to while driving, and which part of the scene around the vehicle is more critical for the task. To this end we propose a new computer vision model based on a multi-branch deep architecture that integrates three sources of information: raw video, motion and scene semantics. We also introduce DR(eye)VE, the largest dataset of driving scenes for which eye-tracking annotations are available. This dataset features more than 500,000 registered frames, matching ego-centric views (from glasses worn by drivers) and car-centric views (from roof-mounted camera), further enriched by other sensors measurements. Results highlight that several attention patterns are shared across drivers and can be reproduced to some extent. The indication of which elements in the scene are likely to capture the driver's attention may benefit several applications in the context of human-vehicle interaction and driver attention analysis

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Conditional Channel Gated Networks for Task-Aware Continual Learning

Author: Bejnordi Babak Ehteshami
Abati Davide
Cucchiara Rita
Blankevoort Tijmen
Calderara Simone
Tomczak Jakub
Publication venue
Publication date: 01/01/2020
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS

Variations on the Author

Author: Sayad Cecilia
Publication venue
Publication date: 01/01/2016
Field of study

“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

Crossref

Kent Academic Repository