Search CORE

1,720,997 research outputs found

Sfruttare e Trasferire conoscenza a priori nelle Architetture di Deep Learning

Author: PORRELLO ANGELO
Publication venue
Publication date: 2022
Field of study

Nell'ultimo decennio, il Deep Learning è diventato un argomento caldo oltre che uno strumento dirompente nel contesto del Machine Learning e della Computer Vision. Si basa su un paradigma di apprendimento in cui i dati (ad esempio, i video acquisiti da telecamere di video-sorveglianza poste su una strada pubblica) giocano un ruolo cruciale. Sfruttando un gran numero di esempi, è possibile imparare compiti complessi e simili a quelli svolti da esseri umani (ad esempio, riconoscere azioni anomale in un video-stream) con risultati impressionanti. Tuttavia, se la disponibilità di dati rappresenta la più grande forza delle tecniche di Deep Learning, essa nasconde anche la più grande debolezza: lo sviluppo di applicazioni e servizi è, infatti, spesso limitato da tale requisito, poiché l'acquisizione e il mantenimento di una enorme quantità di dati sono attività costose che richiedono personale esperto e attrezzature idonee. Tuttavia, la progettazione delle moderne architetture di Deep Learning offre diversi gradi di libertà, i quali possono essere sfruttati per mitigare la mancanza di dati di allenamento, sia essa parziale che completa. L'idea di fondo è quella di compensare tale mancanza incorporando una conoscenza preliminare che gli umani (in particolare, colore che controllano e guidano il processo di apprendimento) detengono sul dominio in questione. Infatti, le regole e le proprietà intrinseche si estendono ben oltre i dati di formazione e spesso possono essere identificate e imposte al modello di learning. Se prendiamo in considerazione la classificazione delle immagini, il successo delle Reti Neurali Convoluzionali (CNN) rispetto alle soluzioni del passato (come le Reti Neurali Multistrato) può essere attribuito principalmente a tale pratica. Infatti, i principi di progettazione del suo elemento costitutivo fondamentale (cioè la convoluzione tra due segnali 2D) riflettono naturalmente ciò che sapevamo sulle immagini naturali: le correlazioni che sussistono tra le regioni vicine dell'immagine hanno fornito pertanto una potente intuizione per lo sviluppo di modelli efficienti ed efficaci come lo sono ancora le CNN. Lo scopo di questa tesi riguarda l'indagine e la proposta di nuovi modi di modellare e iniettare la conoscenza a priori nelle architetture di Deep Learning. È importante sottolineare che tale discussione è trasversale: infatti, si concentra su diversi domini di dati (ad esempio, immagini, video, dati strutturati mediante un grafo, ecc.) e coinvolge diversi livelli della pipeline complessiva. Su quest'ultimo punto, il lettore viene guidato in questa ricerca attraverso la seguente triplice categorizzazione: i) approcci basati sui parametri, che limitano lo spazio delle soluzioni possibili a quelle regioni che riflettono le proprietà geometriche dei dati; ii) approcci goal-driven, che guidano il processo di apprendimento verso soluzioni che incarnano alcune proprietà vantaggiose; iii) approcci data-driven, che sfruttano i dati per estrarre la conoscenza da utilizzare successivamente per condizionare l'algoritmo di training. Insieme a una descrizione completa di entrambe le impostazioni e degli strumenti coinvolti, presentiamo ampi risultati sperimentali e studi di ablazione che dimostrano il valore delle tecniche proposte in questa ricerca.In the last decade, Deep Learning has arisen as a hot topic and a disruptive tool in the fields of Machine Learning and Computer Vision. It builds upon a learning paradigm in which data (e.g., videos acquired by surveillance cameras placed on a public road) play a crucial role. By leveraging a great number of data-points, it is possible to fit complex and human-like tasks (e.g., recognizing abnormal actions in a video-stream) with impressive results. However, if data availability represents the source of the greatest strength of Deep Learning techniques, it also reveals the greatest weakness: the development of applications and services is indeed often restrained by such a requirement, as the acquisition and maintenance of a huge amount of data are expensive activities that require expert staff and equipment. However, the design of modern Deep Learning architectures offers several degrees of freedom that can be exploited to mitigate the lack of training data, either partial or complete. The underlying idea is to compensate for it by incorporating a prior knowledge that humans (specifically, those who control and guide the learning process) hold about the domain at hand. Indeed, intrinsic rules and properties extend far beyond training data and can often be identified and imposed on the learner. If we take image classification into account, the success of Convolutional Neural Networks (CNNs) over past solutions (such as Multi-Layered Neural Networks) can be mainly ascribed to such a practice. Indeed, the design principle of its fundamental building block (i.e., the convolution between two 2D-signals) naturally reflect what we knew about natural images: in this regard, the correlations that subsist between neighborhood regions of the image provided so a powerful insight for the development of efficient and effective models as CNNs still prove to be. The ultimate aim of this thesis is the investigation and proposal of novel ways of modeling and injecting prior knowledge in Deep Learning architectures. Importantly, we conduct such a discussion across the board: in fact, it focuses on several data domains (e.g., images, videos, graph-structured data, etc.) and involves different levels of the overall training pipeline. On this latter point, we guide the reader towards this research by means of the following threefold categorization: i) parameter-based approaches, which limit the space of feasible solutions to those regions reflecting geometrical properties of the data; ii) goal-driven approaches, which guide the learning process towards solutions that embody some advantageous properties; iii) data-driven approaches, which exploit data to extract the knowledge to be used to condition the training algorithm. Along with a comprehensive description of both settings and tools involved, we present extensive experimental results and ablation studies that demonstrate the value of the techniques proposed in this research

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

ClusterFix: A Cluster-Based Debiasing Approach without Protected-Group Supervision

Author: Capitani Giacomo
Calderara Simone
Ficarra Elisa
Bolelli Federico
Porrello Angelo
Publication venue
Publication date: 01/01/2024
Field of study

The failures of Deep Networks can sometimes be ascribed to biases in the data or algorithmic choices. Existing debiasing approaches exploit prior knowledge to avoid unintended solutions; we acknowledge that, in real-world settings, it could be unfeasible to gather enough prior information to characterize the bias, or it could even raise ethical considerations. We hence propose a novel debiasing approach, termed ClusterFix, which does not require any external hint about the nature of biases. Such an approach alters the standard empirical risk minimization and introduces a per-example weight, encoding how critical and far from the majority an example is. Notably, the weights consider how difficult it is for the model to infer the correct pseudo-label, which is obtained in a self-supervised manner by dividing examples into multiple clusters. Extensive experiments show that the misclassification error incurred in identifying the correct cluster allows for identifying examples prone to bias-related issues. As a result, our approach outperforms existing methods on standard benchmarks for bias removal and fairness

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Continual Semi-Supervised Learning through Contrastive Interpolation Consistency

Author: Buzzega Pietro
Porrello Angelo
Boschini Matteo
Calderara Simone
Bonicelli Lorenzo
Publication venue
Publication date: 01/01/2022
Field of study

Continual Learning (CL) investigates how to train Deep Networks on a stream of tasks without incurring forgetting. CL settings proposed in literature assume that every incoming example is paired with ground-truth annotations. However, this clashes with many real-world applications: gathering labeled data, which is in itself tedious and expensive, becomes infeasible when data flow as a stream. This work explores Continual Semi-Supervised Learning (CSSL): here, only a small fraction of labeled input examples are shown to the learner. We assess how current CL methods (e.g.: EWC, LwF, iCaRL, ER, GDumb, DER) perform in this novel and challenging scenario, where overfitting entangles forgetting. Subsequently, we design a novel CSSL method that exploits metric learning and consistency regularization to leverage unlabeled examples while learning. We show that our proposal exhibits higher resilience to diminishing supervision and, even more surprisingly, relying only on 25% supervision suffices to outperform SOTA methods trained under full supervision.Comment: 7 pages, 2 figures, to appear in Pattern Recognition Letters, Volume 162, October 2022, Pages 9-1

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Class-Incremental Continual Learning into the eXtended DER-verse

Author: Buzzega Pietro
Porrello Angelo
Boschini Matteo
Calderara Simone
Bonicelli Lorenzo
Publication venue
Publication date: 19/09/2022
Field of study

The staple of human intelligence is the capability of acquiring knowledge in a continuous fashion. In stark contrast, Deep Networks forget catastrophically and, for this reason, the sub-field of Class-Incremental Continual Learning fosters methods that learn a sequence of tasks incrementally, blending sequentially-gained knowledge into a comprehensive prediction. This work aims at assessing and overcoming the pitfalls of our previous proposal Dark Experience Replay (DER), a simple and effective approach that combines rehearsal and Knowledge Distillation. Inspired by the way our minds constantly rewrite past recollections and set expectations for the future, we endow our model with the abilities to i) revise its replay memory to welcome novel information regarding past data ii) pave the way for learning yet unseen classes. We show that the application of these strategies leads to remarkable improvements; indeed, the resulting method - termed eXtended-DER (X-DER) - outperforms the state of the art on both standard benchmarks (such as CIFAR-100 and miniImagenet) and a novel one here introduced. To gain a better understanding, we further provide extensive ablation studies that corroborate and extend the findings of our previous research (e.g. the value of Knowledge Distillation and flatter minima in continual learning setups).Comment: 23 pages, 22 figures. To appear in IEEE TPAM

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Latent Space Autoregression for Novelty Detection

Author: Rita Cucchiara
Simone Calderara
PORRELLO ANGELO
Angelo Porrello
ABATI DAVIDE
Davide Abati
Publication venue
Publication date: 01/01/2019
Field of study

Novelty detection is commonly referred to as the discrimination of observations that do not conform to a learned model of regularity. Despite its importance in different application settings, designing a novelty detector is utterly complex due to the unpredictable nature of novelties and its inaccessibility during the training procedure, factors which expose the unsupervised nature of the problem. In our proposal, we design a general framework where we equip a deep autoencoder with a parametric density estimator that learns the probability distribution underlying its latent representations through an autoregressive procedure. We show that a maximum likelihood objective, optimized in conjunction with the reconstruction of normal samples, effectively acts as a regularizer for the task at hand, by minimizing the differential entropy of the distribution spanned by latent vectors. In addition to providing a very general formulation, extensive experiments of our model on publicly available datasets deliver on-par or superior performances if compared to state-of-the-art methods in one-class and video anomaly detection settings. Differently from prior works, our proposal does not make any assumption about the nature of the novelties, making our work readily applicable to diverse contexts

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Towards Unbiased Continual Learning: Avoiding Forgetting in the Presence of Spurious Correlations

Author: Capitani Giacomo
Calderara Simone
Ficarra Elisa
Bolelli Federico
Porrello Angelo
Bonicelli Lorenzo
Publication venue
Publication date: 01/01/2025
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Multi-views Embedding for Cattle Re-identification

Author: Andrea Capobianco Dondona
Nicola D’Alterio
Luca Bergamini
Simone Calderara
PORRELLO ANGELO
Ercole Del Negro
Mauro Mattioli
Publication venue
Publication date: 01/01/2018
Field of study

People re-identification task has seen enormous improvements in the latest years, mainly due to the development of better image features extraction from deep Convolutional Neural Networks (CNN) and the availability of large datasets. However, little research has been conducted on animal identification and re-identification, even if this knowledge may be useful in a rich variety of different scenarios. Here, we tackle cattle re-identification exploiting deep CNN and show how this task is poorly related to the human one, presenting unique challenges that make it far from being solved. We present various baselines, both based on deep architectures or on standard machine learning algorithms, and compared them with our solution. Finally, a rich ablation study has been conducted to further investigate the unique peculiarities of this task

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Context-guided Prompt Learning for Continual WSI Classification

Author: Corso Giulia
Miccolis Francesca
Bolelli Federico
Ficarra Elisa
Porrello Angelo
Calderara Simone
Publication venue
Publication date: 01/01/2025
Field of study

Whole Slide Images (WSIs) are crucial in histological diagnostics, providing high-resolution insights into cellular structures. In addition to challenges like the gigapixel scale of WSIs and the lack of pixel-level annotations, privacy restrictions further complicate their analysis. For instance, in a hospital network, different facilities need to collaborate on WSI analysis without the possibility of sharing sensitive patient data. A more practical and secure approach involves sharing models capable of continual adaptation to new data. However, without proper measures, catastrophic forgetting can occur. Traditional continual learning techniques rely on storing previous data, which violates privacy restrictions. To address this issue, this paper introduces Context Optimization Multiple Instance Learning (CooMIL), a rehearsal-free continual learning framework explicitly designed for WSI analysis. It employs a WSI-specific prompt learning procedure to adapt classification models across tasks, efficiently preventing catastrophic forgetting. Evaluated on four public WSI datasets from TCGA projects, our model significantly outperforms state-of-the-art methods within the WSI-based continual learning framework. The source code is available at https://github.com/FrancescaMiccolis/CooMIL

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

U-Net Transplant: The Role of Pre-training for Model Merging in 3D Medical Segmentation

Author: Capitani Giacomo
Grana Costantino
Ficarra Elisa
Bolelli Federico
Lumetti Luca
Porrello Angelo
Calderara Simone
Publication venue
Publication date: 01/01/2025
Field of study

Despite their remarkable success in medical image segmentation, the life cycle of deep neural networks remains a challenge in clinical applications. These models must be regularly updated to integrate new medical data and customized to meet evolving diagnostic standards, regulatory requirements, commercial needs, and privacy constraints. Model merging offers a promising solution, as it allows working with multiple specialized networks that can be created and combined dynamically instead of relying on monolithic models. While extensively studied in standard 2D classification, the potential of model merging for 3D segmentation remains unexplored. This paper presents an efficient framework that allows effective model merging in the domain of 3D image segmentation. Our approach builds upon theoretical analysis and encourages wide minima during pre-training, which we demonstrate to facilitate subsequent model merging. Using U-Net 3D, we evaluate the method on distinct anatomical structures with the ToothFairy2 and BTCV Abdomen datasets. To support further research, we release the source code and all the model weights in a dedicated repository: https://github.com/LucaLumetti/UNetTransplan

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Monocular per-object distance estimation with Masked Object Modeling

Author: Calderara Simone
Haj Ali Fedy
Cucchiara Rita
Mancusi Gianluca
Panariello Aniello
Porrello Angelo
Publication venue
Publication date: 01/01/2025
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia