Search CORE

1,721,012 research outputs found

Contrast, Stylize and Adapt: Unsupervised Contrastive Learning Framework for Domain Adaptive Semantic Segmentation

Author: Roy Subhankar
Publication venue
Publication date: 01/01/2023
Field of study

To overcome the domain gap between synthetic and real-world datasets, unsupervised domain adaptation methods have been proposed for semantic segmentation. Majority of the previous approaches have attempted to reduce the gap either at the pixel or feature level, disregarding the fact that the two components interact positively. To address this, we present CONtrastive FEaTure and pIxel alignment (CON-FETI) for bridging the domain gap at both the pixel and feature levels using a unique contrastive formulation. We introduce well-estimated prototypes by including category-wise cross-domain information to link the two alignments: the pixel-level alignment is achieved using the jointly trained style transfer module with the prototypical semantic consistency, while the feature-level alignment is enforced to cross-domain features with the pixel-to-prototype contrast. Our extensive experiments demonstrate that our method outperforms existing state-of-the-art methods using DeepLabV2. Our code1 has been made publicly availabl

Aisberg (Università degli Studi di Bergamo)

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

Author: Roy Subhankar
Publication venue
Publication date: 01/01/2024
Field of study

Domain Generalized Semantic Segmentation (DGSS) deals with training a model on a labeled source domain with the aim of generalizing to unseen domains during inference. Existing DGSS methods typically effectuate robust features by means of Domain Randomization (DR). Such an approach is often limited as it can only account for style diversification and not content. In this work, we take an orthogonal approach to DGSS and propose to use an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic Segmentation (CLOUDS). In detail, CLOUDS is a framework that integrates Foundation Models of various kinds: (i) CLIP backbone for its robust feature representation, (ii) Diffusion Model to diversify the content, thereby covering various modes of the possible target distribution, and (iii) Segment Anything Model (SAM) for iteratively refining the predictions of the segmentation model. Extensive experiments show that our CLOUDS excels in adapting from synthetic to real DGSS benchmarks and under varying weather conditions, notably outperforming prior methods by 5.6% and 6.7% on averaged mIoU, respectively. Our code is available at https://github.com/yasserben/CLOUD

Aisberg (Università degli Studi di Bergamo)

Metric-Learning-Based Deep Hashing Network for Content-Based Retrieval of Remote Sensing Images

Author: Roy Subhankar
Publication venue
Publication date: 01/01/2021
Field of study

Hashing methods have recently been shown to be very effective in the retrieval of remote sensing (RS) images due to their computational efficiency and fast search speed. Common hashing methods in RS are based on hand-crafted features on top of which they learn a hash function, which provides the final binary codes. However, these features are not optimized for the final task (i.e., retrieval using binary codes). On the other hand, modern deep neural networks (DNNs) have shown an impressive success in learning optimized features for a specific task in an end-to-end fashion. Unfortunately, typical RS data sets are composed of only a small number of labeled samples, which make the training (or fine-tuning) of big DNNs problematic and prone to overfitting. To address this problem, in this letter, we introduce a metric-learning-based hashing network, which: 1) implicitly uses a big, pretrained DNN as an intermediate representation step without the need of retraining or fine-tuning; 2) learns a semantic-based metric space where the features are optimized for the target retrieval task; and 3) computes compact binary hash codes for fast search. Experiments carried out on two RS benchmarks highlight that the proposed network significantly improves the retrieval performance under the same retrieval time when compared to the state-of-the-art hashing methods in RS

Aisberg (Università degli Studi di Bergamo)

Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation

Author: Roy Subhankar
Publication venue
Publication date: 01/01/2023
Field of study

In this work we address multi-target domain adaptation (MTDA) in semantic segmentation, which consists in adapting a single model from an annotated source dataset to multiple unannotated target datasets that differ in their underlying data distributions. To address MTDA, we propose a self-training strategy that employs pseudo-labels to induce cooperation among multiple domain-specific classifiers. We employ feature stylization as an efficient way to generate image views that forms an integral part of selftraining. Additionally, to prevent the network from overfitting to noisy pseudo-labels, we devise a rectification strategy that leverages the predictions from different classifiers to estimate the quality of pseudo-labels. Our extensive experiments on numerous settings, based on four different semantic segmentation datasets, validates the effectiveness of the proposed self-training strategy and shows that our method outperforms state-of-the-art MTDA approaches. https://github.com/Mael-zys/CoaST

Aisberg (Università degli Studi di Bergamo)

RaSP: Relation-aware Semantic Prior for Weakly Supervised Incremental Segmentation

Author: Roy Subhankar
Publication venue
Publication date: 01/01/2023
Field of study

Class-incremental semantic image segmentation assumes multiple model updates, each enriching the model to segment new categories. This is typically carried out by providing expensive pixel-level annotations to the training algorithm for all new objects, limiting the adoption of such methods in practical applications. Approaches that solely require image-level labels offer an attractive alternative, yet, such coarse annotations lack precise information about the location and boundary of the new objects. In this paper we argue that, since classes represent not just indices but semantic entities, the conceptual relationships between them can provide valuable information that should be leveraged. We propose a weakly supervised approach that exploits such semantic relations to transfer objectness prior from the previously learned classes into the new ones, complementing the supervisory signal from image-level labels. We validate our approach on a number of continual learning tasks, and show how even a simple pairwise interaction between classes can significantly improve the segmentation mask quality of both old and new classes. We show these conclusions still hold for longer and, hence, more realistic sequences of tasks and for a challenging few-shot scenari

Aisberg (Università degli Studi di Bergamo)

Unlearning Personal Data from a Single Image

Author: Roy Subhankar
Publication venue
Publication date: 01/01/2025
Field of study

Aisberg (Università degli Studi di Bergamo)

Metric-Learning-Based Deep Hashing Network for Content-Based Retrieval of Remote Sensing Images

Author: Sebe Nicu
Demir Begum
Roy Subhankar
Sangineto Enver
Publication venue
Publication date: 01/01/2021
Field of study

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation

Author: Lathuilière Stéphane
Zhang Yangsong
Lu Hongtao
Ricci Elisa
Roy Subhankar
Publication venue
Publication date: 04/10/2022
Field of study

In this work we address multi-target domain adaptation (MTDA) in semantic segmentation, which consists in adapting a single model from an annotated source dataset to multiple unannotated target datasets that differ in their underlying data distributions. To address MTDA, we propose a self-training strategy that employs pseudo-labels to induce cooperation among multiple domain-specific classifiers. We employ feature stylization as an efficient way to generate image views that forms an integral part of self-training. Additionally, to prevent the network from overfitting to noisy pseudo-labels, we devise a rectification strategy that leverages the predictions from different classifiers to estimate the quality of pseudo-labels. Our extensive experiments on numerous settings, based on four different semantic segmentation datasets, validate the effectiveness of the proposed self-training strategy and show that our method outperforms state-of-the-art MTDA approaches. Code available at: https://github.com/Mael-zys/CoaSTComment: Accepted at WACV 202

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

AutoLabel: CLIP-based framework for Open-Set Video Domain Adaptation

Author: Zara Giacomo
Ricci Elisa
Roy Subhankar
Rota Paolo
Publication venue
Publication date: 01/01/2023
Field of study

Open-set Unsupervised Video Domain Adaptation (OU-VDA) deals with the task of adapting an action recognition model from a labelled source domain to an unlabelled target domain that contains “target-private” categories, which are present in the target but absent in the source. In this work we deviate from the prior work of training a specialized open-set classifier or weighted adversarial learning by proposing to use pre-trained Language and Vision Models (CLIP). The CLIP is well suited for OUVDA due to its rich representation and the zero-shot recognition capabilities. However, rejecting target-private instances with the CLIP's zero-shot protocol requires oracle knowledge about the target-private label names. To circumvent the impossibility of the knowledge of label names, we propose AutoLabel that automatically discovers and generates object-centric compositional candidate target-private class names. Despite its simplicity, we show that CLIP when equipped with AutoLabel can satisfactorily reject the target-private instances, thereby facilitating better alignment between the shared classes of the two domains. The code is available 1 1 https://github.com/gzaraunitn/autolabel

Archivio della ricerca - Fondazione Bruno Kessler

Unsupervised Domain Adaptation Using Full-Feature Whitening and Colouring

Author: Sebe Nicu
Aliaksandr Siarohin
Subhankar Roy
Nicu Sebe
Roy Subhankar
Siarohin Aliaksandr
Publication venue
Publication date: 01/01/2019
Field of study

It is a very well known fact in computer vision that classifiers trained on source datasets do not perform well when tested on other datasets acquired under different conditions. To this end, Unsupervised Domain adaptation (UDA) methods address the shift between the source and target domain by adapting the classifier to work well in the target domain despite having no access to the target labels. A handful of UDA methods bridge domain shift by aligning the source and target feature distributions through embedded domain alignment layers that are based on batch normalization (BN) or grouped whitening. Contrarily, in this work we propose to align feature distributions with domain specific full-feature whitening and domain agnostic colouring transforms, abbreviated as F2WCT . The proposed F2WCT optimally aligns the feature distributions by ensuring that the source and target features have identical covariance matrices. Our claim is also substantiated by the experimental results on Digits datasets for both single source and multi source unsupervised adaptation settings

Crossref

Archivio della ricerca - Fondazione Bruno Kessler