1,721,023 research outputs found

    Adversarial feature refinement for cross-view action recognition

    No full text
    Apparent motion information of an action may vary dramatically from one view to another, making transfer of knowledge across views a core challenge of action recognition. Recent times have seen the use of large scale datasets to compensate for this lack in generalization, and in fact most state-of-the-art methods today require large amounts of training data and have high computational cost while training. We propose a novel technique leveraging pre-trained features refined to minimize the view-related information through adversarial training inspired by domain adaptation methods. Our method is able to recognize actions from unfamiliar viewpoints and works effectively on substantially less training data than the ones necessary to train state-of-the-art cross-view methods with exceptional results

    Cross-view action recognition with small-scale datasets

    No full text
    Cross-view action recognition refers to the task of recognizing actions observed from view-points that are unfamiliar to the system. To address the complexity of the problem, state of the art methods often rely on large-scale datasets, where the variability of viewpoints is appropriately represented. However, this comes to a significant price, in terms of computational power, time, costs, energy for both gathering data annotation and training the model. We propose a methodological pipeline that tackles the same challenges with specific focus on small-scale datasets and attention to the amount of resources required. The core idea of our method is to transfer knowledge from an intermediate, pre-trained representation, under the hypothesis that it already may implicitly incorporate relevant cues for the task. We rely on an effective domain adaptation strategy coupled with the design of a robust classifier that promotes view-invariant properties and allows us to efficiently generalise to action recognition to unseen viewpoints. In contrast to other state-of-art methods employing also alternative data modalities, our approach is purely video-based and thus has a wider field of applications. We present a thorough experimental analysis justifying the choices on the design of the pipeline, and providing a comparison with existing approaches in the two main scenarios of one-one learning and multiple view learning, where our approach provides superior performance

    Knowledge distillation for efficient standard scanplane detection of fetal ultrasound

    No full text
    Abstract: In clinical practice, ultrasound standard planes (SPs) selection is experience-dependent and it suffers from inter-observer and intra-observer variability. Automatic recognition of SPs can help improve the quality of examinations and make the evaluations more objective. In this paper, we propose a method for the automatic identification of SPs, to be installed onboard a portable ultrasound system with limited computational power. The deep Learning methodology we design is based on the concept of Knowledge Distillation, transferring knowledge from a large and well-performing teacher to a smaller student architecture. To this purpose, we evaluate a set of different potential teachers and students, as well as alternative knowledge distillation techniques, to balance a trade-off between performances and architectural complexity. We report a thorough analysis of fetal ultrasound data, focusing on a benchmark dataset, to the best of our knowledge the only one available to date. Graphical abstract: [Figure not available: see fulltext.]

    Food Image Classification: The Benefit of In-Domain Transfer Learning

    No full text
    Monitoring food intake and calories may be fundamental for a healthy lifestyle and preventing nutrition-related illnesses. Recently, deep-learning approaches have been extensively exploited to provide an automatic analysis of food images. However, food image datasets have peculiar challenges, including fine granularity with a high intra-class and low inter-class variability. In this work, we focus on training strategies considering the typical scenario where data availability and computational resources are limited. Exploiting convolutional neural networks, we show that in-domain source datasets provide a better representation with respect to only using ImageNet, bringing a significant increase in test accuracy. We finally show that ensembling different CNN models further improves the learned representation

    Single view learning in action recognition

    No full text
    Viewpoint is an essential aspect of how an action is visually perceived, with the motion appearing substantially different for some viewpoint pairs. Data driven action recognition algorithms compensate for this by including a variety of viewpoints in their training data, adding to the cost of data acquisition as well as training. We propose a novel methodology that leverages deeply pretrained features to learn actions from a single viewpoint using domain adaptation for knowledge transfer. We demonstrate the effectiveness of this pipeline on 3 different datasets: IXMAS, MoCA and NTU RGBD+, and compare with both classical and deep learning methods. Our method requires low training data and demonstrates unparalleled cross-view action recognition accuracies for single view learning

    Anomaly detection in feature space for detecting changes in phytoplankton populations

    No full text
    Plankton organisms are fundamental components of the earth’s ecosystem. Zooplankton feeds on phytoplankton and is predated by fish and other aquatic animals, being at the core of the aquatic food chain. On the other hand, Phytoplankton has a crucial role in climate regulation, has produced almost 50% of the total oxygen in the atmosphere and it’s responsible for fixing around a quarter of the total earth’s carbon dioxide. Importantly, plankton can be regarded as a good indicator of environmental perturbations, as it can react to even slight environmental changes with corresponding modifications in morphology and behavior. At a population level, the biodiversity and the concentration of individuals of specific species may shift dramatically due to environmental changes. Thus, in this paper, we propose an anomaly detection-based framework to recognize heavy morphological changes in phytoplankton at a population level, starting from images acquired in situ. Given that an initial annotated dataset is available, we propose to build a parallel architecture training one anomaly detection algorithm for each available class on top of deep features extracted by a pre-trained Vision Transformer, further reduced in dimensionality with PCA. We later define global anomalies, corresponding to samples rejected by all the trained detectors, proposing to empirically identify a threshold based on global anomaly count over time as an indicator that can be used by field experts and institutions to investigate potential environmental perturbations. We use two publicly available datasets (WHOI22 and WHOI40) of grayscale microscopic images of phytoplankton collected with the Imaging FlowCytobot acquisition system to test the proposed approach, obtaining high performances in detecting both in-class and out-of-class samples. Finally, we build a dataset of 15 classes acquired by the WHOI across four years, showing that the proposed approach’s ability to identify anomalies is preserved when tested on images of the same classes acquired across a timespan of years

    FasterVideo: Efficient Online Joint Object Detection and Tracking

    No full text
    Object detection and tracking in videos represent essential and computationally demanding building blocks for current and future visual perception systems. In order to reduce the efficiency gap between available methods and computational requirements of real-world applications, we propose to re-think one of the most successful methods for image object detection, Faster R-CNN, and extend it to the video domain. Specifically, we extend the detection framework to learn instance-level embeddings which prove beneficial for data association and re-identification purposes. Focusing on the computational aspects of detection and tracking, our proposed method reaches a very high computational efficiency necessary for relevant applications, while still managing to compete with recent and state-of-the-art methods as shown in the experiments we conduct on standard object tracking benchmarks (Code available at https://github.com/Malga-Vision/fastervideo )

    Learning dictionaries of kinematic primitives for action classification

    No full text
    This paper proposes a method based on visual motion primitives to address the problem of action understanding. The approach builds in an unsupervised way a dictionary of kinematic primitives from a set of sub-movements obtained by segmenting the velocity profile of an action on the basis of local minima derived directly from the optical flow. The dictionary is then used to describe each sub-movement as a linear combination of atoms using sparse coding. The descriptive capability of the proposed motion representation is experimentally validated on the MoCA dataset, a collection of synchronized multi-view videos and motion capture data of cooking activities. The results show that the approach, despite its simplicity, has a good performance in action classification, especially when the motion primitives are combined over time. Also, the method is proved to be tolerant to view point changes, and can thus support cross-view action recognition. Overall, the method may be seen as a backbone of a general approach to action understanding, with potential applications in robotics

    Guest Editorial Assistive Computing Technologies for Human Well-Being

    No full text
    Well-being is a complex concept, that can be affected by long-term or temporary disabilities, as well as the natural process of aging. Nowadays, while meaningful computing methodologies have reached maturity, and a full awareness of the problem dimension has been reached, we are facing the objective of designing ad hoc technologies with the real potential of improving the quality of life of fragile individuals. Technology may contribute in different directions: by providing health-care providers with well-being assessment tools, by designing computer-assisted monitoring and rehabilitation methods that help maintaining independence, or by proposing assistive aids to compensate disabilities. The aim of this special issue is to promote a dialogue between healthcare and technology researchers in order to conceive effective solutions that tackle real needs of fragile people thus improving their well-being
    corecore