1,721,201 research outputs found

    Combination of spatially-modulated ToF and structured light for MPI-free depth estimation

    No full text
    Multi-path Interference (MPI) is one of the major sources of error in Time-of-Flight (ToF) camera depth measurements. A possible solution for its removal is based on the separation of direct and global light through the projection of multiple sinusoidal patterns. In this work we extend this approach by applying a Structured Light (SL) technique on the same projected patterns. This allows to compute two depth maps with a single ToF acquisition, one with the Time-of-Flight principle and the other with the Structured Light principle. The two depth fields are finally combined using a Maximum-Likelihood approach in order to obtain an accurate depth estimation free from MPI error artifacts. Experimental results demonstrate that the proposed method has very good MPI correction properties with state-of-the-art performances

    Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations

    No full text
    Deep neural networks suffer from the major limitation of catastrophic forgetting old tasks when learning new ones. In this paper we focus on class incremental continual learning in semantic segmentation, where new categories are made available over time while previous training data is not retained. The proposed continual learning scheme shapes the latent space to reduce forgetting whilst improving the recognition of novel classes. Our framework is driven by three novel components which we also combine on top of existing techniques effortlessly. First, prototypes matching enforces latent space consistency on old classes, constraining the encoder to produce similar latent representation for previously seen classes in the subsequent steps. Second, features sparsification allows to make room in the latent space to accommodate novel classes. Finally, contrastive learning is employed to cluster features according to their semantics while tearing apart those of different classes. Extensive evaluation on the Pascal VOC2012 and ADE20K datasets demonstrates the effectiveness of our approach, significantly outperforming state-of-the-art methods

    Head-mounted gesture controlled interface for human-computer interaction

    Full text link
    This paper proposes a novel human-computer interaction system exploiting gesture recognition. It is based on the combined usage of an head-mounted display and a multi-modal sensor setup including also a depth camera. The depth information is used both to seamlessly include augmented reality elements into the real world and as input for a novel gesture-based interface. Reliable gesture recognition is obtained through a real-time algorithm exploiting novel feature descriptors arranged in a multi-dimensional structure fed to an SVM classifier. The system has been tested with various augmented reality applications including an innovative human-computer interaction scheme where virtual windows can be arranged into the real world observed by the user

    Predictive image compression for interactive remote visualization

    No full text
    The standard way of remotely visualizing 3D models is by first downloading them to the user's computer, and then rendering them. This approach is plagued by many problems which can be overcome by moving the rendering at server's side and by essentially turning interactive visualization into the transmission of user's prompted images. How to effectively compress the rendered views which need to be transmitted to the client is a fundamental issue within this remote visualization scenario. This work presents a predictive compression scheme for remote visualization based on image based rendering. Its experimental performance is discussed with respect to the use of JPEG and JPEG2000 within it

    3D scanning of cultural heritage with consumer depth cameras

    Full text link
    Three dimensional reconstruction of cultural heritage objects is an expensive and time-consuming process. Recent consumer real-time depth acquisition devices, like Microsoft Kinect, allow very fast and simple acquisition of 3D views. However 3D scanning with such devices is a challenging task due to the limited accuracy and reliability of the acquired data. This paper introduces a 3D reconstruction pipeline suited to use consumer depth cameras as hand-held scanners for cultural heritage objects. Several new contributions have been made to achieve this result. They include an ad-hoc filtering scheme that exploits the model of the error on the acquired data and a novel algorithm for the extraction of salient points exploiting both depth and color data. Then the salient points are used within a modified version of the ICP algorithm that exploits both geometry and color distances to precisely align the views even when geometry information is not sufficient to constrain the registration. The proposed method, although applicable to generic scenes, has been tuned to the acquisition of sculptures and in this connection its performance is rather interesting as the experimental results indicate

    METHOD AND APPARATUS FOR CODING OF SPATIAL DATA

    No full text
    The invention describes a method for representing geometry information to utilise for scalable coding of piecewise smooth spatial data sets. The method may also be applicable to vector data such as motion, where this data tends to exhibit piecewise smooth characteristics. The hierarchical geometry representation detailed in this invention is spatially scalable and amenable to embedded quantization and coding techniques. These features enable the geometry representation to be incorporated into highly scalable image coding schemes to attain efficient compression and output bit-streams with embedded resolution and quality scalability. Central elements of the invention are: the hierarchical representation of geometry information which describe points of discontinuity in the input data set; a rate-distortion driven estimation process to construct the geometry representation; a process to prioritize the geometry information in accordance to its influence on compression performance; and methods for efficient coding of the geometry information that facilitates resolution and quality scalability
    corecore