1,721,032 research outputs found

    High-accuracy camera calibration and scene acquisition

    No full text
    In this thesis we present some interesting new approaches in the field of camera calibration and high-accuracy scene acquisition. The first part is devoted to the camera calibration problem exploiting targets composed by circular features. Specifically, we start by improving some previous work on a family of fiducial markers which are leveraged to be used as calibration targets to recover both extrinsic and intrinsic camera parameters. Then, by using the same geometric concepts developed for the markers, we present a method to calibrate a pinhole camera by observing a set of generic coplanar circles. In the second part we move our attention to unconstrained (non-pinhole) camera models. We begin asking ourselves if such models can be effectively applied also to quasi-central cameras and present a powerful calibration technique that exploit active targets to estimate the huge number of parameters required. Then, we apply a similar method to calibrate a structured-light projector during the range-map acquisition process to improve both the accuracy and coverage. Finally, we propose a way to lower the complexity of a complete unconstrained model toward a pinhole configuration but allowing a complete generic distortion map. In the last part we study two different scene acquisition problems, namely industry-grade 3D geometry measurements and dichromatic model parameters recovery from multi-spectral images. In the former, we propose a novel visual-inspection device for the dimensional assessment of metallic pipe intakes. In the latter, we formulate a state-of-the-art optimization approach for the simultaneous recovery of the optical flow and the dichromatic coefficients of a scene by analyzing two subsequent frames

    On-the-Go Reflectance Transformation Imaging with Ordinary Smartphones

    Full text link
    Reflectance Transformation Imaging (RTI) is a popular technique that allows the recovery of per-pixel reflectance information by capturing an object under different light conditions. This can be later used to reveal surface details and interactively relight the subject. Such process, however, typically requires dedicated hardware setups to recover the light direction from multiple locations, making the process tedious when performed outside the lab. We propose a novel RTI method that can be carried out by recording videos with two ordinary smartphones. The flash led-light of one device is used to illuminate the subject while the other captures the reflectance. Since the led is mounted close to the camera lenses, we can infer the light direction for thousands of images by freely moving the illuminating device while observing a fiducial marker surrounding the subject. To deal with such amount of data, we propose a neural relighting model that reconstructs object appearance for arbitrary light directions from extremely compact reflectance distribution data compressed via Principal Components Analysis (PCA). Experiments shows that the proposed technique can be easily performed on the field with a resulting RTI model that can outperform state-of-the-art approaches involving dedicated hardware setups

    A Geometric Model for Polarization Imaging on Projective Cameras

    Full text link
    The vast majority of Shape-from-Polarization (SfP) methods work under the oversimplified assumption of using orthographic cameras. Indeed, it is still unclear how Stokes vector projection behaves when the incoming rays are not orthogonal to the image plane. In this paper, we try to answer this question with a new geometric model describing how a general projective camera captures the light polarization state. Based on the optical properties of a tilted polarizer, our model is implemented as a pre-processing operation acting on raw images, and a scene-independent rotation of the reconstructed normal field. Moreover, our model is consistent with state-of-the-art forward and inverse renderers (as Mitsuba3 and ART), intrinsically enforces physical constraints among the captured channels, and handles the demosaicing of DoFP sensors. Experiments on existing and new datasets demonstrate the accuracy of the model when applied to commercially available polarimetric cameras

    Embedding Shepard’s Interpolation into CNN Models for Unguided Depth Completion

    Full text link
    When acquiring sparse data samples, an interpolation method is often needed to fill in the missing information. An example application, known as “depth completion”, consists in estimating dense depth maps from sparse observations (e.g. LiDAR acquisitions). To do this, algorithmic methods fill the depth image by performing a sequence of basic image processing operations, while recent approaches propose data-driven solutions, mostly based on Convolutional Neural Networks (CNNs), to predict the missing information. In this work, we combine learning-based and classical algorithmic approaches to ideally exploit the performance of the former with the ability to generalize of the latter. First, we define a novel architecture block called IDWBlock. This component allows to embed Shepard’s interpolation (or Inverse Distance Weighting, IDW) into a CNN model, with the advantage of requiring a small number of parameters regardless of the kernel size. Second, we propose two network architectures involving a combination of the IDWBlock and learning-based depth completion techniques. In the experimental section, we tested the models’ performances on the KITTI depth completion benchmark and NYU-depth-v2 dataset, showing how they present strong robustness to input sparsity under different densities and patterns

    A Neural Reflectance Field Model for Accurate Relighting in RTI Applications

    Full text link
    Reflectance Transformation Imaging (RTI) is a computational photography technique in which an object is acquired from a fixed point-of-view with different light directions. The aim is to estimate the light transport function at each point so that the object can be interactively relighted in a physically-accurate way, revealing its surface characteristics. In this paper, we propose a novel RTI approach describing surface reflectance as an implicit neural representation acting as a ”relightable image” for a specific object. We propose to represent the light transport function with a Neural Reflectance Field (NRF) model, feeding it with pixel coordinates, light direction, and a latent vector encoding the per-pixel reflectance in a neighbourhood. These vectors, computed during training, allow a more accurate relighting than a pure implicit representation (i.e., relying only on positional encoding) enabling the NRF to handle complex surface shadings. Moreover, they can be efficiently stored with the learned NRF for compression and transmission. As an additional contribution, we propose a novel synthetic dataset containing objects of various shapes and materials created with a physically based rendering software. An extensive experimental section shows that the proposed NRF accurately models the light transport function for challenging datasets in synthetic and real-world scenarios

    Exploring Audio Compression as Image Completion in Time-Frequency Domain

    Full text link
    Audio compression is usually achieved with algorithms that exploit spectral properties of the given signal such as frequency or temporal masking. In this paper we propose to tackle such a problem from a different point of view, considering the time-frequency domain of an audio signal as an intensity map to be reconstructed via a data-driven approach. The compression stage removes some selected input values from the time-frequency representation of the original signal. Then, decompression works by reconstructing the missing samples as an image completion task. Our method is divided into two main parts: first, we analyse the feasibility of a data-driven audio reconstruction with missing samples in its time-frequency representation. To do so, we exploit an existing CNN model designed for depth completion, involving a sequence of sparse convolutions to deal with absent values. Second, we propose a method to select the values to be removed at compression stage, maximizing the perceived audio quality of the decompressed signal. In the experimental section we validate the proposed technique on some standard audio datasets and provide an extensive study on the quality of the reconstructed signal under different conditions

    Semi-supervised Segmentation of 3D Surfaces Using a Weighted Graph Representation

    Full text link
    A wide range of cheap and simple to use 3D scanning devices has recently been introduced in the market. These tools are no longer addressed to research labs and highly skilled professionals. By converse, they are mostly designed to allow inexperienced users to easily and independently acquire surfaces and whole objects. In this scenario, the demand for automatic or semi-automatic algorithms for 3D data processing is increasing. Specifically, in this paper we concentrate on the segmentation task applied to the acquired surfaces. Such a problem is well known to be ill-defined both for 2D images and 3D objects. In fact, even with a perfect understanding of the scene, many different and incompatible semantic or syntactic segmentations can exist together. For this reasons, we refrain from any attempt to offer an automatic solution. Instead we introduce a semi-supervised procedure that exploits an initial set of seeds selected by the user. In our framework segmentation happens by iteratively visiting a weighted graph representation of the surface starting from the supplied seeds. The assignment of each element is driven by a greedy approach that accounts for the curvature between adjacent triangles. The proposed technique does not require to perform edge detection or to fit parametrized surfaces and its implementation is very straightforward. Still, despite its simplicity, tests made on scanned 3D objects show its effectiveness and easiness of use. © 2011 Springer-Verlag Berlin Heidelberg

    A 5 degrees of freedom multi-user pointing device for interactive whiteboards

    No full text
    Interactive whiteboards are nowadays rather common equipments in classrooms as they provide large advantages in terms of expressive power. Despite the radical paradigm shift, their interaction model is firmly tied to the archetypal concept of strokes and gestures over a whiteboard. In this paper we introduce a novel pointing device that enables one to escape the surface-based interaction, by means of a robust and occlusion-resilient multi-camera 3D tracking. More precisely, we designed a frequency-based active pen. By means of a camera network such pen can be localized in a 3D frame featuring the same 5 degrees of freedom exposed by a real whiteboard marker. Our approach allows for using many pointers at the same time, by reliably assigning an unique and permanent identity to each one. By levering on these capabilities, interaction designers can conceive new and inventive interaction models. A few of them have been implemented within this study and are described in the experimental part of this work

    Multi-view horizon-driven sea plane estimation for stereo wave imaging on moving vessels

    Full text link
    In the last few years we faced an increased popularity of stereo imaging as an effective tool to investigate wind sea waves at short and medium scales. Given the advances of computer vision techniques, the recovery of a scattered point-cloud from a sea surface area is nowadays a well consolidated technique producing excellent results both in terms of wave data resolution and accuracy. Nevertheless, almost all the subsequent analyses tasks, from the recovery of directional wave spectra to the estimation of significant wave height, are bound to two limiting conditions. First, wave data are required to be aligned to the mean sea plane. Second, a uniform distribution of 3D point samples is assumed. Since the stereo-camera rig is placed tilted with respect to the sea surface, perspective distortion do not allow these conditions to be met. Errors due to this problem are even more challenging if the optical instrumentation is mounted on a moving vessel, so that the mean sea plane cannot be simply obtained by averaging data from multiple subsequent frames. We address the first problem with two main contributions. First, we propose a novel horizon estimation technique to recover the attitude of a moving stereo rig with respect to the sea plane. Second, an effective weighting scheme is described to account for the non-uniform sampling of the scattered data in the estimation of the sea-plane distance. The interplay of the two allows us to provide a precise point cloud alignment without any external positioning sensor or rig viewpoint pre-calibration. The advantages of the proposed technique are evaluated throughout an experimental section spanning both synthetic and real-world scenarios
    corecore