1,721,010 research outputs found
Multi-View Guided Multi-View Stereo
This paper introduces a novel deep framework for dense 3D reconstruction from multiple image frames, leveraging a sparse set of depth measurements gathered jointly with image acquisition.
Given a deep multi-view stereo network, our framework uses sparse depth hints to guide the neural network by modulating the plane-sweep cost volume built during the forward step, enabling us to infer constantly much more accurate depth maps.
Moreover, since multiple viewpoints can provide additional depth measurements, we propose a multi-view guidance strategy that increases the density of the sparse points used to guide the network, thus leading to even more accurate results. We evaluate our Multi-View Guided framework within a variety of state-of-the-art deep multi-view stereo networks, demonstrating its effectiveness at improving the results achieved by each of them on BlendedMVG and DTU datasets
Lightweight Self-Supervised Depth Estimation with few-beams LiDAR Data
This paper proposes a lightweight yet effective self-supervised depth completion network trained on monocular videos and sparse raw LiDAR measurements only. Specifically, we utilize a multi-stage network architecture, which depends on cheap CNN layers. We introduce a novel guided sparse convolution operator combining sparse and dense data to extract depth features. To mitigate the impact of outliers commonly present in the sparse raw LiDAR data, we adopt a distance-dependent outlier mask that incorporates an elastic threshold mechanism to selectively discard such points. Our experimental results on the KITTI dataset show the favorable trade-off between accuracy and efficiency achieved by our model, reaching state-of-the-art performance on self-supervised depth estimation from few-beams LiDAR (4-beams), depth completion (64-beams) and a few hundred depth points, using a fraction of the parameters. Our code will be available on https://github.com/franky-ciomp/GSCNN/
Active Stereo Without Pattern Projector
This paper proposes a novel framework integrating the principles of active
stereo in standard passive camera systems without a physical pattern projector.
We virtually project a pattern over the left and right images according to the
sparse measurements obtained from a depth sensor. Any such devices can be
seamlessly plugged into our framework, allowing for the deployment of a virtual
active stereo setup in any possible environment, overcoming the limitation of
pattern projectors, such as limited working range or environmental conditions.
Experiments on indoor/outdoor datasets, featuring both long and close-range,
support the seamless effectiveness of our approach, boosting the accuracy of
both stereo algorithms and deep networks
Method for determining the confidence of a disparity map through a self-adaptive learning of a neural network, and sensor system thereof
The proposed method aims at determining the confidence of a disparity map through a self-adaptive learning of a neural
network, and sensor system thereof
Matching-space Stereo Networks for Cross-domain Generalization
End-to-end deep networks represent the state of the art
for stereo matching. While excelling on images framing environments similar to the training set, major drops in accuracy occur in unseen domains (e.g., when moving from
synthetic to real scenes). In this paper we introduce a
novel family of architectures, namely Matching-Space Networks (MS-Nets), with improved generalization properties.
By replacing learning-based feature extraction from image RGB values with matching functions and confidence
measures from conventional wisdom, we move the learning process from the color space to the Matching Space,
avoiding over-specialization to domain specific features.
Extensive experimental results on four real datasets highlight that our proposal leads to superior generalization
to unseen environments over conventional deep architectures, keeping accuracy on the source domain almost unaltered. Our code is available at https://github.com/ccj5351/MS-Nets
On-Site Adaptation for Monocular Depth Estimation with a Static Camera
We introduce a novel technique for easing the deployment of an off-the-shelf monocular depth estimation network in unseen environments. Specifically, we target a very diffused setting with a fixed camera mounted higher over the ground to monitor an environment and highlight the limitations of state-of-the-art monocular networks deployed in such a setup. Purposely, we develop an on-site adaptation technique capable of 1) improving the accuracy of estimated depth maps in the presence of moving subjects, such as pedestrians, cars, and others; 2) refining the overall structure of the predicted depth map, to make it more consistent with the real 3D structure of the scene; 3) recovering absolute metric depth, usually lost by state-of-the-art solutions. Experiments on synthetic and real datasets confirm the effectiveness of our proposal
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
