1,721,016 research outputs found
Extending computer vision techniques to recognition problems in 3d volumetric baggage imagery
We investigate the application of computer vision techniques to rigid object recognition
in Computed Tomography (CT) security scans of baggage items. This imagery
is of poor resolution and is complex in nature: items of interest can be imaged in
any orientation and copious amounts of clutter, noise and artefacts are prevalent.
We begin with a novel 3D extension to the seminal SIFT keypoint descriptor
that is evaluated through specific instance recognition in the volumetric data. We
subsequently compare the performance of the SIFT descriptor against a selection of
alternative descriptor methodologies. We demonstrate that the 3D SIFT descriptor
is notably outperformed by simpler descriptors which appear to be more suited for
use in noise and artefact-prone CT imagery.
Rigid object class recognition in 3D volumetric baggage data has received little
attention in prior work. We evaluate contrasting techniques between a traditional
approach derived from interest point descriptors and a novel technique based on
modelling of the primary components of the primate visual cortex.
We initially demonstrate class recognition through the implementation of a codebook
approach. A variety of aspects relating to codebook generation are investigated
(codebook size, assignment method) using a range of feature descriptors. Recognition
of a number of object classes is performed and results from this show that the
choice of descriptor is a critical aspect.
Finally, we present a unique extension to the established standard model of the
visual cortex: a volumetric implementation. The visual cortex model comprises a
hierarchical structure of alternating simple and complex operations that has demonstrated
excellent class recognition results using 2D imagery. We derive 3D extensions
to each layer in the hierarchy resulting in class recognition results that signficantly
outperform those achieved using the earlier traditional codebook approach.
Overall we present several novel solutions to object recognition within 3D CT
security images that are supported by strong statistical results
Self localization and mapping using optical and thermal imagery
Given a mobile robot starting from an unknown position in an unknown environment, with
the task of explores the surroundings, it has to be able to build an environmental map and
localize itself inside that map. Achieving a solution of this problem allows the exploration of
area that can be dangerous or inaccessible for humans.
In our implementation we decide to use two primary sensors for the environment
exploration: an optical and a thermal camera. Prior work on the combined use of optical and
thermal sensors for the Simultaneous Localization And Mapping (SLAM) problem is limited.
The innovative aspect of this work is based on this combined use of a secondary thermal
camera as an additional visual sensor for navigation under varying environmental conditions.
A secondary innovative aspect is that we focus our attention on both cameras, using them as
two separate and independent sensors and combine the information in the final stage of
environmental mapping.
During the mobile robot navigation the two cameras capture images on the environment
and SURF feature points are extracted and matched between successive scenes. Using a prior
work on bearing-only SLAM approach as a reference, a feature initialization method is
implemented and allows each new good candidate feature (optical or thermal) to be
initialized with a sum of Gaussians that represents a set of possible spatial positions of the
detected feature. Using successive observations, is possible to estimate the environment
coordinates of the feature and adding it to the Extended Kalman Filter (EKF) dynamic state
vector. The EKF state vector contains the information about the position of the 6 degree of
freedom mobile robot and the environmental landmark coordinates. The update of this
information is managed by the EKF algorithm, a statistical method that allows a prediction of
the state vector and it updates based on sensor information available.
The final methodology is tested in indoor and outdoor environments with several different
light conditions and robot trajectories producing results that are robust in terms of noise in the
images and in other sensor data (i.e. encoders and GPS). The use of the thermal camera
improves the number of landmarks detected during the navigation adding useful information
about the explored area
Real-time object detection using monocular vision for low-cost automotive sensing systems
This work addresses the problem of real-time object detection in automotive environments
using monocular vision. The focus is on real-time feature detection,
tracking, depth estimation using monocular vision and finally, object detection by
fusing visual saliency and depth information.
Firstly, a novel feature detection approach is proposed for extracting stable and
dense features even in images with very low signal-to-noise ratio. This methodology
is based on image gradients, which are redefined to take account of noise as
part of their mathematical model. Each gradient is based on a vector connecting a
negative to a positive intensity centroid, where both centroids are symmetric about
the centre of the area for which the gradient is calculated. Multiple gradient vectors
define a feature with its strength being proportional to the underlying gradient
vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows
superior performance over other contemporary detectors in terms of keypoint density,
tracking accuracy, illumination invariance, rotation invariance, noise resistance
and detection time.
The DeGraF features form the basis for two new approaches that perform dense
3D reconstruction from a single vehicle-mounted camera. The first approach tracks
DeGraF features in real-time while performing image stabilisation with minimal
computational cost. This means that despite camera vibration the algorithm can
accurately predict the real-world coordinates of each image pixel in real-time by comparing
each motion-vector to the ego-motion vector of the vehicle. The performance
of this approach has been compared to different 3D reconstruction methods in order
to determine their accuracy, depth-map density, noise-resistance and computational
complexity. The second approach proposes the use of local frequency analysis of
i
ii
gradient features for estimating relative depth. This novel method is based on the
fact that DeGraF gradients can accurately measure local image variance with subpixel
accuracy. It is shown that the local frequency by which the centroid oscillates
around the gradient window centre is proportional to the depth of each gradient
centroid in the real world. The lower computational complexity of this methodology
comes at the expense of depth map accuracy as the camera velocity increases, but
it is at least five times faster than the other evaluated approaches.
This work also proposes a novel technique for deriving visual saliency maps by
using Division of Gaussians (DIVoG). In this context, saliency maps express the
difference of each image pixel is to its surrounding pixels across multiple pyramid
levels. This approach is shown to be both fast and accurate when evaluated against
other state-of-the-art approaches. Subsequently, the saliency information is combined
with depth information to identify salient regions close to the host vehicle.
The fused map allows faster detection of high-risk areas where obstacles are likely
to exist. As a result, existing object detection algorithms, such as the Histogram of
Oriented Gradients (HOG) can execute at least five times faster.
In conclusion, through a step-wise approach computationally-expensive algorithms
have been optimised or replaced by novel methodologies to produce a fast object
detection system that is aligned to the requirements of the automotive domain
On artefact reduction, segmentation and classification of 3D computed tomography imagery in baggage security screening
This work considers novel image-processing and computer-vision techniques to
advance the automated analysis of low-resolution, complex 3D volumetric Computed
Tomography (CT) imagery obtained in the aviation-security-screening domain.
Novel research is conducted in three key areas: image quality improvement,
segmentation and classification.
A sinogram-completion Metal Artefact Reduction (MAR) technique is presented.
The presence of multiple metal objects in the scanning Field of View
(FoV) is accounted for via a distance-driven weighting scheme. The technique is
shown to perform comparably to the state-of-the-art medical MAR techniques in
a quantitative and qualitative comparative evaluation.
A materials-based technique is proposed for the segmentation of unknown objects
from low-resolution, cluttered volumetric baggage-CT data. Initial coarse
segmentations, generated using dual-energy techniques, are refined by partitioning
at automatically-detected regions. Partitioning is guided by a novel random-forestbased
quality metric (trained to recognise high-quality, single-object segments). A
second segmentation-quality measure is presented for quantifying the quality of
full segmentations. In a comparative evaluation, the proposed method is shown to
produce similar-quality segmentations to the state-of-the-art at reduced processing
times.
A codebook model constructed using an Extremely Randomised Clustering
(ERC) forest for feature encoding, a dense-feature-sampling strategy and a Support
Vector Machine (SVM) classifier is presented. The model is shown to offer
improvements in accuracy over the state-of-the-art 3D visual-cortex model at reduced
processing times, particularly in the presence of noise and artefacts.
The overall contribution of this work is a novel, fully-automated and effcient
framework for the classification of objects in cluttered 3D baggage-CT imagery. It
extends the current state-of-the-art by improving classification performance in the
presence of noise and artefacts; by automating the previously-manual isolation of
objects and by decreasing processing times by several orders of magnitude
Effective temporal change detection in low altitude aerial imagery: using 3D structure and colour to detect scene change in models generated from 2D imagery.
Unmanned Aerial Vehicles (UAVs) are now common place and their sensor solutions are
producing ever increasing volumes of data. Typically the data is based around the theme
of remote sensing of the Earth, and is gathered by a multitude of sensors for differing
applications. The requirement to process the data gathered into useful information grows
as does the demand for intelligent systems to assist with this. The most common, cost
effective and readily available sensor solution is through standard camera photography,
and offers the most usable data format without specialist tools. This also allows for proven
methods to process the data gathered by a UAV thorough image processing and
computation vision. One consistent theme in computer vision research is the drive for the
ability to accurately reconstruct 3D scenes from 2D imagery through the process of
Structure from Motion (SfM). This thesis details the research into the use of this 3D
imagery, specifically aiding the ability to detect temporal change in dynamic scenes. This
work presents a new technique to increase probability of detection and reduce
computation required for such a process, the 3D Structure and Colour (3DSAC)
differencing technique. The technique also goes to present a visualisation ability that best
uses the algorithm for additional end user analysis beyond that of mathematics. Three
scenarios where complex non-uniform changes are presented, of which assess and
validate this technique to offer a capability to cope with dynamic scenes. The weighted
3DSAC algorithm gives the end user the ability to configure with emphasis being placed
more within either structural or colour changes. Finally, through the implementation and
evaluation of other current state of the art techniques for describing 3D points, the
research shows the 3DSAC technique is more performant with imagery gathered by low
altitude UAVs.Engineering and Physical Sciences (EPSRC)PhD in Aerospac
Automatic Rain Drop Detection for Improved Sensing in Automotive Computer Vision Applications
The presence of raindrop induced distortion can have a significant negative impact
on computer vision applications. Here we address the problem of visual raindrop
distortion in standard colour video imagery for use in non-static, automotive
computer vision applications where the scene can be observed to be changing over
subsequent consecutive frames. We utilise current state of the art research
conducted into the investigation of salience mapping as means of initial detection
of potential raindrop candidates. We further expand on this prior state of the art
work to construct a combined feature rich descriptor of shape information (Hu
moments), isolation of raindrops pixel information from context, and texture
(saliency derived) within an improved visual bag of words verification framework.
Support Vector Machine and Random Forest classification were utilised for
verification of potential candidates, and the effects of increasing discrete cluster
centre counts on detection rates were studied.
This novel approach of utilising extended shape information, isolation of context,
and texture, along with increasing cluster counts, achieves a notable 13% increase
in precision (92%) and 10% increase in recall (86%) against prior state of the art.
False positive rates were also observed to decrease with a minimal false positive
rate of 14% observed
Distributed scene reconstruction from multiple mobile platforms
Recent research on mobile robotics has produced new designs that provide
house-hold robots with omnidirectional motion. The image sensor embedded
in these devices motivates the application of 3D vision techniques on them
for navigation and mapping purposes. In addition to this, distributed cheapsensing
systems acting as unitary entity have recently been discovered as an
efficient alternative to expensive mobile equipment.
In this work we present an implementation of a visual reconstruction method,
structure from motion (SfM), on a low-budget, omnidirectional mobile platform,
and extend this method to distributed 3D scene reconstruction with
several instances of such a platform.
Our approach overcomes the challenges yielded by the plaform. The unprecedented
levels of noise produced by the image compression typical of
the platform is processed by our feature filtering methods, which ensure
suitable feature matching populations for epipolar geometry estimation by
means of a strict quality-based feature selection. The robust pose estimation
algorithms implemented, along with a novel feature tracking system,
enable our incremental SfM approach to novelly deal with ill-conditioned
inter-image configurations provoked by the omnidirectional motion. The
feature tracking system developed efficiently manages the feature scarcity
produced by noise and outputs quality feature tracks, which allow robust
3D mapping of a given scene even if - due to noise - their length is shorter
than what it is usually assumed for performing stable 3D reconstructions.
The distributed reconstruction from multiple instances of SfM is attained
by applying loop-closing techniques. Our multiple reconstruction system
merges individual 3D structures and resolves the global scale problem with
minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping
stretches of sequences. The performance of this system is demonstrated
in the 2-session case.
The management of noise, the stability against ill-configurations and the
robustness of our SfM system is validated on a number of experiments and
compared with state-of-the-art approaches. Possible future research areas
are also discussed
The development of novel adjuncts to aid in the diagnosis of Epithelial Misplacement
Epithelial Misplacement (EM) is a benign phenomenon that occurs within polyps
most commonly associated with the sigmoid colon. It is brought about because of
the colons convulsive nature and this forces a polyps surface epithelium into its
submucosa and also causes bleeding.
This is problematic as the Bowel Cancer Screening Programme (BCSP) uses
positive Faecal Occult Blood (FOB) test results to identify patients that require
pathological review. As EM polyps bleed, they get selected for assessment and this
results in them being sectioned and stained. In these cross sections, submucosal
glandular tissue will be found that looks like it has formed due to metastatic
mechanisms. This can lead to ambiguous diagnoses that will cause some patients
to undergo unnecessary surgery.
It is postulated that this can be prevented if the continuity of the EM samples could
be measured. This is because only in the EM cases will the submucosal epithelial
tissue remain in continuity with the surface. To test this, volumes representative of
9 samples of cancer and 13 cases of EM were segmented and their number of 26
three dimensional (3D) connected components were recorded. These were used
with the 99% confidence limits of the two tailed Mann Whitney U Statistic and
tested the null hypothesis that the cancer cases were as connected as the EM
samples. In this instance, no significant differences were found and so the benefit
of measuring the connectivity of these pathologies is questionable.
It was because of this that Immunohistochemical (IHC) alternatives were
considered. It was found that Collagen IV antibody staining correctly differentiated
nine samples of EM from ten cases of cancer. The Mann Whitney U Statistic found
this to be highly significant, p < 0.001, and future investigations should concentrate
on automating this analysis.
Although, Collagen IV provided a good classification it relied upon the subjective
assessment of a pathologist. Therefore, the use of epithelial specific IR spectra was
also investigated and this enabled the eleven EM and nine cancer cases that were
investigated to be accurately classified 80% of the time upon cross validation. The
collection of epithelial specific spectra relied upon a novel digital staining
technique that has much application within future research.
This study demonstrates that the intermodal registration of complementary
modalities is of benefit to the disease classification problem. This technique has
potential to be used in the correct identification of EM but more work is required
Completing unknown portions of 3D scenes by 3D visual propagation
Institute of Perception, Action and BehaviourAs the requirement for more realistic 3D environments is pushed forward by the computer {graphics | movie | simulation | games} industry, attention turns away from the creation of purely synthetic, artist derived environments towards the use of real world captures from the 3D world in which we live.
However, common 3D acquisition techniques, such as laser scanning and stereo capture, are realistically only 2.5D in nature - such that the backs and occluded portions of objects cannot be realised from a single uni-directional viewpoint. Although multi-directional capture has existed for sometime, this incurs additional temporal and
computational cost with no existing guarantee that the resulting acquisition will be free of minor holes, missing surfaces and alike.
Drawing inspiration from the study of human abilities in 3D visual completion, we consider the automated completion of these hidden or missing portions in 3D scenes originally acquired from 2.5D (or 3D) capture. We propose an approach based on the visual propagation of available scene knowledge from the known (visible) scene areas to these unknown (invisible) 3D regions (i.e. the completion of unknown volumes via visual propagation - the concept of volume completion).
Our proposed approach uses a combination of global surface fitting, to derive an initial underlying geometric surface completion, together with a 3D extension of nonparametric texture synthesis in order to provide the propagation of localised structural 3D surface detail (i.e. surface relief). We further extend our technique both to the combined completion of 3D surface relief and colour and additionally to hierarchical surface completion that offers both improved structural results and computational efficiency gains over our initial non-hierarchical technique.
To validate the success of these approaches we present the completion and extension of numerous 2.5D (and 3D) surface examples with relief ranging in natural, man-made, stochastic, regular and irregular forms. These results are evaluated both subjectively within our definition of plausible completion and quantitatively by statistical analysis in the geometric and colour domains
- …
