1,721,016 research outputs found

    Extending computer vision techniques to recognition problems in 3d volumetric baggage imagery

    Full text link
    We investigate the application of computer vision techniques to rigid object recognition in Computed Tomography (CT) security scans of baggage items. This imagery is of poor resolution and is complex in nature: items of interest can be imaged in any orientation and copious amounts of clutter, noise and artefacts are prevalent. We begin with a novel 3D extension to the seminal SIFT keypoint descriptor that is evaluated through specific instance recognition in the volumetric data. We subsequently compare the performance of the SIFT descriptor against a selection of alternative descriptor methodologies. We demonstrate that the 3D SIFT descriptor is notably outperformed by simpler descriptors which appear to be more suited for use in noise and artefact-prone CT imagery. Rigid object class recognition in 3D volumetric baggage data has received little attention in prior work. We evaluate contrasting techniques between a traditional approach derived from interest point descriptors and a novel technique based on modelling of the primary components of the primate visual cortex. We initially demonstrate class recognition through the implementation of a codebook approach. A variety of aspects relating to codebook generation are investigated (codebook size, assignment method) using a range of feature descriptors. Recognition of a number of object classes is performed and results from this show that the choice of descriptor is a critical aspect. Finally, we present a unique extension to the established standard model of the visual cortex: a volumetric implementation. The visual cortex model comprises a hierarchical structure of alternating simple and complex operations that has demonstrated excellent class recognition results using 2D imagery. We derive 3D extensions to each layer in the hierarchy resulting in class recognition results that signficantly outperform those achieved using the earlier traditional codebook approach. Overall we present several novel solutions to object recognition within 3D CT security images that are supported by strong statistical results

    Self localization and mapping using optical and thermal imagery

    Full text link
    Given a mobile robot starting from an unknown position in an unknown environment, with the task of explores the surroundings, it has to be able to build an environmental map and localize itself inside that map. Achieving a solution of this problem allows the exploration of area that can be dangerous or inaccessible for humans. In our implementation we decide to use two primary sensors for the environment exploration: an optical and a thermal camera. Prior work on the combined use of optical and thermal sensors for the Simultaneous Localization And Mapping (SLAM) problem is limited. The innovative aspect of this work is based on this combined use of a secondary thermal camera as an additional visual sensor for navigation under varying environmental conditions. A secondary innovative aspect is that we focus our attention on both cameras, using them as two separate and independent sensors and combine the information in the final stage of environmental mapping. During the mobile robot navigation the two cameras capture images on the environment and SURF feature points are extracted and matched between successive scenes. Using a prior work on bearing-only SLAM approach as a reference, a feature initialization method is implemented and allows each new good candidate feature (optical or thermal) to be initialized with a sum of Gaussians that represents a set of possible spatial positions of the detected feature. Using successive observations, is possible to estimate the environment coordinates of the feature and adding it to the Extended Kalman Filter (EKF) dynamic state vector. The EKF state vector contains the information about the position of the 6 degree of freedom mobile robot and the environmental landmark coordinates. The update of this information is managed by the EKF algorithm, a statistical method that allows a prediction of the state vector and it updates based on sensor information available. The final methodology is tested in indoor and outdoor environments with several different light conditions and robot trajectories producing results that are robust in terms of noise in the images and in other sensor data (i.e. encoders and GPS). The use of the thermal camera improves the number of landmarks detected during the navigation adding useful information about the explored area

    Real-time object detection using monocular vision for low-cost automotive sensing systems

    Full text link
    This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain

    On artefact reduction, segmentation and classification of 3D computed tomography imagery in baggage security screening

    Full text link
    This work considers novel image-processing and computer-vision techniques to advance the automated analysis of low-resolution, complex 3D volumetric Computed Tomography (CT) imagery obtained in the aviation-security-screening domain. Novel research is conducted in three key areas: image quality improvement, segmentation and classification. A sinogram-completion Metal Artefact Reduction (MAR) technique is presented. The presence of multiple metal objects in the scanning Field of View (FoV) is accounted for via a distance-driven weighting scheme. The technique is shown to perform comparably to the state-of-the-art medical MAR techniques in a quantitative and qualitative comparative evaluation. A materials-based technique is proposed for the segmentation of unknown objects from low-resolution, cluttered volumetric baggage-CT data. Initial coarse segmentations, generated using dual-energy techniques, are refined by partitioning at automatically-detected regions. Partitioning is guided by a novel random-forestbased quality metric (trained to recognise high-quality, single-object segments). A second segmentation-quality measure is presented for quantifying the quality of full segmentations. In a comparative evaluation, the proposed method is shown to produce similar-quality segmentations to the state-of-the-art at reduced processing times. A codebook model constructed using an Extremely Randomised Clustering (ERC) forest for feature encoding, a dense-feature-sampling strategy and a Support Vector Machine (SVM) classifier is presented. The model is shown to offer improvements in accuracy over the state-of-the-art 3D visual-cortex model at reduced processing times, particularly in the presence of noise and artefacts. The overall contribution of this work is a novel, fully-automated and effcient framework for the classification of objects in cluttered 3D baggage-CT imagery. It extends the current state-of-the-art by improving classification performance in the presence of noise and artefacts; by automating the previously-manual isolation of objects and by decreasing processing times by several orders of magnitude

    Effective temporal change detection in low altitude aerial imagery: using 3D structure and colour to detect scene change in models generated from 2D imagery.

    Full text link
    Unmanned Aerial Vehicles (UAVs) are now common place and their sensor solutions are producing ever increasing volumes of data. Typically the data is based around the theme of remote sensing of the Earth, and is gathered by a multitude of sensors for differing applications. The requirement to process the data gathered into useful information grows as does the demand for intelligent systems to assist with this. The most common, cost effective and readily available sensor solution is through standard camera photography, and offers the most usable data format without specialist tools. This also allows for proven methods to process the data gathered by a UAV thorough image processing and computation vision. One consistent theme in computer vision research is the drive for the ability to accurately reconstruct 3D scenes from 2D imagery through the process of Structure from Motion (SfM). This thesis details the research into the use of this 3D imagery, specifically aiding the ability to detect temporal change in dynamic scenes. This work presents a new technique to increase probability of detection and reduce computation required for such a process, the 3D Structure and Colour (3DSAC) differencing technique. The technique also goes to present a visualisation ability that best uses the algorithm for additional end user analysis beyond that of mathematics. Three scenarios where complex non-uniform changes are presented, of which assess and validate this technique to offer a capability to cope with dynamic scenes. The weighted 3DSAC algorithm gives the end user the ability to configure with emphasis being placed more within either structural or colour changes. Finally, through the implementation and evaluation of other current state of the art techniques for describing 3D points, the research shows the 3DSAC technique is more performant with imagery gathered by low altitude UAVs.Engineering and Physical Sciences (EPSRC)PhD in Aerospac

    Automatic Rain Drop Detection for Improved Sensing in Automotive Computer Vision Applications

    Full text link
    The presence of raindrop induced distortion can have a significant negative impact on computer vision applications. Here we address the problem of visual raindrop distortion in standard colour video imagery for use in non-static, automotive computer vision applications where the scene can be observed to be changing over subsequent consecutive frames. We utilise current state of the art research conducted into the investigation of salience mapping as means of initial detection of potential raindrop candidates. We further expand on this prior state of the art work to construct a combined feature rich descriptor of shape information (Hu moments), isolation of raindrops pixel information from context, and texture (saliency derived) within an improved visual bag of words verification framework. Support Vector Machine and Random Forest classification were utilised for verification of potential candidates, and the effects of increasing discrete cluster centre counts on detection rates were studied. This novel approach of utilising extended shape information, isolation of context, and texture, along with increasing cluster counts, achieves a notable 13% increase in precision (92%) and 10% increase in recall (86%) against prior state of the art. False positive rates were also observed to decrease with a minimal false positive rate of 14% observed

    Distributed scene reconstruction from multiple mobile platforms

    Full text link
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

    The development of novel adjuncts to aid in the diagnosis of Epithelial Misplacement

    Full text link
    Epithelial Misplacement (EM) is a benign phenomenon that occurs within polyps most commonly associated with the sigmoid colon. It is brought about because of the colons convulsive nature and this forces a polyps surface epithelium into its submucosa and also causes bleeding. This is problematic as the Bowel Cancer Screening Programme (BCSP) uses positive Faecal Occult Blood (FOB) test results to identify patients that require pathological review. As EM polyps bleed, they get selected for assessment and this results in them being sectioned and stained. In these cross sections, submucosal glandular tissue will be found that looks like it has formed due to metastatic mechanisms. This can lead to ambiguous diagnoses that will cause some patients to undergo unnecessary surgery. It is postulated that this can be prevented if the continuity of the EM samples could be measured. This is because only in the EM cases will the submucosal epithelial tissue remain in continuity with the surface. To test this, volumes representative of 9 samples of cancer and 13 cases of EM were segmented and their number of 26 three dimensional (3D) connected components were recorded. These were used with the 99% confidence limits of the two tailed Mann Whitney U Statistic and tested the null hypothesis that the cancer cases were as connected as the EM samples. In this instance, no significant differences were found and so the benefit of measuring the connectivity of these pathologies is questionable. It was because of this that Immunohistochemical (IHC) alternatives were considered. It was found that Collagen IV antibody staining correctly differentiated nine samples of EM from ten cases of cancer. The Mann Whitney U Statistic found this to be highly significant, p < 0.001, and future investigations should concentrate on automating this analysis. Although, Collagen IV provided a good classification it relied upon the subjective assessment of a pathologist. Therefore, the use of epithelial specific IR spectra was also investigated and this enabled the eleven EM and nine cancer cases that were investigated to be accurately classified 80% of the time upon cross validation. The collection of epithelial specific spectra relied upon a novel digital staining technique that has much application within future research. This study demonstrates that the intermodal registration of complementary modalities is of benefit to the disease classification problem. This technique has potential to be used in the correct identification of EM but more work is required

    Completing unknown portions of 3D scenes by 3D visual propagation

    Full text link
    Institute of Perception, Action and BehaviourAs the requirement for more realistic 3D environments is pushed forward by the computer {graphics | movie | simulation | games} industry, attention turns away from the creation of purely synthetic, artist derived environments towards the use of real world captures from the 3D world in which we live. However, common 3D acquisition techniques, such as laser scanning and stereo capture, are realistically only 2.5D in nature - such that the backs and occluded portions of objects cannot be realised from a single uni-directional viewpoint. Although multi-directional capture has existed for sometime, this incurs additional temporal and computational cost with no existing guarantee that the resulting acquisition will be free of minor holes, missing surfaces and alike. Drawing inspiration from the study of human abilities in 3D visual completion, we consider the automated completion of these hidden or missing portions in 3D scenes originally acquired from 2.5D (or 3D) capture. We propose an approach based on the visual propagation of available scene knowledge from the known (visible) scene areas to these unknown (invisible) 3D regions (i.e. the completion of unknown volumes via visual propagation - the concept of volume completion). Our proposed approach uses a combination of global surface fitting, to derive an initial underlying geometric surface completion, together with a 3D extension of nonparametric texture synthesis in order to provide the propagation of localised structural 3D surface detail (i.e. surface relief). We further extend our technique both to the combined completion of 3D surface relief and colour and additionally to hierarchical surface completion that offers both improved structural results and computational efficiency gains over our initial non-hierarchical technique. To validate the success of these approaches we present the completion and extension of numerous 2.5D (and 3D) surface examples with relief ranging in natural, man-made, stochastic, regular and irregular forms. These results are evaluated both subjectively within our definition of plausible completion and quantitatively by statistical analysis in the geometric and colour domains
    corecore