Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
Not a member yet
    343 research outputs found

    Automatic Date Fruit Recognition Using Outlier Detection Techniques and Gaussian Mixture Models

    Full text link
    In this paper, we propose a method for automatically recognizing different date varieties. The presence of outlier samples could significantly degrade the recognition outcomes. Therefore, we separately prune samples of each variety from outliers using the Pruning Local Distance-based Outlier Factor (PLDOF) method. Samples of the same variety could have several visual appearances because of the noticeable variation in terms of their visual characteristics. Thus, in order to take this intra-variation into account, we model each variety with a Gaussian Mixture Model (GMM), where each component within the GMM corresponds to one visual appearance. Expectation-Maximization (EM) algorithm was used for parameters estimation and Davies-Bouldin index was used to automatically and precisely estimate the number of components (i.e., appearances). Compared to the related studies, the proposed method 1) is capable to recognize samples though the noticeable variation, in terms of maturity stage and hardness degree, within some varieties; 2) achieves a high recognition rate in spite of the presence of outlier samples; 3) is capable to distinguish between the highly confusing varieties; 4) is fully automatic, as it does not require neither physical measurements nor human assistance. For testing purposes, we introduce a new benchmark which includes the highest number of varieties (11) compared to the previous studies. Experiments show that our method has significantly outperformed several methods, where a high recognition rate of 97.8% has been reached

    Probability-Possibility Theories Based Iris Biometric Recognition System

    Full text link
    The performance and robustness of the iris-based recognition systems still suffer from imperfection in the biometric information. This paper makes an attempt to address these imperfections and deals with important problem for real system. We proposed a new method for iris recognition system based on uncertainty theories to treat imperfection iris feature. Several factors cause different types of degradation in iris data such as the poor quality of the acquired pictures, the partial occlusion of the iris region due to light spots, or lenses, eyeglasses, hair or eyelids, and adverse illumination and/or contrast. All of these factors are open problems in the field of iris recognition and affect the performance of iris segmentation, its feature extraction or decision making process, and appear as imperfections in the extracted iris feature. The aim of our experiments is to model the variability and ambiguity in the iris data with the uncertainty theories. This paper illustrates the importance of the use of this theory for modeling or/and treating encountered imperfections. Several comparative experiments are conducted on two subsets of the CASIA-V4 iris image database namely Interval and Synthetic. Compared to a typical iris recognition system relying on the uncertainty theories, experimental results show that our proposed model improves the iris recognition system in terms of Equal Error Rates (EER), Area Under the receiver operating characteristics Curve (AUC) and Accuracy Recognition Rate (ARR) statistics

    Selection of Wavelet Basis Function for Image Compression – A Review

    Full text link
    Wavelets are being suggested as a platform for various tasks in image processing. The advantage of wavelets lie in its time frequency resolution. The use of different basis functions in the form of different wavelets made the wavelet analysis as a destination for many applications. The performance of a particular technique depends on the wavelet coefficients arrived after applying the wavelet transform. The coefficients for a specific input signal depends on the basis functions used in the wavelet transform. Hence in this paper toward this end, different basis functions and their features are presented. As the image compression task depends on wavelet transform to large extent from few decades, the selection of basis function for image compression should be taken with care. In this paper, the factors influencing the performance of image compression are presented

    Facial attributes recognition using computer vision to detect drowsiness and distraction in drivers

    Full text link
    Driving is an activity that requires a high degree of concentration on the part of the person who performs it, since the slightest negligence is sufficient to provoke an accident with the consequent material and/or human losses. According to the most recent study published by the World Health Organization (WHO) in 2013, it was estimated that 1.25 million people died as a result of traffic accidents, whereas between 20 and 50 million did not die but consequences resulted in chronic conditions. Many of these accidents are caused by what is known as inattention. This term encloses different conditions such as distraction and drowsiness, which are, precisely, the ones that cause more fatalities. Many publications and research have tried to set figures indicating the consequences of inattention (and its subtypes), but there is no exact number of the accidents caused by inattention since all these studies have been carried out in different places, different time frames and, therefore, under different conditions. Overall, it has been estimated that inattention causes between 25% and 75% of accidents and near-accidents. A study on drowsiness while driving in ten European countries found that fatigue risks increasing reaction time by 86% and it is the fourth leading cause of death on Spanish roads. Distraction is also a major contributor to fatal accidents in Spain. According to the Directorate General of Traffic (DGT), distraction is the first violation found in fatal accidents, 13.15% of the cases. Overall, considering both distraction and drowsiness, the latest statistics on inattentive driving in Spanish drivers are alarming, appearing as the leading cause of fatalities (36%), well above excessive speed (21%) or alcohol consumption (11%).The reason for this PhD thesis is the direct consequences of the abovementioned figures and its purpose is to provide mechanisms to help reduce driver inattention effects using computer vision techniques. The extraction of facial attributes can be used to detect inattention robustly.Specifically, research establishes a frame of reference to characterize distraction in drivers in order to provide solid foundations for future research [1]. Based on this research [1], an architecture based on the analysis of visual characteristics has been proposed, constructed and validated by using techniques of computer vision and automatic learning for the detection of both distraction and drowsiness [2], integrating several innovative elements in order to operate in a completely autonomous way for the robust detection of the main visual indicators characterizing the driver’s both distraction and drowsiness: (1) a review of the role of computer vision technology applied to the development of monitoring systems to detect distraction [3]; (2) a face processing algorithm based on Local Binary Patterns (LBP) and Support Vector Machine (SVM) to detect facial attributes [4]; (3) detection unit for the presence/absence of the driver using both a marker and a machine learning algorithm [2]; (4) robust face tracking algorithm based on both the position of the camera and the face detection algorithm [2]; (5) a face alignment and normalization algorithm to improve the eyes state detection [3]; (6) driver drowsiness detection based on the eyes state detection over time [2]; (7) driver distraction detection based on the position of the head over time [2]. This architecture has been validated, firstly, with reference databases testing the different modules that compose it, and, secondly, with users in real environments, obtaining in both cases, excellent results with a suitable computational load for the embedded devices in vehicle environments [2]. In connection with the tests performed in real-world settings, 16 drivers were involved performing several activities imitating different signs of sleepiness and distraction. Overall, an accuracy of 93.11% is obtained considering all activities and all drivers [2].Additionally, other contributions of this thesis have been experimentally validated in controlled settings, but are expected to be included in the abovementioned architecture: (1) glasses detection algorithm prior to the detection of the eyes state [3] (the eyes state can not be accurately obtained if the driver is wearing glasses or sunglasses [1]); (2) face recognition and spoofing detection algorithm to identify the driver [5]; (3) physiological information (Heart Rate, Respiration Rate and Heart Rate Variability) are extracted from the users face [6] (using this information, cognitive load and stress can be obtained [1]); (4) a real-time big data architecture to process a large number of relatively small-sized images [7]. Therefore, future work will include these points to complete the architecture

    Detail Enhanced Multi-Exposure Image Fusion Based On Edge Preserving Filters

    Full text link
    Recent computational photography techniques play a significant role to overcome the limitation of standard digital cameras for handling wide dynamic range of real-world scenes contain brightly and poorly illuminated areas. In many of such techniques [1,2,3], it is often desirable to fuse details from images captured at different exposure settings, while avoiding visual artifacts. One such technique is High Dynamic Range (HDR) imaging that provides a solution to recover radiance maps from photographs taken with conventional imaging equipment. The process of HDR image composition needs the knowledge of exposure times and Camera Response Function (CRF), which is required to linearize the image data before combining Low Dynamic Range (LDR) exposures into HDR image. One of the long-standing challenges in HDR imaging technology is the limited Dynamic Range (DR) of conventional display devices and printing technology. Due to which these devices are unable to reproduce full DR. Although DR can be reduced by using a tone-mapping, but this comes at an unavoidable trade-off with increased computational cost. Therefore, it is desirable to maximize information content of the synthesized scene from a set of multi-exposure images without computing HDR radiance map and tone-mapping.This research attempts to develop a novel detail enhanced multi-exposure image fusion approach based on texture features, which exploits the edge preserving and intra-region smoothing property of nonlinear diffusion filters based on Partial Differential Equations (PDE). With the captured multi-exposure image series, we first decompose images into Base Layers (BLs) and Detail Layers (DLs) to extract sharp details and fine details, respectively. The magnitude of the gradient of the image intensity is utilized to encourage smoothness at homogeneous regions in preference to inhomogeneous regions. In the next step texture features of the BL to generate a decision mask (i.e., local range) have been considered that guide the fusion of BLs in multi-resolution fashion. Finally, well-exposed fused image is obtained that combines fused BL and the DL at each scale across all the input exposures. The combination of edge-preserving filters with Laplacian pyramid is shown to lead to texture detail enhancement in the fused image.Furthermore, Non-linear adaptive filter is employed for BL and DL decomposition that has better response near strong edges. The texture details are then added to the fused BL to reconstruct a detail enhanced LDR version of the image. This leads to an increased robustness of the texture details while at the same time avoiding gradient reversal artifacts near strong edges that may appear in fused image after DL enhancement.Finally, we propose a novel technique for exposure fusion in which Weighted Least Squares (WLS) optimization framework is utilized for weight map refinement of BLs and DLs, which lead to a new simple weighted average fusion framework. Computationally simple texture features (i.e. DL) and color saturation measure are preferred for quickly generating weight maps to control the contribution from an input set of multi-exposure images. Instead of employing intermediate HDR reconstruction and tone-mapping steps, well-exposed fused image is generated for displaying on conventional display devices. Simulation results are compared with a number of existing single resolution and multi-resolution techniques to show the benefits of the proposed scheme for the variety of cases.            Moreover, the approaches proposed in this thesis are effective for blending flash and no-flash image pair, and multi-focus images, that is, input images photographed with and without flash, and images focused on different targets, respectively. A further advantage of the present technique is that it is well suited for detail enhancement in the fused image

    Contributions to the Problem of Fight Detection in Video

    Full text link
    While action detection has become an important line of research in computer vision, the detection of particular events such as violence, aggression or fights, has been relatively less studied. These tasks may be extremely useful in several video surveillance scenarios such as psychiatric wards, prisons or even in camera smartphones. The clear practical applications have led to a surge of interest in developing violence detectors

    Selection of relevant information to improve Image Classification using Bag of Visual Words

    Full text link
    One of the main challenges in computer vision is image classification. Nowadays the number of images increases exponentially every day; therefore, it is important to classify them in a reliable way.The conventional image classification pipeline usually consists on extracting local image features, encoding them as a feature vector and classify them using a previously created model. With regards to feature codification, the Bag of Words model and its extensions, such as pyramid matching and weighted schemes, have achieved quite good results and have become the state of the art methods.The process as mentioned above is not perfect and computers, as well as humans, may make mistakes in any of the steps, causing a performance drop in classification. Some of the primary sources of error on large-scale image classification are the presence of multiple objects in the image, small or very thin objects, incorrect annotations or fine-grained recognition tasks among others.Based on those problems and the steps of a typical image classification pipeline, the motivation of this PhD thesis was to provide some guidelines to improve the quality of the extracted features to obtain better classification results. The contributions of the PhD thesis demonstrated how a good feature selection can contribute to improving the fine-grained classification, and that there would even be no need to have a big training data set to learn the key features of each class and to predict with good results

    Video Processing for Remote Respiration Monitoring

    Full text link
    Monitoring of vital signs is a key tool in medical diagnostics.Among fundamental vital parameters, the Respiratory Rate (RR) plays an important role as indicator of possible pathological events.For this reason, respiration needs to be carefully monitored in order to detect potential signs indicating possible changes of health conditions.In this work, novel techniques for the visualization and analysis of respiration by remote and non-invasive video monitoring, based on the study of breathing-related movements, are proposed.The lack of large video databases, associated with clinical data, essential for performance evaluation and optimization of the video processing-based algorithms, is also addressed; statistical models of respiration and apnea events are proposed together with proper simulators, useful to test the remote monitoring algorithms

    Depth Data Error Modeling of the ZED 3D Vision Sensor from Stereolabs

    Full text link
    The ZED camera is binocular vision system that can be used to provide a 3D perception of the world. It can be applied in autonomous robot navigation, virtual reality, tracking, motion analysis and so on. This paper proposes a mathematical error model for depth data estimated by the ZED camera with its several resolutions of operation. For doing that, the ZED is attached to a Nvidia Jetson TK1 board providing an embedded system that is used for processing raw data acquired by ZED from a 3D checkerboard. Corners are extracted from the checkerboard using RGB data, and a 3D reconstruction is done for these points using disparity data calculated from the ZED camera, coming up with a partially ordered, and regularly distributed (in 3D space) point cloud of corners with given coordinates, which are computed by the device software. These corners also have their ideal world (3D) positions known with respect to the coordinate frame origin that is empirically set in the pattern. Both given (computed)  coordinates from the camera’s data and known (ideal) coordinates of a corner can, thus, be compared for estimating the error between the given and ideal point locations of the detected corner cloud. Subsequently, using a curve fitting technique, we obtain the equations that model the RMS (Root Mean Square) error. This procedure is repeated for several resolutions of the ZED sensor, and at several distances. Results showed its best effectiveness with a maximum distance of approximately sixteen meters, in real time, which allows its use in robotic or other online applications

    Efficient Iris and Eyelids Detection from Facial Sketch Images

    Full text link
    In this paper, we propose a simple yet effective technique for an automatic iris and eyelids detection method for facial sketch images. Our system uses Circular Hough Transformation (CHT) algorithm for iris localization process and a low level grayscale analysis for eyelids contour segmentation procedure. We limit the input face for the system to facial sketch photos with frontal pose, illumination invariant, neutral expression and without occlusions. CUHK and IIIT-D sketch databases are used to acquire the experimental results. As to validate the proposed algorithm, experiments on ground truth for iris and eyelids segmentation, which are prepared at our lab, is conducted. The iris segmentation from the proposed method gives the best accuracy of 92.93 and 86.71 based on F-measure evaluation for IIIT-D and CUHK, respectively. For eyelids segmentation, on the other hand, the proposed algorithm achieves an average of 4 standard deviation which indicates the closeness of proposed method to ground truth

    258

    full texts

    343

    metadata records
    Updated in last 30 days.
    Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇