Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
Not a member yet
    343 research outputs found

    Modelling and Analysis of Facial Expressions Using Optical Flow Derived Divergence and Curl Templates

    No full text
    Facial expressions are integral part of non-verbal paralinguistic communication as they provide cues significant in perceiving one’s emotional state. Assessment of emotions through expressions is an active research domain in computer vision due to its potential applications in multi-faceted domains. In this work, an approach is presented where facial expressions are modelled and analyzed with dense optical flow derived divergence and curl templates that embody the ideal motion pattern of facial features pertaining to unfolding of an expression on the face. Two types of classification schemes based on multi-class support vector machine and k-nearest neighbour are employed for evaluation. Promising results obtained from comparative analysis of the proposed approach with state-of-the-art techniques on the Extended Cohn Kanade database and with human cognition and pre-trained Microsoft face application programming interface on the Karolinska Directed Emotional Faces database validate the efficiency of the approach

    Investigation of Solar Flare Classification to Identify Optimal Performance

    No full text
    When an intense brightness for a small amount of time is seen in the sun, then we can say that a solar flare emerged. As solar flares are made up of high energy photons and particles, thus causing the production of high electric fields and currents and therefore results in the disruption in space-borne or ground-based technological system. It also becomes a challenging task to extract its important features for prediction. Convolutional Neural Networks have gain a significant amount of popularity in the classification and localization tasks. This paper has given stress on the classification of the solar flares emerged on different years by stacking different convolutional layers followed by max pooling layers. From the reference of Alexnet, the pooling layer employed in this paper is the overlapping pooling. Also two different activation functions that are ELU and CReLU have been used to investigate how many number of convolutional layers with a particular activation function provides with the best results on this dataset as the size of the dataset in this domain is always small. The proposed investigation can be further used in a novel solar prediction systems

    Processing and Representation of Multispectral Images Using Deep Learning Techniques

    No full text
    This thesis has implemented innovative techniques in the field of computer vision using visible and near-infrared spectrum images, applying deep learning through convolutional networks, especially GANs\u27 architectures, which are specialists in generating information and also includes meta techniques -learning to tackle the problem of determining the similarity of images of a different spectrum. In this research, with this type of convolutional networks, different supervised and unsupervised techniques have been created to solve challenging problems, like detect the similarity of patches of different spectra (visible-infrared), colorized images of the near-infrared spectrum, estimation of vegetation index (NDVI) and the haze removal present on RGB images using NIR images. For all these techniques different variants of the GAN\u27s networks, such as standard, conditional, stacked, and cyclic have been used. Also, a metric-based meta-learning approach has been implemented. It should be mentioned that together with the implementation of adversarial network models, the use of multiple loss functions has been proposed to improve the generalization and increase the effectiveness of the models. The experiments were performed with paired and unpaired images, given the different supervised and unsupervised architectures implemented, respectively. The experimental results obtained in each of the approaches implemented in the doctoral work compared with the techniques of the state of the art were shown to be more effective

    Recognition of Devanagari Scene Text Using Autoencoder CNN

    No full text
    Scene text recognition is a well-rooted research domain covering a diverse application area. Recognition of scene text is challenging due to the complex nature of scene images. Various structural characteristics of the script also influence the recognition process. Text and background segmentation is a mandatory step in the scene text recognition process. A text recognition system produces the most accurate results if the structural and contextual information is preserved by the segmentation technique.  Therefore, an attempt is made here to develop a robust foreground/background segmentation(separation) technique that produces the highest recognition results. A ground-truth dataset containing Devanagari scene text images is prepared for the experimentation. An encoder-decoder convolutional neural network model is used for text/background segmentation. The model is trained with Devanagari scene text images for pixel-wise classification of text and background.  The segmented text is then recognized using an existing OCR engine (Tesseract). The word and character level recognition rates are computed and compared with other existing segmentation techniques to establish the effectiveness of the proposed technique

    Social Video Advertisement Replacement and its Evaluation in Convolutional Neural Networks

    No full text
    This paper introduces a method to use deep convolutional neural networks (CNNs) to automatically replace advertisement (AD) photo on social (or self-media) videos and provides the suitable evaluation method to compare different CNNs. An AD photo can replace a picture inside a video. However, if a human being occludes the replaced picture in the original video, the newly pasted AD photo will block the human occluded part. The deep learning algorithm is implemented to segment the human being from the video. The segmented human pixels are then pasted back to the occluded area, so that the AD photo replacement becomes natural and perfect appearance in the video. This process requires the predicted occlusion edge to be closed to the ground truth occlusion edge, so that the AD photo can be occluded naturally. Therefore, this research introduces a curve fitting method to measure the predicted occlusion edge’s error. By using this method, three CNN methods are applied and compared for the AD replacement. They are mask of regions convolutional neural network (Mask RCNN), a recurrent network for video object segmentation (ROVS) and DeeplabV3. The experimental results show the comparative segmentation accuracy of the different models and DeeplabV3 shows the best performance

    Accuracy improvement of the inSAR quality-guided phase unwrapping based on a modified PDV map.

    No full text
    In this paper, an accuracy improvement of the quality-guided phase unwrapping algorithm is proposed. Our proposal is based on a modified phase derivative variance which provides more details on local variations especially for important patterns such as fringes and edges, hence distorted regions may be re-unwrapped according to this new reliable PDV. The proposed improvement is not only effective on accuracy but also on time, the obtained results have shown that the running time with our proposal is less than that of a skillful optimization-based algorithm. To prove effectiveness, the experimental test is carried out on simulated and real data, and the comparison is made under several relevant criteria

    Distilling Structure from Imagery:Graph-based Models for the Interpretation of Document Images

    No full text
    From its early stages, the community of Pattern Recognition and Computer Vision has considered the importance of leveraging the structural information when understanding images. Usually, graphs have been proposed as a suitable model to represent this kind of information due to their flexibility and representational power able to codify both, the components, objects, or entities and their pairwise relationship. Even though graphs have been successfully applied to a huge variety of tasks, as a result of their symbolic and relational nature, graphs have always suffered from some limitations compared to statistical approaches. Indeed, some trivial mathematical operations do not have an equivalence in the graph domain. For instance, in the core of many pattern recognition applications, there is a need to compare two objects. This operation, which is trivial when considering feature vectors defined in ℝn, is not properly defined for graphs.  In this thesis, we have investigated the importance of the structural information from two perspectives, the traditional graph-based methods and the new advances on Geometric Deep Learning. On the one hand, we explore the problem of defining a graph representation and how to deal with it on a large scale and noisy scenario. On the other hand, Graph Neural Networks are proposed to first redefine a Graph Edit Distance methodologies as a metric learning problem, and second, to apply them in a real use case scenario for the detection of repetitive patterns which define tables in invoice documents. As experimental framework, we have validated the different methodological contributions in the domain of Document Image Analysis and Recognition

    Identification of Suitable Contrast Enhancement Technique for Improving the Quality of Astrocytoma Histopathological Images.

    No full text
    Contrast enhancement plays an important part in image processing. In histology, the application of a contrast enhancement technique is necessary since it can help pathologists in diagnosing the sample slides by increasing the visibility of the morphological and features of cells in an image. Various techniques have been proposed to enhance the contrast of microscopic images. Thus, this paper aimed to study the effectiveness of contrast enhancement techniques in enhancing the Ki67 images of astrocytoma. Three contrast enhancement techniques consist of contrast stretching, histogram equalization, and CLAHE techniques were proposed to enhance the sample images. The performance of each technique was compared by computing seven quantitative measures. The CLAHE technique was preferred for enhancing the contrast of the astrocytoma images. This technique produces good results especially in contrast enhancement, edge conservation and enhancement, brightness preservation, and minimum distortions to the enhanced images.

    Scale Invariant Mask R-CNN for Pedestrian Detection

    No full text
    Pedestrian detection is a challenging and active research area in computer vision. Recognizing pedestrians helps in various utility applications such as event detection in overcrowded areas, gender, and gait classification, etc. In this domain, the most recent research is based on instance segmentation using Mask R-CNN. Most of the pedestrian detection method uses a feature of different body portions for identifying a person. This feature-based approach is not efficient enough to differentiate pedestrians in real-time, where the background changing. In this paper, a combined approach of scale-invariant feature map generation for detecting a small pedestrian and Mask R-CNN has been proposed for multiple pedestrian detection to overcome this drawback. The new database was created by recording the behavior of the student at the prominent places of the engineering institute. This database is comparatively new for pedestrian detection in the academic environment. The proposed Scale-invariant Mask R-CNN has been tested on the newly created database and has been compared with the Caltech [1], INRIA [2], MS COCO [3], ETH [4], and KITTI [5] database. The experimental result shows significant performance improvement in pedestrian detection as compared to the existing approaches of pedestrian detection and instance segmentation. Finally, we conclude and investigate the directions for future research

    A review of movie recommendation system: Limitations, Survey and Challenges

    No full text
    Recommendation System is a major area which is very popular and useful for people to take proper decision. It is a method that helps user to find out the information which is beneficial for the user from variety of data available. When it comes to Movie Recommendation System, recommendation is done based on similarity between users (Collaborative Filtering) or by considering particular user’s activity (Content Based Filtering) which he wants to engage with. So to overcome the limitations of collaborative and content based filtering generally, combination of collaborative and content based filtering is used so that a better recommendation system can be developed. Also various similarity measures are used to find out similarity between users for recommendation. In this paper, we have reviewed different similarity measures. Various companies like face book which recommends friends, LinkedIn which recommends job, Pandora recommends music, Netflix recommends movies, Amazon recommends products etc. use recommendation system to increase their profit and also benefit their customers. This paper mainly concentrates on the brief review of the different techniques and its methods for movie recommendation, so that research in recommendation system can be explored

    258

    full texts

    343

    metadata records
    Updated in last 30 days.
    Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇