Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
Not a member yet
    343 research outputs found

    Pre-trained CNNs as Feature-Extraction Modules for Image Captioning: An Experimental Study

    No full text
    In this work, we present a thorough experimental study about feature extraction using Convolutional NeuralNetworks (CNNs) for the task of image captioning in the context of deep learning. We perform a set of 72experiments on 12 image classification CNNs pre-trained on the ImageNet [29] dataset. The features areextracted from the last layer after removing the fully connected layer and fed into the captioning model. We usea unified captioning model with a fixed vocabulary size across all the experiments to study the effect of changingthe CNN feature extractor on image captioning quality. The scores are calculated using the standard metrics inimage captioning. We find a strong relationship between the model structure and the image captioning datasetand prove that VGG models give the least quality for image captioning feature extraction among the testedCNNs. In the end, we recommend a set of pre-trained CNNs for each of the image captioning evaluation metricswe want to optimise, and show the connection between our results and previous works. To our knowledge, thiswork is the most comprehensive comparison between feature extractors for image captioning

    Deep Learning Based Models for Offline Gurmukhi Handwritten Character and Numeral Recognition

    No full text
    Over the last few years, several researchers have worked on handwritten character recognition and have proposed various techniques to improve the performance of Indic and non-Indic scripts recognition. Here, a Deep Convolutional Neural Network has been proposed that learns deep features for offline Gurmukhi handwritten character and numeral recognition (HCNR). The proposed network works efficiently for training as well as testing and exhibits a good recognition performance. Two primary datasets comprising of offline handwritten Gurmukhi characters and Gurmukhi numerals have been employed in the present work. The testing accuracies achieved using the proposed network is 98.5% for characters and 98.6% for numerals

    Analysis of the Measurement Matrix in Directional Predictive Coding for Compressive Sensing of Medical Images

    No full text
    Compressive sensing of 2D signals involves three fundamental steps: sparse representation, linear measurement matrix, and recovery of the signal. This paper focuses on analyzing the efficiency of various measurement matrices for compressive sensing of medical images based on theoretical predictive coding. During encoding, the prediction is efficiently chosen by four directional predictive modes for block-based compressive sensing measurements. In this work, Gaussian, Bernoulli, Laplace, Logistic, and Cauchy random matrices are used as the measurement matrices. While decoding, the same optimal prediction is de-quantized. Peak-signal-to-noise ratio and sparsity are used for evaluating the performance of measurement matrices. The experimental result shows that the spatially directional predictive coding (SDPC) with Laplace measurement matrices performs better compared to scalar quantization (SQ) and differential pulse code modulation (DPCM) methods. The results indicate that the Laplace measurement matrix is the most suitable in compressive sensing of medical images

    Robust Pedestrian Detection and Path Prediction using Improved YOLOv5

    No full text
    In vision-based surveillance systems, pedestrian recognition and path prediction are critical concerns. Advanced computer vision applications, on the other hand, confront numerous challengesdue to differences in pedestrian postures and scales, backdrops, and occlusion. To tackle these challenges, we present a YOLOv5-based deep learning-based pedestrian recognition and path prediction method. The updated YOLOv5 model was first used to detect pedestrians of various sizes and proportions. The proposed path prediction method is then used to estimate the pedestrian\u27s path based on motion data. The suggested method deals with partial occlusion circumstances to reduce object occlusion-induced progression and loss, and links recognition results with motion attributes. After then, the path prediction algorithm uses motion and directional data to estimate the pedestrian movement\u27s direction. The proposed method outperforms the existing methods, according to the results of the experiments. Finally, we come to a conclusion and look into future study

    Cricket Video Highlight Generation Methods: A Review

    No full text
    The key events extraction from a video for the bestrepresentation of its contents is known as video summarization.In this study, the game of cricket is specifically consideredfor extracting important events such as boundaries, sixes andwickets. The cricket video highlight generation frameworksrequire extensive key event identification. These key events canbe identified by extracting the audio, visual and textual featuresfrom any cricket video.The prediction accuracy of the cricketvideo summarization mainly depends on the game rules, player’sform, their skill, and different natural conditions. This paperprovides a complete survey of latest research in cricket videosummarization methods. It includes the quantitative evaluationof the outcomes of the existing frameworks. This extensive reviewhighly recommended developing deep learning-assisted videosummarization approaches for cricket video due to their morerepresentative feature extraction and classification capabilitythan the conventional edge, texture features, and classifiers. Thescope of this analysis also includes future visions and researchopportunities in cricket highlight generation

    A Multi-staged Feature-Attentive Network for Fashion Clothing Classification and Attribute Prediction

    No full text
    In the visual fashion clothing analysis, many researchers are attracted with the success of deep learning concepts. In this work, we introduce a multi-staged feature-attentive network to attain clothing category classification and attribute prediction. The proposed network in this work brings out a landmark-independent structure, whereas the existing landmark-dependent structures take up a lot of manpower for landmark annotation and also suffers from inter- and intra-individual variability. Our focus on this work is intensifying feature extraction by incorporating low-level and high-level feature fusion within fashion network. We are aiming on multi-level contextual features which utilise spatial and channel-wise information to create contextual feature supervision. Further, we enclose a semi-supervised learning approach to escalate fashion clothes analysis that utilises knowledge sharing among labelled and unlabelled data. To the best of our knowledge, this is the first attempt to investigate the semi-supervised learning in fashion clothing analysis by adopting multitask architecture which simultaneously study the clothing categories as well as its attributes. We evaluated the proposed approach on large-scale DeepFashion-C dataset while unlabelled dataset obtained from six publicly available fashion datasets. Experimental results show that the proposed architectures for supervised and semi-supervised learning entailing deep convolutional neural network outperforms many state-of-the-art techniques considerably, in fashion clothing analysis

    Object Detection and Statistical Analysis of Microscopy Image Sequences

    No full text
    Confocal microscope images are wide useful in medical diagnosis and research. The automatic interpretation of this type of images is very important but it is a challenging endeavor in image processing area, since these images are heavily contaminated with noise, have low contrast and low resolution. This work deals with the problem of analyzing the penetration velocity of a chemotherapy drug in an ocular tumor called retinoblastoma. The primary retinoblastoma cells cultures are exposed to topotecan drug and the penetration evolution is documented by producing sequences of microscopy images. It is possible to quantify the penetration rate of topotecan drug because it produces fluorescence emission by laser excitation which is captured by the camera.In order to estimate the topotecan penetration time in the whole retinoblastoma cell culture, a procedure based on an active contour detection algorithm, a neural network classifier and a statistical model and its validation, is proposed.This new inference model allows to estimate the penetration time. Results show that the penetration mean time strongly depends on tumorsphere size and on chemotherapeutic treatment that the patient has previously received

    Application of computer vision to egg detection on a production line in real time.

    No full text
    In this paper we investigate the application of computer vision to the problem of egg detection on a production line in real-time. For this purpose a dedicated software was designed and implemented that exploited the advantages of neural networks or template matching approaches. To verify the correctness of the developed software as well as to confirm its applicability to real life problems a number of carefully designed experiments have been carried out. These experiments let us reveal what approaches are best suited for supporting egg detection on a production line in real-time

    An efficient hybrid approach for medical images enhancement

    No full text
    Medical images have various critical usages in the field of medical science and healthcare engineering. These images contain information about many severe diseases. Health professionals identify various diseases by observing the medical images. Quality of medical images directly affects the accuracy of detection and diagnosis of various diseases. Therefore, quality of images must be as good as possible. Different approaches are existing today for enhancement of medical images, but quality of images is not good. In this literature, we have proposed a novel approach that uses principal component analysis (PCA), multi-scale switching morphological operator (MSMO) and contrast limited adaptive histogram equalization (CLAHE) methods in a unique sequence for this purpose. We have conducted exhaustive experiments on large number of images of various modalities such as MRI, ultrasound, and retina. Obtained results demonstrate that quality of medical images processed by proposed approach has significantly improved and better than other existing methods of this field

    A Deep Learning-based Lung Cancer Classification of CT Images using Augmented Convolutional Neural Networks

    No full text
    Lung cancer is worldwide the second death cancer, both in prevalence and lethality, both for women and men. The applicability of machine learning and pattern classification in lung cancer detection and classification is proposed. Pattern classification algorithms can classify the input data into different classes underlying the characteristic features in the input. Early identification of lung cancer using pattern recognition can save lives by analyzing the significant number of Computed Tomography images. Convolutional Neural Networks recently achieved remarkable results in various applications including Lung cancer detection in Deep Learning. The deployment of augmentation to improve the accuracy of a Convolutional Neural Network has been proposed. Data augmentation is utilized to find suitable training samples from existing training sets by employing various transformations such as scaling, rotation, and contrast modification. The LIDC-IDRI database is utilized to assess the networks. The proposed work showed an overall accuracy of 95%. Precision, recall, and F1 score for benign test data are 0.93, 0.96, and 0.95, respectively, and 0.96, 0.93, and 0.95 for malignant test data. The proposed system has impressive results when compared to other state-of-the-art approaches

    258

    full texts

    343

    metadata records
    Updated in last 30 days.
    Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇