Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)
Not a member yet
343 research outputs found
Sort by
DAE-MLP Based Feature Extraction for Hyperspectral Image Classification of Saint Clair River
Hyperspectral remote sensing has emerged as a powerful tool for vegetation classification due to its ability to capture detailed spectral information. This study introduces a novel methodology for vegetation classification using exclusively hyperspectral imagery. The proposed approach comprises atmospheric correction using the FLAASH algorithm, followed by dimensionality reductionusing PCA and segmentation through the ROI selection and the Spectral Angle Mapper (SAM) module. Subsequently, a deep autoencoder is employed for feature extraction, paving the way for classification using the Multi-Layer Perceptron (MLP) algorithm. The effectiveness of this methodology is evaluated using a hyperspectral image of the Saint Clair River, successfully classifying the image into six main classes: water 1, water 2, grass, tree, reed, corn, and an \u27unclassified\u27 category encompassing concrete, roads, bricks, wood, and more. Our findings demonstrate the efficacy of this approach in accurately classifying and mapping vegetation in river ecosystems, offering a promising solution in the face of limited hyperspectral datasets
Deep Learning-Based Video Anomaly Detection Using Optimised Attention-Enhanced Autoencoders
Anomaly detection in video is essential for applications like surveillance, healthcare, and industrial monitoring. Through the reconstruction of normal patterns and the computation of reconstruction error in relation to ground truth, convolutional autoencoders detect anomalies. Frames with errors above a threshold are flagged as abnormal. Existing approaches rely on fixed thresholds, which may not adapt well to varying lighting conditions, leading to false positives or missed anomalies. A novel autoencoder (SESAA) is proposed in this work that combines self-attention with squeeze-and-excitation (SE) blocks and improves video anomaly detection by using a thresholding technique for optimal threshold identification. Our adaptive thresholding technique leverages reconstruction cost, peak signal-to-noise ratio (PSNR) and frame brightness for optimal threshold identification, enhancing adaptability to different scenarios. Comparing with dynamic threshold methods, we assess our model using ROC and AUC metrics. Experiments on three benchmark datasets validate the efficacy of our method in precise anomaly detection through optimal thresholding
A Systematic Framework for Sanskrit Character Recognition Using Deep Learning
Sanskrit is widely acknowledged to be among the world’s oldest surviving classical languages, and yet its usage has continued to decline unabated in the present milieu. Such insidious erosion of popularity is directly attributable to the absence of native speakers of the language and the perceived inaccessibility of Sanskrit to contemporary audiences. Notwithstanding, the language remains historically and culturally inseparable from the subcontinent, with numerous religious manuscripts, epigraphical inscriptions, edicts and scientific literature written in the Sanskrit script. Attempts made to resuscitate the language have been largely unsuccessful as these attempts have relied extensively on laborious human transcription and translation. Such manual endeavors can be superseded by the use of efficient computational techniques to facilitate the efficient transcription of voluminous manuscripts written in the Sanskrit script.
The emergence of deep learning frameworks has enabled researchers to overcome the draw backs of conventional machine learning algorithms in developing efficient and extensible character recognition systems. Notwithstanding, the advancement of character recognition frameworks varies across different Indic scripts.
In this context, this paper introduces an extensible framework for the transcription of hand written Sanskrit manuscripts. In the absence of a benchmark dataset of handwritten Sanskrit characters, the authors introduce a comprehensive dataset to facilitate further downstream segmentation. The dataset, on augmentation, comprises over a hundred thousand samples and has been collected from over a hundred individuals. The paper explores an integrated approach to segmentation and accordingly delineates a systematic methodology for effectively segmenting Sanskrit words, incorporating techniques such as thresholding, zone-based classification, median bisection and projection profiles. The proposed technique accommodates a diverse array of characters and modifiers present in the Sanskrit script. Subsequently, a concurrent deep learning architecture parallelizes transcription using Neural Networks (CNN and Residual Networks). The deep learning models show accuracies exceeding 90%. This paper attempts to benchmark the significance ofsystematic approaches to machine transcription of low-resource languages
An Inclusive review on deep learning techniques and their scope in handwriting recognition
Deep learning expresses a category of machine learning algorithms that have the capability to combine raw inputs into intermediate features layers. These deep learning algorithms have demonstrated great results in different fields. Deep learning has particularly witnessed for a great achievement of human level performance across a number of domains in computer vision and pattern recognition. For the achievement of state-of-the-art performances in diverse domains, the deep learning used different architectures and these architectures used activation functions to perform various computations between hidden and output layers of any architecture. This paper presents a survey on the existing studies of deep learning in handwriting recognition field. Even though the recent progress indicates that the deep learning methods has provided valuable means for speeding up or proving accurate results in handwriting recognition, but following from the extensive literature survey, the present study finds that the deep learning has yet to revolutionize more and has to resolve many of the most pressing challenges in this field, but promising advances have been made on the prior state of the art. Additionally, an inadequate availability of labelled data to train presents problems in this domain. Nevertheless, the present handwriting recognition survey foresees deep learning enabling changes at both bench and bedside with the potential to transform several domains as image processing, speech recognition, computer vision, machine translation, robotics and control, medical imaging, medical information processing, bio-informatics, natural language processing, cyber security, and many others
An Efficient Deep Learning based License Plate Recognition for Smart Cities
Computer vision algorithm with the amalgamation of deep learning technologies has provided endless possible applications. Currently, with the high load of vehicle traffic it is very difficult to trace and capture vehicular information over traffic surveillance on roads, parking or for safety concerns. Here, we have done an exploration for such a use case where a deep learning model is trained to detect and recognize a license plate in a vehicle. In the proposed method an object detection model, EfficientDet-D0 has been trained with custom dataset for license plate detection and have used optical character recognition model, Tesseract. In the proposed method, we have used a novel license plate extraction algorithm which reduces false localization followed by character recognition in a pipeline manner. We have also explored model quantization method to compress the model at reduced precision for efficient edge-based deployment for an end-application. In the proposed work, we have dedicated our study for Indian vehicles and have evaluated the performance with standard datasets like CCPD, UFPR and have achieved 97.9% in license localization and 95.15% in end-to-end detection and recognition respectively. We have implemented on Raspberry Pi3 and NVIDIA Jetson Nano deviced with improved performances. Comparing with state-of-the-art we have achieved 2×, 3.8× and 2.5× in CPU, GPU and edge platform respectively
Classification of radiological patterns of tuberculosis with a Convolutional neural network in x-ray images
In this paper we propose the classification of radiological patterns with the presence of tuberculosis in X-ray images, it was observed that two to six patterns (consolidation, fibrosis, opacity, opacity, pleural, nodules and cavitations) are present in the radiographs of the patients. It is important to mention that species specialists consider the type of TB pattern in order to provide appropriate treatment. It should be noted that not all medical centres have specialists who can immediately interpret radiological patterns. Considering the above, the aim is to classify patterns by means of a convolutional neural network to help make a more accurate diagnosis on X-rays, so that doctors can recommend immediate treatment and thus avoid infecting more people. For the classification of tuberculosis patterns, a proprietary convolutional neural network (CNN) was proposed and compared against the VGG16, InceptionV3 and ResNet-50 architectures, which were selected based on the results of other radiograph classification research [1]–[3] . The results obtained for the Macro-averange AUC-SVM metric for the proposed architecture and InceptionV3 were 0.80, and for VGG16 it was 0.75, and for the ResNet-50 network it was 0.79. The proposed architecture has better classification results, as does InceptionV3
A Study on CNN-Based and Handcrafted Extraction Methods with Machine Learning for Automated Classification of Breast Tumors from Ultrasound Images
In this paper, we present an efficient procedure for automatically classifying ultrasound images of benign and malignant breast tumors. We evaluated our approach using four openly available datasets and investigated two categories of feature extraction methods: handcrafted methods (Local Binary Pattern (LBP), Histogram of Oriented Gradients (HOG)) and methods based on convolutional neural network (CNN) models. For classification, we explored three classifiers: linear support vector machines (SVM), k-nearest neighbors (KNN), and artificial neural networks (ANN). Two experiments were conducted: the first aimed to design a classifier for each individual dataset, whereas the second aimed to develop a unified classifier for the ensemble datasets. The obtained results demonstrate that the ANN classifier associated to the early stopping (ES) criterion, is very effective in both experiments, outperforming KNN and SVM with 100 % accuracy. Additionally, using CNN models as feature extraction methods proved effective. Among these CNNs: ResNet50, InceptionV3 and DenseNet201 achieve 100 % accuracy in the first experiment, while DenseNet201 allows achieving 100 % accuracy in the second experiment. Comparative analysis with existing research demonstrates the competitiveness or superiority of the proposed procedure
A Novel Framework Based On Deep Neural Network For Determining The Melting Point Of Crystalline Chemical Substances
Deep learning is a subset of machine learning that uses artificial neural networks inspired by human cognitive systems. Although this is a newly approach recently it became very popular and effective. In many applications deep learning become most successful approach where machine learning has been successful at certain rates. In the succession of these the proposed deep learning model is suitable for melting point detection apparatus which determine melting point of chemical substances this apparatus generally used in pharmaceutical and chemical industries. Proposed deep learning model classify images of chemical’s state (Solid or Liquid) by deep neural network (DNN) it consists of TensorFlow framework, libraries like Keras and activation function like ReLu, sigmoid, MaxPool and Flatten to determine melting point of chemical substances. The proposed model enables to TensorFlow architecture, which can determine the melting point of chemicals in real time on a single board computer. This use python as a programming language, TensorFlow framework and keras library. The input image data mainly focuses on chemical’s state, there are 2 categories of chemical’s state either solid or liquid. The Deep Neural Network (DNN) chosen as the best practice for the training process because it provides high accuracy. The results discussed in terms of the image classification accuracy in percentage. The images from two class label gets maximum accuracy is 99.72% and maximum validation accuracy is 99.37% same as liquid’s image and the average value of accuracy 84.17% or higher after certain epochs
Multi-Biometric System Based On The Fusion Of Fingerprint And Finger-Vein
Biometrics is the process of measuring the unique biological traits of an individual for identification and verification purposes. Multiple features are used to enhance the security and robustness of the system. This study concentrates exclusively on the finger and employs two modalities - fingerprint and finger vein. The proposed system utilizes feature extraction for finger vein and two matching algorithms, namely ridge-based matching, and minutiae-based matching, to derive matching scores for both biometrics. The scores from the two modalities are combined using four fusion approaches: holistic fusion, non-linear fusion, sum rule-based fusion, and Dempster-Shafer theory. The ultimate decision is made by the performance metrics and the Receiver Operating Characteristics (ROC) curve of the fusion technique with the best results. The proposed technique is tested on images collected from the “Nanjing University Posts and Telecommunications- Fingerprint and Finger vein dataset (NUPT-FPV).” According to the results, which were obtained for 840 input images the proposed system accomplishes the Equal Error Rate (EER) of 0% while using Dempster Shafer-based fusion and 14% while using the other three fusion techniques. Also, the False Acceptance Rate (FAR) is very low at 0% for all the fusion techniques which are crucial for security and preventing unauthorized access.
 
Image-based Mangifera Indica Leaf Disease Detection using Transfer Learning for Deep Learning Methods
Mangifera Indica, ordinarily known as mango, comes from a large tree. The leaf of the mango treehas human health benefits; the mango leaf extract is used for curing various diseases, including patientswith cancer and diabetes. It also has an anti-oxidant and anti-microbial biological activity. Leaf disease,including fungal disease, is a severe security threat to nourishment and food paramours. Sometimes, itleads to decreased productivity and a huge loss for the farmers. Observing and determining whether aleaf is infected through the naked eye is unreliable and inconsistent. Technology advancement has helpedagriculture people in several ways, and deep learning methods are a promising approach to spotting leafdiseases with the best accuracy. A mango leaf disease detection model is developed with the pre-trainedmodel of ResNet18, which is used in transfer learning along with the Fast.ai framework. Around 2000images were used, including images of healthy and infected leaves. The trained model achieved an accuracyof 99.88% and performed well compared to the existing state-of-the-art methods