International Journal of Advances in Intelligent Informatics
Not a member yet
    235 research outputs found

    Detecting signal transtition in dynamic sign language using R-GB LSTM method

    Full text link
    Sign Language Recognition (SLR) helps deaf people communicate with normal people. However, SLR still has difficulty detecting dynamic movements of connected sign language, which reduces the accuracy of detection. This results from a sentence's usage of transitional gestures between words. Several researchers have tried to solve the problem of transition gestures in dynamic sign language, but none have been able to produce an accurate solution. The R-GB LSTM method detects transition gestures within a sentence based on labelled words and transition gestures stored in a model. If a gesture to be processed during training matches a transition gesture stored in the pre-training process and its probability value is greater than 0.5, it is categorized as a transition gesture. Subsequently, the detected gestures are eliminated according to the gesture's time value (t). To evaluate the effectiveness of the proposed method, we conducted an experiment using 20 words in Indonesian Sign Language (SIBI). Twenty representative words were selected for modelling using our R-GB LSTM technique. The results are promising, with an average accuracy of 80% for gesture sentences and an even more impressive accuracy rate of 88.57% for gesture words. We used a confusion matrix to calculate accuracy, specificity, and sensitivity. This study marks a significant leap forward in developing sustainable sign language recognition systems with improved accuracy and practicality. This advancement holds great promise for enhancing communication and accessibility for deaf and hard-of-hearing communities

    AI-Driven Analysis: Optimizing Tertiary Education Policy through Machine Learning Insights

    Full text link
    Tertiary education is pivotal in equipping individuals with the necessary knowledge and skills for success, prompting global initiatives for equitable access to quality higher education. The Philippines' Universal Access to Quality Tertiary Education (UAQTE) Act exemplifies this commitment by providing free tertiary education to eligible Filipino students. This study evaluates the UAQTE program's implementation through the perspectives of student beneficiaries, employing a combined approach of qualitative analysis and machine learning techniques. The study utilizes supervised and unsupervised machine learning to analyze student responses, specifically multiclass text classification using BERT and topic modeling with BERTopic. The results reveal insights into students' experiences and perceptions of the UAQTE program. While BERT demonstrates effectiveness in certain categories, challenges such as overfitting and balancing sequence length versus model performance are identified. BERTopic highlights the importance of capturing two-word combinations for enhancing topic coherence. Key themes identified through both approaches include "Educational Opportunity," "Program Implementation," "Financial Support," and "Appreciation and Gratitude," emphasizing their significance within the UAQTE program. Alignment between machine learning analyses and domain experts' insights underscores the relevance and effectiveness of the methodologies employed. Recommendations for optimizing the UAQTE program include refining focus areas, strengthening support systems, incorporating two-word combinations in analysis, and fostering continuous monitoring and interdisciplinary collaboration. By leveraging insights from qualitative and machine learning analyses, administrators can make informed decisions to enhance program effectiveness and comprehensively address students' diverse needs

    Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models

    Full text link
    Sign language is the primary communication tool used by the deaf community and people with speaking difficulties, especially during emergencies. Numerous deep learning models have been proposed to solve the sign language recognition problem. Recently. Bidirectional LSTM (BLSTM) has been proposed and used in replacement of Long Short-Term Memory (LSTM) as it may improve learning long-team dependencies as well as increase the accuracy of the model. However, there needs to be more comparison for the performance of LSTM and BLSTM in LRCN model architecture in sign language interpretation applications. Therefore, this study focused on the dense analysis of the LRCN model, including 1) training the CNN from scratch and 2) modeling with pre-trained CNN, VGG-19, and ResNet50. Other than that, the ConvLSTM model, a special variant of LSTM designed for video input, has also been modeled and compared with the LRCN in representing emergency sign language recognition. Within LRCN variants, the performance of a small CNN network was compared with pre-trained VGG-19 and ResNet50V2. A dataset of emergency Indian Sign Language with eight classes is used to train the models. The model with the best performance is the VGG-19 + LSTM model, with a testing accuracy of 96.39%. Small LRCN networks, which are 5 CNN subunits + LSTM and 4 CNN subunits + BLSTM, have 95.18% testing accuracy. This performance is on par with our best-proposed model, VGG + LSTM. By incorporating bidirectional LSTM (BLSTM) into deep learning models, the ability to understand long-term dependencies can be improved. This can enhance accuracy in reading sign language, leading to more effective communication during emergencies

    Analyzing computer vision models for detecting customers: a practical experience in a mexican retail

    Full text link
    Computer vision has become an important technology for obtaining meaningful data from visual content and providing valuable information for enhancing security controls, marketing, and logistic strategies in diverse industrial and business sectors. The retail sector constitutes an important part of the worldwide economy. Analyzing customer data and shopping behaviors has become essential to deliver the right products to customers, maximize profits, and increase competitiveness. In-person shopping is still a predominant form of retail despite the appearance of online retail outlets. As such, in-person retail is adopting computer vision models to monitor store products and customers. This research paper presents the development of a computer vision solution by Lytica Company to detect customers in Steren’s physical retail stores in Mexico. Current computer vision models such as SSD Mobilenet V2, YOLO-FastestV2, YOLOv5, and YOLOXn were analyzed to find the most accurate system according to the conditions and characteristics of the available devices. Some of the challenges addressed during the analysis of videos were obstruction and proximity of the customers, lighting conditions, position and distance of the camera concerning the customer when entering the store, image quality, and scalability of the process. Models were evaluated with the F1-score metric: 0.64 with YOLO FastestV2, 0.74 with SSD Mobilenetv2, 0.86 with YOLOv5n, 0.86 with YOLOv5xs, and 0.74 with YOLOXn. Although YOLOv5 achieved the best performance, YOLOXn presented the best balance between performance and FPS (frames per second) rate, considering the limited hardware and computing power conditions

    Domain adaptation for driver's gaze mapping for different drivers and new environments

    Full text link
    Distracted driving is a leading cause of traffic accidents, and often arises from a lack of visual attention on the road. To enhance road safety, monitoring a driver's visual attention is crucial. Appearance-based gaze estimation using deep learning and Convolutional Neural Networks (CNN) has shown promising results, but it faces challenges when applied to different drivers and environments. In this paper, we propose a domain adaptation-based solution for gaze mapping, which aims to accurately estimate a driver's gaze in diverse drivers and new environments. Our method consists of three steps: pre-processing, facial feature extraction, and gaze region classification. We explore two strategies for input feature extraction, one utilizing the full appearance of the driver and environment and the other focusing on the driver's face. Through unsupervised domain adaptation, we align the feature distributions of the source and target domains using a conditional Generative Adversarial Network (GAN). We conduct experiments on the Driver Gaze Mapping (DGM) dataset and the Columbia Cave-DB dataset to evaluate the performance of our method. The results demonstrate that our proposed method reduces the gaze mapping error, achieves better performance on different drivers and camera positions, and outperforms existing methods. We achieved an average Strictly Correct Estimation Rate (SCER) accuracy of 81.38% and 93.53% and Loosely Correct Estimation Rate (LCER) accuracy of 96.69% and 98.9% for the two strategies, respectively, indicating the effectiveness of our approach in adapting to different domains and camera positions. Our study contributes to the advancement of gaze mapping techniques and provides insights for improving driver safety in various driving scenarios

    Granularity-aware legal question answering: a case study of Indonesian government regulations

    Full text link
    Question answering (QA) technologies are crucial for building conversational AI.  Current research related to QA for the legal domain lacks focus on the organized structure of laws, which are hierarchically segmented into components at varying levels of detail. To address this gap, we propose a new task of granularity-aware legal QA, which accounts for the underlying granularity levels of law components. Our approach encompasses task formulation, dataset creation, and model development. Under the Indonesian jurisdiction, we consider four law component granularity levels: chapters (bab), articles (pasal), sections (ayat), and letters (huruf). We include 15 government regulations (Peraturan Pemerintah) of Indonesia related to labor affairs and build a legal QA dataset with granularity information. We then design a solution for such a task—the first IR system to account for legal component granularity. We implement a customized retriever-reranker pipeline in which the retriever accepts law components of multiple granularities and the reranker is trained for granularity-aware ranking. We leverage BM25 and BERT models as retriever and reranker, respectively, yielding an end-to-end exact match accuracy of 35.68%, which offers a significant improvement (20%) over a strong baseline. The use of reranker also improves the granularity accuracy from 44.86% to 63.24%. In practical context, such a solution can help provide more precise answers, not only from legal chatbots, but also other conversational AI that deals with hierarchically-structured documents

    Weather classification using meta-based random forest fusion of transfer learning models

    Full text link
    Weather classification into multiple categories is an essential task for many applications, including farming, military, transport, airlines, navigation, agriculture, etc. A few pieces of research give attention to this field and the current state-of-art methods have limitations, including low accuracy and limited weather conditions. In this study, a new weather classification meta-based fusion of the transfer deep learning model is introduced. The study takes into account all possible weather conditions and utilizes the fusion technique to improve the performance. First, the weather images are pre-processed and a data augmentation process is performed. These images are fed into five transfer deep learning models (XceptionNet, VGG16, ResNet50V2, InceptionV3, and DenseNet201). Then, the meta-based random forest fusion, the meta-based bagging fusion, and the score-level fusion are applied. Finally, all individual and fusion models are evaluated. Experiments were conducted on the WEAPD dataset which includes 11 categories. Results prove that the best performance is related to the meta-based ransom forest fusion method with 96% accuracy. The current study is also compared with the current state-of-art methods, and the comparison proves the robustness and high performance of the current study especially the fact that the current study achieves the best performance on the WEAPD dataset compared to studies worked on the same dataset. The current study proves that meta-based RF fusion is a promising methodology to address the weather classification problem. This outcome can be used by future study to improve the weather classification fusion and ensemble methodologies

    Imputation of missing microclimate data of coffee-pine agroforestry with machine learning

    Full text link
    This research presents a comprehensive analysis of various imputation methods for addressing missing microclimate data in the context of coffee-pine agroforestry land in UB Forest. Utilizing Big data and Machine learning methods, the research evaluates the effectiveness of imputation missing microclimate data with Interpolation, Shifted Interpolation, K-Nearest Neighbors (KNN), and Linear Regression methods across multiple time frames - 6 hours, daily, weekly, and monthly. The performance of these methods is meticulously assessed using four key evaluation metrics Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE). The results indicate that Linear Regression consistently outperforms other methods across all time frames, demonstrating the lowest error rates in terms of MAE, MSE, RMSE, and MAPE. This finding underscores the robustness and precision of Linear Regression in handling the variability inherent in microclimate data within agroforestry systems. The research highlights the critical role of accurate data imputation in agroforestry research and points towards the potential of machine learning techniques in advancing environmental data analysis. The insights gained from this research contribute significantly to the field of environmental science, offering a reliable methodological approach for enhancing the accuracy of microclimate models in agroforestry, thereby facilitating informed decision-making for sustainable ecosystem management

    A comparison of machine learning methods for knowledge extraction model in A LoRa-Based waste bin monitoring system

    Full text link
    Knowledge Extraction Model (KEM) is a system that extracts knowledge through an IoT-based smart waste bin emptying scheduling classification. Classification is a difficult problem and requires an efficient classification method. This research contributes in the form of the KEM system in the classification of scheduling for emptying waste bins with the best performance of the Machine Learning method. The research aims to compare the performance of Machine Learning methods in the form of Decision Tree, Naïve Bayes, K-Nearest Neighbor, Support Vector Machine, and Multi-Layer Perceptron, which will be recommended in the KEM system. Performance testing was performed on accuracy, recall, precision, F-Measure, and ROCS curves using the cross-validation method with ten observations. The experimental results show that the Decision Tree performs best for accuracy, recall, precision, and ROCS curve. In contrast, the K-NN method obtains the highest F-measure performance. KEM can be implemented to extract knowledge from data sets created in various other IoT-based systems

    Empirical study of 3D-HPE on HOI4D egocentric vision dataset based on deep learning

    Full text link
    3D hand pose estimation (3D-HPE) is one of the tasks performed on data obtained from egocentric vision camera (EVC) such as hand detection, segmentation, and gesture recognition applied in fields such as HCI, HRI, VR, AR, Healthcare, supporting for the visually impaired people, etc. In these applications, hand point cloud data obtained from EV is not very challenging due to being obscured by gaze direction and other objects. Our paper performs a comparative study on 3D right-hand pose estimation (3D-R-HPE) from the HOI4D dataset with four cameras used to collect and animate the dataset. This is a very challenging dataset and was published at CVPR 2022. We use CNNs (P2PR PointNet, Hand PointNet, V2V-PoseNet, and HandFoldingNet - HFNet) to fine-tune the 3D-HPE model based on the point cloud data (PCD) of hand. The resulting error of 3D-HPE is presented as follows: P2PR PointNet (average error (Erra) is 32.71mm), Hand PointNet (average error (Erra) is 35.12mm), V2V-PoseNet (average error (Erra) is 26.32mm), and HFNet (average error (Erra) is 20.49mm). HFNet is the latest CNN (in 2021) with the best results. This estimation error is small and can be applied and modeled to automatically detect, estimate, and recognize hand pose from the data obtained by EV. The average processing time is 5.4fps when done on the GPU of the HFNet, which is the fastest. Detailed quantitative and qualitative results were presented that are beneficial to various applications such as human-computer interaction, virtual and augmented reality, and healthcare, particularly in challenging scenarios involving occlusions and complex datasets

    226

    full texts

    235

    metadata records
    Updated in last 30 days.
    International Journal of Advances in Intelligent Informatics
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇