International Journal of Advances in Intelligent Informatics
Not a member yet
235 research outputs found
Sort by
RadEval: A novel semantic evaluation framework for radiology report
The evaluation of automatically generated radiology reports remains a critical challenge, as conventional metrics fail to capture the semantic, clinical, and contextual correctness required for automatic medical analysis. This study proposes RadEval, a semantic-aware evaluation framework, to assess the quality of generated radiology reports. This method integrates domain-specific knowledge and contextual embeddings to evaluate the quality of generated radiology reports using a four-level scoring system. Given a reference report and a predicted report from a radiology image, RadEval performs scoring evaluation by first extracting relevant medical entities using a fine-tuned biomedical NER model. These entities are normalized through ontology mapping using RadLex concept identifiers to resolve lexical variation. Then, semantically related entities were clustered using BioBERT's contextual embeddings to capture deeper semantic similarity. In addition, predicted abnormality tags are incorporated to weight clinically significant terms during score aggregation. The final semantic score reflects a weighted combination of exact match, ontology match, and contextual similarity, modulated by tag importance. Experiments were conducted on the MIMIC-CXR dataset, which contains over 200,000 report pairs. Comparative evaluations show that RadEval outperforms traditional metrics, achieving an F1-score of 0.69, compared to 0.56 for BERTScore. Using this method, a more precise clinical interpretation of the predicted report was captured from the reference report. These findings suggest that RadEval method provides a more accurate and clinically aligned framework for evaluating the medical report generation model
GAN-Enhanced multimodal fusion and ensemble learning for imbalanced chest X-Ray classification
Chest X-ray (CXR) classification tasks often suffer from severe class imbalance, resulting in biased predictions and suboptimal diagnostic performance. To address this challenge, we propose an integrated framework that combines high-fidelity data augmentation using Generative Adversarial Networks (GANs), ensemble learning via hard and soft voting, and multimodal feature fusion. The method begins by partitioning the majority class into multiple subsets, which are individually balanced through GAN-generated synthetic images. Deep learning models, specifically DenseNet201 and EfficientNetV2B3, are trained separately on each balanced subset. These models are then combined using ensemble voting to improve robustness. Additionally, features extracted from the most performant models are fused and used to train traditional classifiers such as Logistic Regression, Multilayer Perceptron, CatBoost, and XGBoost. Evaluations on a publicly available CXR dataset demonstrate consistent improvements across key metrics, including accuracy, precision, recall, F1-score, AUROC, AUPRC, MCC, and G-mean. This framework shows superior performance in multiclass scenarios
LC Map: a robust chaotic function for enhancing cryptographic security through key sensitivity and randomness analysis
The security of digital image data has become increasingly critical in modern communication systems. While chaos-based cryptography offers a promising solution, many existing algorithms lack rigorous security validation. This paper introduces the Logistic-Circle Map (LC Map), a novel one-dimensional compound chaotic system designed to provide a robust and efficient foundation for image encryption. By composing the Logistic Map and the Circle Map, the LC Map exhibits a broader chaotic range and higher dynamical complexity. The performance and security of an LC Map-based encryption scheme are extensively validated using a comprehensive dataset of 24 digital images. Security analysis demonstrates that the algorithm is highly resistant to brute-force, statistical, and differential attacks. It provides a vast key space and demonstrates very strong key sensitivity, both confirmed through experimental evaluation. Test results show near-ideal performance on standard security metrics, with a Number of Pixels Change Rate (NPCR) approaching 99.6%, a Unified Average Changing Intensity (UACI) approaching 33.4%, and an information entropy value nearing the theoretical maximum of 8. Further quantitative comparative analysis demonstrates the superiority of the LC Map in balancing security and computational efficiency. Thus, the LC Map is presented as a rigorously validated component for the development of future image cryptosystems
Ensemble semi-supervised learning in facial expression recognition
Facial Expression Recognition (FER) plays a crucial role in human-computer interaction, yet improving its accuracy remains a significant challenge. This study aims to enhance the robustness and effectiveness of FER systems by integrating multiple machine learning techniques within a semi-supervised learning framework. The primary objective is to develop a more effective ensemble model that combines Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), Support Vector Classifier (SVC), and Random Forest classifiers, utilizing both labeled and unlabeled data. The research implements data augmentation and feature extraction techniques, utilizing advanced architectures such as VGG19, ResNet50, and InceptionV3 to improve the quality and representation of facial expression data. Evaluations were conducted across three dataset scenarios: original, feature-extracted, and augmented, using various label-to-unlabeled ratios. The results indicate that the ensemble model achieved a notable accuracy improvement of 87% on the augmented dataset compared to individual classifiers and other ensemble methods, demonstrating superior performance in handling occlusions and diverse data conditions. However, several limitations exist. The study’s reliance on the JAFFE dataset may restrict its generalizability, as it may not cover the full range of facial expressions encountered in real-world scenarios. Additionally, the effect of label-to-unlabeled ratios on the model's performance requires further exploration. Computational efficiency and training time were also not evaluated, which are critical considerations for practical implementation. For future research, it is recommended to employ cross-validation methods for more robust performance evaluation, explore additional data augmentation techniques, optimize ensemble configurations, and address the computational efficiency of the model to better advance FER technologies
Gender classification performance optimization based on facial images using LBG-VQ and MB-LBP
In the computer vision and machine learning field, especially for gender classification based on facial images, feature extraction is one of the inseparable parts. Various features can be extracted from images, including texture features. Several prior studies show that the Linde Buzo gray vector quantization (LBG-VQ) and Multi-block local binary pattern (MB-LBP) methods can extract texture features from images. The LBG-VQ produces less optimal performance in gender classification on the FEI facial images dataset. On the other hand, the MB-LBP produces more optimal performance when applied to the FERET facial images dataset. Therefore, this study was conducted to discover the gender classification performance when the LBG-VQ and MB-LBP methods are implemented independently or in combination on the FEI facial images dataset. Three preprocessing stages are involved before extracting images' features: noise removal, illumination adjustment, and image conversion from RGB to grayscale. The extracted features are then used as training material for several classification methods, namely Naïve Bayes, SVM, KNN, Random Forest, and Logistic Regression. Then, the K-Fold Cross Validation method is used to evaluate the trained models. This study discovered that the implementation of MB-LBP tends to show a performance improvement compared to the LBG-VQ. Furthermore, the most optimal classification model, with a performance of 91.928%, was formed by implementing Logistic Regression with MB-LBP on LBG-VQ quantized images. In conclusion, this study successfully formed an optimized gender classification model based on the FEI facial images dataset
Cocoa bean quality identification using a computer vision-based color and texture feature extraction
The current pressing issue in the downstream processing of cocoa beans in cocoa production is a strict quality control system. However, visually inspecting raw cocoa beans reveals the need for advanced technological solutions, especially in Industry 4.0. This paper introduces an innovative image-processing approach to extracting color and texture features to identify cocoa bean quality. Image acquisition involved capturing video with a data acquisition box device connected to a conveyor, resulting in image samples of Good-quality and Poor-quality of non-cutting cocoa beans dataset. Our methodology includes multifaceted advanced pre-processing, sharpening techniques, and comparative analysis of feature extraction methodologies using Hue-Saturation-Value (HSV) and Gray Level Cooccurrence Matrix (GLCM) with correlated features. This study used 15 features with the highest correlation. Machine Learning models using Support Vector Machine (SVM) with some parameter variation value alongside an RBF kernel. Some parameters were measured to compare each approach, and the results show that pre-processing without sharpening achieves better accuracy, notably with the HSV and GLCM combination reaching 0.99 accuracy. Adequate technical lighting during data acquisition is crucial for accuracy. This study sheds light on the efficacy of image processing in cocoa bean quality identification, addressing a critical gap in industrial-scale implementation of technological solutions and advancing quality control measures in the cocoa industry
BERT-Enhanced Bi-LSTM with weighted cross-entropy for multilingual sentiment classification
With the increasing volume of multilingual user-generated content across social media platforms, effective sentiment analysis (SA) becomes crucial, especially for low-resource languages. However, traditional models relying on context-independent embeddings, such as Word2Vec, GloVe, and fastText, struggle to handle the complexity of multilingual sentiment classification. To address this, we propose an Automatic Multilingual Sentiment Detection (AMSD) framework that leverages the contextual capabilities of BERT for feature extraction and a Bidirectional Long Short-Term Memory (Bi-LSTM) network for classification. Our method, termed Elite Opposition Cross-Entropy Weighted Bi-LSTM (EOCEWBi-LSTM), integrates elite opposition-based learning to optimize hyperparameters and enhance classification accuracy. A weighted cross-entropy loss function further refines the model's sensitivity to class imbalance, thereby improving its performance. The model is trained and evaluated on the NEP_EDUSET corpus, comprising 45,434 tweets in English, Hindi, and Tamil. Experimental results demonstrate notable improvements in precision, recall, F1-score, and accuracy, highlighting the effectiveness of EOCEWBi-LSTM in multilingual sentiment analysis, especially across both high-resource and low-resource languages. The experimental results show that the proposed EOCEWBi-LSTM achieves a high F1-score ratio of 93.83% and an accuracy ratio of 93.83% compared to other existing methods. EOCEWBi-LSTM provides an effective solution for multilingual sentiment analysis, especially for languages with limited resources
Human Capital Decision Intelligence (HCDI) architecture in microbiology laboratory based on machine learning and operations research models
The Human Capital Decision Intelligence (HDCI) system integrates human-computer interaction in a microbiology laboratory that uses machine learning and operational research to classify new tasks and then recommend assignments to each person. The models evaluated in building this system are Support Vector Machine, Gaussian Naive Bayes, Multinomial Logistic Regression, and Artificial Neural Network. The results of the research show that the ANN model is the most consistent and reliable across various training ratios, as indicated by the model's goodness parameters. The selected ANN model is combined with a linear programming approach to optimize workload distribution. The integrated system successfully manages new job scenarios and recommends staff based on competencies and availability. It also ensures assignments do not exceed maximum workload limits and finds alternatives when key staff are unavailable. The implementation of the HDCI system has a positive impact on various factors, including the fair distribution of tasks, enhanced staff performance monitoring, and significantly improved operational efficiency and human resource management in the microbiology laboratory. The system is designed to be easy to use and support collaboration between laboratory staff and computational models. The system is not only advanced in supporting personnel management decision-making, but it can also demonstrate how artificial intelligence and operations research systems can be combined to address the needs of the microbiology laboratory environment
Edge optimized multimodal cross fusion model with statistical validation for multi crop disease detection
Accurate and timely crop disease detection is crucial for mitigating agricultural losses and ensuring food security, particularly in resource-limited settings. Traditional diagnostic methods are inefficient and prone to errors, while existing deep learning models, such as ResNet50 and Inception V3, struggle with generalizability and computational efficiency. This study proposes a Dynamic Edge-Optimized Multimodal Fusion (DEMF) model, integrating EfficientNetV2 and MobileNetV2 to enhance feature learning and scalability. The model was trained on a 76-class dataset comprising PlantVillage and locally collected images of crop diseases, ensuring robustness across diverse conditions. Feature fusion via concatenation, combined with compound scaling and transfer learning, enabled the model to capture fine-grained patterns of disease. Extensive experiments, including ablation studies and comparative evaluations against DenseNet-121, DenseNet-50, AlexNet, and ResNet-152, validated the model’s superiority. The proposed model achieved 99.2% accuracy, a Kappa of 0.9919, and an AUC of 0.9999, outperforming benchmarks. Statistical validation confirmed significant improvements (p<0.05) and stability. To enhance accessibility, an AI-powered mobile application was deployed on the Google Play Store, enabling real-time disease detection and actionable recommendations. To enhance accessibility, an AI-powered mobile application was deployed on the Google Play Store, enabling real-time disease detection and actionable recommendations. This research advances transfer learning, feature fusion, and statistical validation for robust, scalable crop disease detection in low-resource environments
Optimized image-based grouping of e-commerce products using deep hierarchical clustering
Managing large and constantly evolving product catalogs is a significant challenge for e-commerce platforms, especially when visually similar products cannot be reliably distinguished using text-based methods. This study proposes a product grouping method that combines a fine-tuned EfficientNetV2M model with an adaptive Agglomerative Clustering strategy. Unlike conventional CNN-based approaches, which have limited scalability and a fixed number of clusters, the proposed method dynamically adjusts similarity thresholds and automatically forms clusters for unseen product variations. By linking deep visual feature extraction with adaptive clustering, the method enhances flexibility in handling product diversity. Experiments on the Shopee product image dataset show that it achieves a high Normalized Mutual Information (NMI) score of 0.924, outperforming standard baselines. These results demonstrate the method’s effectiveness in automating catalog organization and offer a scalable solution for inventory management and personalized recommendations in e-commerce platforms