IAES International Journal of Artificial Intelligence (IJ-AI)
Not a member yet
1769 research outputs found
Sort by
Multiclass instance segmentation optimization for fetal heart image object interpretation
This research aims to develop a multi-class instance segmentation model for segmenting, detecting, and classifying objects in fetal heart ultrasound images derived from fetal heart ultrasound videos. Previous studies have performed object detection on fetal heart images, identifying nine anatomical classes. Further, these studies have conducted instance segmentation on fetal heart images for six anatomical classes. This research seeks to expand the scope by increasing the number of classes to ten, encompassing four main chambers left atrium (LA), right atrium (RA), left ventricle (LV), right ventricle (RV); four valves tricuspid valve (TV), pulmonary valve (PV), mitral valve (MV), and aortic valve (AV); one aorta (Ao), and the spine. By developing an instance segmentation method for segmenting ten anatomical structures of the fetal heart, this research aims to make a significant contribution to improving medical image analysis in healthcare. It also aims to pave the way for further research on fetal heart diseases using AI. The instance segmentation approach is expected to enhance the accuracy of segmenting fetal heart images and allow for more efficient identification and labeling of each anatomical structure in the fetal heart
Solving k-city multiple travelling salesman using genetic algorithm
This paper addresses a novel variant of the classical multiple traveling salesman problem (MTSP) i.e. k-city multiple traveling salesman problem (k-MTSP). The problem can describe as follows. Let there are n cities, m salesman positioned at depot city and a predefined positive value k. The distance between each pair of cities is known. The objective of the k-MTSP is to determine a collection of m closed tours for salesman, which covers exactly k (including depot city) of n cities such that the total distance covered is minimum. The k-MTSP can be seen as a combination of both subset selection and permutation characteristics. From the through literature review, it is found that this study on k-MTSP is first of its kind to the best of author’s knowledge. The paper introduces a zero-one integer linear programming (0-1 ILP) formulation alongside an efficient genetic algorithm (GA), designed to address k-MTSP. No comparative studies carried out due to the absence of existing studies on k-MTSP. However, the developed GA is tested over various benchmark test cases from TSPLIB and results are reported, which may potentially serve as basis for further comparative studies. Overall findings demonstrate that the GA consistently produces best solutions within reasonable computational times for relatively smaller and medium test cases, suggesting its robustness and effectiveness in tackling the k-MTSP. However, to enhance consistency and efficiency, particularly for larger datasets, further algorithm improvements are necessary
A deep learning-based framework for automatic detection of COVID-19 using chest X-ray and CT-scan images
COVID-19 has profoundly impacted global public health, underscoring the need for rapid detection methods. Radiography and radiologic imaging, especially chest X-rays, enable swift diagnosis of infected individuals. This study delves into leveraging machine learning to identify COVID-19 from X-ray images. By gathering a dataset of 9,000 chest X-rays and CT scans from public resources, meticulously vetted by board-licensed radiologists to confirm COVID-19 presence, the research sets a robust foundation. However, further validation is essential expanding datasets to encompass enough COVID-19 cases enhances convolutional neural network (CNN) accuracy. Among various machine learning techniques, deep learning excels in identifying distinct patterns on imaging characteristics discernible in chest radiographs of COVID-19 patients. Yet, extensive validation across diverse datasets and clinical trials is crucial to ensure the robustness and generalizability of these models. The conversation extends into complexities, including ethical considerations around patient privacy and integrating intelligent tech into clinical workflows. Collaborating closely with healthcare professionals ensures this technology complements the established diagnostic approach. Despite the potential to detect COVID-19 using chest X-ray imaging findings, thorough research and validation, alongside ethical deliberations, are vital before implementing it in the healthcare field. The results show that the proposed model achieved classification accuracy and F1 score of 96% and 98%, respectively, for the X-ray images
Exploring bibliometric trends in speech emotion recognition (2020-2024)
Speech Emotion Recognition (SER) is crucial in various real-world applications, including healthcare, human-computer interaction, and affective computing. By enabling systems to detect and respond to human emotions through vocal cues, SER enhances user experience, supports mental health monitoring, and improves adaptive technologies. This research presents a bibliometric analysis of SER based on 68 articles from 2020 to early 2024. The findings show a significant increase in publications each year, reflecting the growing interest in SER research. The analysis highlights various approaches in preprocessing, data sources, feature extraction, and emotion classification. India and China emerged as the most active contributors, with external funding, particularly from the NSFC, playing a significant role in the advancement of SER research. SVM remains the most widely used classification model, followed by KNN and CNN. However, several critical challenges persist, including inconsistent data quality, cross-linguistic variability, limited emotional diversity in datasets, and the complexity of real-time implementation. These limitations hinder the generalizability and scalability of SER systems in practical environments. Addressing these gaps is essential to enhance SER performance, especially for multimodal and multilingual applications. This study provides a detailed understanding of SER research trends, offering valuable insights for future advances in speech-based emotion recognition
Image analysis and machine learning techniques for accurate detection of common mango diseases in warm climates
Mangoes are valuable crops grown in warm climates, but they often suffer from diseases that harm both the trees and the fruits. This paper proposes a new way to use machine learning to detect these diseases early in mango plants. We focused on common issues like mango fruit diseases, leaf diseases, powdery mildew, anthracnose/blossom blight, and dieback, which are particularly problematic in places like Bangladesh. Our method starts by improving the quality of images of mango plants and then extracting important features from these images. We use a technique called k-means clustering to divide the images into meaningful parts for analysis. After extracting ten key features, we tested various ways to classify the diseases. The random forest algorithm stood out, accurately identifying diseases with a 97.44% success rate. This research is crucial for Bangladesh, where mango farming is essential for the economy. By spotting diseases early, we can improve mango production, quality, and the livelihoods of farmers. This automated system offers a practical way to manage mango diseases in regions with similar climates
Comparative evaluation of left ventricle segmentation using improved pyramid scene parsing network in echocardiography
Automatic segmentation of the left ventricle is a challenging task due to the presence of artifacts and speckle noise in echocardiography. This paper studies the ability of a fully supervised network based on pyramid scene parsing network (PSPNet) to implement echocardiographic left ventricular segmentation. First, the lightweight MobileNetv2 was selected to replace ResNet to adjust the coding structure of the neural network, reduce the computational complexity, and integrate the pyramid scene analysis module to construct the PSPNet; secondly, introduce dilated convolution and feature fusion to propose an improved PSPNet model, and study the impact of pre-training and transfer learning on model segmentation performance; finally, the public data set challenge on endocardial three-dimensional ultrasound segmentation (CETUS) was used to train and test different backbone and initialized PSPNet models. The results demonstrate that the improved PSPNet model has strong segmentation advantages in terms of accuracy and running speed. Compared with the two classic algorithms VGG and Unet, the dice similarity coefficient (DSC) index is increased by an average of 7.6%, Hausdorff distance (HD) is reduced by 2.9%, and the mean intersection over union (mIoU) is improved by 8.8%. Additionally, the running time is greatly shortened, indicating good clinical application potential
An algorithm for controlling the transmission of video streams in a flying ad hoc network
This article discussing the enhancement of video surveillance in various territories through the implementation of a flying ad hoc network (FANET). The primary objective of the surveillance is for search and rescue operations. To optimize the quality of FANET video broadcasting, a decision-making algorithm for video stream management is introduced. This algorithm evaluates the likelihood of achieving high-quality video transmission. Depending on the assessed probabilities, the algorithm recommends one of the following actions: initiating a new video stream transmission, reducing the average length of wireless channels, or discontinuing the transmission of low-information video streams. Computational experiments demonstrate a significant improvement in the accuracy of decision-making regarding the management of video stream transmission to FANET when utilizing the proposed algorithm
Deep transfer learning for classification of ECG signals and lip images in multimodal biometric authentication systems
Authentication plays an essential role in diverse kinds of application that requires security. Several authentication methods have been developed, but biometric authentication has gained huge attention from the research community and industries due to its reliability and robustness. This study investigates multimodal authentication techniques utilizing electrocardiogram (ECG) signals and face lip images. Leveraging transfer learning from pre-trained ResNet and VGG16 models, ECG signals and photos of the lip area of the face are used to extract characteristics. Subsequently, a convolutional neural network (CNN) classifier is employed for classification based on the extracted features. The dataset used in this study comprises ECG signals and face lip images, representing distinct biometric modalities. Through the integration of transfer learning and CNN classification, improving the reliability and precision of multimodal authentication systems is the primary objective of the study. Verification results show that the suggested method is successful in producing trustworthy authentication using multimodal biometric traits. The experimental analysis shows that the proposed deep transfer learning-based model has reported the average accuracy, F1-score, precision, and recall as 0.962, 0.970, 0.965, and 0.966, respectively
Early goat disease detection using temperature models: k-nearest neighbor, decision tree, naive Bayes, and random forest
This study aims to aid livestock activities by enabling early detection of diseases in goats through body temperature measurement. Early detection is crucial to prevent disease spread and improve livestock welfare. Using the knowledge discovery in databases (KDD) methodology, the study involves collecting, processing, and analyzing goat body temperature data. Four algorithms—k-nearest neighbor (KNN), decision tree, naive Bayes, and random forest—were used to develop disease detection models. The decision tree algorithm was found to be the most accurate, achieving 100% accuracy. This demonstrates its effectiveness in detecting diseases based on body temperature. Implementing this model is expected to significantly benefit farmers by helping maintain the health and productivity of their livestock
Optimizing nitik batik classification through comparative analysis of image augmentation
Nitik batik is one of the most intricate and culturally significant motifs in Yogyakarta's batik tradition, characterized by its complex, geometric dot-based patterns. The unique challenges of automatically classifying nitik batik motifs stem from the high variability within the class and the limited availability of training data. This study investigates how different image data augmentation techniques can enhance the performance of a random forest classifier for nitik batik motifs. Techniques such as geometric transformations (flip, rotate, and scaling), intensity transformations (cut-out, grid mask, and random erasing), non-instance level augmentation (pairing samples), and unconditional image generation (deep convolutional generative adversarial network (DCGAN)) were used to expand the dataset and improve the model's ability to generalize. The results show that specific techniques, notably flip, cut-out, and DCGAN, significantly improved classification accuracy, with flip achieving the highest accuracy improvement of 20.20%, followed by cut-out at 19.27% and DCGAN at 16.25%. Moreover, DCGAN demonstrated the lowest standard deviation (0.78%), indicating high stability and robustness in classification performance across multiple validation folds. These findings suggest that augmentation techniques effectively improve classification accuracy and enhance the model's ability to generalize from limited and complex datasets