International Journal of Advances in Intelligent Informatics
Not a member yet
235 research outputs found
Sort by
Automatic note generator for Javanese gamelan music accompaniment using deep learning
Javanese gamelan is a traditional form of music from Indonesia with a variety of styles and patterns. One of these patterns is the harmony music of the Bonang Barung and Bonang Penerus instruments. When playing gamelan, the resulting patterns can vary based on the music’s rhythm or dynamics, which can be challenging for novice players unfamiliar with the gamelan rules and notation system, which only provides melodic notes. Unlike in modern music, where harmony notes are often the same for all instruments, harmony music in Javanese gamelan is vital in establishing the character of a song. With technological advancements, musical composition can be generated automatically without human participation, which has become a trend in music generation research. This study proposes a method to generate musical accompaniment notes for harmony music using a bidirectional long-term memory (BiLSTM) network and compares it with recurrent neural network (RNN) and long-term memory (LSTM) models that use numerical notation to represent musical data, making it easier to learn the variations of harmony music in Javanese gamelan. This method replaces the gamelan composer in completing the notation for all the instruments in a song. To evaluate the generated harmonic music, note distance, dynamic time warping (DTW), and cross-correlation techniques were used to measure the distance between the system-generated results and the gamelan composer's creations. In addition, audio features were extracted and used to visualize the audio. The experimental results show that all models produced better accuracy results when using all features of the song, reaching a value of around 90%, compared to using only 2 features (rhythm and note of melody), which reached 65-70%. Furthermore, the BiLSTM model produced musical harmonies that were more similar to the original music (+93%) than those generated by the LSTM (+92%) and RNN (+90%). This study can be applied to performing Javanese gamelan music
Pneumonia Detection on X-Ray Imaging using Softmax Output in Multilevel Meta Ensemble Algorithm of Deep Convolutional Neural Network Transfer Learning Models
Pneumonia is the leading cause of death from a single infection worldwide in children. A proven clinical method for diagnosing pneumonia is through a chest X-ray. However, the resulting X-ray images often need clarification, resulting in subjective judgments. In addition, the process of diagnosis requires a longer time. One technique can be applied by applying advanced deep learning, namely, Transfer Learning with Deep Convolutional Neural Network (Deep CNN) and modified Multilevel Meta Ensemble Learning using Softmax. The purpose of this research was to improve the accuracy of the pneumonia classification model. This study proposes a classification model with a meta-ensemble approach using five classification algorithms: Xception, Resnet 15V2, InceptionV3, VGG16, and VGG19. The ensemble stage used two different concepts, where the first level ensemble combined the output of the Xception, ResNet15V2, and InceptionV3 algorithms. Then the output from the first ensemble level is reused for the following learning process, combined with the output from other algorithms, namely VGG16 and VGG19. This process is called ensemble level two. The classification algorithm used at this stage is the same as the previous stage, using KNN as a classification model. Based on experiments, the model proposed in this study has better accuracy than the others, with a test accuracy value of 98.272%. The benefit of this research could help doctors as a recommendation tool to make more accurate and timely diagnoses, thus speeding up the treatment process and reducing the risk of complications
IDSX-Attention: Intrusion detection system (IDS) based hybrid MADE-SDAE and LSTM-Attention mechanism
An Intrusion Detection System (IDS) is essential for automatically monitoring cyber-attack activity. Adopting machine learning to develop automatic cyber attack detection has become an important research topic in the last decade. Deep learning is a popular machine learning algorithm recently applied in IDS applications. The adoption of complex layer algorithms in the term of deep learning has been applied in the last five years to increase IDS detection effectiveness. Unfortunately, most deep learning models generate a large number of false negatives, leading to dominant mistake detection that can affect the performance of IDS applications. This paper aims to integrate a statistical model to remove outliers in pre-processing, SDAE, responsible for reducing data dimensionality, and LSTM-Attention, responsible for producing attack classification tasks. The model was implemented into the NSL-KDD dataset and evaluated using Accuracy, F1, Recall, and Confusion metrics measures. The results showed that the proposed IDSX-Attention outperformed the baseline model, SDAE, LSTM, PCA-LSTM, and Mutual Information (MI)-LSTM, achieving more than a 2% improvement on average. This study demonstrates the potential of the proposed IDSX-Attention, particularly as a deep learning approach, in enhancing the effectiveness of IDS and addressing the challenges in cyber threat detection. It highlights the importance of integrating statistical models, deep learning, and dimensionality reduction mechanisms to improve IDS detection. Further research can explore the integration of other deep learning algorithms and datasets to validate the proposed model's effectiveness and improve the performance of IDS
Leveraging social media data using latent dirichlet allocation and naïve bayes for mental health sentiment analytics on Covid-19 pandemic
In Malaysia, during the early stages of the COVID-19 pandemic, the negative impact on mental health became noticeable. The public's psychological and behavioral responses have risen as the COVID-19 outbreak progresses. A high impression of severity, vulnerability, impact, and fear was the element that influenced higher anxiety. Social media data can be used to track Malaysian sentiments in the COVID-19 era. However, it is often found on the internet in text format with no labels, and manually decoding this data is usually complicated. Furthermore, traditional data-gathering approaches, such as filling out a survey form, may not completely capture the sentiments. This study uses a text mining technique called Latent Dirichlet Allocation (LDA) on social media to discover mental health topics during the COVID-19 pandemic. Then, a model is developed using a hybrid approach, combining both lexicon-based and Naïve Bayes classifier. The accuracy, precision, recall, and F-measures are used to evaluate the sentiment classification. The result shows that the best lexicon-based technique is VADER with 72% accuracy compared to TextBlob with 70% accuracy. These sentiments results allow for a better understanding and handling of the pandemic. The top three topics are identified and further classified into positive and negative comments. In conclusion, the developed model can assist healthcare workers and policymakers in making the right decisions in the upcoming pandemic outbreaks
Fault diagnosis-based SDG transfer for zero-sample fault symptom
The traditional fault diagnosis models cannot achieve good fault diagnosis accuracy when a new unseen fault class appears in the test set, but there is no training sample of this fault in the training set. Therefore, studying the unseen cause-effect problem of fault symptoms is extremely challenging. As various faults often occur in a chemical plant, it is necessary to perform fault causal-effect diagnosis to find the root cause of the fault. However, only some fault causal-effect data are always available to construct a reliable causal-effect diagnosis model. Another worst thing is that measurement noise often contaminates the collected data. The above problems are very common in industrial operations. However, past-developed data-driven approaches rarely include causal-effect relationships between variables, particularly in the zero-shot of causal-effect relationships. This would cause incorrect inference of seen faults and make it impossible to predict unseen faults. This study effectively combines zero-shot learning, conditional variational autoencoders (CVAE), and the signed directed graph (SDG) to solve the above problems. Specifically, the learning approach that determines the cause-effect of all the faults using SDG with physics knowledge to obtain the fault description. SDG is used to determine the attributes of the seen and unseen faults. Instead of the seen fault label space, attributes can easily create an unseen fault space from a seen fault space. After having the corresponding attribute spaces of the failure cause, some failure causes are learned in advance by a CVAE model from the available fault data. The advantage of the CVAE is that process variables are mapped into the latent space for dimension reduction and measurement noise deduction; the latent data can more accurately represent the actual behavior of the process. Then, with the extended space spanned by unseen attributes, the migration capabilities can predict the unseen causes of failure and infer the causes of the unseen failures. Finally, the feasibility of the proposed method is verified by the data collected from chemical reaction processes
Detection of multi-class arrhythmia using heuristic and deep neural network on edge device
Heart disease is a heart condition that sometimes causes a person to die suddenly. One indication is a rhythm disorder known as arrhythmia. Multi-class Arrhythmia Detection has followed: QRS complex detection procedure and arrhythmia classification based on the QRS complex morphology. We proposed an edge device that detects QRS complexes based on variance analysis (QVAT) and the arrhythmia classification based on the QRS complex spectrogram. The classifier uses two-dimensional convolutional neural network (2D CNN) deep learning. We use a single board computer and neural network compute stick to implement the edge device. The outcomes are a prototype device cardiologists use as a supporting tool for analysing ECG signals, and patients can also use it for self-tests to figure out their heart health. To evaluate the performance of our edge device, we tested using the MIT-BIH database because other methods also use the data. The QVAT sensitivity and predictive positive are 99.81% and 99.90%, respectively. Our classifier's accuracy, sensitivity, predictive positive, specificity, and F1-score are 99.82%, 99.55%, 99.55%, 99.89%, and 99.55%, respectively. The experiment result of arrhythmia classification shows that our method outperforms the others. Still, for r-peak detection, the QVAT implemented in an edge device is comparable to the other methods. In future work, we can improve the performance of r-peak detection using the double-check algorithm in QVAT and cross-check the QRS complex detection by adding 1 class to the classifier, namely the non-QRS class
Lightweight pyramid residual features with attention for person re-identification
Person re-identification is one of the problems in the computer vision field that aims to retrieve similar human images in some image collections (or galleries). It is very useful for people searching or tracking in a closed environment (like a mall or building). One of the highlighted things on person re-identification problems is that the model is usually designed only for performance instead of performance and computing power consideration, which is applicable for devices with limited computing power. In this paper, we proposed a lightweight residual network with pyramid attention for person re-identification problems. The lightweight residual network adopted from the residual network (ResNet) model used for CIFAR dataset experiments consists of not more than two million parameters. An additional pyramid features extraction network and attention module are added to the network to improve the classifier's performance. We use CPFE (Context-aware Pyramid Features Extraction) network that utilizes atrous convolution with different dilation rates to extract the pyramid features. In addition, two different attention networks are used for the classifier: channel-wise and spatial-based attention networks. The proposed classifier is tested using widely use Market-1501 and DukeMTMC-reID person re-identification datasets. Experiments on Market-1501 and DukeMTMC-reID datasets show that our proposed classifier can perform well and outperform the classifier without CPFE and attention networks. Further investigation and ablation study shows that our proposed classifier has higher information density compared with other person re-identification methods
Improving convolutional neural network based on hyperparameter optimization using variable length genetic algorithm for english digit handwritten recognition
Convolutional Neural Networks (CNNs) perform well compared to other deep learning models in image recognition, especially in handwritten alphabetic numeral datasets. CNN's challenging task is to find an architecture with the right hyperparameters. Usually, this activity is done by trial and error. A genetic algorithm (GA) has been widely used for automatic hyperparameter optimization. However, the original GA with fixed chromosome length allows for suboptimal solution results because CNN has a variable number of hyperparameters depending on the depth of the model. Previous work proposed variable chromosome lengths to overcome the drawbacks of native GA. This paper proposes a variable length GA by adding global hyperparameters, namely optimizer and learning speed, to systematically and automatically tune CNN hyperparameters to improve performance. We optimize seven hyperparameters, such as the learning rate. Optimizer, kernel, filter, activation function, number of layers and pooling. The experimental results show that a population of 25 produces the best fitness value and average fitness. In addition, the comparison results show that the proposed model is superior to the basic model based on accuracy. The experimental results show that the proposed model is about 99.18% higher than the baseline model
Detecting and monitoring the development stages of wild flowers and plants using computer vision: approaches, challenges and opportunities
Wild flowers and plants play an important role in protecting biodiversity and providing various ecosystem services. However, some of them are endangered or threatened and are entitled to preservation and protection. This study represents a first step to develop a computer vision system and a supporting mobile app for detecting and monitoring the development stages of wild flowers and plants, aiming to contribute to their preservation. It first introduces the related concepts. Then, surveys related work and categorizes existing solutions presenting their key features, strengths, and limitations. The most promising solutions and techniques are identified. Insights on open issues and research directions in the topic are also provided. This paper paves the way to a wider adoption of recent results in computer vision techniques in this field and for the proposal of a mobile application that uses YOLO convolutional neural networks to detect the stages of development of wild flowers and plants
Deep learning pest detection on Indonesian red chili pepper plant based on fine-tuned YOLOv5
.This research developed a pest detection model for Indonesian red chili pepper based on fine-tuned YOLOv5. Indonesian red chili pepper is the third largest vegetable commodity produced in Indonesia. Pest attacks disrupt the quantity and quality of crop yields. To control pests effectively, it is necessary to detect the type of pest correctly. A viable solution is to leverage computer vision and deep learning technologies. However, no previous studies have developed a pest detection model for Indonesian red chili pepper based on this technology. YOLOv5 is a variant of the YOLO object detection algorithm, which has major advantages in terms of computation cost and execution speed. The dataset comprises 4,994 image files collected from a chili plantation in Bengkulu province, Indonesia, covering 4 different classes and a total of 10,683 pests. The image is 1216 x1216 px with the smallest, largest, and average object dimensions of 2%, 35%, and 4% of the image dimensions. The training model used is fine-tuning YOLOv5s with variations of patience as an early stop parameter of 100, 200, and 300. The evaluation of the trained model is based on train loss, validation loss, and [email protected]:0.95, the best-trained model is the 445th epoch on patience 100 with the best confidence value of 0.321 and the highest TF1 of 0.74. From the best-trained model testing on the test dataset, the [email protected] performance for all classes is 81.3%. The model not only detected large pests but was also able to detect objects that were small in size compared to the image size. The best-trained model's best [email protected] performance and speed are 82.6% and 20 ms/image, or 50 fps on NVIDIA P100 GPU