149 research outputs found
Analysis of Taxpayer Behavior to Predict Motor Vehicle Tax Payments Using the Weighted Majority Voting Ensemble Approach
Taxpayer non-compliant behavior impacts Motor Vehicle Tax (MVT) revenues not following the predetermined targets. This behavior results in reduced income, and several regional development targets may not be achieved. Therefore, Regional Governments need to predict MVT payments to formulate future targets better. This research aims to analyze taxpayer behavior in predicting future MVT payments, whether the payments are compliant or late or non-payment. The proposed approach starts by analyzing and obtaining a dataset of taxpayer behavioral features. An ensemble classifier method based on Weighted Majority Voting (WMV) is used to predict payments. WMV was developed using the GridSearchCV technique to find optimal hyperparameter values to increase the model accuracy value for individual classifiers. The weight determined from the model accuracy value is converted into a ranking of the number of votes to maximize model performance. Next, feature ablation analysis is carried out to understand the contribution of each feature to model performance. The performance of the proposed system is evaluated using the confusion matrix, accuracy, precision, recall, and f1-score. The research results show that the WMV method performs better, with an accuracy of 96.247%, compared to the proposed individual classifier method in predicting MVT payments based on taxpayer behavior
Road Damage Detection Using YOLOv7 with Cluster Weighted Distance-IoU NMS
Road damage can occur everywhere. Potholes are one of the most common types of road damage. Previous research that used images as input for pothole detection used the Faster Regional Convolutional Neural Network (R-CNN) method. It has a large inference time because it is a two-stage detection method. The object detection method requires post-processing for its detection results to save only the best prediction from the method, namely, non-maximum suppression (NMS). However, the original NMS could not properly detect small, far, and two objects close to each other. Therefore, this research uses the YoloV7 method as the object detection method because it has better mean Average Precision (mAP) results and a lower inference time than other object detection methods; with an improved NMS method, namely Cluster Weighted Distance Intersection over Union (DIoU) NMS (CWD-NMS), to solve small or close potholes. When training YoloV7, we combined a new, independently collected pothole dataset, with previous public research datasets, where the detection results of the YoloV7 method were better than those of Faster R-CNN. The YoloV7 method was trained using various scenarios. The best scenario during training is using the best checkpoint without using a scheduler. The mAP.5 and mAP.5-.95 value of CWD-NMS was 89.20% and 63.30% with 10.30 millisecond per image for inference time
Combination of Historical Stock Data and External Factors In Improving Stock Price Prediction Performance
Stock price prediction continues to be a major focus for investors today, some previous studies often focus on technical analysis using historical stock price data and ignore external factors that can affect stock prices. The purpose of this research is to overcome the shortcomings of previous research by creating a stock price prediction model that combines historical stock data consisting of date, high, low, open, close, adj close, volume and external factors such as days, interest rates, inflation, and dividends. The data used came from 33 companies from 11 industrial sectors in Indonesia for 2267 trading days and evaluated the prediction performance using MSE, MAPE and R-squared. The results show a significant improvement in the evaluation metrics when external factors are added. This shows the importance of such factors in improving the prediction analysis and increasing the reliability of the prediction model. This approach is expected to not only overcome the limitations of traditional methods but also utilize a combination of deep learning and machine learning to improve prediction accuracy. Thus, this research not only provides new insights in the field of financial analysis but also provides new insights and solutions for investors to make more informed and less risky decisions
IMPROVED LIP-READING LANGUAGE USING GATED RECURRENT UNITS
Lip-reading is one of the most challenging studies in computer vision. This is because lip-reading requires a large amount of training data, high computation time and power, and word length variation. Currently, the previous methods, such as Mel Frequency Cepstrum Coefficients (MFCC) with Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) with LSTM, still obtain low accuracy or long-time consumption because they use LSTM. In this study, we solve this problem using a novel approach with high accuracy and low time consumption. In particular, we propose to develop lip language reading by utilizing face detection, lip detection, filtering the amount of data to avoid overfitting due to data imbalance, image extraction based on CNN, voice extraction based on MFCC, and training model using LSTM and Gated Recurrent Units (GRU). Experiments on the Lip Reading Sentences dataset show that our proposed framework obtained higher accuracy when the input array dimension is deep and lower time consumption compared to the state-of-the-art
Network Intrusion Detection System with Time-Based Sequential Cluster Models using LSTM and GRU
Technological development and the growth of the internet today have a positive and revolutionary impact in various areas of human life, such as banking, health, science, and more. The presence of Open Data and Open API also facilitates the exchange of data and information between entities without the restrictions imposed by different regions and geographical areas. However, information openness not only has a positive impact but also makes data vulnerable to data theft, viruses, and various other types of cyber attacks. The large-scale data exchange that occurs across the network poses a challenge in detecting unusual activity and new cyber attacks. Therefore, the existence of an Intrusion Detection System (IDS) is urgently essential. The IDS helps system administrators detect cyber attacks and network anomalies, thus minimizing the risk of data leaks and intrusions. The research developed a new approach using time-based sequential clustered data sets in the Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) models. This IDS model was implemented using the CIC-IDS 2018 data set, which has more than 4 million data lines. The capabilities and uniqueness of the LSTM and GRU models are used to classify and determine various attacks in IDS based on sequential data sets ordered by time and clustered according to the destination ports and protocols, such as TCP and UDP. The model was evaluated using the accuracy, precision, recall, and F-1 scores matrix, and the results showed that the time-based sequential clustered models in LSTM and GRU have an accurities of up to 97.21%. This suggests that this new approach is good enough to be applied to the future IDS models
Optimization of the Electronic Nose Sensor Array for Asthma Detection Based on Genetic Algorithm
The human body releases several gases and volatile organic compounds through exhaled breath. This compound can be used as markers of lung disease, including asthma. An electronic nose can play a role in determining a patient’s condition. The main problem that often occurs is the selection of appropriate sensors based on their characteristics and performance in detecting various gases to provide an optimal system while still providing high accuracy. Genetic algorithms have a good advantage in applying feature selection problems that can effectively solve noise and collinearity problems through three main genetic operators: crossover, mutation, and selection. This study aims to apply this method to determine the optimal number of gas sensors in identifying healthy people and asthma suspects through an exhaled breath. Several classification methods are combined with selected gas sensor arrays to obtain an optimized electronic nose, including support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost), artificial neural network (ANN), one-dimensional convolutional neural network (1D-CNN), long short-term memory (LSTM), gated recurrent unit (GRU), 1D CNN-LSTM, and 1D CNN-GRU. These machine-learning approaches are usually used for electronic nose systems as highly accurate classification methods depending on the parameters. The experimental results showed that the genetic algorithm produced five gas sensors that provided a certain sensor pattern on the exhaled breath from the asthma suspects. Meanwhile, the 1D-CNN model was chosen as a classification method for the asthma dataset with an accuracy of 96.6%, a precision of 96.1%, a recall of 95.5%, and an F1-score of 95.6%
Stacking-based ensemble learning for identifying artist signatures on paintings
Identifying artist signatures on paintings is essential for authenticating artworks and advancing digital humanities. An artist’s signature is a consistent element included in each painting that the artist creates, providing a unique identifier for their work. Traditional methods that rely on expert analysis and manual comparison are time-consuming and are prone to human error. Although convolutional neural networks (CNNs) have shown promise in automating this process, existing single-model approaches struggle with the diversity and complexity of artistic styles, leading to limitations in their performance and generalizability. Therefore, this study proposes an ensemble learning approach that integrates the predictive power of multiple CNN-based models. The proposed framework leverages the strengths of three state-of-the-art CNNs: EfficientNetB4, ResNet-50, and Xception. These models were independently trained, and the predictions were combined using a meta-learning strategy. To address class imbalance, data augmentation techniques and weighted loss functions were employed. The experimental results obtained on a dataset of more than 8,000 paintings from 50 artists demonstrate significant improvements over individual CNN architectures and other ensemble methods, thereby effectively capturing complex features and improving generalizability
- …
