Jurnal Politeknik Negeri Batam (PoliBatam)
Not a member yet
3001 research outputs found
Sort by
SemetonBug: Next-Generation Machine Learning-Powered Code Analyzer for Precision Bug Detection and Dynamic Error Localization
Bug detection in Python programming is a crucial challenge in software development. This research proposes SemetonBug, a machine learning-based system for automatically detecting bugs in Python code. The system utilizes a Random Forest Classifier as the main model, with features extracted from the syntactic structure of the code using an Abstract Syntax Tree (AST). The dataset consists of 200 Python files, divided into 100 files with bugs and 100 files without bugs. The model is optimized using Grid Search Cross Validation, with the best combination of n_estimators = 300, max_depth = 20, min_samples_split = 5, and min_samples_leaf = 2. Evaluation results show that the model achieves 85% accuracy, 0.84 precision, 0.87 recall, and 0.86 F1-score. The detected bugs are stored in an Excel file for further analysis. By leveraging machine learning, SemetonBug enhances efficiency and accuracy in bug identification compared to traditional rule-based methods. These findings highlight the potential of machine learning models in improving software quality and reducing coding errors automatically
Explainable Transformer and Machine Learning Models in Predicting Tuberculosis Treatment Outcomes. A Systematic Review
Tuberculosis (TB) remains a major health challenge, and predicting treatment outcomes continues to be difficult in real-world settings. Recent advances in Artificial Intelligence (AI), particularly transformer-based models, have shown promise in modelling longitudinal, multimodal, and heterogeneous TB data. However, their clinical adoption is constrained by limited interpretability, fairness concerns, and deployment challenges. This study presents a systematic literature review of explainable transformer and machine learning models used for TB prognosis. Following PRISMA guidelines, searches across ACM, IEEE Xplore, PubMed, and ScienceDirect identified 17 peer-reviewed studies published between 2020 and 2025 that met the inclusion criteria. The review synthesises evidence on predictive performance, explainability techniques, and deployment considerations. Findings indicate that transformer-based and deep learning models generally outperform conventional machine learning approaches on longitudinal and multimodal data. In contrast, traditional models remain competitive for tabular clinical datasets. Explainability approaches are dominated by feature importance methods and SHAP, with limited use of intrinsic transformer interpretability mechanisms. Persistent challenges include data scarcity, limited generalisability, computational overhead, insufficient evaluation of fairness, and weak alignment with real-world TB care workflows. Building on these findings, the study proposes the Explainable Transformer Adoption Model for TB Prognosis (ETAMTB) as a conceptual clinical adoption framework integrating multimodal transformers, explainability layers, clinician-facing interfaces, and deployment enablers. Overall, the review concludes that effective AI adoption in TB care requires balancing predictive performance, interpretability, and equity, and that explainable transformers should currently be viewed as promising but largely experimental tools rather than deployment-ready solutions
Leveraging Convolutional Neural Networks and Random Forests for Advanced Sentiment Classification of Social Media Responses on Public Services
In the digital era, social media has become a significant channel for citizens to express their opinions on government services. In Indonesia, particularly in the context of municipal issues, understanding public sentiment is essential to improving public service delivery. This study analyzes user comments from Facebook, Instagram, Twitter, and YouTube to capture public responses toward local government performance. Departing from previous studies that typically employ binary or three-level classifications, this research implements a five-category sentiment scheme: Very Good, Good, Fair, Poor, and Very Poor. A hybrid model combining a Convolutional Neural Network (CNN) for feature extraction and a Random Forest (RF) classifier is proposed to address this multi-class task. The model achieves 87% accuracy, outperforming the individual CNN and RF models. The results demonstrate the potential of social media–based sentiment analysis to enhance public service quality in Indonesia
Application of the Hybrid Entropy–VIKOR Method for Urban EV Charging Station Prioritization in Central Java
The rapid growth of electric vehicles (EVs) in Indonesia necessitates strategic and data-driven planning of public electric vehicle charging stations (EVCS/SPKLU), particularly in urban areas with high mobility and economic activity such as Central Java Province. This study aims to determine priority locations for EVCS development using an objective hybrid Multi-Criteria Decision Making (MCDM) approach. Official secondary data from the Central Java Provincial Statistics Agency (BPS) for the 2023-2024 period are employed, involving 12 urban areas as decision alternatives. Criteria weighting is performed using the Entropy method to minimize subjectivity, while alternative ranking is conducted using the VIKOR method to obtain the best compromise solution. Six criteria are considered, including installed electrical capacity, population density, motor vehicle density, gross regional domestic product (GRDP) per capita, percentage of regional area, and the number of commercial facilities. The results indicate that Cilacap Regency (Q = 0.000), Banyumas Regency (Purwokerto) (Q = 0.271), and Tegal Regency (Q = 0.492) are the highest-priority locations for EVCS development. Ranking validation using the Normalized Discounted Cumulative Gain (NDCG) yields a value of 0.963, indicating a very high level of agreement with the reference ranking, while the Spearman rank correlation coefficient of 0.832 reflects a strong positive consistency. The novelty of this study lies in integrating up-to-date regional statistical indicators with a fully objective Entropy-VIKOR framework complemented by ranking validation, providing a reliable data-driven decision-support tool for policymakers and investors in regional EVCS infrastructure planning.The rapid growth of electric vehicles (EVs) in Indonesia necessitates strategic and data-driven planning of public electric vehicle charging stations (EVCS/SPKLU), particularly in urban areas with high mobility and economic activity such as Central Java Province. This study aims to determine priority locations for EVCS development using an objective hybrid Multi-Criteria Decision Making (MCDM) approach. Official secondary data from the Central Java Provincial Statistics Agency (BPS) for the 2023-2024 period are employed, involving 12 urban areas as decision alternatives. Criteria weighting is performed using the Entropy method to minimize subjectivity, while alternative ranking is conducted using the VIKOR method to obtain the best compromise solution. Six criteria are considered, including installed electrical capacity, population density, motor vehicle density, gross regional domestic product (GRDP) per capita, percentage of regional area, and the number of commercial facilities. The results indicate that Cilacap Regency (Q = 0.000), Banyumas Regency (Purwokerto) (Q = 0.271), and Tegal Regency (Q = 0.492) are the highest-priority locations for EVCS development. Ranking validation using the Normalized Discounted Cumulative Gain (NDCG) yields a value of 0.963, indicating a very high level of agreement with the reference ranking, while the Spearman rank correlation coefficient of 0.832 reflects a strong positive consistency. The novelty of this study lies in integrating up-to-date regional statistical indicators with a fully objective Entropy-VIKOR framework complemented by ranking validation, providing a reliable data-driven decision-support tool for policymakers and investors in regional EVCS infrastructure planning
Sentiment Analysis of President Prabowo\u27s Performance on Twitter (X) with a Comparative Study of SVM, XGBoost, and AdaBoost
This study was conducted to understand how Twitter (X) users respond to President Prabowo\u27s performance through machine learning-based sentiment analysis. Data was collected using a dataset crawling approach, then processed through a series of pre-processing stages such as cleansing, case folding, tokenisation, stopword removal, and stemming before being converted into a numerical representation with TF-IDF. The class imbalance problem was addressed by applying SMOTE so that the model could learn more evenly. Three classification algorithms, SVM, XGBoost, and AdaBoost, were tested with the help of GridSearchCV to obtain the best parameter configuration. The research evaluation showed that the XGBoost algorithm was able to provide the best performance with an accuracy of 0.8443, followed by the SVM algorithm with an RBF kernel, which achieved an accuracy of 0.8135. The AdaBoost algorithm came in third with an accuracy of 0.7868. These findings indicate that the boosting approach, especially XGBoost, is better able to handle complex language patterns and high-dimensional text data characteristics. Overall, this study provides an overview of public opinion trends on social media and can be used as a reference for the development of sentiment analysis models in future research.This study was conducted to understand how Twitter (X) users respond to President Prabowo\u27s performance through machine learning-based sentiment analysis. Data was collected using a dataset crawling approach, then processed through a series of pre-processing stages such as cleansing, case folding, tokenisation, stopword removal, and stemming before being converted into a numerical representation with TF-IDF. The class imbalance problem was addressed by applying SMOTE so that the model could learn more evenly. Three classification algorithms, SVM, XGBoost, and AdaBoost, were tested with the help of GridSearchCV to obtain the best parameter configuration. The research evaluation showed that the XGBoost algorithm was able to provide the best performance with an accuracy of 0.8443, followed by the SVM algorithm with an RBF kernel, which achieved an accuracy of 0.8135. The AdaBoost algorithm came in third with an accuracy of 0.7868. These findings indicate that the boosting approach, especially XGBoost, is better able to handle complex language patterns and high-dimensional text data characteristics. Overall, this study provides an overview of public opinion trends on social media and can be used as a reference for the development of sentiment analysis models in future research
Penerapan Teknik Rotoscoping Sudut 360 Derajat pada Animasi 2D Menggunakan Toon Boom Harmony
In this digital era, modern animation has developed as a discipline that integrates elements of art and technology. As an art form, animation is based on fundamental principles that serve as the foundation of its knowledge, such as slow in–slow out, squash and stretch, anticipation, to staging. Technological advancements have further enriched these principles and led to the emergence of various animation techniques, one of which is rotoscoping, first introduced by Fleischer Studio in 1917. The rotoscoping technique allows animators to replicate human movements precisely by tracing over live-action footage, resulting in animation with a high level of smoothness and accuracy. Along with the development of visual needs, the Motion Capture technique has emerged, capable of recording movements in a 360-degree angle, although its implementation has cost and equipment limitations. In the context of 2D animation, the development of software such as Toon Boom Harmony provides solutions through features that support effective rotoscoping applications, including onion skinning, timeline flexibility, and vector layering. This study focuses on the application of 360-degree rotoscoping techniques in 2D animation, emphasizing aspects of perspective, lighting, and shadow to create the illusion of realistic depth. Based on the results of a questionnaire with 10 respondents, the 360-degree rotoscoping technique achieved an overall percentage score of 78.89%, categorized as good.Pada era digital saat ini, animasi modern berkembang sebagai disiplin ilmu yang mengintegrasikan unsur seni dan teknologi. Sebagai bidang seni, animasi berlandaskan prinsip-prinsip dasar yang menjadi fondasi keilmuannya, seperti slow in–slow out, squash and stretch, anticipation, hingga staging. Perkembangan teknologi kemudian memperkaya prinsip-prinsip tersebut dan melahirkan berbagai teknik animasi, salah satunya adalah rotoscoping yang pertama kali diperkenalkan oleh Fleischer Studio pada tahun 1917. Teknik rotoscoping memungkinkan animator menirukan gerakan manusia secara presisi dengan menelusuri ulang rekaman gerak nyata, sehingga menghasilkan animasi dengan tingkat kehalusan dan akurasi yang tinggi. Seiring perkembangan kebutuhan visual, muncul teknik Motion Capture yang mampu merekam gerakan dalam sudut 360 derajat, meskipun penerapannya memiliki keterbatasan biaya dan perangkat. Dalam konteks animasi 2D, perkembangan perangkat lunak seperti Toon Boom Harmony memberikan solusi melalui fitur-fitur yang mendukung penerapan rotoscoping secara efektif, termasuk onion skinning, fleksibilitas timeline, dan vector layering. Penelitian ini berfokus pada penerapan teknik rotoscoping 360 derajat pada animasi 2D dengan menekankan aspek perspektif, pencahayaan, dan bayangan untuk menciptakan ilusi kedalaman yang realistis. Berdasarkan hasil kuesioner terhadap 10 responden, teknik rotoscoping dengan sudut 360 derajat memperoleh nilai persentase keseluruhan sebesar 78,89% dengan kategori baik.  
Implementation of the Random Forest Algorithm for Anomaly Detection of Phishing Attacks on Computer Networks
Phishing attacks are among the most common and dangerous cyber security threats, as they exploit manipulation techniques to steal sensitive user information. This research focuses on leveraging the Random Forest algorithm to identify anomalies caused by phishing attacks in computer network environments. Random Forest was selected for its superior classification performance and its capability to handle a wide variety of data types with minimal over fitting. The experimental dataset consists of captured network traffic, containing both benign activities and malicious events labeled as phishing. The data underwent pre-processing, feature selection, and model training using Random Forest. The experimental results show that the model achieved 98% accuracy, with precision 98%, recall 98%, and F1-score 98%. This study also reveals that URL features such as the percentage of external links redirecting back to the original domain, frequent domain name mismatches, the number of hyphens (-) in the URL, and the presence of data submission via email are relevant and effective in distinguishing phishing from non-phishing URLs. These findings confirm that Random Forest can serve as an effective method for identifying phishing attacks based on URL characteristics
Comparative Analysis of Foot Sole Classification Models: Evaluating Logistic Regression, SVM, and Random Forest
Accurate sole classification and types can aid applications in healthcare, sports, and biometrics such as diagnosis of high arch or flat foot disease, as well as in improved design of custom orthotics and enhanced gait analysis to improve sports performance. When applied to large-scale datasets, traditional methods for foot sole classification are inefficient as they are often manual, time-consuming and prone to human error. Machine learning has the ability to significantly improve accuracy and efficiency in automating this process. The proposed method uses Logistic Regression model compared to Support Vector Machines (SVM), and Random Forest using Orange Data Mining. The performance of these algorithms changes depending on the complexity of the data and model parameters. There are three types of feet that will be processed in this image analytics namely normal arch, flat foot and high arch. The pre-trained models used are Inception V3, VGG-19 and SqueezeNet. Logistic Regression model showed the best overall performance with superior parameter values such as AUC of 0.973, Classification Accuracy (CA) of 0.933, and MCC of 0.902, and demonstrated reliability and balance between precision and recall
Comparative Analysis of IndoBERT and Classic Machine Learning Models for Sentiment Classification of Education Policy on Social Media X
Leadership changes provide an opportunity for new education policies, generating complex public opinions on social media X that often contain implicit sentiments like satire, making automated analysis challenging. This study aims to address this challenge by conducting a comparative analysis to evaluate the effectiveness of the IndoBERT model in capturing nuanced, implicit sentiments compared to traditional machine learning classifiers (SVM, Naïve Bayes, Logistic Regression, KNN, and Random Forest). This research utilized a dataset of Indonesian-language tweets, collected via crawling. Data was pre-processed (cleaning, case folding, etc.) and labeled (positive/negative) using a hybrid Lexicon-LLM approach. The TF-IDF technique was used for feature extraction for the machine learning models, while IndoBERT used its internal tokenization. Models were evaluated using accuracy, precision, recall, and F1-score. The results showed that the IndoBERT model performed best with an accuracy score of 97%, significantly outperforming the other best machine learning models, namely Random Forest 95% and SVM 95%. This study concludes that the IndoBERT model is a superior and more robust solution for analyzing nuanced public sentiment on educational policies, demonstrating a greater ability to understand complex context and implicit language compared to traditional TF-IDF-based methods
Implementation of SSL-Vision Transformer (ViT) for Multi-Lung Disease Classification on X-Ray Images
Chest X-ray imaging is one of the most widely used modalities for lung disease screening; however, manual interpretation remains challenging due to overlapping pathological patterns and the frequent presence of multiple coexisting abnormalities. In recent years, Vision Transformer (ViT) models have demonstrated strong potential for medical image analysis by capturing global contextual relationships. Nevertheless, their performance is highly dependent on large-scale labeled datasets, which are costly and difficult to obtain in clinical settings. To address this limitation, this study proposes a Self-Supervised Learning Vision Transformer (SSL-ViT) framework for multi-label lung disease classification using the CheXpert-v1.0-small dataset. The proposed approach leverages self-supervised pretraining to learn robust and transferable visual representations from unlabeled chest X-ray images prior to supervised fine-tuning. A total of twelve clinically relevant thoracic disease labels are retained, while non-disease labels are excluded to enhance interpretability and reduce confounding effects. Experimental results demonstrate that SSL-ViT achieves a high recall of 0.73 and a peak AUC of 0.75 on the test set, indicating strong sensitivity in detecting pathological cases. Compared to the baseline ViT model, SSL-ViT exhibits a recall-oriented performance profile that is particularly suitable for screening applications, where minimizing false negatives is critical. Furthermore, Grad-CAM visualizations confirm that the model focuses on anatomically meaningful lung regions, supporting its clinical relevance. These findings suggest that SSL-enhanced Vision Transformers provide a robust and effective solution for multi-label chest X-ray screening tasks