International Journal of Advances in Data and Information System
Not a member yet
    161 research outputs found

    Sentiment Analysis of Twitter Users Towards Kartu Prakerja Program Using the Naive Bayes Method

    Full text link
    This study conducts a sentiment analysis of Twitter users regarding the Indonesian government’s Kartu Prakerja program, utilizing the Naive Bayes method for classification. Launched in 2020 to enhance employability skills amidst the COVID-19 pandemic, the program has garnered various public responses. A total of 836 tweets containing the keyword "Kartu Prakerja" were collected using the Twitter API and analyzed to determine sentiment distribution. Results indicate a predominance of neutral sentiment (800 tweets), with only 17 positive and 22 negative tweets. The Naive Bayes method achieved an accuracy of 95%, demonstrating its effectiveness in sentiment classification. However, comparisons with other methods, such as Support Vector Machine (SVM) and Recurrent Neural Network (RNN), reveal that these techniques yield higher accuracy rates (98.34% and 96%, respectively). This research highlights the importance of sentiment analysis in understanding public perceptions and informs policymakers about areas needing improvement. The findings underscore the potential of integrating advanced machine learning techniques to enhance sentiment analysis and provide insights into the effectiveness of government programs like Kartu Prakerja

    Ensemble Stacking of Machine Learning Approach for Predicting Corrosion Inhibitor Performance of Pyridazine Compounds

    Full text link
    Corrosion is a major challenge affecting various industrial sectors, leading to increased operational costs and decreased equipment efficiency. The use of organic corrosion inhibitors is one of the promising solutions. This study applies an ensemble algorithm with a stacking method to estimate pyridazine-derived compounds corrosion inhibition efficiency. This study utilized various molecular characteristics of pyridazine compounds as inputs to predict inhibition efficiency values. After evaluating several boosting models, the stacking technique was chosen as it showed the best results. Stacking Model 6, which combines XGB, LGBM, and CatBoost as the base model with Random Forest as the meta-model, produced the most accurate prediction with an RMSE of 0.055. These findings indicate that machine learning approaches can effectively and efficiently predict corrosion inhibitor performance. This method offers a faster and more economical alternative to conventional experimental methods

    Indonesian to Bengkulu Malay Statistical Machine Translation System

    Full text link
    Machine translation is an automatic tool that can process language translation from one language to another. This research focuses on developing Statistical Machine Translation (SMT) from Indonesian to Bengkulu Malay and evaluating the quality of the machine translation output. The training and testing data consist of parallel corpora taken from Bengkulu Malay dictionaries and online resources for Indonesian corpora, with a total of 5261 parallel sentence pairs. Several steps are performed in SMT. The initial step is preprocessing, aimed at preparing the parallel corpus. After that, a training phase is conducted, where the parallel corpus is processed to build language and translation models. Subsequently, a testing phase is carried out, followed by an evaluation phase. Based on the research results, various factors influence the quality of SMT translation output. The most important factor is the quantity and quality of the parallel corpus used as the foundation for developing translation and language models. The machine translation output is automatically evaluated using the Bilingual Evaluation Understudy (BLEU), indicating accuracy values observed when using 500 sentences, 1500 sentences, 2500 sentences, 4000 sentences, and 5261 sentences are 80.56%, 90.86%, 92.50%, 92.91%, and 94.48% respectively

    Comparative Analysis of Cryptocurrency Prediction based on Deep Learning, Decision Tree, Gradient Boosted Tree, Random Tree, and k-NN Model

    Full text link
    Cryptocurrency being a digital or virtual currency that uses cryptography to secure transactions and control the creation of new units. Bitcoin, one of the most popular cryptocurrency, offers various advantages such as security, transparency, and efficiency. The value of Bitcoin can change over time, similar to the regular currencies, and the need to predict the value can be as important as those in the regular. The prediction can be done by multiple algorithms. The purpose of this research is to compare five algorithms in predicting bitcoin value based on Root Mean Squared Error (RMSE) and Squared Error (R2). The five algorithms compared can model the prediction of changes in the bitcoin cryptocurrency, effectively. Based on the experiment, Random Forest outperformed the other algorithms based on its RMSE and R2 resul

    Classification of Students\u27 Academic Performance Using Neural Network and C4.5 Model

    Full text link
    ducation involves deliberately creating an environment and learning process to empower students to fully utilize their academic and non-academic potential. It encompasses fostering spiritual qualities, religious understanding, self-discipline, cognitive abilities, and skills necessary for personal, societal, national, and state development. Madrasah Aliyah, in particular, emphasizes preparing participants for higher studies in areas of their interest, thereby showcasing their academic prowess. The evaluation of educational models like Neural Networks is crucial for ensuring their effectiveness in problem-solving. This involves testing and assessing the performance of the Neural Network model to ensure its accuracy and reliability. Similarly, the C4.5 method, based on condition data mining, is utilized to measure classification performance by assessing accuracy, precision, and recall. Research findings indicate that the neural network algorithm is more adept at accurately classifying students\u27 academic abilities compared to the C4.5 algorithm. With an accuracy of 92.6% for the neural network algorithm and 80.6% for the C4.5 algorithm, it is evident that the former is more precise in determining the classification of students\u27 academic abilities. This highlights the suitability of the neural network approach for classifying academic abilities in Madrasah Aliyah. Furthermore, the insights gained from this classification process can be extrapolated to benefit other madrasas

    Improvement The Accuracy of Convolutional Neural Network with Using Undersampling Method on Unbalanced Credit Card Dataset

    Full text link
    In this study, we address the challenge of imbalanced data in credit card fraud detection by proposing a novel approach that leverages Convolutional Neural Networks (CNNs) and undersampling techniques. The imbalance in the dataset, typical of real-world financial transactions, often leads to biased models favoring the majority class. To mitigate this, we employ undersampling to balance the classes, thereby enhancing the CNN\u27s ability to learn from minority instances crucial for fraud detection. Our method is validated on a large unbalanced credit card dataset, demonstrating significant improvements in accuracy compared to traditional CNN models trained on imbalanced data. We evaluate our approach using standard performance metrics, including precision, recall, and F1-score, showcasing its effectiveness in accurately identifying fraudulent transactions while minimizing false positives. Furthermore, we pro-vide insights into the CNN\u27s decision-making process through visualization techniques, shedding light on its ability to discern fraudulent patterns within the data. Our findings highlight the importance of addressing class imbalance in fraud detection tasks and underscore the efficacy of undersampling in enhancing the performance of deep learning models, particularly CNNs, in handling imbalanced datasets

    Data-Driven Analytical Model Using Machine Learning Algorithms: A Case Study on Clean and Healthy Living Behaviour in Surabaya City\u27s Coastal Areas

    Full text link
    The objective of this article is to use machine learning technology, specifically the Support Vector Machine (SVM) approach with a linear kernel, to analyze and predict clean and healthy living behavior (CHLB) in coastal dwellings in Surabaya City. To train the SVM model, researchers collect health and environmental data from the region. As a result, our model predicts house CHLB status with an 83% accuracy rate. The most important variables in this prediction are the amount of community access to appropriate sanitary facilities, the health of households, and the sustainability of public areas that meet health requirements. These findings have crucial implications for attempts to improve CHLB in Surabaya\u27s coastal areas in compliance with the National Medium-Term Development Plan (RPJMN) aims. Furthermore, the findings of this study can be used to build more targeted and long-term health policies in coastal communities

    Integrated Multi-Income Stream Performance Dashboard: a Japanese Corporate Banking Case

    Full text link
    In response to the complex operational challenges faced by Japanese Corporate Banking (JCB), arising from the coexistence of disparate core banking systems post-merger, this study aims to address inherent issues affecting marketing performance monitoring. The existing condition at JCB is characterized by data inconsistency, limited system interoperability, and fragmented income tracking through multiple Excel reports and management systems. Recognizing the gaps in the current setup, the research question revolves around how to enhance marketing performance monitoring effectively. The research objectives, therefore, encompass the development and implementation of a tailored integrated report utilizing the CRISP-DM methodology. This innovative performance dashboard harmoniously consolidates data from diverse sources, presenting a cohesive representation crucial for comprehensive marketing performance assessment. Leveraging advanced methodologies like data normalization and cross-platform integration, the research approach ensures streamlined income tracking, mitigating existing limitations. The data, drawn from various product applications, undergoes meticulous processing to facilitate a unified view on the integrated dashboard. The anticipated result is a significant improvement in monitoring efficiency, heightened data accuracy, and an empowered decision-making process within JCB\u27s operations. The business implication of this initiative is the tangible enhancement of the bank\u27s ability to comprehensively assess income performance, thereby elevating the quality of strategic decision-making and reinforcing JCB\u27s competitive positioning in the banking sector

    Managing Inherent IT Business Risk against Cyber Threats: a Decision Analysis Case Study of an Oil and Gas Company

    Full text link
    XYZ, an anonymized oil and gas company, aims to enhance cyber resilience by strategically managing inherent risk profiles in cybersecurity, aligned with business needs and stakeholder expectations. This research addresses challenges including Information Security Control determination, proficiency improvement in risk management, and ISMS preparedness. Additionally, it tackles procurement strategy for Security Operations Control across XYZ Group, operating under PSC Gross Split, Cost Recovery, and Non-PSC statuses. Utilizing diverse frameworks such as problem tree analysis, stakeholders’ power-interest matrix, MITRE ATT&CK, NIST 800-53, COBIT 2019, ISO 27005:2022, KAMI 5.0, and SMART, data analysis includes risk documents, interviews, and cyber-attack data. The research establishes effective IS Control for risk mitigation, readiness for Information Security Management System ISMS implementation, strategic programs enhancing risk management capability, and refined Security Operations Control procurement. These outcomes, incorporated into a collaborative contract structure, significantly mitigate cyber threats and potential impacts, such as disruptions to operations, revenue reduction, increased costs, data theft, and non-compliance

    Predicting Methanol Space-Time Yield from CO2 Hydrogenation Using Machine Learning: Statistical Evaluation of Penalized Regression Techniques

    Full text link
    This study investigates the effectiveness of machine learning techniques, specifically penalized regression models Ridge Regression, Lasso Regression, and Elastic Net Regression in predicting methanol space-time yield (STY) from CO2 hydrogenation data. Using a dataset derived from Cu-based catalyst research, the study implemented a comprehensive preprocessing approach, including data cleaning, imputation, outlier removal, and normalization. The models were rigorously evaluated through 10-fold cross-validation and tested on unseen data. Ridge Regression outperformed the other models, achieving the lowest Root Mean Squared Error (RMSE) of 0.7706, Mean Absolute Error (MAE) of 0.5627, and Mean Squared Error (MSE) of 0.5938. In comparison, Lasso and Elastic Net Regression models exhibited higher error metrics. Feature importance analysis revealed that Gas Hourly Space Velocity (GHSV) and Molar Masses of Support significantly influence catalytic activity. These findings suggest that Ridge Regression is a promising tool for accurately predicting methanol production, providing valuable insights for optimizing catalytic processes and advancing sustainable practices in chemical engineering

    158

    full texts

    161

    metadata records
    Updated in last 30 days.
    International Journal of Advances in Data and Information System
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇