International Journal of Advances in Data and Information System
Not a member yet
    161 research outputs found

    Comparative Analysis of Artificial Neural Networks, Linear Regression, Random Forest, and Support Vector Machine for Predicting Poverty Levels in Indonesia

    Full text link
    Poverty remains a persistent and complex challenge in Indonesia, driven by multiple interrelated socioeconomic factors. Accurate poverty prediction is essential to support effective policy formulation and targeted interventions. This study evaluates and compares the performance of four machine learning models for predicting poverty levels in Indonesia: Artificial Neural Networks (ANN), Linear Regression, Random Forest, and Support Vector Machine (SVM). A quantitative approach is employed using provincial-level data from 2015 to 2023, consisting of 306 observations and 13 socioeconomic indicators related to education, employment, health, infrastructure, and economic conditions. Data preprocessing includes data cleaning, Min–Max normalization, and feature selection. Model performance is assessed using Mean Squared Error (MSE), Mean Absolute Error (MAE), and the coefficient of determination (R²). The results show that ANN achieves the best predictive performance, with the lowest MSE (0.0132) and MAE (0.0815), and the highest R² value (0.924). Random Forest and SVM demonstrate competitive performance, while Linear Regression yields the weakest accuracy. These findings confirm the effectiveness of ANN for poverty prediction and support its use in data-driven poverty reduction policies in Indonesia

    Enhancing Medical Data Security Through Blockchain Smart Contract and Decentralized Application

    Full text link
    This research studies implementation of decentralized applications (DApps) that are combined with blockchain technology and IPFS for storing patient medical data. The goal of this research is to increase the security, transparency, and access control of stored medical data to make sure only legitimate users can access the data. The proposed system uses smart contracts on the Ethereum network to handle user rights of access (doctors, patients, and admins) and ensure data integrity through the blockchain immutability feature. Patient medical records are retained in IPFS and traced using the Content Identifier (CID). Implementation outcome reveals that the system can safely process medical information, keeping patients in full control of their information, and restricting data access only to scheduled time. This system also shows the potential of blockchain and IPFS technology-based applications in achieving a more efficient health ecosystem focused on safeguarding people\u27s data

    KNN-MVO-SMOTE Algorithm for Air Quality Imbalanced Data Classification

    Full text link
    This research addresses air pollution, a pressing global issue influenced by geographic and temporal factors, using advanced machine-learning techniques to enhance air quality classification. By integrating the K-Nearest Neighbors (KNN) algorithm with the Synthetic Minority Over-sampling Technique (SMOTE) and Multi-Verse Optimization (MVO), we tackle challenges like data imbalance and parameter optimization. Our novel approach, which combines SMOTE and MVO within the KNN framework, has significantly increased classification accuracy to 97%, substantially improving over previous methods. The dataset includes diverse geographic and temporal data, with potential biases acknowledged and addressed. This study highlights the efficacy of merging MVO and SMOTE to optimize classification models, making a substantial contribution to environmental analysis and the fight against air pollution. Future research will explore AutoML technology to improve algorithmic optimization, offering more efficient and adaptive solutions. This pioneering effort emphasizes the critical role of technological innovation in tackling environmental challenges and marks a significant advancement in combating global air pollution

    A Performance Enhancement Strategy for Sentiment Classification Models On Political Social Media Using Hyperparameter Tuning And Boosting

    Full text link
    This study aims to develop an optimized machine learning-based sentiment classification model for election-related issues. A dataset comprising 10,001 entries was collected from the social media platform X and manually labeled into three sentiment classes: positive, negative, and neutral. The preprocessing stage involved text cleaning, stemming, and feature transformation using the Term Frequency-Inverse Document Frequency (TF-IDF) method. To address class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was employed. Three baseline classification algorithms—K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Gaussian Naive Bayes (GNB)—were initially evaluated to establish a performance benchmark. Model development proceeded by applying hyperparameter optimization using the Optuna framework and further enhancing the models via boosting with Extreme Gradient Boosting (XGBoost). Experimental results revealed that the combination of SVM with Optuna and XGBoost achieved the best performance, reaching 97% accuracy, precision, recall, and F1-score across all classes. In contrast, the KNN and GNB models experienced a notable decline in performance following hyperparameter tuning, although partial recovery was observed when combined with boosting. These findings suggest that hyperparameter tuning and boosting are not universally effective across all classifiers, yet their synergistic application significantly enhances performance in SVM-based models. This study highlights the importance of model-specific optimization strategies in building robust sentiment analysis systems, particularly for handling unbalanced public opinion data in social media contexts

    Enhancing GERD Disease Prediction using Extra Tree Classifier Tuned by Komodo Mlipir Algorithm

    Full text link
    Gastroesophageal reflux disease (GERD) is a prevalent gastrointestinal disorder characterized by the backward flow of gastric contents into the esophagus, often causing heartburn and regurgitation, with a global prevalence of approximately 13.98%. Early detection is essential to prevent severe complications such as esophagitis, esophageal strictures, and esophageal cancer. However, conventional diagnostic methods are often limited by inadequate healthcare resources and high cost, particularly in developing countries. On the other hand, machine learning can be implemented as a promising alternative method for disease detection, improving accuracy through data pattern identification. Machine learning has been used for several disease detection tasks, such as Breast Cancer, Diabetes, etc. This study proposed an enhanced GERD prediction model by implementing the Extra Tree classifier optimized by the Komodo Mlipir Algorithm (KMA) for hyperparameter optimization.  This study used a GERD dataset from the Harvard  Dataverse, which consists of 1200 rows with 69 features. The result shows that the Extra Tree Algorithm that KMA tuned achieved a high-performance evaluation with an F1-score of 0.97.  This highlights the effectiveness of KMA in enhancing model performance. Compared to the previous study, the proposed Extra Tree Models optimized by KMA performed improved performance, demonstrating the effectiveness of metaheuristic optimization in GERD prediction

    The Importance of Literacy on Artificial Intelligence for the Higher Education Students: A Systematic Literature Review

    Full text link
    The rapid development of AI technology makes AI literacy crucial in providing individuals with an understanding of the essential functions of AI and its ethical application in higher education. This study used a scoping literature review method by searching the Scopus, Web of Science, Science Direct, and Sage Journals databases. Based on the search results, the eligibility criteria data were analyzed. Authors found as many as 153 pieces of literature, and eleven were declared to meet the eligibility criteria for the literature reviewed in this study. This study shows that AI literacy is essential in higher education. Educators and higher education institutions are responsible for providing programs that support the development of AI literacy skills in students. The application of AI literacy for students in higher education is essential in dealing with the development of AI technology. However, the lack of studies that address the evaluation of the importance of AI literacy and its implications limits the in-depth understanding of this topic

    Comparison of Text Classification Techniques in Fake News Detection in the Digital Information Age

    Full text link
    A comparison of text classification techniques for detecting fake news in the digital information age has been discussed in this study, with a focus on the application of Deep Learning methods, specifically Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). The increasing spread of fake news through digital platforms emphasizes the importance of developing effective methods for identifying inaccurate information. In this study, a news dataset was collected from various sources, and both models were applied for text classification analysis. The performance of the model was then measured based on accuracy, precision, recall, and F1-score. The results showed that although both have their own advantages, better results in terms of processing speed and classification accuracy were found in CNN compared to RNN. These findings provide important insights for the development of more efficient and effective fake news detection systems in the digital age

    Machine Learning and Density Functional Theory Investigation of Corrosion Inhibition Capability of Ionic Liquid

    No full text
    This study investigated the corrosion inhibition potential of ionic liquid compounds using a QSPR-based machine learning predictive model combined with DFT calculations. The Gradient Boosting (GB) model was identified as the most effective predictor, demonstrating excellent accuracy with a high R² value of 0.98. Additionally, the model exhibited low RMSE (0.95), MAE (0.84), and MAD (0.94) values. The predicted corrosion inhibition efficiencies (CIE) for three new ionic liquid compounds (IL1, IL2, and IL3) were 88.95, 90.82, and 93.16, respectively, which aligned well with experimental data. By integrating DFT simulations into the data updating process, facilitated by machine learning, the approach proved invaluable for identifying new corrosion inhibitors. This work highlighted the continuous refinement of data related to the corrosion inhibition effects of ionic liquid compounds

    Prediction of Rice Harvesting During the Rainy Season in Kabupaten Lamongan Using Stochastic Frontier Analysis

    Full text link
    The agricultural sector plays a critical role in ensuring national food security, yet it faces challenges in achieving technical efficiency due to limited land and input resources. This study aims to model and predict the technical efficiency of rice production in Lamongan Regency during the rainy season using a data science-driven Stochastic Frontier Analysis (SFA) approach. The dataset includes key inputs such as land area, labor, fertilizer, and environmental variables. The methodology involved data preprocessing, feature selection based on Pearson correlation and VIF thresholds, and model validation using metrics like R-squared, MAPE, and log-likelihood. The SFA model demonstrated high predictive capability, with R² values exceeding 0.91 in cross-validation and MAPE under 15%. The low gamma value (? = 0.0100) indicates minimal yet consistent inefficiency. The results suggest that integrating SFA with data science techniques provides an effective framework for identifying inefficiencies and can serve as a decision-support system for evidence-based agricultural policy

    Revisiting Cyber Threats in Government Sectors: A Systematic Review of Attacks, Challenges, and Policy-Level Defenses

    Full text link
    This paper presents a systematic literature review (SLR) based on the PRISMA framework, synthesizing 128 peer-reviewed studies published between 2020 and 2024, drawn from major scholarly databases. The review investigates cyber threats specifically targeting government institutions and identifies phishing, ransomware, malware, and denial-of-service (DoS) attacks as the most prevalent attack vectors affecting government sector environments. In addition to these threats, the study highlights persistent institutional limitations, such as outdated infrastructure, fragmented inter-agency coordination, limited technical capacity, and regulatory gaps, which hinder effective cybersecurity governance and response. To address these challenges, the review compiles both proactive and reactive mitigation strategies, emphasizing the need for SOC design principles such as scalability, interoperability, inter-agency coordination, and resilience in cyber operations. The paper synthesizes its findings into a taxonomy of threat profiles and contextual constraints, offering a foundational reference for building government-specific SOC models. It also outlines future research directions related to operational validation, capability maturity modeling, and institutional alignment in public-sector cybersecurity architectures.

    158

    full texts

    161

    metadata records
    Updated in last 30 days.
    International Journal of Advances in Data and Information System
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇