Jurnal Matematika, Statistika dan Komputasi
Not a member yet
    562 research outputs found

    Comparison of Basic Statistics and Machine Learning Classification Algorithms in Kalimantan Poverty Prediction with Handling Missing Data

    Full text link
    Poverty is a crucial development challenge in Indonesia, including in regencies/cities in Kalimantan that require more attention. In reality, poverty is influenced by various factors. Therefore, this research proposes an analysis comparing the accuracy of basic and statistical machine learning models in predicting poverty rates and finding factors that affect poverty rates. The advance of this research is the performance comparison combined with the handling of missing data. The three models proposed in this study are binary logistic regression with backward stepwise selection, random forest, and extremely randomized trees (extra trees). The data used in this study is secondary data taken from the Indonesian Statistics (BPS) of five provinces in Kalimantan, where the pre-processing is done by handling missing data with a k-nearest neighbor (KNN). The results of the poverty prediction analysis show that the binary logistic regression model is the most accurate compared to random forest and extra trees, with a balanced accuracy of 75%. In addition, based on the best model with the highest accuracy, this study also found significant predictor variables that affect the poverty rate of regencies/cities in Kalimantan: population density, average years of schooling, and per capita expenditure on food

    A Study of Noetherian and Artinian (R,S)-Modules

    No full text
    As science develops, a module has been generalized to a new algebraic structure called -module. Some of the concepts in the module can also be extended into the -module structure. The concepts related to the Noetherian module and the Artinian module have never been developed into the -module structure. Therefore, this study aims to examine the development results related to the Noetherian module and the Artinian module in the -module. In this paper, we present the definition of Noetherian -module, the definition of Artinian -module, and the development results of its properties in the -module structure

    Pemodelan Peluang Pencemaran Air Sungai Menggunakan Model Geographically Weighted Logistic Regression (Studi Kasus: Data DO Air Sungai di Kalimantan Timur)

    Full text link
    Geographically Weighted Logistic Regression (GWLR) is a local binary logistic regression model, and it’s applied to the spatial heterogeneity data. The parameter estimation of GWLR model in this study uses Maximum Likelihood Estimation (MLE) method, and it’s conducted at each observation location with spatial weighting. The spatial weight in this study was calculated using the adaptive tricube function. The spatial weighting function depends on distance between observation location and bandwidth, where the determination of optimal bandwidth uses the Akaike Information Criterion (AIC). The aim of this research is to identify the factors influencing the probability of river water pollution in East Kalimantan Province through GWLR modelling to Dissolved Oxygen (DO) data 2022, and to interpret it based on the best model. The research data is secondary data provided by Life Environment Department of East Kalimantan Province. Research concludes that the GWLR was fit model based on the results of similarity testing of the GWLR model and global model, as well as simultaneous parameter testing, with the model fitting measure was a McFadden R-Squared value of 61,1%, and an AIC value of 29,629. Based on partial parameter testing, local factors influencing chance of river water pollution in East Kalimantan can be identified, namely nitrate concentration and water color degree. Based on the GWLR modelling to DO data 2022, it can be interpreted that increasing nitrate concentration and water colour degree respectively will increase the probability of river water pollutionModel Geographically Weighted Logistic Regression (GWLR) adalah model regresi logistik biner lokal, dan diterapkan pada data heterogenitas spasial (spasial tipe titik). Penaksiran parameter model GWLR dalam penelitian ini menggunakan metode Maximum Likelihood Estimation (MLE), dan dilakukan di setiap titik lokasi pengamatan dengan pembobot spasial. Pembobot spasial pada penelitian ini dihitung menggunakan fungsi pembobot adaptive tricube. Fungsi Pembobot spasial tergantung pada jarak antar lokasi pengamatan dan bandwidth, dimana penentuan bandwidth optimum menggunakan kriteria Akaike Information Criterion (AIC). Tujuan penelitian ini adalah mengidentifikasi faktor-faktor yang berpengaruh terhadap peluang pencemaran air sungai di Provinsi Kalimantan Timur melalui pemodelan GWLR pada data Dissolved Oxygen (DO) Tahun 2022, dan mendapatkan interpretasi model GWLR. Data penelitian adalah data sekunder yang disediakan oleh Dinas Lingkungan Hidup (DLH) Provinsi Kalimantan Timur. Penelitian ini menyimpulkan bahwa faktor-faktor lokal yang memengaruhi peluang air sungai di Kalimantan Timur tercemar adalah konsentrasi nitrat dan derajat warna air. Berdasarkan pemodelan GWLR pada data DO diinterpretasikan bahwa setiap kenaikan konsentrasi nitrat dan derajat warna air masing-masing akan meningkatkan peluang pencemaran air sungai

    Clustering and Portfolio Optimization on LQ45 Stocks with Fuzzy C-Means and Single Index Model

    Full text link
    Spreading money or capital across several assets through portfolio formation is more recommended when investing in stocks. The selection of the optimal portfolio can use the Fuzzy C-Means and Single Index Model methods. This research used stock data on the LQ45 Index from January 27, 2020, to November 27, 2024, with the results of 3 portfolios formed. Portfolio 1 has members ADRO, ANTM, and PTBA with an expected return value of 0.001123 and a risk of 0.000670 with a sharpe index performance of 1.393. Portfolio 2 has members BBNI, BMRI, and INCO with an expected return value of 0.000456 and a risk of 0.000524 which has a sharpe index of 0.509. Portfolio 3 with BBCA and INKP has an expected return of 0.000343 and a risk of 0.000453 with a performance of 0.338. For investors who are very risk tolerant, it is recommended to invest in portfolio 1, and for investors who are slightly risk tolerant, portfolio 2 will be suitable, and for investor who are intolerant of risk, portfolio 3 is more suitable. Based on the results obtained in this study, it can be concluded that portfolio building through clustering from Fuzzy C-Means and continued by portfolio weighting based on the Single Index Model produces an optimal portfolio with a fairly high sharpe index performance

    Fuzzy Mewma Control Chart with Median Transformation for Manufacturing Multivariate Process Control

    Full text link
    The Multivariate Exponentially Weighted Moving Average (MEWMA) control chart is developed with the advantage of detecting small shifts in the mean vector and is robust. Conventional control charts have limitations in handling ambiguity in a process. The fuzzy MEWMA control chart is proposed to detect small shifts under uncertain conditions. When the fuzzy data distribution is asymmetric, the median transformation method is used. Quality control is crucial for the convection industry. Clothing designs tailored to human body proportions indicate that ambiguity in the process and small measurement shifts can affect measurement accuracy. This study will utilize the Fuzzy MEWMA control chart with median transformation for quality control in the multivariate manufacturing process, particularly in the convection industry. The purpose of this study is to determine the UCL value, obtain performance evaluation results and implement the Fuzzy MEWMA control chart with median transformation. The research findings show that the UCL with an alpha level cut of 0.6 for three quality characteristics increases as the lambda value decreases. Performance evaluation results indicate that when small process shifts occur, lambda 0.05 and lambda 0.1 provide better performance than other lambda values. The production control results for uniform manufacturing in a convection company in Palu City show two observations outside the UCL, which can serve as an early warning for the company

    Estimating Reinsurance Premiums Using Pareto Conjugate Priors and Extreme Value Methods: Studies Case of Fire Insurance Claims in Denmark

    Full text link
    Loss distributions in insurance are typically right-skewed with heavy tails. As a result, modelling such distributions often involves the use of heavy-tailed distributions, such as the Pareto family, Cauchy, Student-t, and mixture distributions. This study employs the Generalized Inverse Gaussian (GIG) distribution as a conjugate prior to the Pareto distribution. The GIG distribution is characterized by three parameters and includes the modified Bessel function of the third kind in its density, which makes parameter estimation using the likelihood method challenging. Therefore, a Bayesian estimation approach is adopted, utilizing two prior distributions from the GIG family: the Inverse Gaussian and the Reciprocal Inverse Gaussian. The modelling is carried out within the framework of Extreme Value Theory (EVT), focusing on excess values over a specified threshold and the probability of claims exceeding that threshold. The results obtained from this analysis can be used to derive a premium estimation formula that insurance companies can apply when reinsuring their claims with a reinsurance compan

    Multiresponse Nonparametric Regression Model with Mixed Estimator of Truncated Spline and Kernel for Poverty Indicators Analysis in Nusa Tenggara

    Full text link
    Regression analysis is a statistical method used to describe the causal relationship between response variables and predictor variables. Regression analysis can be classified into parametric regression, nonparametric regression, and semiparametric regression, depending on whether the regression curve is fully known, unknown, or partially known. This study aims to obtain a multiresponse nonparametric regression model with a mixed spline truncated and kernel estimator. The model obtained is applied to Poverty Indicators in 32 districts/cities in Nusa Tenggara. The response variables include the Percentage of Poor People, Poverty Depth Index, and Poverty Severity Index, while the predictors are the Human Development Index, Open Unemployment Rate, and GRDP per capita. The estimation method used in this research is Weighted Least Square (WLS). The result shows that the Human Development Index predictor variable can be approximated by a truncated spline function, while the Open Unemployment Rate and Gross Regional Domestic Product (GRDP) per capita predictor variables can be approximated by a kernel function. The multiresponse nonparametric regression model with a mixture of truncated spline and kernel estimators can be used to model Poverty Indicators in Nusa Tenggara. Results show that the Human Development Index aligns with the spline function, while the other predictors align with the kernel function. The best model is a model with one knot point and two bandwidths where the model produces an R² value of 89.86% based on the smallest GCV value

    Comparative Analysis of ARIMA and LSTM Methods for Sea Surface Temperature Forecasting in the Sunda Strait

    Full text link
    The Sunda Strait is an important area for Indonesia because it is the main domestic and international transportation route. As a water area, the Sunda Strait has weather conditions that are greatly influenced by sea surface temperature (SST). Crucial SST forecasting is carried out to assist maritime transportation activities. This study aims to compare the performance of the ARIMA and LSTM methods in forecasting SST in the Sunda Strait. The data used in this study are daily SST data for the Sunda Strait from August 20, 2022, to January 1, 2024. The best ARIMA model obtained in this data modeling is ARIMA(1,1,1), where this model has significant overall parameters, the smallest AIC and BIC values, and model diagnostic results that meet the assumptions. Meanwhile, in LSTM modeling, the best combination of hyperparameters obtained is a neuron of 150, an epoch of 150, and a batch size of 32, where this combination produces the lowest MSE value of 0.003799. A comparison of the performance of the ARIMA and LSTM methods is carried out by considering the MAPE values ​​of the training data and test data. The LSTM method is superior to the ARIMA method, with a MAPE value of 0.512% for training data and 0.564 for testing data. The forecasting results using the LSTM method show a similar pattern in both training and testing data. Meanwhile, the forecasting results of the LSTM method for the following 30 periods show a fluctuating pattern throughout the day

    Forecasting Rice Prices in Gorontalo Province Using Hybrid Singular Spectrum Analysis (SSA) and Triple Exponential Smoothing Methods (TES)

    Full text link
    Currently, Gorontalo Province is experiencing the problem of unstable rice prices from the government which makes it difficult for people to meet their food needs, especially rice. There are several factors that can influence rice price instability, namely high demand from other regions, large areas of harvested land, and weather conditions such as drought, floods and the spread of pests that can destroy rice plants. This can cause the price of rice to increase and decrease each month. Therefore, forecasting rice prices for the future is carried out.  The method used to forecast is hybrid Singular Spectrum Analysis and Triple Exponential Smoothing. The criteria for determining forecasting accuracy are based on the Mean Absolute Percentage Error value. After the forecasting was carried out, the hybrid Singular Spectrum Analysis and Triple Exponential Smoothing forecasting obtained a Mean Absolute Percentage Error (MAPE) value of 0.04352537 or 4.35%. The hybrid method of Singular Spectrum Analysis and Triple Exponential Smoothing is said to be better if it has an accuracy value of less than 10%

    D-Optimal Design with Split Plot Approach for Quadratic Mixture-Amount Experiments (Case Study of Three Components with Composition Constraints)

    Full text link
    The mixture experiments (MAE) were influenced by both the proportions of the components and the total amount. The traditional MAE encompasses classical mixture experiments for each total amount, which complicates the application of complete randomization; therefore, a split-plot design is suggested. In this design, the plot factor represents the total mixture amount, whereas the subplot represents the composition of the materials. Another issue is the increasing number of experiments as the number of materials and total amount increase. This study proposes a split-plot approach using a point-exchange algorithm based on the D-optimal criteria to generate an efficient design. The model used is a quadratic mixture of numbers model, which can capture the linear, quadratic, and interaction effects between material proportions and total mixture numbers. The case study involved three components with proportion constraints and three levels of mixture numbers: high, medium, and low. The results show that the algorithm generates optimal points generally located at the edge of the design region and that increasing the number of experimental units improves the stability of designs involving total mixture numbers

    502

    full texts

    562

    metadata records
    Updated in last 30 days.
    Jurnal Matematika, Statistika dan Komputasi
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇