FORUM STATISTIKA DAN KOMPUTASI
Not a member yet
    119 research outputs found

    LAD-LASSO: SIMULATION STUDY OF ROBUST REGRESSION IN HIGH DIMENSIONAL DATA

    Full text link
    The common issues in regression, there are a lot of cases in the condition number of predictor variables more than number of observations ( ) called high dimensional data. The classical problem always lies in this case, that is multicolinearity. It would be worse when the datasets subject to heavy-tailed errors or outliers that may appear in the responses and/or the predictors. As this reason, Wang et al in 2007 developed combined methods from Least Absolute Deviation (LAD) regression that is useful for robust regression, and also LASSO that is popular choice for shrinkage estimation and variable selection, becoming LAD-LASSO. Extensive simulation studies demonstrate satisfactory using LAD-LASSO in high dimensional datasets that lies outliers better than using LASSO.Keywords: high dimensional data, LAD-LASSO, robust regressio

    MODELING OF DENGUE HEMORRHAGIC FEVER IN BOGOR USING BAYESIAN SUR-SAR

    Full text link
    The purposes of this research are (1) To develop Seemingly Unrelated Regression (SUR) system constructed by correlated Spatial Autoregressive Model (SAR) with Bayesian approach for dynamic analysis of spatial and non-spatial contributions of Dengue Hemorrhagic Fever (DHF) case in Bogor, (2) To evaluate efficiency issues on parameters estimation with SUR system. Markov Chain Monte Carlo (MCMC) sampling scheme was used to estimate all of model parameters with the number of iteration whose burn-in period was discovered. The results indicated that : there was the similar pattern of DHF spread in Bogor during 2009 – 2011, the nearby areas had a significant role to the incidence of DHF in an area in the city of Bogor, and the non-spatial contributions of DHF cases in Bogor during 2009 -2011 included in this model were dynamic. Gain efficiency of parameters estimation on modeling of DHF in Bogor with SAR for each year during 2009-2011 can be obtained if we construct all of SAR with SUR system model

    GEOGRAPHICALLY WEIGHTED REGRESSION (GWR) INCLUDED THE DATA CONTAINING MULTICOLLINEARITY

    Full text link
    One of the reasons of spatial effect of each location is spatial variety. Beside of spatial variety, number of independent variable (X) causes local multicolinearity, that is one or more independent variable, which collaborated with other variable in each location of observation. The methods can be used to solve spatial diversity problem and local multicollinearity in Geographically Weighted Regression (GWR) model that is GWPCA. This research aim to examine GWPCAR feasibility model for PDRB data in 2010 at 113 districts/cities in Java. analysis indicate that GWPCA method can overcome local multicollinearity problem, it can be seen from the characteristic value of VIF which is smaller than 10.Key words : Local Multicollinearity, Geographically Weighted Principal Components Analysis

    TEMPERATURE CHANGES IN CLOUD FOREST OF KHAO NAN NATIONAL PARK, SOUTHERN THAILAND DURING 2000 - 2015

    Full text link
    Khao-Nan National Park(KNNP) is a part of the Nakhon Si Thammarat mountain range, which is the one of the cloud forest southern Thailand. The characteristic of cloud forest is a plenty of flora especially epiphyte and the presence of clouds even in the dry season. The aim of this study was to investigate temperature pattern and variation at Khao Nan. We downloaded data, for land surface temperatures recorded by MODIS EARTH Satellites every eight days from 2000-2015 in square kilometers grid boxes covering Khao-Nan National Park, to investigate time series of temperature variation. The cubic spline modeling was used for fitting a pattern of LST among day time from satellite image at Khao-Nan National Park. Otherwise, we used GEE for parameter estimate. The result was shown the temperature has similar pattern and variation around Khao-Nan National Park during 2000-2015. Eventually, the conclusion is the temperature have changed during 2000-2005, 2006-2009 and 2010-2015 by using GEE.Key words : Temperature changes, cloud forest, Khao Nan National Par

    COMPARISON OF LOW BIRTH WEIGHT RATE ESTIMATES BASED ON DIFFERENT AGGREGATE LEVELS DATA USING LOGISTIC REGRESSION MODEL

    Full text link
    Low Birth-Weight (LBW) is defined as a birth weight of a live-born infant of less than 2.500 grams regardless of gestational age. Case of LBW is associated with infant mortality, infant morbidity, inhibited growth and slow cognitive development, also chronic diseases in later life. It is vital because with high LBW rate the generation hardly grow into its full potential. There are many risk factors, whether direct or indirect, can cause a birth as a high risk of Low Birth Weight case. These factors are genetics, obstetrics, nutrition intakes, diseases, toxic exposures, pregnancy care and social factors. With these factors measured, statistical modelling can be used to estimate rate on group level or probability on individual level of the Low Birth Weight event. As the case is a binary response, Logistic Regression Model is commonly used.Data of LBW case and the risk factors came from Indonesian Demographic and Health Survey (IDHS) 2012. Published national rate of LBW was 7.3% with provincial rates fell between 4.7-15.7 %. Although the national rate was considered low, the wide variation of provincial rates showed that the problem was not handled so well. However, these rates cannot be measured yearly due to 5 year period of the survey. With the availability of risk factors data a model can be built to estimate the LBW rates. But, another problem for the model is the case when aggregate level data is available instead of individual level data. So, the purpose of this study was to compare models based on different aggregate levels and theirs estimated provincial rates. Comparison was done among individual birth level, mother level, household level and census block (cluster) level. Models from three former levels were quite similar with adequate significant parameters, while cluster level model was resulted only a few significant parameters. But instead, LBW rate estimates from cluster level model were the closest to the direct estimates. But the variance of these estimates was still higher than the other models.Key words : Low Birth-Weight, IDHS, Logistic Regression, GLM, Aggregate Dat

    EBLUP METHOD OF TIME SERIES AND CROSS-SECTION DATA FOR ESTIMATING EDUCATION INDEX IN DISTRICT PURWAKARTA

    Full text link
    Since decentralisation was implemented in Indonesia, more detailed information about the condition of an area becomes very necessary to know as an evaluation of development that the government has done. the success development of a region can be seen through the Human Development Index (HDI). HDI consists of three basic dimensions, knowledge as one of that three basic measured by the index of education. This index is measured by the Adult Literacy Rate and Mean Years of Schooling. Education is one of the important factors in improving human development. The enhancement of education index results in increasing the HDI of an area. Purwakarta has a vision that is made as a district that excels in education in West Java, but until now Purwakarta’s education index is still below the West Java province. One step that can be done is to seek information on the education index each district in Purwakarta, with the aim to provide the right policy in each region. Direct estimation of the components forming the HDI for districts is not feasible because these estimates will generate a great value of variance, This is due to the size of the sample used is too small. This study proposes a statistical method by performing the estimation using small area estimation. These estimates using information from surrounding areas that can improve the effectiveness of the sample size and the lower the standard error. Some surveys are conducted regularly every year, in conducting indirect estimation in the survey such as this, efficiency of estimating education index for district level can be improved by including the random effect of the area as well as the random effect of time (Sadik and Notodipuro, 2006). So in this study will be used Empirical Best Linear Unbiased Prediction (EBLUP) by combining the time series and cross-section data for estimating the education index at the level of districts in Purwakarta. The direct estimation of education index produce a larger variance than our methode, it shown by comparing mean square error (MSE) of direct method and indirect method, direct method have the largest MSE.Key words : Indirect Estimation, Small Area Estimator, EBLUP, Time Series and Cross-Section, HDI, Education Index

    SURVIVAL ANALYSIS WITH EXTENDED COX MODEL ABOUT DURABILITY DEBTOR EFFORTS ON CREDIT RISK

    Full text link
    The application of survival analysis on the data of credit motorcycle financing experiencing bad loans after the credit starts early, with sixteen covariates were considered. The model used in survival analysis is the Cox proportional hazard models. Cox models have the assumption that the proportional hazard assumption. Extended Cox models selected to improve cox proportional hazard models when one or more covariates did not meet the assumption of proportional hazards. Extended cox models is an extension of cox models that involve time-dependent variables. Covariates that do not meet the proportional hazards assumption in the Cox models diinteraksikan extended with functions appropriate time, in order to obtain time-dependent covariates. So on the model covariates that are not dependent on time and time dependent covariates. The parameters of these covariates estimated using partial maximum likelihood method. To determine whether the extended Cox model is a suitable model for the data in a particular case, likelihood ratio test was used. The results indicate that extended Cox models with functions time appropriate, provide the best model.Keywords : Credit Risk, Survival Analysis, Cox Proportional Hazard , Extended Cox Mode

    CLUSTERING PROVINCE IN INDONESIA BY COMMUNICATION TECHNOLOGY RELATED VARIABLES

    Full text link
    Technological developments in Indonesia growth rapidly. Almost all systems used in daily life have been using the technology. One of its technology is communication technology. It because communication technology is a important tool for send information. All was done in order to communicate easier and faster. It is therefore important to research the condition of the existing communication technology in Indonesia. Communications technology also one of the focus of the government in national development. But not easy to know the state of communication technology in Indonesia because Indonesia has a large region and different geographically. The purpose of this research was to determine the grouping of provinces in Indonesia to increase the communication sector in order to support national development. The method used in this research is cluster hierarchical analysis method and criterion of determining the best method and many cluster optimal use Cubic Clustering Criterion (CCC). The data used is secondary data from the Statisctics Indonesia (BPS) and the Ministry of Communication and Information. The results showed that the number of cluster based on related communication technology variables are 3 cluster which 1st cluster members consist of 21 provinces, 2nd cluster members consist of 7 provinces and 3rd cluster members consist of 3 provinces.Key words : Communications Technology, Cluster Analysis, Hierarchical Method, Cubic Clustering Criterion (CCC

    AUTOREGRESSIVE MOVING AVERAGE (ARMA) MODEL FOR DETECTING SPATIAL DEPENDENCE IN INDONESIAN INFANT MORTALITY DATA

    Full text link
    Infant mortality is an important indicator that must to be monitored seriously. The infant mortality is associated with several determinants, such as the infant’s characteristics, maternal and fertility factors, housing condition, geographical area, and policy. It can also be influenced by the presence of spatial dependence between regency in Indonesia. This is due to the social and economic activity in one regency depend on social and economic activity in other regency, especially with neighboring area. Infant mortality data obtained from Indonesian Demographic and Health Survey (IDHS) published by Statistic Indonesia (BPS). In BPS’s publication, data is always sorted by regency code from the smallest to the largest. Therefore, the closeness of the regency code refers to the closeness of the regency itself. the infant mortality data by regency could be analogized as time series data. So that, the relationship between regency can be seen using Autoregressive Moving Average (ARMA) model. If the parameter at ARMA is significant, we can conclude that there is a spatial dependence on the infant mortality in Indonesia. This paper will focus on discussing whether there is a spatial dependenc in Indonesia’s Infant Mortality Data using ARMA approach. The result is the Autocorrelation Function (ACF) showed a significant effect until lag 3, and Partial Autocorrelation Function (PACF) showed a significant effect until lag 1. Based on Bayesian Information Criterion (BIC), the AR(1) fitted the model well. It shows that the probability of infant mortality in one regency is affected by probability of infant mortality in neighboring regency.Key words : ARMA, spatial dependence, infant mortality, IDH

    A SIMULATION STUDY OF LOGARITHMIC TRANSFORMATION MODEL IN SPATIAL E MPIRICAL BEST LINEAR UNBIASED PREDICTION (SEBLUP) METHOD OF SMALL AREA ESTIMATION

    Full text link
    There have been many studies developed to improve the quality of estimates in small area estimation (SAE). The standard method known as EBLUP (Empirical Unbiased Best Linear Predictor) has been developed by incorporating spatial effects into the model. This modification of the method was known SEBLUP (Spatial EBLUP) since it incorporates the spatial correlations which exist among the small areas. The data obtained (variables of concern) usually have a large variance and tend to have a a nonsymmetric distribution and therefore tend to have nonlinear relationship pattern between concomitant variables and variables of concern. the results showed that the method SEBLUP using logarithmic transformation produces estimator more than the other methods.Keywords : EBLUP, SAE, SEBLU

    89

    full texts

    119

    metadata records
    Updated in last 30 days.
    FORUM STATISTIKA DAN KOMPUTASI
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇