FORUM STATISTIKA DAN KOMPUTASI
Not a member yet
119 research outputs found
Sort by
MODEL AVERAGING, AN ALTERNATIVE APPROACH TO MODEL SELECTION IN HIGH DIMENSIONAL DATA ESTIMATION
Model averaging is an alternative approach to classical model selection in model estimation. The model selection such as forward or stepwise regression, use certain criteria in choosing one best model fitted the data such as AIC and BIC. On the other hand, model averaging estimates one model whose parameters determined by weighted averaging the parameter of each approximation models. Instead of conducting inference and prediction only based one best chosen model, model averaging covering model uncertainty problem by including all possible model in determining prediction model. Some of its developments and applications also challenges will be described in this paper. Frequentist model averaging will be preferential described.Keywords : model selection, frequentist model averaging, high dimensional dat
ESTIMATION OF UNEMPLOYMENT RATE USING SMALL AREA ESTIMATION MODEL BASED ON A ROTATING PANEL NATIONAL LABOR FORCE SURVEY
In Indonesia, labor force participation data are collected by Sakernas (National Labor Force Survey). Sakernas is conducted based on a quarterly rotating panel survey. Because of the groups differ according to their time-in-panel and observation strategy, it is possible to the presence of a bias. Besides, there are insufficiency problem of sample size to obtain an adequate precision of direct estimation at the district level. It is necessary to study how to estimate parameter based on a rotating panel survey when sample size is insufficient. Currently, a small area estimation (SAE) model that accomodates the bias component due to the rotation still only assume the effect over time which follows a random walk process, so it is necessary to develop a model that is more general. We propose a SAE model for rotation group level, its combined idea of the time-series multi-level model and the Rao-Yu model. The model will applied to Sakernas data to estimate a quarterly unemployment rate at the district level.Key words : Sakernas, rotating panel survey, time-series multi-level model and Rao-Yu mode
RANDOM PARAMETER MODELS OF FERTILIZER RESPONSE FOR CORN USING SKEWED DISTRIBUTIONS
Random parameter models have been found to better determine the optimum dose of fertilizer than fixed parameter. However, a major restriction of it is the normality assumption.. The purpose of this study the introduction of random parameter models of fertilizer response using skewed distributions from a Bayesian perspective. The method is applied to data sets of multilocation trials of potassium fertilization on corn. We compare the Linear Plateau, Spillman-Mitscherlich, and Quadratic random parameter models with different random errors distribution assumption, i.e. as normal, skew-normal, Student-t and Skew-t distribution using the Deviance Information Criterion (DIC). The results show that the smallest DIC value is obtained for the normal linear plateau model compare with the other models. The correlation between observed and fitted values was significant.Key words : fertilizer response model, mixed effects, skewed distributions, DIC
SMALL AREA ESTIMATION FOR ESTIMATING THE NUMBER OF INFANT MORTALITY USING MIXED EFFECTS ZERO INFLATED POISSON MODEL
Demographic and Health Survey Indonesia (DHSI) is a national designed survey to provide information regarding birth rate, mortality rate, family planning and health. DHSI was conducted by BPS in cooperation with National Population and Family Planning Institution (BKKBN), Indonesia Ministry of Health (KEMENKES) and USAID. Based on the publication of DHSI 2012, the infant mortality rate for a period of five years before survey conducted is 32 for 1000 birth lives. In this paper, Small Area Estimation (SAE) is used to estimate the number of infant mortality in districts of West Java. SAE is a special model of Generalized Linear Mixed Models (GLMM). In this case, the incidence of infant mortality is a Poisson distribution which has equdispersion assumption. The methods to handle overdispersion are binomial negative and quasi-likelihood model. Based on the analysis results, quasi-likelihood model is the best model to overcome overdispersion problem. However, after checking the residual assumptions, still resulted that residuals of model formed two normal distributions. So as to resolve the issue used Mixed Effect Zero Inflated Poisson (ZIP) Model. The basic model of the small area estimation used basic area level model. Mean square error (MSE) which based on bootstrap method is used to measure the accuracy of small area estimates.Keywords : SAE, GLMM, Mixed Effect ZIP Model, Bootstra
LASSO : SOLUSI ALTERNATIF SELEKSI PEUBAH DAN PENYUSUTAN KOEFISIEN MODEL REGRESI LINIER
A new method, known as LASSO, has recently developed for selections and shrinkage linear regression methods. The method gives an alternative solution on high correlated data between independent variables, where the least squares produces high variance. Based on simulation this method is not better than forward selection (in the case the parameters contains many zero values) and ridge regression (in the case all parameter values close to zero). Unknowing the true parameter and consistency estimates for all conditions that put the LASSO is better than ridge or forward selection.Keywords : LASSO, least square, forward selection, ridge, cross validatio
SURVIVAL ANALYSIS OF CUSTOMER IN POSTPAID TELECOMMUNICATION INDUSTRY
Currently, the business competition in mobile telecommunication industry among providers in Indonesia is tighter and it has given rise to a phenomenon of customer defection which has serious consequences for the business performance. In the current circumstances, customers are faced numerous options to be selected that probably cause them at risk to get churn. Therefore, it becomes one of the challenges encountered by Division of Loyalty and Retention to makes the efforts of decreasing customer defection. So that it is important conducting a model of churn practically applied to predict tendency of customer churn and also recognizing the prognostic factors influence customer churn. Survival analysis modelling, such as Cox’s proportional hazard model, was very successful in previous research, which investigatedthe relationship between survival time and possible prognostic factors. Based on the research, Cox’s proportional hazard model of customer lifetime is effective to distinguish relative risk between churn customers and others, and also between which loyal customers and with other short time customers with their significant prognostic factors. Afterwards the simulation of the survival probability estimated over time with particular possible combination of the most significant characteristics affecting tendency of churn, are able to predict such information of lifetime to churn event and compare the survival performance of one another. Finally, the results of this research is able to yield simple, helpful and applicable results as the principle of taking decission for optimizing their customer retention and/or treatment resources in their customer retention efforts for the company.Key words : Churn, Cox’s proportional hazard model, customer retention, survival analysis and telecommunication industry
KLASIFIKASI RANCANGAN FAKTORIAL PECAHAN JENUH TIGA TARAF DALAM 27 RUN
Tulisan ini memberikan klasifikasi terhadap gugus rancangan jenuh tiga faktor OA(27, 313) yang berguna dalam penentuan rancangan terbaik untuk diterapkan. Kriteria A3 dan A4 tidak dapat digunakan karena memiliki nilai yang sama untuk seluruh array. Dengan mengasumsikan hanya ada tiga faktor yang aktif, kriteria projection aberration menggunakan vektor A3(3) mengkelaskan 68 OA yang non-isomorphic ke dalam 54 kelas. Dua array terbaik menurut kriteria ini ditampilkan sebagai rujukan untuk digunakan.Kata kunci: rancangan jenuh, orthogonal array, klasifikasi, projection aberratio
KAJIAN METODE THURSTONE DALAM PENENTUAN ASPEK PENTING PADA SISTEM TRANSAKSI NON TUNAI
Data persepsi sudah sangat umum diukur dalam skala ordinal. Thurstone memperkenalkan metode untuk mengolah data ordinal, diantaranya adalah metode Thurstone, metode equal appearing intervals, dan metode successive intervals. Prinsip dasar metode-metode tersebut adalah mentransformasi data dari skala ordinal menjadi interval agar relevan dalam melakukan interpretasi. Selain itu, metode tersebut dapat menilai peringkat suatu atribut dan mengukur seberapa besar perbedaan kepentingan suatu atribut terhadap atribut lainnya. Pada kasus penilaian tingkat kepentingan responden dalam menggunakan transaksi non tunai, hasil uji kesesuaian model pada metode Thurstone dan metode successive intervals menyatakan bahwa model telah cukup baik menggambarkan kondisi data sebenarnya dengan tingkat ketidaksesuaian masing-masing sebesar 2.3% dan 4.5%. Metode Thurstone relatif tidak sensitif terhadap perubahan bobot atau skala penilaian pada suatu atribut. Metode Thurstone hanya melihat bagaimana hasil penilaian berpasangan antara dua atribut, namun tidak mampu melihat perbedaan penilaian yang diberikan oleh responden terhadap atribut-atribut tersebut. Penilaian tingkat kepentingan dalam menggunakan transaksi non tunai memberikan hasil tiga peringkat tingkat kepentingan paling tinggi dalam melakukan transaksi non tunai secara berutut-turut adalah aspek tingkat kemudahan atau aksesibilitas, tingkat keamanan, dan kecepatan transaksi. Kenyamanan merupakan aspek yang tingkat kepentingannya paling rendah dibandingkan atribut lain. Kata kunci : Metode Thurstone, Transaksi Non Tuna
PENENTUAN DOMAIN DENGAN TEKNIK VARIOGRAM
Dalam banyak kesempatan, penyusunan model skoring untuk memprediksi klasifikasi calon nasabah dilakukan menggunakan model regresi logistik dan beberapa model lain. Proses pengklasifikasian dapat juga dilakukan dengan menerapkan simple naive Bayesian classifier. Meskipun menggunakan asumsi yang secara umum dilanggar oleh data dan proses komputasi yang jauh lebih sederhana, teknik ini mampu menghasilkan akurasi dugaan yang tidak mengecewakan. Paper ini memberikan ilustrasi penggunaan simple naive bayesian classifier pada kasus prediksi klasifikasi status kolektibilitas calon nasabah dan membandingkannya dengan model regresi logistik dan generalized additive model. Kata kunci: simple naive Bayesian classifie
PENDUGAAN SELANG KEPERCAYAAN BOOTSTRAP BAGI ARAH RATA-RATA DATA SIRKULAR (Bootstrap Confidence Interval Estimation of Mean Direction for Circular Data)
The confidence interval is an estimator based on the sampling distribution. When the sampling distribution can not be derived from population distribution, the bootstrap method can be used to estimate it. Three methods used to estimate the bootstrap confidence interval for circular data were equal-tailed arc (ETA), symmetric arc (SYMA), and likelihood-based arc (LBA). In this study, three methods were evaluated through simulation study. The most important criterion to evaluate them were true coverage and interval width. The simulation results indicated in all methods, the interval width shortened when the concentration parameter increased. True coverage approached confidence level when the concentration parameter were one or more. For small concentration parameter, all three methods appeared unstable. Based on the true coverage, SYMA was the best, while in terms the interval width, LBA was the best one. For both criterion could be summarized that ETA is the best result. ETA applicated for estimate the period of Dengue Fever outbreaks in Bengkulu. The estimation showed that Dengue Fever outbreaks in 2009 were October through January. In 2010, it were January through March, and in 2011, it were June through September.Keywords : Circular, Bootstrap confidence interval, Equal-tailed arc, Symmetric arc, Likelihood-based arc