1,721,056 research outputs found
An alternative marginal likelihood estimator for phylogenetic models
Bayesian phylogenetic methods are generating noticeable enthusiasm in the field of molecular systematics. Many phylogenetic models are often at stake and different approaches are used to compare them within a Bayesian framework. The Bayes factor, defined as the ratio of the marginal likelihoods of two competing models, plays a key role in Bayesian model selection. We focus on an alternative estimator of the marginal likelihood whose computation is still a challenging problem. Several computational solutions have been proposed none of which can be considered outperforming the others simultaneously in terms of simplicity of implementation, computational burden and precision of the estimates. Practitioners and researchers, often led by available software, have privileged so far the simplicity of the harmonic mean estimator (HM) and the arithmetic mean estimator (AM). However it is known that the resulting estimates of the Bayesian evidence in favor of one model are biased and often inaccurate up to having an infinite variance so that the reliability of the corresponding conclusions is doubtful. Our new implementation of the generalized harmonic mean (GHM) idea recycles MCMC simulations from the posterior, shares the computational simplicity of the original HM estimator, but, unlike it, overcomes the infinite variance issue. The alternative estimator is applied to simulated phylogenetic data and produces fully satisfactory results outperforming those simple estimators currently provided by most of the publicly available software.
Subjects: Computation (stat.CO); Quantitative Methods (q-bio.QM); Applications (stat.AP); Methodology (stat.ME)
Cite as: arXiv:1001.2136 [stat.CO]
http://arxiv.org/abs/1001.2136
(Submitted on 13 Jan 2010 (v1), last revised 20 Jun 2010 (this version, v2)
A unit level small area model with misclassified covariates
Model-based small area estimation relies on mixed effects regression models that link the small areas and borrow strength from similar domains. When the auxiliary variables that are used in the models are measured with error, small area estimators that ignore the measurement error may be worse than direct estimators. Alternative small area estimators accounting for measurement error have been proposed in the literature but only for continuous auxiliary variables. Adopting a Bayesian approach, we extend the unit level model to account for measurement error in both continuous and categorical covariates. For the discrete variables we model the misclassification probabilities and estimate them jointly with all the unknown model parameters. We test our model through a simulation study. The effect of the model proposed is emphasized through application to data from the Ethiopia Demographic and Health Survey where we focus on the women's malnutrition issue: a dramatic problem in developing countries and an important indicator of the socio-economic progress of a country
A multivariate approach to the analysis of air quality in a high environmental risk area
A new robust Bayesian small area estimation via α-stable model for estimating the proportion of athletic students in California
In the last few years, diabetes mellitus and obesity revealed to be one of the fastest-growing chronic diseases in youth in the United States. The number of new diabetes cases is dramatically increasing, and, for the moment, effective therapy does not exist. Experts believe that one of the causes of this increase is the decline in exercise behavior. The California Education Code requires local educational agencies (LEAs) to administer the FITNESSGRAM, the Physical Fitness Test (PFT), to Californian students of public schools. This test evaluates six fitness areas, and experts defined that a passing result on all six areas of the test represents a fitness level that offers some protection against the diseases associated with physical inactivity. We consider 2015–2016 data provided by the California Department of Education (CDE): for each Californian county ((Formula presented.)), we aim at estimating the county-level proportion of students with a score equal to six. To account for the heterogeneity of the phenomenon and the presence of outlying counties, we extend the standard area-level model by specifying the random effects as a symmetric (Formula presented.) -stable (S (Formula presented.) S) distribution that can accommodate different types of outlying observations. The model can accurately estimate the county-level proportion of students with a score equal to six. Results highlight some interesting relationships with social and economic situations in each county. The performance of the proposed model is also investigated through an extensive simulation study
Ercolano M.R., Carli P., Arima S., Fogliano V., Tardella L., Barone A. Complex network analysis of traits affecting tomato organoleptic quality. The 5th Solanaceae Genome Workshop. Cologne (Germany) 12-16 ottobre 2008, pag 172
Exploiting blank spots for model-based background correction in discovering genes with DNA array data
A model-based approach to lagoon ecosystem classification
The EU Water Framework Directive recognizes benthic macroinvertebrates as good biological indicators of the quality of transitional waters as they are mainly exposed to natural variability patterns characteristic of these ecosystems, due to their life cycles and space-use behavior. Here, we address the classification of the ecological status of three lagoons in Apulia (I), using three multimetric indices based on benthic macroinvertebrates (namely M-AMBI, BITS and ISS), that are likely to respond differently to different sources of stress and natural variability. Lagoon classification is usually based on the discretization of such indices by standard classification boundaries with only partial consideration of the natural variability of ecosystem properties and possible inaccuracies of the classification procedures. We first consider a Bayesian hierarchical model to study the effects of abiotic covariates and external anthropogenic pressure indicators on the multimetric indices, taking into account their correlation structure. In order to further investigate the possible contrasting behavior of the three indices in terms of lagoon classification, we propose a cumulative proportional odds model for the discretized version of the indices as function of the same explanatory ecological variables. This model allows to understand how abiotic variables and anthropogenic pressures affect the classification into different ecological status and to evaluate the agreement between indices in terms of classification. Both models have been estimated in a fully Bayesian framework by a Monte Carlo Markov Chain posterior simulation algorithm
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Estimating child and infant mortality in Egypt through a Bayesian approach for small area
In the Egyptian context, delayed fertility transition compared to neighboring countries, can be in part ascribed to the delay in the fall of infant mortality rates. Infant mortality was high in Egypt till the 1980s. Since then, infant mortality recorded significant progress: in 2001, the number of deaths per 1000 births was 38 against 97 in 1984. However differences are still significant between governorates: in urban governorates, the 2008 level is 29 deaths per 1,000 births. In rural Upper Egypt, mortality was about 39 ‰. No previous studies had attempted to estimate infant and child mortality in Egypt for small geographical areas. Strong socio-economics differences and inequalities exist between urban and rural setting, Upper and Lower Egypt and even between small area in the same region or city. Those differences justify the need to calculate infant and child mortality rates at the local level. We will account for this problem using a Bayesian hierarchical model for small area: model-based estimators will be derived and their precisions compared with alternative estimators proposed in literature. We use data from Egyptian Demographic and Health Surveys (1995 and 2005), Egyptian population register and Egyptian Population and Housing Census (1996 and 2006)
- …
