1,721,037 research outputs found

    Bayesian modeling of spatio-temporal data with R

    No full text
    Applied sciences, both physical and social, such as atmospheric, biological, climate, demographic, economic, ecological, environmental, oceanic and political, routinely gather large volumes of spatial and spatio-temporal data in order to make wide ranging inference and prediction. Ideally such inferential tasks should be approached through modeling as modeling automatically aids in estimation of uncertainties in all conclusions drawn from such data. Unified Bayesian modeling, implemented through user friendly software packages, provides a crucial key to unlocking the full power of these methods for solving challenging practical problems.Keeping the applied scientists in mind, this book presents most of the modeling with the help of R commands written in a purposefully developed R package to facilitate spatio-temporal modeling. However, the presentation in the book does not lose sight of mathematical and statistical rigor as it presents the underlying theories of Bayesian inference and computation in stand alone chapters in the first part which would be appealing to mathematics/statistics major final year undergraduate or post-graduate students who are in search of such modeling

    Bayesian Estimation and Model Choice in Item Response Models

    No full text
    Item response models are essential tools for analyzing results from many educational and psychological tests. Such models are used to quantify the probability of correct response as a function of unobserved examinee ability and other parameters explaining the difficulty and the discriminatory power of the questions in the test. Some of these models also incorporate a threshold parameter for the probability of the correct response to account for the effect of guessing the correct answer in multiple choice type tests.In this article we consider fitting of such models using the Gibbs sampler. A data augmentation method to analyze a normal-ogive model incorporating a threshold guessing parameter is introduced and compared with a Metropolis-Hastings sampling method. The proposed method is an order of magnitude more efficient than the existing method. Another objective of this paper is to develop Bayesian model choice techniques for model discrimination. A predictive approach based on a variant of the Bayes factor is used and compared with another decision theoretic method which minimizes an expected loss function on the predictive space. A classical model choice technique based on a modified likelihood ratio test statistic is shown as one component of the second criterion. As a consequence the Bayesian methods proposed in this paper are contrasted with the classical approach based on the likelihood ratio test. Several examples are given to illustrate the methods

    Introduction to probability, statistics & R: Foundations for data-based sciences

    No full text
    A strong grasp of elementary statistics and probability, along with basic skills in using R, is essential for various scientific disciplines reliant on data analysis. This book serves as a gateway to learning statistical methods from scratch, assuming a solid background in high school mathematics. Readers gradually progress from basic concepts to advanced statistical modelling, with examples from actuarial, biological, ecological, engineering, environmental, medicine, and social sciences highlighting the real-world relevance of the subject. An accompanying R package enables seamless practice and immediate application, making it ideal for beginners. The book comprises 19 chapters divided into five parts. Part I introduces basic statistics and the R software package, teaching readers to calculate simple statistics and create basic data graphs. Part II delves into probability concepts, including rules and conditional probability, and introduces widelyused discrete and continuous probability distributions (e.g., binomial, Poisson, normal, log-normal). It concludes with the central limit theorem and joint distributions for multiple random variables. Part III explores statistical inference, covering point and interval estimation, hypothesis testing, and Bayesian inference. This part is intentionally less technical, making it accessible to readers without an extensive mathematical background. Part IV addresses advanced probability and statistical distribution theory, assuming some familiarity with (or concurrent study of) mathematical methods like advanced calculus and linear algebra. Finally, Part V focuses on advanced statistical modelling using simple and multiple regression and analysis of variance, laying the foundation for further studies in machine learning and data science applicable to various data and decision analytics contexts. Based on years of teaching experience, this textbook includes numerousexercises and makes extensive use of R, making it ideal for year-long data science modules and courses. In addition to university courses, the book amply covers the syllabus for the Actuarial Statistics 1 examination of the Institute and Faculty of Actuaries in London. It also provides a solid foundation for postgraduate studies in statistics and probability, or a reliable reference for statistics.</p

    Dynamically updated spatially varying parameterizations of hierarchical Bayesian models for spatial data

    No full text
    Fitting hierarchical Bayesian models to spatially correlated data sets using Markov chain Monte Carlo (MCMC) techniques is computationally expensive. Complicated covariance structures of the underlying spatial processes, together with high dimensional parameter space, mean that the number of calculations required grows cubically with the number of spatial locations at each MCMC iteration. This necessitates the need for efficient model parameterisations that hasten the convergence and improve the mixing of the associated algorithms. We consider partially centred parameterisations (PCPs) which lie on a continuum between what are known as the centred (CP) and noncentered parameterisations (NCP). By introducing a weight matrix we remove the conditional posterior correlation between the fixed and the random effects, and hence construct a PCP which achieves immediate convergence for a three stage model, based on multiple Gaussian processes with known covariance parameters. When the covariance parameters are unknown we dynamically update the parameterisation within the sampler. The PCP outperforms both the CP and the NCP and leads to a fully automated algorithm which has been demonstrated in two simulation examples. The effectiveness of the spatially varying PCP is illustrated with a practical data set of nitrogen dioxide concentration levels. Supplemental materials consisting of appendices, data sets and computer code to reproduce the results are available online

    A comparison of Bayesian models for daily ozone concentration levels

    No full text
    Recently, there has been a surge of interest in Bayesian space-time modeling of daily maximum eight-hour average ozone concentration levels. Hierarchical models based on well known time series modeling methods such as the dynamic linear models (DLM) and the auto-regressive (AR) models are often used in the literature. The DLM, developed as a result of the popularity of Kalman filtering methods, provide a dynamical state-space system that is thought to evolve from a pair of state and observation equations. The AR models, on the other hand, cast in a Bayesian hierarchical setting, have recently been developed through a pair of models where a measurement error model is formulated at the top level and an AR model for the true ozone concentration levels is postulated at the next level. Each of the modeling scenarios is set in an appropriate multivariate setting to model the spatial dependence. This paper compares these two methods in hierarchical Bayesian settings. A simplified skeletal version of the DLM taken from Dou et al. (2009) is compared theoretically with a matching hierarchical AR model. The comparisons reveal many important differences in the induced space-time correlation structures. Further comparisons of the variances of the predictive distributions by conditioning on different sets of data for each model show superior performances of the AR models under certain conditions. These theoretical investigations are followed-up by a simulation study and a real data example implemented using Markov chain Monte Carlo (MCMC) methods for modeling daily maximum eighthour average ozone concentration levels observed in the state of New York in the months of July and August, 2006. The hierarchical AR model is chosen by all the model choice criteria considered in this example

    An evaluation of European air pollution regulations for particulate matter monitored from a heterogeneous network

    No full text
    Statistical methods are needed for evaluating many aspects of air pollution regulations increasingly adopted by many different governments in the European Union. The atmospheric particulate matter (PM) is an important air pollutant for which regulations have been issued recently. A challenging task here is to evaluate the regulations based on data monitored on a heterogeneous network where PM has been observed at a number of sites and a surrogate has been observed at some other sites. This paper develops a hierarchical Bayesian joint space-time model for the PM measurements and its surrogate between which the exact relationship is unknown, and applies the methods to analyse spatio-temporal data obtained from a number of sites in Northern Italy. The model is implemented using MCMC techniques and methods are developed to meet the regulatory demands. These enablefull inference with regard to process unknowns, calibration, validation, predictions in time and space and evaluation of regulatory standards

    A space-time model for joint modeling of ocean temperature and salinity levels as measured by Argo Floats

    Full text link
    The world's climate is to a large extent driven by the transport of heat and fresh water in the oceans. Regular monitoring, studying, understanding and forecasting of temperature and salinity at different depths of the oceans are a great scientific challenge. Temperature at the ocean surface can be measured from space. However salinity cannot yet be measured by satellites, and space-based measurements can only ever give us values at the surface. Until recently temperature and salinity measurements within the oceans have had to come from expensive research ships. The Argo float program has been funded by various nations to collect actual measurements and rectify this problem.A Bayesian hierarchical model is proposed in this paper describing the spatio-temporal behaviour of the joint distribution of temperature and salinity levels. The model is obtained as a kernel-convolution effect of a single latent spatio-temporal process. Additional terms in the mean describe non-stationarity arising in time and space. Predictive Bayesian model selection criteria have been used to validate the models using data for the year 2003. Illustrative annual prediction maps along with their uncertainty maps are also obtained. The Markov chain Monte Carlo methods are used throughout in the implementation<br/

    spTimer: Spatio-Temporal Bayesian Modeling Using R

    Full text link
    Hierarchical Bayesian modeling of large point-referenced space-time data is increasingly becoming feasible in many environmental applications due to the recent advances in both statistical methodology and computation power. Implementation of these methods using the Markov chain Monte Carlo (MCMC) computational techniques, however, requires development of problem-specific and user-written computer code, possibly in a low-level language. This programming requirement is hindering the widespread use of the Bayesian model-based methods among practitioners and, hence there is an urgent need to develop high-level software that can analyze large data sets rich in both space and time. This paper develops the package spTimer for hierarchical Bayesian modeling of stylized environmental space-time monitoring data as a contributed software package in the R language that is fast becoming a very popular statistical computing platform. The package is able to fit, spatially and temporally predict large amounts of space-time data using three recently developed Bayesian models. The user is given control over many options regarding covariance function selection, distance calculation, prior selection and tuning of the implemented MCMC algorithms, although suitable defaults are provided. The package has many other attractive features such as on the fly transformations and an ability to spatially predict temporally aggregated summaries on the original scale, which saves the problem of storage when using MCMC methods for large datasets. A simulation example, with more than a million observations, and a real life data example are used to validate the underlying code and to illustrate the software capabilities
    corecore