1,721,025 research outputs found

    Bayesian inference for Poisson and multinomial log-linear models

    Full text link
    Categorical data frequently arise in applications in the social sciences. In such applications,the class of log-linear models, based on either a Poisson or (product) multinomial response distribution, is a flexible model class for inference and prediction. In this paper we consider the Bayesian analysis of both Poisson and multinomial log-linear models. It is often convenient to model multinomial or product multinomial data as observations of independent Poisson variables. For multinomial data, Lindley (1964) showed that this approach leads to valid Bayesian posterior inferences when the prior density for the Poisson cell means factorises in a particular way. We develop this result to provide a general framework for the analysis of multinomial or product multinomial data using a Poisson log-linear model. Valid finite population inferences are also available, which can be particularly important in modelling social data.We then focus particular attention on multivariate normal prior distributions for the log-linear model parameters.Here, an improper prior distribution for certain Poisson model parameters is required for valid multinomial analysis, and we derive conditions under which the resulting posterior distribution is proper.We also consider the construction of prior distributions across models, and for model parameters, when uncertainty exists about the appropriate form of the model. We present classes of Poisson and multinomial models, invariant under certain natural groups of permutations of the cells. We demonstrate that, if prior belief concerning the model parameters is also invariant, as is the case in a `reference' analysis, then choice of prior distribution is considerably restricted. The analysis of multivariate categorical data in the form of a contingency table is considered in detail. We illustrate the methods with two examples

    Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions - Discussion

    No full text
    The major implementational problem for reversible jump Markov chain Monte Carlo methods is that there is commonly no natural way to choose jump proposals since there is no Euclidean structure in the parameter space to guide our choice. We consider mechanisms for guiding the choice of proposal. The first group of methods is based on an analysis of acceptance probabilities for jumps. Essentially, these methods involve a Taylor series expansion of the acceptance probability around certain canonical jumps and turn out to have close connections to Langevin algorithms. The second group of methods generalizes the reversible jump algorithm by using the so–called saturated space approach. These allow the chain to retain some degree of memory so that, when proposing to move from a smaller to a larger model, information is borrowed from the last time that the reverse move was performed. The main motivation for this paper is that, in complex problems, the probability that the Markov chain moves between such spaces may be prohibitively small, as the probability mass can be very thinly spread across the space. Therefore, finding reasonable jump proposals becomes extremely important. We illustrate the procedure by using several examples of reversible jump Markov chain Monte Carlo applications including the analysis of autoregressive time series, graphical Gaussian modeling and mixture modelling.<br/

    Ecological inference for 2 × 2 tables - Discussion

    No full text
    A fundamental problem in many disciplines, including political science, sociology and epidemiology, is the examination of the association between two binary variables across a series of 2 × 2 tables, when only the margins are observed, and one of the margins is fixed. Two unobserved fractions are of interest, with only a single response per table, and it is this non-identifiability that is the inherent difficulty lying at the heart of ecological inference. Many methods have been suggested for ecological inference, often without a probabilistic model; we clarify the form of the sampling distribution and critique previous approaches within a formal statistical framework, thus allowing clarification and examination of the assumptions that are required under all approaches. A particularly difficult problem is choosing between models with and without contextual effects. Various Bayesian hierarchical modelling approaches are proposed to allow the formal inclusion of supplementary data, and/or prior information, without which ecological inference is unreliable. Careful choice of the prior within such models is required, however, since there may be considerable sensitivity to this choice, even when the model assumed is correct and there are no contextual effects. This sensitivity is shown to be a function of the number of areas and the distribution of the proportions in the fixed margin across areas. By explicitly providing a likelihood for each table, the combination of individual level survey data and aggregate level data is straightforward and we illustrate that survey data can be highly informative, particularly if these data are from a survey of the minority population within each area. This strategy is related to designs that are used in survey sampling and in epidemiology. An approximation to the suggested likelihood is discussed, and various computational approaches are described. Some extensions are outlined including the consideration of multiway tables, spatial dependence and area-specific (contextual) variables. Voter registration-race data from 64 counties in the US state of Louisiana are used to illustrate the methods. Copyright 2004 Royal Statistical Society.

    Bayesian model determination for multivariate ordinal and binary data

    No full text
    Different conditional independence specifications for ordinal categorical data are compared by calculating a posterior distribution over classes of graphical models. The approach is based on the multivariate ordinal probit model where the data are considered to have arisen as truncated multivariate normal random vectors. By parameterising the precision matrix of the associated multivariate normal in Cholesky form, ordinal data models corresponding to directed acyclic conditional independence graphs for the latent variables can be specified and conveniently computed. Where one or more of the variables are binary this parameterisation is particularly compelling, as necessary constraints on the latent variable distribution can be imposed in such a way that a standard, fully normalised, prior can still be adopted. For comparing different directed graphical models a reversible jump Markov chain Monte Carlo (MCMC) approach is proposed. Where interest is focussed on undirected graphical models, this approach is augmented to allow switches in the orderings of variables of associated directed graphs, hence allowing the posterior distribution over decomposable undirected graphical models to be computed. The approach is illustrated with several examples, involving both binary and ordinal variables, and directed and undirected graphical model classes

    Model-based inference for categorical survey data subject to non-ignorable non-response (with discussion)

    No full text
    We consider non-response models for a single categorical response with categorical covariates whose values are always observed. We present Bayesian methods for ignorable models and a particular non-ignorable model, and we argue that standard methods of model comparison are inappropriate for comparing ignorable and non-ignorable models. Uncertainty about ignorability of non-response is incorporated by introducing parameters describing the extent of non-ignorability into a pattern mixture specification and integrating over the prior uncertainty associated with these parameters. Our approach is illustrated using polling data from the 1992 British general election panel survey. We suggest sample size adjustments for surveys when non-ignorable non-response is expected

    Bayesian forecasting of mortality rates by using latent Gaussian models

    No full text
    We provide forecasts for mortality rates by using two different approaches. First we employ dynamic non-linear logistic models based on the Heligman–Pollard formula. Second, we assume that the dynamics of the mortality rates can be modelled through a Gaussian Markov random field. We use efficient Bayesian methods to estimate the parameters and the latent states of the models proposed. Both methodologies are tested with past data and are used to forecast mortality rates both for large (UK and Wales) and small (New Zealand) populations up to 21 years ahead. We demonstrate that predictions for individual survivor functions and other posterior summaries of demographic and actuarial interest are readily obtained. Our results are compared with other competing forecasting methods.</p
    corecore