1,721,007 research outputs found
Log-mean linear regression models for binary responses with an application to multimorbidity
In regression models for categorical data a linear model is typically related to the response variables via a transformation of probabilities called the link function. We introduce an approach based on two link functions for binary data named the log-mean and the log-mean linear methods. The choice of the link function plays a key role in the interpretation of the model, and our approach is especially appealing in terms of interpretation of the effects of covariates on the association of responses. Similarly to Poisson regression, the log-mean and log-mean linear regression coefficients of single outcomes are log-relative-risks, and we show that the relative risk interpretation is maintained also in the regressions of the association of responses. Furthermore, certain collections of zero log-mean linear regression coefficients imply that the relative risks for joint responses factorize with respect to the corresponding relative risks for marginal responses. This work is motivated by the analysis of a data set obtained from a case–control study aimed at investigating the effect of human immunodeficiency virus infection on multimorbidity, i.e. simultaneous presence of two or more non-infectious comorbidities in one patient
On the comparison of regression coefficients across multiple logistic models with binary predictors
In many applied contexts, it is of interest to identify the extent to which a given association measure changes its value as different sets of variables are included in the analysis. We consider logistic regression models where the interest is for the effect of a focal binary explanatory variable on a specific response, and a further collection of binary covariates is available. We provide a methodological framework for the joint analysis of the full set of coefficients of the focal variable computed across all the models obtained by adding or removing predictors from the set of covariates. The result is obtained by applying a specific log-hybrid linear expansion of the joint distribution of the variables that implicitly comprises all the regression coefficients of interest. In this way, we obtain a method that allows one to verify, in a flexible way, a wide range of scientific hypotheses involving the comparison of multiple logistic regression coefficients both in nested and in non-nested models. The proposed methodology is illustrated through a test bed example and an empirical application
Pairwise Likelihood Inference for Nested Hidden Markov Chain Models for Multilevel Longitudinal Data
In the context of multilevel longitudinal data, where sample units are collected in clusters, an important aspect that should be accounted for is the unobserved heterogeneity between sample units and between clusters. For this aim, we propose an approach based on nested hidden (latent) Markov chains, which are associated with every sample unit and with every cluster. The approach allows us to account for the previously mentioned forms of unobserved heterogeneity in a dynamic fashion; it also allows us to account for the correlation that may arise between the responses provided by the units belonging to the same cluster. Under the assumed model, computing the manifest distribution of these response variables is infeasible even with a few units per cluster. Therefore, we make inference on this model through a composite likelihood function based on all the possible pairs of subjects within each cluster. Properties of the composite likelihood estimator are assessed by simulation. The proposed approach is illustrated through an application to a dataset concerning a sample of Italian workers in which a binary response variable for the worker receiving an illness benefit was repeatedly observed. Supplementary materials for this article are available online.</p
Expected posterior priors for model comparison in a class of discrete graphical models.
The implementation of the Bayesian paradigm to model comparison can be problematic. In
particular, prior distributions on the parameter space of each candidate model require special
care. While it is well known that improper priors cannot be routinely used for Bayesian model
comparison, we claim that also the use of proper conventional priors under each model should
be regarded as suspicious, especially when comparing models having different dimensions.
The basic idea is that priors should not be assigned separately under each model; rather they
should be related across models, in order to acquire some degree of compatibility, and thus
allow fairer and more robust comparisons. In this connection, the intrinsic prior as well as
the expected posterior prior (EPP) methodology represent a useful tool. In this paper we develop
a procedure based on EPP to perform Bayesian model comparison for discrete undirected
decomposable graphical models, although our method could be adapted to deal also with directed
acyclic graph models. We present two possible approaches. One based on imaginary
data, and one which makes use of a limited number of actual data. The methodology is illustrated
through the analysis of a 2 × 3 × 4 contingency table
Observation-driven models for discrete-valued time series
Statisticalinferencefordiscrete-valuedtimeserieshasnotbeen developed like traditional methods for time series generated by continuous random variables. Some relevant models exist, but the lack of a homogenous framework raises some critical issues. For instance, it is not trivial to ex- plore whether models are nested and it is quite arduous to derive stochastic properties which simultaneously hold across different specifications. In this paper, inference for a general class of first order observation-driven mod- els for discrete-valued processes is developed. Stochastic properties such as stationarity and ergodicity are derived under easy-to-check conditions, which can be directly applied to all the models encompassed in the class and for every distribution which satisfies mild moment conditions. Consis- tency and asymptotic normality of quasi-maximum likelihood estimators are established, with the focus on the exponential family. Finite sample properties and the use of information criteria for model selection are inves- tigated throughout Monte Carlo studies. An empirical application to count data is discussed, concerning a test-bed time series on the spread of an infection
Profile graphical models
This thesis concerns the theory and the inference of a new class of independence models based on a graphical representation that we name profile graphs. Multiple graph models are special cases in this class and the compatibility in terms of independence structure is derived with respect to chain graph models of different types. Inference and model selection based on both Lasso methodology and Bayesian theory are studied and implemented. The thesis is composed of four chapters. In the first chapter, we present a literature review of multiple and chain graphs. Markov properties, parameterization and inference are reviewed for undirected, bidirected, LWF chain and regression graphs. In the second chapter, a class of profile graphs is introduced for modelling the effect of an external factor on the independence structure of a multivariate set of variables. Conditional and marginal independence structures are explored by using profile undirected and bi-directed graphical models, respectively. These two families of graphical models are formally defined with their corresponding Markov properties. Furthermore, necessary conditions are derived to induce, for any profile undirected and bi-directed graph model, a compatible class of chain graph models of different type known as LWF chain graph and regression graph, respectively. In the third chapter, we propose two Bayesian approaches for the selection of Ising models associated to multiple undirected graphs. We devise a Bayesian exact-likelihood inference for low-dimensional binary response data, based on conjugate priors for log- linear parameters. We also propose a quasi-likelihood Bayesian approach for fitting high-dimensional multiple Ising graphs, where the normalization constant results computationally intractable. In both methods, we define a Markov Random Field prior on the graph structures, which encourages the selection of the same edges in related graphs. Finally, in the fourth chapter we present some final remarks on Chapters 2 and 3
Stationarity of a general class of observation driven models for discrete valued processes
- …
