1,720,989 research outputs found
On a normalized random measure with independent increments relevant to Bayesian nonparametric inference
Conjugacy as a distinctive feature of the Dirichlet process
Recently the class of normalized random measures with independent increments, which contains the Dirichlet process as a particular case, has been introduced. Here a new technique for deriving moments of these random probability measures is proposed. It is shown that, a priori, most of the appealing properties featured by the Dirichlet process are preserved. When passing to posterior computations, we obtain a characterization of the Dirichlet process as the only conjugate member of the whole class of normalized random measures with independent increments
Bayesian nonparametric analysis for a generalized Dirichlet process prior
This paper considers a generalization of the Dirichlet process which is obtained by suitably normalizing superposed independent gamma processes having increasing integer-valued scale parameter. A comprehensive treatment of this random probability measure is provided. We prove results concerning its finite-dimensional distributions, moments, predictive distributions and the distribution of its mean. Most expressions are given in terms of multiple hypergeometric functions, thus highlighting the interplay between Bayesian Nonparametrics and special functions. Finally, a suitable simulation algorithm is applied in order to compute quantities of statistical interest
Bayesian nonparametric estimation of the probability of discovering new species
We consider the problem of evaluating the probability of discovering a certain number of new species in a new sample of population units, conditional on the number of species recorded in a basic sample. We use a Bayesian nonparametric approach. The different species proportions are assumed to be random and the observations from the population exchangeable. We provide a Bayesian estimator, under quadratic loss, for the probability of discovering new species which can be compared with well-known frequentist estimators. The results we obtain are illustrated through a numerical example and an application to a genomic dataset concerning the discovery of new genes by sequencing additional single-read sequences of cDNA fragments
Hierarchical mixture modelling with normalized inverse Gaussian priors
In recent years the Dirichlet process prior has experienced a great success in the context of Bayesian mixture modeling. The idea of overcoming discreteness of its realizations by exploiting it in hierarchical models, combined with the development of suitable sampling techniques, represent one of the reasons of its popularity. In this article we propose the normalized inverse-Gaussian (N–IG) process as an alternative to the Dirichlet process to be used in Bayesian hierarchical models. The N–IG prior is constructed via its finite-dimensional distributions. This prior, although sharing the discreteness property of the Dirichlet prior, is characterized by a more elaborate and sensible clustering which makes use of all the information contained in the data.Whereas in the Dirichlet case the mass assigned to each observation depends solely on the number of times that it occurred, for the N–IG prior the weight of a single observation depends heavily on the whole number of ties in the sample. Moreover, expressions corresponding to relevant statistical quantities, such as a priori moments and the predictive distributions, are as tractable as those arising from the Dirichlet process. This implies that well-established sampling schemes can be easily extended to cover hierarchical models based on the N–IG process. The mixture of N–IG process and the mixture of Dirichlet process are compared using two examples involving mixtures of normals
A Bayesian Nonparametric approach for comparing clustering structures in EST libraries
Inference for Expressed Sequence Tags (ESTs) data is considered. We focus on evaluating the redundancy of a cDNA library and, more importantly, on comparing different libraries on the basis of their clustering structure. The numerical results we achieve allow us to assess the effect of an error correction procedure for EST data and to study the compatibility of single EST libraries with respect to merged ones. The proposed method is based on a Bayesian nonparametric approach that allows to understand the clustering mechanism that generates the observed data. As specific nonparametric model we use the two parameter Poisson–Dirichlet (PD) process. The PD process represents a tractable nonparametric prior which is a natural candidate for modeling data arising from discrete distributions. It allows prediction and testing in order to analyze the clustering structure featured by the data. We show how a full Bayesian analysis can be performed and describe the corresponding computational algorithm
An asymptotic analysis of a class of discrete nonparametric priors
In this paper we analyze the asymptotic behaviour of Gibbs-type priors, that represent a natural generalization of the Dirichlet process. After determining their topological support, we investigate their consistency according to the “ what if ”, or frequentist, approach, that postulates the existence of a “ true ” distribution P_0. We provide a full taxonomy of their limiting behaviors: consistency holds essentially always for discrete P_0, whereas inconsistency may occur for diffuse P_0. Such findings are further illustrated by means of three special cases admitting closed form expressions and exhibiting a wide range of asymptotic behaviors. For both Gibbs-type priors and discrete nonparametric priors in general, the possible inconsistency should not be interpreted as evidence against their use tout court. It rather represents an indication that they are designed for modeling discrete distributions and evidence against their use in the case of diffuse P_0
Robustifying Bayesian nonparametric mixtures for count data
Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure
Bayesian clustering in nonparametric hierarchical mixture models.
Il problema della stima del numero di componenti di una mistura che ha generato un insieme di dati può essere affrontato in ambito bayesiano mediante l'utilizzo di modelli mistura nonparametrici. Lo sviluppo di appropriati metodi di simulazione ha favorito la diffusione in ambito applicativo di modelli mistura gerachici basati sul processo di Dirichlet. Vengono qui studiati modelli nonparametrici alternativi alla mistura del processo di Dirichlet e confrontate le diverse strutture di clustering dei dati. L'analisi è completata dallo studio di due insiemi di dati reali.
A note on the simulation of Lévy processes with a view towards applications
In recent years several techniques for simulating purely discontinuous Lévy processes have been developed. Due to the "infinite activity" character of most Lévy processes the achievement of satisfactory approximations is not a trivial issue. By means of two examples, one related to an optimal storage problem and the other to Bayesian nonparametric inference, the behavior of the so-called inverse Lévy measure algorithm is studied. Some hints for overcoming possible difficulties are given
- …
