1,720,987 research outputs found

    Bayesian analysis of Amazon’s best-selling books via finite nested mixture models

    No full text
    Online shopping has become increasingly common in recent years and has influenced how we form our preferences and choose the items to buy. This influence also applies to the books we read: other readers’ online reviews are one of the most used tools to determine the next book we will buy. The increasing use of e-commerce websites has also led to a large availability of data to study how the users’ ratings interact with other variables. Here, we consider a dataset of Amazon’s best-selling books in the period 2009-2019. In particular, we study the similarities of the distributions of ratings and prices across different years. To fully capture the complexity of the observed data, we make use of flexible Bayesian nested mixture models to simultaneously avoid strict parametric assumptions and study the clustering structure of observations and years

    Mechanical strength of adhesively bonded joints using polymeric additive manufacturing

    Full text link
    This paper investigates the combined use of one of the most widespread additive manufacturing techniques, fused deposition molding, with polymeric materials and structural adhesive. The aim is twofold: first, to enhance the adhesive performance exploiting the capability of the additive manufacturing to tailor the bonding surface of the adherend, and second to overcome one of the main limitations of 3D printing, i.e. the quite small printing volume, by means of adhesive bonding. Bonding multiple parts together without loss of performance could open new possibilities for this technology. The present research analyzes, by using a Design of Experiment technique, a wide set of single lap joints with two adhesives and seven different surface morphologies. The results highlight that the adhesive bonding does not undermine the load carrying capacity of the joints as well as their stiffness, and, in some cases, it causes a slight improvement of the peak force. The morphology of the surface plays only a small role in the performance of the system, since it cannot provide a strong mechanical interlocking of the parts due to peel stresses and because of the predominant effect of stress concentrations at the corners, which cause substrate failure

    The generalized nested common atoms model

    Full text link
    Bayesian hierarchical nonparametric models offer a convenient framework for modeling nested data, where observations are organized into groups. These priors jointly accommodate the dependence among groups and among observations within the same group in a flexible way. Several recent instances of such models have combined nested levels of Dirichlet processes and a common sequence of atoms, a formulation that allows for multi-layered partitions, i.e., a simultaneous clustering of observations and groups. However, using a common set of atoms can lead to a forced high prior correlation between the generated random measures. This characteristic can cause shortcomings in the clustering results and even biased density estimation. Extending the nested process with more general stick-breaking specifications for the weights alleviates these issues. Specifically, the proposed generalized Common Atoms Model enhances the flexibility of the dependence structure and improves density estimation. Three..

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    THE ROLE OF INTRINSIC DIMENSION IN HIGH-RESOLUTION PLAYER TRACKING DATA—INSIGHTS IN BASKETBALL

    No full text
    Following the introduction of high-resolution player tracking technology, a new range of statistical analysis has emerged in sports, specifically in basketball. However, such high-dimensional data are often challenging for statistical inference and decision making. In this article we employ a state-of-the-art Bayesian mixture model that allows the estimation of heterogeneous intrinsic dimension (ID) within a dataset, and we propose some theoretical enhancements. Informally, the ID can be seen as an indicator of complexity and dependence of the data at hand, and it is usually assumed unique. Our method provides the capacity to reveal valuable insights about the hidden dynamics of sports interactions in space and time which helps to translate complex patterns into more coherent statistics. The application of this technique is illustrated using NBA basketball players’ tracking data, allowing effective classification and clustering. In movement data the analysis identified key stages of offensive actions, such as creating space for passing, preparation/shooting, and following through which are relevant for invasion sports. We found that the ID value spikes, reaching a peak between four and eight seconds in the offensive part of the court, after which it declines. In shot charts we obtained groups of shots that produce substantially higher and lower successes. Overall, game-winners tend to have a larger intrinsic dimension, indicative of greater unpredictability and unique shot placements. Similarly, we found higher ID values in plays when the score margin is smaller rather than larger. The exploitation of these results can bring clear strategic advantages in sports games

    Nonignorable nonreponse in a multilevel framework: assessing the efficacy di job training courses

    No full text
    L'efficacia delle istituzioni che forniscono servizi pubblici alla persona (Scuole, Università, Corsi di Formazione, Ospedali) è spesso valutata attraverso indicatori costruiti a partire da indagini sugli utenti. Un problema comune a questo tipo di indagini è la presenza di mancate risposte che spesso presentano un meccanismo non casuale e dipendente in particolare dalla variabile oggetto di interesse (stato occupazionale, soddisfazione, ecc.). Il problema è ulteriormente complicato dal fatto che la struttura gerarchica dei dati (utenti al primo livello ed unità operative al secondo) richiede l'uso di modelli multilivello per il trattamento di dati politomici (stato occupazionale, livello di soddisfazione, ecc.) che possa tenere conto del possibile meccanismo non casuale delle non risposte. Il modello viene illustrato attraverso un'applicazione alla valutazione dell'efficacia relativa di enti di formazione sulla base della probabilità di occupazione dei formati ad un anno dalla fine dei corsi. I risultati mostrano come la mancata considerazione delle non risposte spesso assunte come casuali, poò produrre notevoli distorsioni nelle graduatorie e nei confronti tra gli enti, inficiando totalmente le conclusioni che se ne possono trarre. Il problema può affliggere anche altre indagini analoghe, quali Alma Laurea, ed i tassi di occupazione che sono ricavati in tale ambito. Ne consegue che, senza una trattazione metodologica adeguata delle mancate risposte, gli indicatori utilizzati per valutare l'efficacia degli enti possono risultare inaffidabili

    A two-stage Bayesian semiparametric model for novelty detection with robust prior information

    Full text link
    Novelty detection methods aim at partitioning the test units into already observed and previously unseen patterns. However, two significant issues arise: there may be considerable interest in identifying specific structures within the novelty, and contamination in the known classes could completely blur the actual separation between manifest and new groups. Motivated by these problems, we propose a two-stage Bayesian semiparametric novelty detector, building upon prior information robustly extracted from a set of complete learning units. We devise a general-purpose multivariate methodology that we also extend to handle functional data objects. We provide insights on the model behavior by investigating the theoretical properties of the associated semiparametric prior. From the computational point of view we, propose, a suitable ξ: ξ-sequence to construct an independent slice-efficient sampler that takes into account the difference between manifest and novelty components. We showcase our model performance through an extensive simulation study and applications on both multivariate and functional datasets, in which diverse and distinctive unknown patterns are discovered

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
    corecore