1,721,192 research outputs found

    Probabilistic topic models for sequence data

    No full text
    Probabilistic topic models are widely used in different contexts to uncover the hidden structure in large text corpora. One of the main (and perhaps strong) assumption of these models is that generative process follows a bag-of-words assumption, i.e. each token is independent from the previous one. We extend the popular Latent Dirichlet Allocation model by exploiting three different conditional Markovian assumptions: (i) the token generation depends on the current topic and on the previous token; (ii) the topic associated with each observation depends on topic associated with the previous one; (iii) the token generation depends on the current and previous topic. For each of these modeling assumptions we present a Gibbs Sampling procedure for parameter estimation. Experimental evaluation over real-word data shows the performance advantages, in terms of recall and precision, of the sequence-modeling approaches. © 2013 The Author(s)

    Hierarchical latent factors for preference data

    No full text
    In this work we propose a probabilistic hierarchical generative approach for users' preference data, which is designed to overcome the limitation of current methodologies in Recommender Systems and thus to meet both prediction and recommendation accuracy. The Bayesian Hierarchical User Community Model (BH-UCM) focuses both on modeling the popularity of items and the distribution over item ratings. An extensive evaluation over two popular benchmark datasets shows that the combined modeling of item popularity and rating provides a powerful framework both for rating prediction and for the generation of accurate recommendation lists. Copyright (c) 2012 - Edizioni Libreria Progetto and the authors

    A probabilistic hierarchical approach for pattern discovery in collaborative filtering data (Extended Abstract)

    No full text
    This paper presents a hierarchical probabilistic approach to collaborative filtering which allows the discovery and analysis of both global patterns (i.e., tendency of some products of being universally appreciated ) and local patterns (tendency of users within a community to express a common preference on the same group of items). The core of our approach is a probabilistic co-clustering strategy, arranged in a hierarchical fashion: first, user communities are discovered, and then the information provided by each user community is used to discover topics, grouping items into categories. The experimental evaluation shows that the proposed model achieves a competitive prediction accuracy with respect to the state-of-art collaborative filtering approaches

    Temporal Recurrent Activation Networks

    Full text link
    We tackle the problem of predicting whether a target user (or group of users) will be active within an event stream before a time horizon. Our solution, called PATH, leverages recurrent neural networks to learn an embedding of the past events. The embedding allows to capture influence and susceptibility between users and places closer (the representation of) users that frequently get active in different event streams within a small time interval. We conduct an experimental evaluation on real world data and compare our approach with related work

    Survival Factorization on Diffusion Networks

    No full text
    In this paper we propose a survival factorization framework that models information cascades by tying together social influence patterns, topical structure and temporal dynamics. This is achieved through the introduction of a latent space which encodes: (a) the relevance of a information cascade on a topic; (b) the topical authoritativeness and the susceptibility of each individual involved in the information cascade, and (c) temporal topical patterns. By exploiting the cumulative properties of the survival function and of the likelihood of the model on a given adoption log, which records the observed activation times of users and side-information for each cascade, we show that the inference phase is linear in the number of users and in the number of adoptions. The evaluation on both synthetic and real-world data shows the effectiveness of the model in detecting the interplay between topics and social influence patterns, which ultimately provides high accuracy in predicting users activation times. Code and data related to this chapter are available at: https://doi.org/10.6084/m9.figshare.5411341

    Knowledge discovery in databases

    No full text
    The huge amount of data, generated by daily-life data sources, represents a big opportunity for the development and advancement in several fields: scientific research, social life and industry. At the same time, analyzing these big repositories is a hard challenge, since the overload of information can overwhelm our capability of reading and understanding data, making finding useful pieces of information a difficult task. In this discussion we give a general overview about Knowledge Discovery in Databases as a scientific discipline that provides methodologies, techniques and tools for dealing with Big Data in order to find underlying knowledge that can be exploited in decision making processes
    corecore