1,721,037 research outputs found

    Efficient temporal mining of micro-blog texts and its application to event discovery

    No full text
    In this paper we present a novel method for clustering words in micro-blogs, based on the similarity of the related temporal series. Our technique, named SAX*, uses the Symbolic Aggregate ApproXimation algorithm to discretize the temporal series of terms into a small set of levels, leading to a string for each. We then define a subset of “interesting” strings, i.e. those representing patterns of collective attention. Sliding temporal windows are used to detect co-occurring clusters of tokens with the same or similar string. To assess the performance of the method we first tune the model parameters on a 2-month 1 % Twitter stream, during which a number of world-wide events of differing type and duration (sports, politics, disasters, health, and celebrities) occurred. Then, we evaluate the quality of all discovered events in a 1-year stream, “googling” with the most frequent cluster n-grams and manually assessing how many clusters correspond to published news in the same temporal slot. Finally, we perform a complexity evaluation and we compare SAX* with three alternative methods for event discovery. Our evaluation shows that SAX* is at least one order of magnitude less complex than other temporal and non-temporal approaches to micro-blog clustering. © 2015, The Author(s)

    A Gendered Analysis of Leadership in Enterprise Social Networks

    No full text
    The present study is concerned with the analysis of women’s leadership in a less formal work environment, such as an enterprise social network. Our aim is to answer the following research questions: RQ1: Are Enterprise Social Networks a conductive environment to support the emergence of women informal leadership? RQ2: If answer to RQ1 is positive, do women actually exploit this opportunity

    Detecting Network Leaders in Enterprises

    No full text
    This paper describes an interdisciplinary study aimed at analyzing leadership in less formal collaboration environments, such as enterprise social networks (ESNs). To conduct our research, we defined a measure of network leadership which draws on organization theory and on a computational model based on multiplex networks. This model, along with a social network analysis toolkit developed in the context of the present study, enabled the systematic empirical analysis of a large ESN, as a function of gender, time, roles, and discussed topics

    What women like: a gendered analysis of twitter users' interests based on a twixonomy

    No full text
    In this paper we analyze the distribution of interests in a large population of Twitter users (the full set of 40 million users in 2009 and a sample of about 100 thousand New York users in 2014), as a function of gender. To model interests, we associate "topical" friends in users' friendship lists (friends representing an interest rather than a social relation between peers) with Wikipedia categories. A word-sense disambiguation algorithm is used for selecting the appropriate wikipage for each topical friend. Starting from the set of wikipages representing the population's interests, we extract the sub-graph of Wikipedia categories connected to these pages, and we then prune cycles to induce a direct acyclic graph, that we call Twixonomy. We use a novel method for reducing the computational requirements of cycle detection on very large graphs. For any category at any generalization level in the Twixonomy, it is then possible to estimate the gender distribution of Twitter users interested in that category. We analyze both the population of "celebrities", i.e. male and female Twitter users with an associated wikipage, and the population of "peers", i.e. male and female users who follow celebrities

    Recommendation of micro-blog users based on hierarchical interest profiles

    No full text
    Quite a number of recent works have concentrated on the task of recommending to Twitter users whom they should follow, among which, the WTF (Who To Follow) service provided by Twitter. Recommenders are based, either on the user’s network structure, or on some notion of topical similarity with other users, or on both. In this paper, we propose to accomplish the recommendation task in two steps: First, we profile users and classify them as belonging to a target community (depending e.g., on their political affiliation, preferred football team, favorite coffee shop, etc.). Then, we fine-tune recommendations for selected populations. We cast both problems of user classification and recommendation as one of itemset mining, where items are either users’ authoritative friends or semantic categories associated to friends, extracted from WiBi, the Wikipedia Bitaxonomy. In addition to evaluating our profiler and recommender on several populations, we also show that semantic categories allow for very fine-grained population studies, and make it possible to recommend not only whom to follow, but also topics of interest, users interested in the same topic, and mor

    What women like: A gendered analysis of twitter users' interests based on a twixonomy

    No full text
    In this paper we analyze the distribution of interests in a large population of Twitter users (the full set of 40 million users in 2009 and a sample of about 100 thousand New York users in 2014), as a function of gender. To model interests, we associate "topical" friends in users' friendship lists (friends representing an interest rather than a social relation between peers) with Wikipedia categories. A word-sense disambiguation algorithm is used for selecting the appropriate wikipage for each topical friend. Starting from the set of wikipages representing the population's interests, we extract the sub-graph of Wikipedia categories connected to these pages, and we then prune cycles to induce a direct acyclic graph, that we call Twixonomy. We use a novel method for reducing the computational requirements of cycle detection on very large graphs. For any category at any generalization level in the Twixonomy, it is then possible to estimate the gender distribution of Twitter users interested in that category. We analyze both the population of "celebrities", i.e. male and female Twitter users with an associated wikipage, and the population of "peers", i.e. male and female users who follow celebrities

    Hashtag sense clustering based on temporal similarity

    Full text link
    Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. Regardless of the use for which they were originally intended, hashtags cannot be used as a means to cluster messages with similar content. First, because hashtags are created in a spontaneous and highly dynamic way by users in multiple languages, the same topic can be associated with different hashtags, and conversely, the same hashtag may refer to different topics in different time periods. Second, contrary to common words, hashtag disambiguation is complicated by the fact that no sense catalogs (e.g., Wikipedia or WordNet) are available; and, furthermore, hashtag labels are difficult to analyze, as they often consist of acronyms, concatenated words, and so forth. A common way to determine the meaning of hashtags has been to analyze their context, but, as we have just pointed out, hashtags can have multiple and variable meanings. In this article, we propose a temporal sense clustering algorithm based on the idea that semantically related hashtags have similar and synchronous usage patterns

    Large scale homophily analysis in twitter using a twixonomy

    Full text link
    In this paper we perform a large-scale homophily analysis on Twitter using a hierarchical representation of users' interests which we call a Twixonomy. In order to build a population, community, or single-user Twixonomy we first associate "topical" friends in users' friendship lists (i.e. friends representing an interest rather than a social relation between peers) with Wikipedia categories. A wordsense disambiguation algorithm is used to select the appropriate wikipage for each topical friend. Starting from the set of wikipages representing "primitive" interests, we extract all paths connecting these pages with topmost Wikipedia category nodes, and we then prune the resulting graph G efficiently so as to induce a direct acyclic graph. This graph is the Twixonomy. Then, to analyze homophily, we compare different methods to detect communities in a peer friends Twitter network, and then for each community we compute the degree of homophily on the basis of a measure of pairwise semantic similarity. We show that the Twixonomy provides a means for describing users' interests in a compact and readable way and allows for a fine-grained homophily analysis. Furthermore, we show that midlow level categories in the Twixonomy represent the best balance between informativeness and compactness of the representation
    corecore