1,721,050 research outputs found

    The complexity of LSH feasibility

    No full text
    In this paper we study the complexity of the following feasibility problem: given an n × n similarity matrix S as input, is there a locality sensitive hash (LSH) for S? We show that the LSH feasibility problem is NP-hard even in the following strong promise version: either S admits an LSH or S is at l1-distance at least n2 - ε{lunate} from every similarity that admits an LSH. We complement this hardness result by providing an over(O, ̃) (3n) algorithm for the LSH feasibility problem, which improves upon the naïve nΘ (n) time algorithm; we prove that this running time is tight, modulo constants, under the Exponential Time Hypothesis. © 2014 Elsevier B.V. All rights reserved

    Identification and measurement of social influence and correlation

    No full text
    Techniques for detecting social influence between users in a set of users with regard to an activity are provided. Data for each user of the set of users is received that includes a time value at which the user became active with regard to the activity, and includes at least one indication of another user in the set of users associated with the user A first estimate of social correlation in the set of users is determined based on the data. The data is modified. For instance, the data may be modified according to a shuffle test and/or an edge reversal test. A decond estimate of social correlation in the set of users is determined based on the modified data. The first estimate is compared to the second estimate to determine a degree of social influence in the set of users

    Sort me if you can: How to sort dynamic data

    No full text
    We formulate and study a new computational model for dynamic data. In this model the data changes gradually and the goal of an algorithm is to compute the solution to some problem on the data at each time step, under the constraint that it only has a limited access to the data each time. As the data is constantly changing and the algorithm might be unaware of these changes, it cannot be expected to always output the exact right solution; we are interested in algorithms that guarantee to output an approximate solution. In particular, we focus on the fundamental problems of sorting and selection, where the true ordering of the elements changes slowly. We provide algorithms with performance close to the optimal in expectation and with high probability. © 2009 Springer Berlin Heidelberg

    Submodular stochastic probing on matroids

    Full text link
    In a stochastic probing problem we are given a universe E, where each element e in E is active independently with probability p in [0,1], and only a probe of e can tell us whether it is active or not. On this universe we execute a process that one by one probes elements - if a probed element is active, then we have to include it in the solution, which we gradually construct. Throughout the process we need to obey inner constraints on the set of elements taken into the solution, and outer constraints on the set of all probed elements. This abstract model was presented in [Gupta and Nagaraja, IPCO 2013], and provides a unified view of a number of problems. Thus far all the results in this general framework pertain only to the case in which we are maximizing a linear objective function of the successfully probed elements. In this paper we generalize the stochastic probing problem by considering a monotone submodular objective function. We give a (1-1/e)/(k_in+k_out+1)-approximation algorithm for the case in which we are given k_in greater than 0 matroids as inner constraints and k_out greater than 1 matroids as outer constraints. There are two main ingredients behind this result. First is a previously unpublished stronger bound on the continuous greedy algorithm due to Vondrak. Second is a rounding procedure that also allows us to obtain an improved 1/(k_in+k_out)-approximation for linear objective functions

    Sorting and selection on dynamic data

    No full text
    We formulate and study a new computational model for dynamic data. In this model, the data changes gradually and the goal of an algorithm is to compute the solution to some problem on the data at each time step, under the constraint that it only has limited access to the data each time. As the data is constantly changing and the algorithm might be unaware of these changes, it cannot be expected to always output the exact right solution; we are interested in algorithms that guarantee to output an approximate solution. In particular, we focus on the fundamental problems of sorting and selection, where the true ordering of the elements changes slowly. We provide algorithms with performance close to the optimal in expectation and with high probability. (C) 2010 Elsevier B.V. All rights reserved

    Event Detection via Communication Pattern Analysis

    No full text
    Social media applications such as Twitter provide a powerful medium through which users can communicate their observations with friends and with the world at large. We have witnessed live reporting of many events, from soccer games in Johannesburg to revolutions in Cairo and Tunis, and these reports have in many ways rivaled the content provided by the official media. Tapping into this valuable resource is a challenge, due to the heterogeneity and noise inherent in real-time text, diversity of languages, and fast-evolving linguistic norms. In this paper we seek to analyze a tweet stream to au-tomatically discover points in time when an important event happens, and to classify such events based on the type of the sentiments they evoke, using only non-textual features of the tweeting pattern. This results not only in a robust way of an-alyzing tweet streams independent of the languages used; it also provides insights about how users behave on social me-dia websites. For example, we observe that users often re-act to an exciting external event by decreasing the volume of communication with other users. We explain this effect through a model of how users switch between producing in-formation or sentiments and sharing others ’ news or senti-ments. We develop and evaluate our models and algorithms using several Twitter data sets, focusing in particular on the tweets sent during the soccer World Cup of 2010. This data set has the feature that the underlying ground truth is well-defined and known whereby goals serve as events

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Algorithms on evolving graphs

    Full text link
    Motivated by applications that concern graphs that are evolving and massive in nature, we define a new general framework for computing with such graphs. In our framework, the graph changes over time and an algorithm can only track these changes by explicitly probing the graph. This framework captures the inherent tradeoff between the complexity of maintaining an up-to-date view of the graph and the quality of results computed with the available view. We apply this framework to two classical graph connectivity problems, namely, path connectivity and minimum spanning trees, and obtain efficient algorithms. © 2012 ACM

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore