1,721,066 research outputs found

    Differentially Oblivious Database Joins: Overcoming the Worst-Case Curse of Fully Oblivious Algorithms

    Full text link
    Numerous high-profile works have shown that access patterns to even encrypted databases can leak secret information and sometimes even lead to reconstruction of the entire database. To thwart access pattern leakage, the literature has focused on oblivious algorithms, where obliviousness requires that the access patterns leak nothing about the input data. In this paper, we consider the Join operator, an important database primitive that has been extensively studied and optimized. Unfortunately, any fully oblivious Join algorithm would require always padding the result to the worst-case length which is quadratic in the data size N. In comparison, an insecure baseline incurs only O(R + N) cost where R is the true result length, and in the common case in practice, R is relatively short. As a typical example, when R = O(N), any fully oblivious algorithm must inherently incur a prohibitive, N-fold slowdown relative to the insecure baseline. Indeed, the (non-private) database and algorithms literature invariably focuses on studying the instance-specific rather than worst-case performance of database algorithms. Unfortunately, the stringent notion of full obliviousness precludes the design of efficient algorithms with non-trivial instance-specific performance. To overcome this worst-case performance barrier of full obliviousness and enable algorithms with good instance-specific performance, we consider a relaxed notion of access pattern privacy called (ε, δ)-differential obliviousness (DO), originally proposed in the seminal work of Chan et al. (SODA'19). Rather than insisting that the access patterns leak no information whatsoever, the relaxed DO notion requires that the access patterns satisfy (ε, δ)-differential privacy. We show that by adopting the relaxed DO notion, we can obtain efficient database Join mechanisms whose instance-specific performance approximately matches the insecure baseline, while still offering a meaningful notion of privacy to individual users. Complementing our upper bound results, we also prove new lower bounds regarding the performance of any DO Join algorithm. Differential obliviousness (DO) is a new notion and is a relatively unexplored territory. Following the pioneering investigations by Chan et al. and others, our work is among the very first to formally explore how DO can help overcome the worst-case performance curse of full obliviousness; moreover, we motivate our work with database applications. Our work shows new evidence why DO might be a promising notion, and opens up several exciting future directions

    MPC for MPC: Secure Computation on a Massively Parallel Computing Architecture

    Full text link
    Massively Parallel Computation (MPC) is a model of computation widely believed to best capture realistic parallel computing architectures such as large-scale MapReduce and Hadoop clusters. Motivated by the fact that many data analytics tasks performed on these platforms involve sensitive user data, we initiate the theoretical exploration of how to leverage MPC architectures to enable efficient, privacy-preserving computation over massive data. Clearly if a computation task does not lend itself to an efficient implementation on MPC even without security, then we cannot hope to compute it efficiently on MPC with security. We show, on the other hand, that any task that can be efficiently computed on MPC can also be securely computed with comparable efficiency. Specifically, we show the following results: - any MPC algorithm can be compiled to a communication-oblivious counterpart while asymptotically preserving its round and space complexity, where communication-obliviousness ensures that any network intermediary observing the communication patterns learn no information about the secret inputs; - assuming the existence of Fully Homomorphic Encryption with a suitable notion of compactness and other standard cryptographic assumptions, any MPC algorithm can be compiled to a secure counterpart that defends against an adversary who controls not only intermediate network routers but additionally up to 1/3 - η fraction of machines (for an arbitrarily small constant η) - moreover, this compilation preserves the round complexity tightly, and preserves the space complexity upto a multiplicative security parameter related blowup. As an initial exploration of this important direction, our work suggests new definitions and proposes novel protocols that blend algorithmic and cryptographic techniques

    Perfectly Oblivious (Parallel) RAM Revisited, and Improved Constructions

    Full text link
    Oblivious RAM (ORAM) is a technique for compiling any RAM program to an oblivious counterpart, i.e., one whose access patterns do not leak information about the secret inputs. Similarly, Oblivious Parallel RAM (OPRAM) compiles a parallel RAM program to an oblivious counterpart. In this paper, we care about ORAM/OPRAM with perfect security, i.e., the access patterns must be identically distributed no matter what the program’s memory request sequence is. In the past, two types of perfect ORAMs/OPRAMs have been considered: constructions whose performance bounds hold in expectation (but may occasionally run more slowly); and constructions whose performance bounds hold deterministically (even though the algorithms themselves are randomized). In this paper, we revisit the performance metrics for perfect ORAM/OPRAM, and show novel constructions that achieve asymptotical improvements for all performance metrics. Our first result is a new perfectly secure OPRAM scheme with O(log³ N/log log N) expected overhead. In comparison, prior literature has been stuck at O(log³ N) for more than a decade. Next, we show how to construct a perfect ORAM with O(log³ N/log log N) deterministic simulation overhead. We further show how to make the scheme parallel, resulting in an perfect OPRAM with O(log⁴ N/log log N) deterministic simulation overhead. For perfect ORAMs/OPRAMs with deterministic performance bounds, our results achieve subexponential improvement over the state-of-the-art. Specifically, the best known prior scheme incurs more than √N deterministic simulation overhead (Raskin and Simkin, Asiacrypt'19); moreover, their scheme works only for the sequential setting and is not amenable to parallelization. Finally, we additionally consider perfect ORAMs/OPRAMs whose performance bounds hold with high probability. For this new performance metric, we show new constructions whose simulation overhead is upper bounded by O(log³ /log log N) except with negligible in N probability, i.e., we prove high-probability performance bounds that match the expected bounds mentioned earlier

    Unraveling Universally Closest Refinements via Symmetric Density Decomposition and Fisher Market Equilibrium

    Full text link
    We investigate the closest distribution refinements problem, which involves a vertex-weighted bipartite graph as input, where the vertex weights on each side sum to 1 and represent a probability distribution. A refinement of one side’s distribution is an edge distribution that corresponds to distributing the weight of each vertex from that side to its incident edges. The objective is to identify a pair of distribution refinements for both sides of the bipartite graph such that the two edge distributions are as close as possible with respect to a specific divergence notion. This problem is a generalization of transportation, in which the special case occurs when the two closest distributions are identical. The problem has recently emerged in the context of composing differentially oblivious mechanisms. Our main result demonstrates that a universal refinement pair exists, which is simultaneously closest under all divergence notions that satisfy the data processing inequality. Since differential obliviousness can be examined using various divergence notions, such a universally closest refinement pair offers a powerful tool in relation to such applications. We discover that this pair can be achieved via locally verifiable optimality conditions. Specifically, we observe that it is equivalent to the following problems, which have been traditionally studied in distinct research communities: (1) hypergraph density decomposition, and (2) symmetric Fisher Market equilibrium. We adopt a symmetric perspective of hypergraph density decomposition, in which hyperedges and nodes play equivalent roles. This symmetric decomposition serves as a tool for deriving precise characterizations of optimal solutions for other problems and enables the application of algorithms from one problem to another. This connection allows existing algorithms for computing or approximating the Fisher market equilibrium to be adapted for all the aforementioned problems. For example, this approach allows the well-known iterative proportional response process to provide approximations for the corresponding problems with multiplicative error in distributed settings, whereas previously, only absolute error had been achieved in these contexts. Our study contributes to the understanding of various problems within a unified framework, which may serve as a foundation for connecting other problems in the future

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore