1,720,984 research outputs found

    Efficient multi-source RTP stream relaying in overlay networks

    No full text
    P2P streaming systems exploit the high scalability of P2P networks to propagate multimedia streams to wide areas in a distributed fashion. A P2P network is made up of several logic connections among peer nodes, which can form a mesh or a tree overlay. Interactive real-time streaming imposes very strict delay constraints: in e-learning environments, for instance, only a high level of synchrony with the live event may allow users to ask questions in real-time to the lecture speaker. Despite their resilience to peer churning, mesh overlays are affected by a high and variable delay, which hampers their adoption in interactive applications. Playback discontinuity, caused by missing chunks, may be a serious problem too, since it may compromise the speech understanding. On the contrary, multi-tree MDC-based overlays can guarantee a smooth playback continuity, since a stream can be fluently reproduced also when some descriptions are lost. The queuing and the elaboration delay of the nodes should be kept under control to reduce the overall end-to-end delay on each tree branch. In this work we describe a framework for the construction of reliable tree overlays: we designed a smooth switch-to-fallback mechanism to quickly replace nodes leaving the overlay, a channel for packet recovery and a very efficient engine that collects MDC descriptions coming from parent nodes. The experimental data we reported in the last section reveal a low CPU usage and a very low elaboration delay in collecting MDC descriptions

    AFQN: approximate Qn estimation in data streams

    Full text link
    We present afqn (Approximate Fast Qn), a novel algorithm for approximate computation of the Qn scale estimator in a streaming setting, in the sliding window model. It is well-known that computing the Qn estimator exactly may be too costly for some applications, and the problem is a fortiori exacerbated in the streaming setting, in which the time available to process incoming data stream items is short. In this paper we show how to efficiently and accurately approximate the Qn estimator. As an application, we show the use of afqn for fast detection of outliers in data streams. In particular, the outliers are detected in the sliding window model, with a simple check based on the Qn scale estimator. Extensive experimental results on synthetic and real datasets confirm the validity of our approach by showing up to three times faster updates per second. Our contributions are the following ones: (i) to the best of our knowledge, we present the first approximation algorithm for online computation of the Qn scale estimator in a streaming setting and in the sliding window model; (ii) we show how to take advantage of our UDDSketch algorithm for quantile estimation in order to quickly compute the Qn scale estimator; (iii) as an example of a possible application of the Qn scale estimator, we discuss how to detect outliers in an input data stream

    Fast online computation of the Qn estimator with applications to the detection of outliers in data streams

    Full text link
    We present FQN (Fast Qn), a novel algorithm for online computation of the Qn scale estimator. The algorithm works in the sliding window model, cleverly computing the Qn scale estimator in the current window. We thoroughly compare our algorithm for online Qn with the state of the art competing algorithm by Nunkesser et al., and show that FQN (i) is faster, requiring only O(s) time in the worst case where s is the length of the window (ii) its computational complexity does not depend on the input distribution and (iii) it requires less space. To the best of our knowledge, our algorithm is the first that allows online computation of the Qn scale estimator in worst case time linear in the size of the window. As an example of a possible application, besides its use as a robust measure of statistical dispersion, we show how to use the Qn estimator for fast detection of outliers in data streams. Extensive experimental results on both synthetic and real datasets confirm the validity of our approach

    Parallel Mining of Correlated Heavy Hitters on Distributed and Shared-Memory Architectures

    No full text
    We present parallel algorithms for mining Correlated Heavy Hitters from a two-dimensional data stream. In particular, we design and implement a message-passing, a shared-memory and a hybrid algorithm. To the best of our knowledge, these are the first parallel algorithms solving the problem. We show, through experimental results, that our algorithms provide very good scalability, whilst retaining the accuracy of their sequential counterpart

    Data stream fusion for accurate quantile tracking and analysis

    No full text
    UDDSKETCH is a recent algorithm for accurate tracking of quantiles in data streams, derived from the DDSKETCH algorithm. UDDSKETCH provides accuracy guarantees covering the full range of quantiles independently of the input distribution and greatly improves the accuracy with regard to DDSKETCH. In this paper we show how to compress and fuse two or more data streams (or datasets) by leveraging the mergeability of the UDDSKETCH data summaries. In general, two summaries on two data streams are said to be mergeable if there exists an algorithm that allows combining the two summaries into a single one related to the union of the two datasets, simultaneously preserving the error and size guarantees. The property of mergeability of a sketch enables the parallel and distributed processing of big volume data streams that can be compressed and fused by means of such mergeable data structures. Among the applications strictly related to accurate tracking of quantiles, requiring parallel and/or distributed processing we recall here estimating the latency of a web site, database query optimizers and the need of succinctly summarizing the distribution of values occurring over a sensor network. We prove that UDDSKETCH is fully mergeable and introduce PUDDSKETCH, a parallel version of UDDSKETCH suitable for message-passing based architectures. We formally prove its correctness and compare it to a parallel version of DDSKETCH, showing through extensive experimental results that our parallel algorithm almost always outperforms the parallel DDSKETCH algorithm with regard to the overall accuracy in determining the quantiles. Moreover, we also design and implement parallel versions of both the state of the art KLL and REQ sequential algorithms in order to compare and contrast PUDDSKETCH versus the corresponding parallel algorithms. Our experiments clearly show that PUDDSKETCH is faster or on par with regard to parallel running time, whilst providing simultaneously greater accuracy

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore