1,721,274 research outputs found

    On the effective clustering of multidimensional data sequences

    No full text
    In this paper, we investigate the problem of clustering multidimensional data sequences such as video streams. Each sequence is represented by a small number of hyper-rectangular clusters for subsequent indexing and similarity search processing. We present a linear clustering algorithm that guarantees the predefined level of clustering quality, and show its effectiveness via experiments on various video data sets. (C) 2001 Elsevier Science B.V. All rights reserved

    Heterogeneous image database selection on the Web

    No full text
    Image databases on the Web have heterogeneous characteristics since they use different similarity measures and queries are processed depending on their own schemes. In the content-based image retrieval from distributed sites, it is crucial that the metaserver has the capability to find objects, similar to a given query object in terms of the global similarity measure, from different image databases with different local similarity measures. In this paper, we investigate the problem of finding databases, which contain more objects relevant to a given query than other databases, from many image databases dispersed on the Web. This problem is referred to as a database selection problem. We propose a new selection method to determine candidate databases. The selection of databases is based on the hybrid estimator using a few sample objects and compressed histogram information of image databases. Extensive experiments on a large number of image data demonstrate that our proposed method improves the effectiveness of distributed content-based retrieval in a heterogeneous environment. (C) 2002 Elsevier Science Inc. All rights reserved

    Space-efficient cubes for OLAP range-sum queries

    No full text
    Data cubes support a powerful data analysis method called the range-sum query. The range-sum query is widely used in finding trends and in discovering relationships among attributes in diverse database applications. A range-sum query computes aggregate information over an online analytical processing (OLAP) data cube in specified query ranges. Existing techniques for range-sum queries on data cubes use an additional cube called the prefix sum cube (PC), to store the cumulative sums of data, causing a high space overhead. This space overhead not only leads to extra costs for storage devices, but also causes additional propagations of updates and longer access time on physical devices. In this paper, we present a new cube representation called 'the PC Pool', which drastically reduces the space of the PC in a large data warehouse. The PC Pool decreases the update propagation caused by the dependency between values in cells of the PC. We develop an effective algorithm, which finds dense sub-cubes from a large data cube. We perform an extensive experiment with diverse data sets, and examine the space reduction and performance of our proposed method with respect to various dimensions of the data cube and query sizes. Experimental results show that our method reduces the space of the PC while having a reasonable query performance. (C) 2003 Elsevier B.V. All rights reserved

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Dynamic Update Cube for Range-Sum Queries

    No full text
    Proceedings of 27th International Conference on Very Large Data Bases, September 11-14, 2001, Roma, Italy.This work was supported by the Korea Research Foundation Grant (KRF-2000-041-E00262)

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
    corecore