1,721,229 research outputs found

    Clustering by discovery on maps

    No full text
    This paper describes an algorithm for clustering on-line learned maps, requiring no 'a priori' knowledge of the environment. It uses the nearest neighbour rule, coupled with a threshold mechanism. Experimental results are shown on real and fictitious maps. © 1992

    On the complexity of finding bounds for projection cardinalities in relational databases

    No full text
    We address the problem of deriving lower and upper bounds for the cardinality of the projections of a database relation, given a set of functional dependencies on the relation schema and measures of the cardinalities of the attributes in the schema. It is shown that deciding whether a given number is the least upper bound of a projection cardinality is an NP-complete problem, whereas determining whether the greatest lower bound and the least upper bound coincide can be easily solved in linear time. © 1992

    CICERO: An assistant for planning visits to a museum

    No full text
    In this paper we present Cicero, a system for the assisted planning of personalized itineraries to visit the Ducal Palace in Urbino, Italy. The graphic interface gives users a structured and exhaustive view of the artistic content of the museum, and enables them to choose consciously which aspects to privilege during the visit. According to the user preferences, the system derives a set of constraints and then determines a personalized itinerary by means of a heuristic path-planning algorithm

    Domains and Active Domains: What This Distinction Implies for the Estimation of Projection Sizes in Relational Databases

    No full text
    Database optimizers require statistical information about data distributions in order to evaluate result sizes and access plan costs for processing user queries. In this context, we consider the problem of estimating the size of the projections of a database relation, when measures on attribute domain cardinalities are maintained in the system. Our main theoretical contribution is a new formal model (AD), valid under the hypotheses of attribute independence and uniform distribution of attribute values, derived considering the difference between time-invariant domain (the set of values that an attribute can assume) and time-dependent “active domain” (the set of values that are actually assumed, at a certain time). Early models developed under the same assumptions are shown to be formally incorrect Since the AD model is computationally high-demanding, we also introduce an approximate, easy-to-compute model (A2D) that, unlike previous approximations, yields low errors on all the parameter space of the active domain cardinalities. Finally, we extend the A2D model to the case of nonuniform distributions and present experimental results confirming the good behavior of the model. © 1995 IEE

    On the optimal ordering of multiple-field tables

    No full text
    Consider a table of n d-dimensional records, grouped into b buckets, and a set Q = {(Qh, wh)} of weighed partial-match conditions, where wh is the relative frequency of Qh. Let n(Qh) be the number of records which satisfy Qh, and b(Qh) the number of buckets in which these records are found. The problem we consider is the individuation of the optimal ordering field which minimizes the sum of accessed buckets, B(Q) = Σhwh × b(Qh). An exact solution requires the records to be sorted according to the values of each of the d fields in turn with an overall time complexity, evaluated as a function of bucket accesses, of O(d × b log b). We present an approximate O(b) algorithm which estimates the optimal ordering without the need to sort the table at all. The algorithm makes use of a probabilistic counting technique, known as linear counting, which requires a single scan of the table. Experimental results show that in most cases the approximate solution agrees with the exact one. © 1994

    Access cost estimation for physical database design

    No full text
    In this work we propose models for access cost estimation that are suitable in the physical design of a relational database when a set of secondary indexes has to be built on some attributes of the relations. The models are tailored to deal with distinct kinds of queries (partial-match, interval, join, etc.) and are based on a measure of association, the clustering factor, which applies between an attribute and the physical location of records in a file as well as between two (sets of) attributes. The use of clustering factors and the value selectivity of a query (i.e. how many distinct values satisfy a query) allow design time models to be derived without previously needing to estimate the record selectivities (i.e. how many records satisfy a query) or the corresponding access costs of all the query instances that can occur at run time. In practice, unlike previous approaches to the problem, run time models are derived by specializing design time models, rather than vice versa. Estimation of access costs with alternative ordering criteria is also considered, and a model is proposed that allows the primary attribute to be chosen wothout the need to sort the tuples. The proposed models achieve a good tradeoff between accuracy and simplicity, without being based on restrictive assumptions as to data, and easily allow the design process to take advantage of semantic information about the application domain even if the data are not yet loaded in the database. © 1993

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore