1,721,128 research outputs found

    Project Triton : A study into delivering targeted information to an individual based on implicit and explicit data.

    No full text
    The World Wide Web is frequently seen as a source of knowledge, however much of this remains undiscovered by its users. In recent times, recommender systems (e.g. Digg and Last.fm) have attempted to bridge this gap, alerting users to previously untapped knowledge. As more socially oriented services appear on the Web (e.g. Facebook and MySpace), it has never been easier to obtain information pertaining to an individual’s interests. At present, solutions for automated data recommendation tend to be highly topic specific (recommending only a certain topic such as news) and often only allow access to the system using monolithic interfaces. This report hopes to detail the stages from research to evaluation involved in creating an extensible framework, which will operate without the need for human intervention. The framework will feature several proof-of-concept plugins residing in a custom workflow, which target information that is useful to the user. Information will be retrieved automatically through plugins involved with data gathering (such as feed processing and page scraping), while users’ interests will be obtained implicitly (for example, using header information to derive location) or explicitly (taking advantage of Social Network APIs such as Facebook Connect). Finally, Third Parties will be able to integrate the framework into their own solutions using the customisable XML API (written in PHP), so that their products can provide custom user interfaces without style constraints

    Discretizing continuous attributes in AdaBoost for text categorization

    No full text
    We focus on two recently proposed algorithms in the family of "boosting"-based learners for automated text classification, ADABOOST. MH and ADABOOST.MHKR. While the former is a realization of the well-known ADABOOST algorithm specifically aimed at multilabel text categorization, the latter is a generalization of the former based on the idea of learning a committee of classifier sub-committees. Both algorithms have been among the best performers in text categorization experiments so far. A problem in the use of both algorithms is that they require documents to be represented by binary vectors, indicating presence or absence of the terms in the document. As a consequence, these algorithms cannot take full advantage of the "weighted" representations (consisting of vectors of continuous attributes) that are customary in information retrieval tasks, and that provide a much more significant rendition of the document's content than binary representations. In this paper we address the problem of exploiting the potential of weighted representations in the context of ADABOOST-like algorithms by discretizing the continuous attributes through the application of entropy-based discretization methods. We present experimental results on the Reuters-21578 text categorization collection, showing that for both algorithms the version with discretized continuous attributes outperforms the version with traditional binary representations

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore