1,720,988 research outputs found
Ontology-based information extraction experiences, framework, algorithms and tools
A significant portion of the information collected by enterprises and organizations resides in text documents and is thus inherently unstructured. Turning it into a structured form is the aim of Information Extraction (IE). Depending on the approach, the output of an IE process can fill forms, populate relational tables, or even be presented through an ontology. This last approach, known in the literature under the name of Ontology Based Information Extraction (OBIE), is particularly interesting, since ontologies may facilitate the integration with other corporate and external data and enable data management and governance at an abstract, conceptual level.However, despite OBIE has been so far the subject of several investigations, how to exploit the reasoning abilities offered by an ontology to improve the extraction process has not yet been specifically studied. This thesis is intended to be a first step in that direction.
Starting from our experience gained from implementing OBIE systems via open-source technologies, and with the intent to address the encountered weaknesses, we propose a formal framework for OBIE, called Ontology Based Document Spanning (OBDS). We devise our proposal by revisiting the Ontology Based Data Access (ODBA) paradigm, a sophisticated form of semantic data integration from relational databases, and leveraging the investigation on Document Spanners, a recent formal study of rule-based information extraction that follows the database principles. The reasoning service of main interest in OBDS, as usual in ontology based data management approaches, is Query Answering (Q. A.). We provide an analysis of this service in different settings and propose algorithms for Q. A., in the spirit of OBDA. Right here we show how the ontology plays a major role by mediating the extraction of information from text. To demonstrate the applicability of our approach in practice, we illustrate Mastro System-T, an OBDS tool that we have implemented using robust industrial technologies and experimented on large document datasets. Last but not least, we formally treat the problem of the Entity Resolution (ER), which is recurrent in the OBIE context, as in general in information integration approaches
Arguments against the Troll
We envision an improved social Web, in which the Trolls' disruptive power is inhibited or restricted, and the content produced by and shared among community members can gain authoritativeness. We believe that argumentation theories have the potential to give a key contribution to this vision. We sketch a research path in this direction and discuss some research questions
Exploiting Macro-actions and Predicting Plan Length in Planning as Satisfiability
The use of automatically learned knowledge for a planning domain can significantly improve the performance of a generic planner when solving a problem in this domain. In this work, we focus on the well-known SAT-based approach to planning and investigate two types of learned knowledge that have not been studied in this planning framework before: macro-actions and planning horizon. Macro-actions are sequences of actions that typically occur in the solution plans, while a planning horizon of a problem is the length of a (possibly optimal) plan solving it. We propose a method that uses a machine learning tool for building a predictive model of the optimal planning horizon, and variants of the well-known planner SatPlan and solver MiniSat that can exploit macro actions
and learned planning horizons to improve their performance. An experimental analysis illustrates the effectiveness of the proposed techniques
An Effective Approach to Realizing Planning Programs
Planning programs are loose, high-level, declarative representations of the behavior of agents acting in a domain and following a path of goals to achieve. Such programs are specified through transition systems that can include cycles and decisions to make at certain points. We investigate a new effective approach for solving the problem of realizing a planning program, i.e., informally, for finding and combining a collection of plans that guarantee the planning program executability. We focus on deterministic domains and propose a general algorithm that solves the problem exploiting a planning technique handling goal constraints and preferences. A preliminary experimental analysis indicates that our approach dramatically outperforms the existing method based on formal verification and synthesis techniques. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
On Managing Temporal Information for Handling Durative Actions in LPG
This paper presents how LPG manages ordering constraints for full handling of durative actions introduced by the recent standard language PDDL2.1. LPG is a domain-independent planner that took part in the third International Planning Competition (Toulouse, 2002) showing excellent performance
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
- …
