1,721,118 research outputs found

    Comparing State-of-the-art Dependency Parsers on the Italian Stanford Dependency Treebank

    No full text
    In the last decade, many accurate dependency parsers have been made publicly available. It can be difficult for non-experts to select a good off-the-shelf parser among those available. This is even more true when working on languages different from English, because parsers have been tested mainly on English treebanks. Our analysis is focused on Italian and relies on the Italian Stanford Dependency Treebank (ISDT). This work is a contribution to help non-experts understand how difficult it is to apply a specific dependency parser to a new language/treebank and choose a parser that meets their needs

    An N-Best Representation for Bidirectional Parsing Strategy

    No full text
    In speech understanding systems, the interface between acoustic and linguistic modules is often represented by the N best sequences that match the input signal. They compose a set that will be linguistically analyzed in order to find the interpretation of the input. An appropriate representation of the N-Best could make linguistic processing more efficient. Here a representation based on a context-free model is proposed that is obtained by an algorithm inherited by the data compression field. This algorithm is based on the subword tree of the concatenation of the N best sequences. The proposed representation seems particularly appropriate when coupled with a bidirectional parser and some experiments demonstrate that tha approach is worth pursuing. Such experiments focus on the comparison between the proposed representation and a sequential processing of the N hypotheses given by the acoustic module. The comparison takes into consideration the efficiency attained in the two cases, in terms of (partial) analyses constructed by the linguistic module. The obtained results are presented and discussed

    LearningPinocchio: Adaptive Information Extraction for Real World Applications

    No full text
    The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts based on the idea above that is having good commercial and scientific success. Real world applications have been built and evaluation licenses have been released to external companies for application development. In this paper we present a number of applications developed with it and report about an evaluation performed by an independent company. Finally we discuss the suitability of the IE technology behind the system with respect to the requirements mentioned in the introduction and draw some conclusio

    LearningPinocchio: Adaptive Information Extraction for Real World Applications

    No full text
    In this paper we describe LearningPinocchio, a robust system for adaptive Information Extraction from texts. Real world applications have been built and licenses have been released to external companies for application development. In this paper we initially discuss some requirements for Adaptive IE tools. Then we describe LearningPinocchio, present a number of applications developed with it and report about an evaluation performed by an independent company. Finally we discuss the suitability of the IE technology behind the system with respect to the requirements mentioned in the introduction and draw some conclusio

    Full Text Parsing using Cascades of Rules: an Information Extraction Perspective

    No full text
    This paper proposes an approach to full parsing suitable for Information Extraction from texts. Sequences of cascades of rules deterministically analyze the text, building unambiguous structures. Initially basic chunks are analyzed; then argumental relations are recognized; finally modifier attachment is performed and the global parse tree is built. The approach was proven to work for three languages and different domains. It was implemented in the IE module of FACILE, a EU project for multilingual text classification and IE
    corecore