Search CORE

1,721,186 research outputs found

JIGSAW: An Algorithm for Word Sense Disambiguation

Author: SEMERARO Giovanni
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2007
Field of study

Archivio istituzionale della ricerca - Università di Bari

UBA: Using Automatic Translation and Wikipedia for Cross-Lingual Lexical Substitution

Author: SEMERARO Giovanni
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2010
Field of study

Archivio istituzionale della ricerca - Università di Bari

UNIBA @ EVALITA 2009 - Lexical Substitution Task

Author: SEMERARO Giovanni
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2009
Field of study

Archivio istituzionale della ricerca - Università di Bari

UNIBA : Super Sense Tagging at EVALITA 2011

Author: BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2012
Field of study

Archivio istituzionale della ricerca - Università di Bari

QuestionCube: A framework for question answering

Author: MOLINO PIERO
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2012
Field of study

QuestionCube is a framework for Question Answering (QA) that combines several techniques to retrieve passages containing the exact answers for natural language questions. It exploits: (a) Natural Language Processing algorithms for question and candidate answers analysis both in English and Italian; (b) Information Retrieval probabilistic models for candidate answers retrieval and (c) Machine Learning methods for question classification. The data source for the answer is an unstructured text document collection stored in search indices. In this paper an overview of the QuestionCube framework architecture is provided, together with a description of Wikiedi, a QA system for Wikipedia which exploits the proposed framework

Archivio istituzionale della ricerca - Università di Bari

Encoding syntactic dependencies using Random Indexing and Wikipedia as a corpus

Author: CAPUTO ANNALINA
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2012
Field of study

Distributional approaches are based on a simple hypothesis: the meaning of a word can be inferred from its usage. The application of that idea to the vector space model makes possible the construction of a WordSpace in which words are represented by mathematical points in a geometric space. Similar words are represented close in this space and the definition of "word usage" depends on the definition of the context used to build the space, which can be the whole document, the sentence in which the word occurs, a fixed window of words, or a specific syntactic context. However, in its original formulation WordSpace can take into account only one definition of context at a time. We propose an approach based on vector permutation and Random Indexing to encode several syntactic contexts in a single WordSpace. We adopt WaCkypedia EN corpus to build our WordSpace that is a 2009 dump of the English Wikipedia (about 800 million tokens) annotated with syntactic information provided by a full dependency parser. The effectiveness of our approach is evaluated using the GEometrical Models of natural language Semantics (GEMS) 2011 Shared Evaluation data

Archivio istituzionale della ricerca - Università di Bari

Natural browsing

Author: Intonti Luigi
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2010
Field of study

Natural Browsing is an ongoing industrial research project3 which aims to develop a framework able to automatically build a knowledge base from unstructured data. The project relies on NLP methods and Semantic Web technologies in order to mine facts from data

Archivio istituzionale della ricerca - Università di Bari

UNIBA-CORE: Combining Strategies for Semantic Textual Similarity.

Author: CAPUTO ANNALINA
SEMERARO Giovanni
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2013
Field of study

This paper describes the UNIBA participation in the Semantic Textual Similarity (STS) core task 2013. We exploited three different systems for computing the similarity between two texts. A system is used as baseline, which represents the best model emerged from our previous participation in STS 2012. Such system is based on a distributional model of semantics capable of taking into account also syntactic structures that glue words together. In addition, we investigated the use of two different learning strategies exploiting both syntactic and semantic features. The former uses a combination strategy in order to combine the best machine learning techniques trained on 2012 training and test sets. The latter tries to overcame the limit of working with different datasets with varying characteristics by selecting only the more suitable dataset for the training purpose

Archivio istituzionale della ricerca - Università di Bari

Encoding syntactic dependencies by vector permutation

Author: CAPUTO ANNALINA
SEMERARO Giovanni
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Università di Bari

UNIBA-SENSE at CLEF 2008: Semantic N-Levels Search Engine

Author: CAPUTO ANNALINA
SEMERARO Giovanni
BASILE PIERPAOLO
Publication venue
Publication date: 01/01/2008
Field of study

Archivio istituzionale della ricerca - Università di Bari