1,721,070 research outputs found
Ontology Learning from Text: An Overview
Buitelaar P, Cimiano P, Magnini B. Ontology Learning from Text: An Overview. In: Buitelaar P, Cimiano P, Magnini B, eds. Ontology Learning from Text: Methods, Evaluation and Applications. Frontiers in Artificial Intelligence and Applications. Vol 123. Amsterdam: IOS Press; 2005: 3-12
Dynamic Task-Oriented Dialogue: A Comparative Study of Llama-2 and BERT in Slot Value Generation
Recent advancements in instruction-based language models have demonstrated exceptional performance across various natural language processing tasks. We present a comprehensive analysis of the performance of two open-source language models, BERT and Llama-2, in the context of dynamic task-oriented dialogues. Focusing on the Restaurant domain and utilizing the MultiWOZ 2.4 dataset, our investigation centers on the models’ ability to generate predictions for masked slot values within text. The dynamic aspect is introduced through simulated domain changes, mirroring real-world scenarios where new slot values are incrementally added to a domain over time.This study contributes to the understanding of instruction-based models’ effectiveness in dynamic natural language understanding tasks when compared to traditional language models and emphasizes the significance of open-source, reproducible models in advancing research within the academic community
Evaluating Distributional Representations of Verb Semantic Selection
The purpose of this paper is to evaluate whether distributional techniques applied to lexical sets, i.e. the set of fillers of verb argument slots, constitute a useful heuristic to model verb semantic selection. To achieve this purpose, we extract the word vectors corresponding to our lexical set vocabulary from the word2vec distributional semantic model, and then perform k-means clustering on these. We focus on verbs undergoing the causative/inchoative alternation as a case study, as they offer an interesting challenge due to the theoretical assumption that the lexical sets of the transitive Object (O) and the intransitive Subject (S) overlap. We analyze the obtained clusters from a qualitative point of view, calculate the prototype vector based on the cluster centroid, and evaluate them against the human judgments on verb semantic selection acquired from a lexical resource. We present an in-depth linguistic analysis of the Italian verb suonare ’to ring, to play’. The analysis demonstrates that automatically obtained clusters and human judgments based on manual clustering match closely, although the centroids appear not to be systematically the best indicators of the cluster semantics, and metonymic uses leads to incorrect automatic analysis
Grounding Lexical Sets of Causative-Inchoative Verbs with Word Embedding
Lexical sets contain the words filling the argument positions of a verb in one of its senses. They can be extracted from corpora automatically. The purpose of this paper is demonstrating that their vector representation based on word embedding provides insights onto many linguistic phenomena, such as causative-inchoative verbs. A first experiment aims at investigating the internal structure of the sets, which are known to be radial and continuous categories cognitively. A second experiment shows that the distance between the intransitive subject set and transitive object set is correlated with the spontaneity of the event expressed by the verb, defined according to morphological coding and frequency
Designing a Methodology for Semantic Type Tagging of Argument Positions
A verb argument position can be described by the semantic type that characterizes the words filling that position. We investigate a number of linguistic issues underlying the tagging of an Italian corpus with the semantic types provided by the T-PAS (Typed Predicate-Argument Structure) resource. Our main interest is to evaluate whether our annotation methodology can be employed effectively for the extension of the annotation of the corpus associated with the resource. In order to achieve this goal we compare quantitative data about the tagging and qualitative data derived from the Inter-Annotator Agreement
Lexical Opposition in Discourse Contrast
We investigate the connection between lexical opposition and discourse relations, with a focus on the relation of contrast, in order to evaluate whether opposition participates in discourse relations.1 Through a corpus-based analysis of Italian documents, we show that the relation between opposition and contrast is not crucial, although not insignificant in the case of implicit relation. The correlation is even weaker when other discourse relations are taken into account
Contrast-Ita Bank: A Corpus Annotated with Discourse Contrast Relations
We present Contrast-Ita Bank, a corpus annotated with discourse contrast relations in Italian. We annotate both explicit and implicit contrast relations, following the schema proposed in the Penn Discourse Treebank. We provide and discuss quantitative data about the new resource
Mining the web to validate answers to natural language questions
Answer validation is the ability to automatically judge the relevance of a candidate answer with respect to a given question. This paper investigates answer validation following a data-driven approach that considers textual passages (i.e. snippets) retrieved from the Web as the main source of information. Snippets are then analyzed in order to maximize the density of relevant keywords belonging both to the question and to the answer. Results obtained on a corpus of human-judged factoid question-answer pairs submitted by participants at TREC-2001 show a satisfactory degree of success rate (i.e. 86\%). In addition, the efficiency of the methodology (documents are not downloaded) makes the approach suitable to be integrated as a module in the architecture of a question answering system
Tagging Semantic Types for Verb Argument Positions
Verb argument positions can be described by the semantic types that characterise the words filling that position. We investigate a number of linguistic issues underlying the tagging of an Italian corpus with the semantic types provided by the T-PAS (Typed Predicate-Argument Struc- ture) resource. We report both quantitative data about the tagging and a qualitative analysis of cases of disagreement between two annotators
- …
