1,721,492 research outputs found
A kernel-based approach for irony and sarcasm detection in Italian
This paper describes the UNITOR system that participated to the Irony Detection in Italian Tweets task (IronITA) within the context of EvalIta 2018. The system corresponds to a cascade of Support Vector Machine classifiers. Specific features and kernel functions have been proposed to tackle the different subtasks: Irony Classification and Sarcasm Classification. The proposed system ranked first in the Sarcasm Detection subtask (out of 7 submissions), while it ranked sixth (out of 17 submissions) in the Irony Detection task
Towards Compositional Tree Kernels
Distributional Compositional Semantics (DCS) methods combine lexical vectors according to algebraic operators or functions to model the meaning of complex linguistic phrases. On the other hand, several textual inference tasks rely on supervised kernel-based learning, whereas Tree Kernels (TK) have been shown suitable to the modeling of syntactic and semantic similarity between linguistic instances. While the modeling of DCS for complex phrases is still an open research issue, TKs do not account for compositionality. In this paper, a novel kernel called Compositionally Smoothed Partial Tree Kernel is proposed integrating DCS operators into the TK estimation. Empirical results over Semantic Text Similarity and Question Classification tasks show the contribution of semantic compositions with respect to traditional TKs
Making sense of kernel spaces in neural learning
Kernel-based and Deep Learning methods are two of the most popular approaches in Computational Natural Language Learning. Although these models are rather different and characterized by distinct strong and weak aspects, they both had impressive impact on the accuracy of complex Natural Language Processing tasks. An advantage of kernel-based methods is their capability of exploiting structured information induced from examples. For instance, Sequence or Tree kernels operate over structures reflecting linguistic evidence, such as syntactic information encoded in syntactic parse trees. Deep Learning approaches are very effective as they can learn non-linear decision functions: however, general models require input instances to be explicitly modeled via vectors or tensors, and operating on structured data is made possible only by using ad-hoc architectures. In this work, we discuss a novel architecture that efficiently combines kernel methods and neural networks, in the attempt at squeezing the best from the two paradigms. The so-called Kernel-based Deep Architecture (KDA) adopts a Nyström-based projection function to approximate any valid kernel function and convert any structure they operate on (for instance, linguistic structures, such as trees) into dense linear embeddings. These can be used as input of a Deep Feed-forward Neural Network that exploits such embeddings to learn non-linear classification functions. KDA is a mathematically justified integration of expressive kernel functions and deep neural architectures, with several advantages: it (i) directly operates over complex non-tensor structures, e.g., trees, without ad hoc manual feature engineering or architectural design, (ii) achieves a drastic reduction of the computational cost w.r.t. pure kernel methods, and (iii) exploits the non-linearity of Deep Architectures to produce accurate models. We experimented the KDA in three rather different semantic inference tasks: Semantic Parsing, Question Classification, and Community Question Answering. Results show that the KDA achieves state-of-the-art accuracy, with a computational cost that is much lower than the one necessary to train and test a pure kernel-based method, such as the SVM algorithm
Bootstrapping large scale polarity lexicons through advanced distributional methods
Recent interests in Sentiment Analysis brought the attention on effective methods to detect opinions and sentiments in texts. Many approaches in literature are based on hand-coded resources that model the prior polarity of words or multi-word expressions. The development of such resources is expensive and language dependent so that they cannot fully cover linguistic sentiment phenomena. This paper presents an automatic method for deriving large-scale polarity lexicons based on Distributional Models of Lexical Semantics. Given a set of heuristically annotated sentences from Twitter, we transfer the sentiment information from sentences to words. The approach is mostly unsupervised, and experiments on different Sentiment Analysis tasks in English and Italian show the benefits of the generated resources
End-to-end Dependency Parsing via Auto-regressive Large Language Models
This paper presents a straightforward application of Large Language Models (LLMs) for Dependency Parsing. The parsing process is approached as a sequence-to-sequence task, where a language model takes a sentence as input and generates a bracketed form, allowing for the deterministic derivation of the dependency graph. The experimental evaluation explores the feasibility of utilizing LLMs for this purpose, while also assessing the process’s sustainability with modest parameter sizes (training on a single GPU with limited resources) and investigating the impact of incorporating multilingual data during training. The results demonstrate that an end-to-end dependency parsing process can indeed be formulated using a task-agnostic architecture
GAN-BERT: Generative adversarial learning for robust text classification with a bunch of labeled examples
Recent Transformer-based architectures, e.g., BERT, provide impressive results in many Natural Language Processing tasks. However, most of the adopted benchmarks are made of (sometimes hundreds of) thousands of examples. In many real scenarios, obtaining high-quality annotated data is expensive and time-consuming; in contrast, unlabeled examples characterizing the target task can be, in general, easily collected. One promising method to enable semi-supervised learning has been proposed in image processing, based on Semi-Supervised Generative Adversarial Networks. In this paper, we propose GAN-BERT that extends the fine-tuning of BERT-like architectures with unlabeled data in a generative adversarial setting. Experimental results show that the requirement for annotated examples can be drastically reduced (up to only 50-100 annotated examples), still obtaining good performances in several sentence classification tasks
On the impact of linguistic information in kernel-based deep architectures
Kernel methods enable the direct usage of structured representations of textual data during language learning and inference tasks. On the other side, deep neural networks are effective in learning non-linear decision functions. Recent works demonstrated that expressive kernels and deep neural networks can be combined in a Kernel-based Deep Architecture (KDA), a common framework that allows to explicitly model structured information into a neural network. This combination achieves state-of-the-art accuracy in different semantic inference tasks. This paper investigates the impact of linguistic information on the performance reachable by a KDA by studying the benefits that different kernels can bring to the inference quality. We believe that the expressiveness of data representations will play a key role in the wide spread adoption of neural networks in AI problem solving. We experimentally evaluated the adoption of different kernels (each characterized by a growing expressive power) in a Question Classification task. Results suggest the importance of rich kernel functions in optimizing the accuracy of a KDA
- …
