1,720,994 research outputs found

    Making sense of kernel spaces in neural learning

    No full text
    Kernel-based and Deep Learning methods are two of the most popular approaches in Computational Natural Language Learning. Although these models are rather different and characterized by distinct strong and weak aspects, they both had impressive impact on the accuracy of complex Natural Language Processing tasks. An advantage of kernel-based methods is their capability of exploiting structured information induced from examples. For instance, Sequence or Tree kernels operate over structures reflecting linguistic evidence, such as syntactic information encoded in syntactic parse trees. Deep Learning approaches are very effective as they can learn non-linear decision functions: however, general models require input instances to be explicitly modeled via vectors or tensors, and operating on structured data is made possible only by using ad-hoc architectures. In this work, we discuss a novel architecture that efficiently combines kernel methods and neural networks, in the attempt at squeezing the best from the two paradigms. The so-called Kernel-based Deep Architecture (KDA) adopts a Nyström-based projection function to approximate any valid kernel function and convert any structure they operate on (for instance, linguistic structures, such as trees) into dense linear embeddings. These can be used as input of a Deep Feed-forward Neural Network that exploits such embeddings to learn non-linear classification functions. KDA is a mathematically justified integration of expressive kernel functions and deep neural architectures, with several advantages: it (i) directly operates over complex non-tensor structures, e.g., trees, without ad hoc manual feature engineering or architectural design, (ii) achieves a drastic reduction of the computational cost w.r.t. pure kernel methods, and (iii) exploits the non-linearity of Deep Architectures to produce accurate models. We experimented the KDA in three rather different semantic inference tasks: Semantic Parsing, Question Classification, and Community Question Answering. Results show that the KDA achieves state-of-the-art accuracy, with a computational cost that is much lower than the one necessary to train and test a pure kernel-based method, such as the SVM algorithm

    On the impact of linguistic information in kernel-based deep architectures

    No full text
    Kernel methods enable the direct usage of structured representations of textual data during language learning and inference tasks. On the other side, deep neural networks are effective in learning non-linear decision functions. Recent works demonstrated that expressive kernels and deep neural networks can be combined in a Kernel-based Deep Architecture (KDA), a common framework that allows to explicitly model structured information into a neural network. This combination achieves state-of-the-art accuracy in different semantic inference tasks. This paper investigates the impact of linguistic information on the performance reachable by a KDA by studying the benefits that different kernels can bring to the inference quality. We believe that the expressiveness of data representations will play a key role in the wide spread adoption of neural networks in AI problem solving. We experimentally evaluated the adoption of different kernels (each characterized by a growing expressive power) in a Question Classification task. Results suggest the importance of rich kernel functions in optimizing the accuracy of a KDA

    UNITOR: Aspect Based Sentiment Analysis with Structured Learning

    No full text
    In this paper, the UNITOR system participating in the SemEval-2014 Aspect Based Sentiment Analysis competition is presented. The task is tackled exploiting Kernel Methods within the Support Vector Machine framework. The Aspect Term Extraction is modeled as a sequential tagging task, tackled through SVMhmm. The Aspect Term Polarity, Aspect Category and Aspect Category Polarity detection are tackled as a classification problem where multiple kernels are linearly combined to generalize several linguistic information. In the challenge, UNITOR system achieves good results, scoring in almost all rankings between the 2nd and the 8th position within about 30 competitors

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Learning to Generate Examples for Semantic Processing Tasks

    No full text
    Even if recent Transformer-based architectures, such as BERT, achieved impressive results in semantic processing tasks, their fine-tuning stage still requires large scale training resources. Usually, Data Augmentation (DA) techniques can help to deal with low resource settings. In Text Classification tasks, the objective of DA is the generation of well-formed sentences that (i) represent the desired task category and (ii) are novel with respect to existing sentences. In this paper, we propose a neural approach to automatically learn to generate new examples using a pre-trained sequence-to-sequence model. We first learn a task-oriented similarity function that we use to pair similar examples. Then, we use these example pairs to train a model to generate examples. Experiments in low resource settings show that augmenting the training material with the proposed strategy systematically improves the results on text classification and natural language inference tasks by up to 10% accuracy, outperforming existing DA approaches

    Learning to Solve NLP Tasks in an Incremental Number of Languages

    No full text
    In real scenarios, a multilingual model trained to solve NLP tasks on a set of languages can be required to support new languages over time. Unfortunately, the straightforward retraining on a dataset containing annotated examples for all the languages is both expensive and time-consuming, especially when the number of considered languages grows. Moreover, the original annotated material may no longer be available due to storage or business constraints. Re-training only with the new language data will inevitably result in Catastrophic Forgetting of previously acquired knowledge. We propose a Continual Learning strategy that updates a model to support new languages over time, while maintaining consistent results on previously learned languages. We define a Teacher-Student framework where the existing model "teaches" to a student model its knowledge about the languages it supports, while the student is also trained on a new language. We report an experimental evaluation in several tasks including Sentence Classification, Relational Learning and Sequence Labeling

    UNITOR: combining syntactic and semantic kernels for twitter sentiment analysis

    No full text
    In this paper, the UNITOR system participating in the SemEval-2013 Sentiment Analysis in Twitter task is presented. The polarity detection of a tweet is modeled as a classification task, tackled through a Multiple Kernel approach. It allows to combine the contribution of complex kernel functions, such as the Latent Semantic Kernel and Smoothed Partial Tree Kernel, to implicitly integrate syntactic and lexical information of annotated examples. In the challenge, UNITOR system achieves good results, even considering that no manual feature engineering is performed and no manually coded resources are employed. These kernels in-fact embed distributional models of lexical semantics to determine expressive generalization of tweets

    KeLP at SemEval-2016 task 3: Learning semantic relations between questions and answers

    No full text
    This paper describes the KeLP system participating in the SemEval-2016 Community Question Answering (cQA) task. The challenge tasks are modeled as binary classification problems: kernel-based classifiers are trained on the SemEval datasets and their scores are used to sort the instances and produce the final ranking. All classifiers and kernels have been implemented within the Kernel-based Learning Platform called KeLP. Our primary submission ranked first in Subtask A, third in Subtask B and second in Subtask C. These ranks are based on MAP, which is the referring challenge system score. Our approach outperforms all the other systems with respect to all the other challenge metrics
    corecore