1,720,968 research outputs found
Attention Mechanism e Interpretabilità del Deep Learning per il Natural Language Processing in Ambito Biomedico
Attention Mechanism e Interpretabilità del Deep Learning per il Natural Language Processing in Ambito Biomedic
The Impact of Self-Interaction Attention on the Extraction of Drug-Drug Interactions
Since a large amount of medical treatments requires the assumption of multiple drugs, the discovery of how these interact with each other, potentially causing health problems to the patients, is the subject of a huge quantity of documents. In order to obtain this information from free text, several methods involving deep learning have been proposed over the years. In this paper we introduce a Recurrent Neural Network-based method combined with the Self-Interaction Attention Mechanism. Such a method is applied to the DDI2013-Extraction task, a popular challenge concerning the extraction and the classification of drug-drug interactions. Our focus is to show its effect over the tendency to predict the majority class and how it differs from the other types of attention mechanisms
Applying Self-Interaction Attention for Extracting Drug-Drug Interactions
Discovering the effect of the simultaneous assumption of drugs is a very important field in medical research that could improve the effectiveness of healthcare and avoid adverse drug reactions which can cause health problems to patients. Although there are several pharmacological databases containing information on drugs, this type of information is often expressed in the form of free text. Analyzing sentences in order to extract drug-drug interactions was the objective of the DDIExtraction-2013 task. Despite the fact that the challenge took place six years ago, the interest on this task is still active and several new methods based on Recurrent Neural Networks and Attention Mechanisms have been designed. In this paper, we propose a model that combines bidirectional Long Short Term Memory (LSTM) networks with the Self-Interaction Attention Mechanism. Experimental analysis shows how this model improves the classification accuracy reducing the tendency to predict the majority class resulting in false negatives, over several input configurations
Deep learning for classification of radiology reports with a hierarchical schema
Radiological reports are a valuable source of textual information, which can be exploited to improve clinical care and to support research. Such information can be extracted and put into a structured form using machine learning techniques. Some of them rely not only on the classification labels but also on the manual annotation of relevant snippets, which is a time consuming job and requires domain experts. In this paper, we apply deep learning techniques and in particular Long Short Term Memory (LSTM) networks to perform such a task relying only on the classification labels. We focus on the classification of chest computed tomography reports in Italian according to a classification schema proposed for this task by the radiologists of Spedali Civili di Brescia. Each report is classified according to such schema using a combination of neural network classifiers. The resulting system is a novel classification system, which we compare to a previous system based on standard machine learning techniques which used annotations of relevant snippets
Machine Learning Models for Predicting Short-Long Length of Stay of COVID-19 Patients
During 2020 and 2021, managing limited healthcare resources and hospital beds has been a fundamental aspect of the fight against the COVID-19 pandemic. Predicting in advance the length of stay, and in particular identifying whether a patient is going to stay in the hospital longer or less than a week, can provide important support in handling resources allocation. However, there have been significant changes in terms of containment measures, virus diffusion, new treatments, vaccines, and new variants of SARS-CoV-2 during the last period. These changes pose several conceptual drift issues that can limit the usefulness of machine learning in this context. In this work, we present a machine learning system trained and tested using data from more than 6000 hospitalised patients in northern Italy, distributed over almost two years of pandemic. We show how machine learning can be effective even by analysing data over this long period of time, also exploiting a model that predicts the patient's outcome in terms of discharge or death. Furthermore, learning from data that also consider deceased patients is a common issue in predicting the length of stay because they have severe conditions similar to patients with a long stay period, but may actually have a very short duration of hospitalisation. For this purpose, we present a method for handling data from alive and deceased patients, exploiting more patient records, increasing the robustness of the model and its performance in this task. Finally, we investigate the features that are most relevant to the prediction of the simplified length of stay
An Analysis on How Pre-Trained Language Models Learn Different Aspects
By now, it is widely known that pre-trained Neural Language Models (NLM) and Large Language Models (LLM)
possess remarkable capabilities and they are able to solve many Natural Language Processing Tasks. However,
not as much is understood regarding how Transformer-based models acquire this ability during their complex
training process. In this context, an interesting line of work surfaced in the last few years: the study of the
so-called learning trajectories. Several studies tested the knowledge acquired by a model not only when it was
fully trained, but also in its checkpoints, i.e. intermediate versions of the model at different stages during its
training. Nonetheless, most of these works focused on simple tasks, often analysing single grammatical aspects
(such as part-of-speech tags, transitive verbs, etc.) without a proper comparison with more complex tasks and
with semantics-based aspects. In this paper, we consider two additional tasks to study the learning trajectory of
NLMs and to compare different aspects. The first one consists on classifying a sentence as correct or wrong, from
the grammatical point of view, from a novel dataset which can contain several types of errors. The second one is
a totally semantic-based task revolving understanding whether a sentence is funny or not. In our experimental
evaluation, we compare the learning trajectories on these two tasks with three simpler grammatical aspects.
Thus, we highlight the most important similarities and divergences in terms of how these types of knowledge are
learned by three GPT-NeoX models. Moreover, we analyse the behaviour of each layers of the models considered,
verifying whether there are significant differences among them
Recurrent Neural Networks for Daily Estimation of COVID-19 Prognosis with Uncertainty Handling
Most ML-based applications for COVID-19 assess the general conditions of a patient trained and tested on cohorts of patients collected over a short period of time and are capable of providing an alarm a few days in advance, helping clinicians in emergency situations, monitor hospitalised patients and identify potentially critical situations at an early stage. However, the pandemic continues to evolve due to new variants, treatments, and vaccines; considering datasets over short periods could not capture this aspect. In addition, these applications often avoid dealing with the uncertainty associated with the prediction provided by machine learning models, potentially causing costly mistakes. In this work, we present a system based on Recurrent Neural Networks (RNN) for the daily estimate of the prognosis of COVID-19 patients that is built and tested using data collected over a long period of time. Our system achieves high predictive performance and uses an algorithm to effectively determine and discard those patients for whom RNN cannot predict the prognosis with sufficient confidence
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
