1,720,964 research outputs found
XFlow: Cross-Modal Deep Neural Networks for Audiovisual Classification
In recent years, there have been numerous developments toward solving multimodal tasks, aiming to learn a stronger representation than through a single modality. Certain aspects of the data can be particularly useful in this case - for example, correlations in the space or time domain across modalities - but should be wisely exploited in order to benefit from their full predictive potential. We propose two deep learning architectures with multimodal cross connections that allow for dataflow between several feature extractors (XFlow). Our models derive more interpretable features and achieve better performances than models that do not exchange representations, usefully exploiting correlations between audio and visual data, which have a different dimensionality and are nontrivially exchangeable. This article improves on the existing multimodal deep learning algorithms in two essential ways: 1) it presents a novel method for performing cross modality (before features are learned from individual modalities) and 2) extends the previously proposed cross connections that only transfer information between the streams that process compatible data. Illustrating some of the representations learned by the connections, we analyze their contribution to the increase in discrimination ability and reveal their compatibility with a lip-reading network intermediate representation. We provide the research community with Digits, a new data set consisting of three data types extracted from videos of people saying the digits 0-9. Results show that both cross-modal architectures outperform their baselines (by up to 11.5%) when evaluated on the AVletters, CUAVE, and Digits data sets, achieving the state-of-the-art results
Algorithmic Concept-Based Explainable Reasoning
Recent research on graph neural network (GNN) models successfully applied GNNs to classical graph algorithms and combinatorial optimisation problems. This has numerous benefits, such as allowing applications of algorithms when preconditions are not satisfied, or reusing learned models when sufficient training data is not available or can’t be generated. Unfortunately, a key hindrance of these approaches is their lack of explainability, since GNNs are black-box models that cannot be interpreted directly. In this work, we address this limitation by applying existing work on concept-based explanations to GNN models. We introduce concept-bottleneck GNNs, which rely on a modification to the GNN readout mechanism. Using three case studies we demonstrate that: (i) our proposed model is capable of accurately learning concepts and extracting propositional formulas based on the learned concepts for each target class; (ii) our concept-based GNN models achieve comparative performance with state-of-the-art models; (iii) we can derive global graph concepts, without explicitly providing any supervision on graph-level concepts
Continuous Neural Algorithmic Planners
Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially using graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines
Principal neighbourhood aggregation for graph nets
Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features—which occur regularly in real-world input domains and within the hidden layers of GNNs—and we demonstrate the requirement for multiple aggregation functions in this context. Accordingly, we propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). Finally, we compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of our model. With this work, we hope to steer some of the GNN research towards new aggregation methods which we believe are essential in the search for powerful and robust models
Predicting Patient Outcomes with Graph Representation Learning
Recent work on predicting patient outcomes in the Intensive Care Unit (ICU) has focused heavily on the physiological time series data, largely ignoring sparse data such as diagnoses and medications. When they are included, they are usually concatenated in the late stages of a model, which may struggle to learn from rarer disease patterns. Instead, we propose a strategy to exploit diagnoses as relational information by connecting similar patients in a graph. To this end, we propose LSTM-GNN for patient outcome prediction tasks: a hybrid model combining Long Short-Term Memory networks (LSTMs) for extracting temporal features and Graph Neural Networks (GNNs) for extracting the patient neighbourhood information. We demonstrate that LSTM-GNNs outperform the LSTM-only baseline on length of stay prediction tasks on the eICU database. More generally, our results indicate that exploiting information from neighbouring patient cases using graph neural networks is a promising research direction, yielding tangible returns in supervised learning performance on Electronic Health Records
ChronoMID—Cross-modal neural networks for 3-D temporal medical imaging data
ChronoMID—neural networks for temporally-varying, hence Chrono, Medical Imaging Data—makes the novel application of cross-modal convolutional neural networks (X-CNNs) to the medical domain. In this paper, we present multiple approaches for incorporating temporal information into X-CNNs and compare their performance in a case study on the classification of abnormal bone remodelling in mice. Previous work developing medical models has predominantly focused on either spatial or temporal aspects, but rarely both. Our models seek to unify these complementary sources of information and derive insights in a bottom-up, data-driven approach. As with many medical datasets, the case study herein exhibits deep rather than wide data; we apply various techniques, including extensive regularisation, to account for this. After training on a balanced set of approximately 70000 images, two of the models—those using difference maps from known reference points—outperformed a state-of-the-art convolutional neural network baseline by over 30pp (> 99% vs. 68.26%) on an unseen, balanced validation set comprising around 20000 images. These models are expected to perform well with sparse data sets based on both previous findings with X-CNNs and the representations of time used, which permit arbitrarily large and irregular gaps between data points. Our results highlight the importance of identifying a suitable description of time for a problem domain, as unsuitable descriptors may not only fail to improve a model, they may in fact confound it
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Neural Distance Embeddings for Biological Sequences
The development of data-dependent heuristics and representations for biological sequences that reflect their evolutionary distance is critical for large-scale biological research. However, popular machine learning approaches, based on continuous Euclidean spaces, have struggled with the discrete combinatorial formulation of the edit distance that models evolution and the hierarchical relationship that characterises real-world datasets. We present Neural Distance Embeddings (NeuroSEED), a general framework to embed sequences in geometric vector spaces, and illustrate the effectiveness of the hyperbolic space that captures the hierarchical structure and provides an average 22% reduction in embedding RMSE against the best competing geometry. The capacity of the framework and the significance of these improvements are then demonstrated devising supervised and unsupervised NeuroSEED approaches to multiple core tasks in bioinformatics. Benchmarked with common baselines, the proposed approaches display significant accuracy and/or runtime improvements on real-world datasets. As an example for hierarchical clustering, the proposed pretrained and from-scratch methods match the quality of competing baselines with 30x and 15x runtime reduction, respectively
Combinatorial Optimization and Reasoning with Graph Neural Networks
Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have mostly focused on solving problem instances in isolation, ignoring the fact that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning, especially graph neural networks, as a key building block for combinatorial tasks, either directly as solvers or by enhancing the former. This paper presents a conceptual review of recent key advancements in this emerging field, aiming at researchers in both optimization and machine learning
- …
