1,721,044 research outputs found
Modeling cognition with generative neural networks: The case of orthographic processing
This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words
A Comparison of Recurrent and Convolutional Deep Learning Architectures for EEG Seizure Forecasting
Many research efforts are being spent to discover predictive markers of seizures, which would allow to build
forecasting systems that could mitigate the risk of injuries and clinical complications in epileptic patients.
Although electroencephalography (EEG) is the most widely used tool to monitor abnormal brain electrical
activity, no commercial devices can reliably anticipate seizures from EEG signal analysis at present. Re-
cent advances in Artificial Intelligence, particularly deep learning algorithms, show promise in enhancing
EEG classifier forecasting accuracy by automatically extracting relevant spatio-temporal features from EEG
recordings. In this study, we systematically compare the predictive accuracy of two leading deep learning
architectures: recurrent models based on Long Short-Term Memory networks (LSTMs) and Convolutional
Neural Networks (CNNs). To this aim, we consider a data set of long-term, continuous multi-channel EEG
recordings collected from 29 epileptic patients using a standard set of 20 channels. Our results demonstrate
the superior performance of deep learning algorithms, which can achieve up to 99% accuracy, sensitivity, and
specificity compared to more traditional machine learning approaches, which settle around 75% in all evalu-
ation metrics. Our results also show that giving as input the recordings from all electrodes allows to exploit
useful channel correlations to learn more robust predictive features, compared to convolutional models that
treat each channel independently. We conclude that deep learning architectures hold promise for enhancing
the diagnosis and prediction of epileptic seizures, offering potential benefits to those affected by such invali-
dating neurological conditions
Neural Networks for Sequential Data: A Pre-training Approach based on Hidden Markov Models
In the last few years, research highlighted the critical role of unsupervised pre-training strategies to improve the performance of artificial neural networks. However, the scope of existing pre-training methods is limited to static data, whereas many learning tasks require to deal with temporal information. We propose a novel approach to pre-training sequential neural networks that exploits a simpler, first-order Hidden Markov Model to generate an approximate distribution of the original dataset. The learned distribution is used to generate a smoothed dataset that is used for pre-training. In this way, it is possible to drive the connection weights in a better region of the parameter space, where subsequent fine-tuning on the original dataset can be more effective. This novel pre-training approach is model-independent and can be readily applied to different network architectures. The benefits of the proposed method, both in terms of accuracy and training times, are demonstrated on a prediction task using four datasets of polyphonic music. The flexibility of the proposed strategy is shown by applying it to two different recurrent neural network architectures, and we also empirically investigate the impact of different hyperparameters on the performance of the proposed pre-training strategy
A HMM-based pre-training approach for sequential data
Much recent research highlighted the critical role of unsuper- vised pre-training to improve the performance of neural network models. However, extensions of those architectures to the temporal domain intro- duce additional issues, which often prevent to obtain good performance in a reasonable time. We propose a novel approach to pre-train sequential neural networks in which a simpler, approximate distribution generated by a linear model is first used to drive the weights in a better region of the parameter space. After this smooth distribution has been learned, the net- work is fine-tuned on the more complex real dataset. The benefits of the proposed method are demonstrated on a prediction task using two datasets of polyphonic music, and the general validity of this strategy is shown by applying it to two different recurrent neural network architectures
An emergentist perspective on the origin of number sense
The finding that human infants and many other animal species are sensitive to numerical quantity has been widely interpreted as evidence for evolved, biologically determined numerical capacities across unrelated species, thereby supporting a ‘nativist’ stance on the origin of number sense. Here, we tackle this issue within the ‘emergentist’ perspective provided by artificial neural network models, and we build on computer simulations to discuss two different approaches to think about the innateness of number sense. The first, illustrated by artificial life simulations, shows that numerical abilities can be supported by domain-specific representations emerging from evolutionary pressure. The second assumes that numerical representations need not be genetically pre-determined but can emerge from the interplay between innate architectural constraints and domain-general learning mechanisms, instantiated in deep learning simulations. We show that deep neural networks endowed with basic visuospatial processing exhibit a remarkable performance in numerosity discrimination before any experience-dependent learning, whereas unsupervised sensory experience with visual sets leads to subsequent improvement of number acuity and reduces the influence of continuous visual cues. The emergent neuronal code for numbers in the model includes both numerosity-sensitive (summation coding) and numerosity-selective response profiles, closely mirroring those found in monkey intraparietal neurons. We conclude that a form of innatism based on architectural and learning biases is a fruitful approach to understanding the origin and development of number sense.
This article is part of a discussion meeting issue ‘The origins of numerical abilities'.</jats:p
Probabilistic Models and Generative Neural Networks: Towards an Unified Framework for Modeling Normal and Impaired Neurocognitive Functions
Assessment of sequential Boltmann machines on a lexical processing task
The Recurrent Temporal Restricted Boltzmann Machine is a
promising probabilistic model for processing temporal data. It has been shown to learn physical dynamics from videos (e.g. bouncing balls), but its ability to process sequential data has not been tested on symbolic tasks.
Here we assess its capabilities on learning sequences of letters corresponding to English words. It emerged that the model is able to extract local transition rules between items of a sequence (i.e. English graphotactic rules), but it does not seem to be suited to encode a whole word
Self-Communicating Deep Reinforcement Learning Agents Develop External Number Representations
Symbolic numbers are a remarkable product ofhuman cultural development. The developmentalprocess involved the creation and progressive re-finement of material representational tools, suchas notched tallies, knotted strings, and countingboards. In this paper, we introduce a computa-tional framework that allows the investigation ofhow material representations might support num-ber processing in a deep reinforcement learning sce-nario. In this framework, agents can use an exter-nal, discrete state to communicate information tosolve a simple numerical estimation task. We findthat different perceptual and processing constraintsresult in different emergent representations, whosespecific characteristics can facilitate the learningand communication of numbers.publishedVersio
- …
