Search CORE

1,720,967 research outputs found

Training multi-layer binary neural networks with random local binary error signals

Author: Roveri Manuel
Pittorino Fabrizio
Colombo Luca
Publication venue
Publication date: 01/01/2025
Field of study

Binary neural networks (BNNs) significantly reduce computational complexity and memory usage in machine and deep learning by representing weights and activations with just one bit. However, most existing training algorithms for BNNs rely on quantization-aware floating-point stochastic gradient descent (SGD), limiting the full exploitation of binary operations to the inference phase only. In this work, we propose, for the first time, a fully binary and gradient-free training algorithm for multi-layer BNNs, eliminating the need for back-propagated floating-point gradients. Specifically, the proposed algorithm relies on local binary error signals and binary weight updates, employing integer-valued hidden weights that serve as a synaptic metaplasticity mechanism, thereby enhancing its neurobiological plausibility. Our proposed solution enables the training of binary multi-layer perceptrons by using exclusively XNOR, Popcount, and increment/decrement operations. Experimental results on multi-class classification benchmarks show test accuracy improvements of up to +35.47% over the only existing fully binary single-layer state-of-the-art solution. Compared to full-precision SGD, our solution improves test accuracy by up to +35.30% under the same total memory demand, while also reducing computational cost by two to three orders of magnitude in terms of the total number of Boolean gates. The proposed algorithm is made available to the scientific community as a public repository

Archivio istituzionale della ricerca - Politecnico di Milano

Quantifying Cryptocurrency Unpredictability: A Comprehensive Study of Complexity and Forecasting

Author: Puoti Francesco
Roveri Manuel
Pittorino Fabrizio
Publication venue
Publication date: 01/01/2025
Field of study

This paper offers a thorough examination of the univariate predictability in cryptocurrency time-series. By exploiting a combination of complexity measure and model predictions we explore the cryptocurrencies time-series forecasting task focusing on the exchange rate in USD of Litecoin, Binance Coin, Bitcoin, Ethereum, and XRP. On one hand, to assess the complexity and the randomness of these time-series, a comparative analysis has been performed using Brownian and colored noises as a benchmark. The results obtained from the Complexity-Entropy causality plane and power density spectrum analysis reveal that cryptocurrency time-series exhibit characteristics closely resembling those of Brownian noise when analyzed in a univariate context. On the other hand, the application of a wide range of statistical, machine and deep learning models for time-series forecasting demonstrates the low predictability of cryptocurrencies. Notably, our analysis reveals that simpler models such as Naive models consistently outperform the more complex machine and deep learning ones in terms of forecasting accuracy across different forecast horizons and time windows. The combined study of complexity and forecasting accuracies highlights the difficulty of predicting the cryptocurrency market. These findings provide valuable insights into the inherent characteristics of the cryptocurrency data and highlight the need to reassess the challenges associated with predicting cryptocurrency’s price movements

Archivio istituzionale della ricerca - Politecnico di Milano

Deep learning via message passing algorithms based on belief propagation

Author: Gabriele Perugini
Pittorino Fabrizio
Zecchina Riccardo
Riccardo Zecchina
Carlo Lucibello
Lucibello Carlo
Perugini Gabriele
Fabrizio Pittorino
Publication venue
Publication date: 01/01/2022
Field of study

Message-passing algorithms based on the Belief Propagation (BP) equations constitute a well-known distributed computational scheme. It is exact on tree-like graphical models and has also proven to be effective in many problems defined on graphs with loops (from inference to optimization, from signal processing to clustering). The BP-based scheme is fundamentally different from stochastic gradient descent (SGD), on which the current success of deep networks is based. In this paper, we present and adapt to mini-batch training on GPUs a family of BP-based message-passing algorithms with a reinforcement field that biases distributions towards locally entropic solutions. These algorithms are capable of training multi-layer neural networks with discrete weights and activations with performance comparable to SGD-inspired heuristics (BinaryNet) and are naturally well-adapted to continual learning. Furthermore, using these algorithms to estimate the marginals of the weights allows us to make approximate Bayesian predictions that have higher accuracy than point-wise solutions

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio istituzionale della Ricerca - Bocconi

Impact of dendritic non-linearities on the computational capabilities of neurons

Author: Lauditi Clarissa
Malatesta Enrico M.
Pittorino Fabrizio
Baldassi Carlo
Zecchina Riccardo
Brunel Nicolas
Publication venue
Publication date: 01/01/2025
Field of study

These nonlinearities have motivated mathematical descriptions of single neurons as a two-layer computational units, which have been shown to increase substantially the computational abilities of neurons, compared to linear dendritic integration. However, current analytical studies are restricted to neurons with unconstrained synaptic weights and unplausible dendritic nonlinearities. Here we introduce a two-layer model with sign-constrained synaptic weights and a biologically plausible form of dendritic nonlinearity and investigate its properties using both statistical physics methods and numerical simulations. We find that the dendritic nonlinearity enhances both the number of possible learned input-output associations and the learning velocity. We characterize how capacity and learning speed depend on the implemented nonlinearity and the levels of dendritic and somatic inhibition. We calculate analytically the distribution of synaptic weights in networks close to maximal capacity and find that the dendritic nonlinearity increases the fraction of zero-weight (“silent” or “potential”) synapses, compared with the standard perceptron model, when no or weak robustness constraints are present, while the opposite occurs with strong robustness constraints. We test our model on standard real-world benchmark datasets and observe empirically that the nonlinearity provides an enhancement in generalization performance and that it enables to capture more complex input-output relations, compared to the perceptron model

Archivio istituzionale della Ricerca - Bocconi

Entropic gradient descent algorithms and wide flat minima

Author: Pittorino Fabrizio
Baldassi Carlo
Zecchina Riccardo
Lucibello Carlo
Feinauer Christoph
Perugini Gabriele
Demyanenko Elizaveta
Publication venue
Publication date: 01/01/2021
Field of study

No abstract availabl

Archivio istituzionale della Ricerca - Bocconi

Shaping the learning landscape in neural networks around wide flat minima

Author: Baldassi C.
Baldassi Carlo
Zecchina Riccardo
Zecchina R.
PITTORINO FABRIZIO
Pittorino F.
Publication venue
Publication date: 01/01/2020
Field of study

Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points and such minimizers are often satisfactory at avoiding overfitting. How these 2 features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far-reaching open question. In this paper we study basic nonconvex 1- and 2-layer neural network models that learn random patterns and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy-driven greedy and message-passing algorithms that focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian, and their generalization performance on real data

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio istituzionale della Ricerca - Bocconi

Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry

Author: Ferraro Antonio
Gabriele Perugini
Pittorino Fabrizio
Baldassi Carlo
Zecchina Riccardo
Christoph Feinauer
Riccardo Zecchina
Carlo Baldassi
Perugini Gabriele
Feinauer Christoph
Antonio Ferraro
Fabrizio Pittorino
Publication venue
Publication date: 01/01/2022
Field of study

We systematize the approach to the investigation of deep neural network landscapes by basing it on the geometry of the space of implemented functions rather than the space of parameters. Grouping classifiers into equivalence classes, we develop a standardized parameterization in which all symmetries are removed, resulting in a toroidal topology. On this space, we explore the error landscape rather than the loss. This lets us derive a meaningful notion of the flatness of minimizers and of the geodesic paths connecting them. Using different optimization algorithms that sample minimizers with different flatness we study the mode connectivity and relative distances. Testing a variety of state-of-the-art architectures and benchmark datasets, we confirm the correlation between flatness and generalization performance; we further show that in function space flatter minima are closer to each other and that the barriers along the geodesics connecting them are small. We also find that minimizers found by variants of gradient descent can be connected by zero-error paths composed of two straight lines in parameter space, i.e. polygonal chains with a single bend. We observe similar qualitative results in neural networks with binary weights and activations, providing one of the first results concerning the connectivity in this setting. Our results hinge on symmetry removal, and are in remarkable agreement with the rich phenomenology described by some recent analytical studies performed on simple shallow models

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio istituzionale della Ricerca - Bocconi

Chaos and Correlated Avalanches in Excitatory Neural Networks with Synaptic Plasticity

Author: Ibáñez Berganza Miguel
PITTORINO Fabrizio
Miguel Ibáñez-Berganza
BURIONI Raffaella
Raffaella Burioni
DI VOLO Matteo
VEZZANI Alessandro
Matteo di Volo
Alessandro Vezzani
Fabrizio Pittorino
Publication venue
Publication date: 01/01/2017
Field of study

A collective chaotic phase with power law scaling of activity events is observed in a disordered mean field network of purely excitatory leaky integrate-and-fire neurons with short-term synaptic plasticity. The dynamical phase diagram exhibits two transitions from quasisynchronous and asynchronous regimes to the nontrivial, collective, bursty regime with avalanches. In the homogeneous case without disorder, the system synchronizes and the bursty behavior is reflected into a period doubling transition to chaos for a two dimensional discrete map. Numerical simulations show that the bursty chaotic phase with avalanches exhibits a spontaneous emergence of persistent time correlations and enhanced Kolmogorov complexity. Our analysis reveals a mechanism for the generation of irregular avalanches that emerges from the combination of disorder and deterministic underlying chaotic dynamics

Crossref

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Dinamica complessa emergente in reti neurali con plasticità sinaptica

Author: Pittorino Fabrizio
Publication venue
Publication date: 2017
Field of study

This thesis concerns the study of the emerging dynamical regimes in a neural network in the presence of the mechanism of short-term synaptic plasticity. In particular, the aim has been to characterize and to study the collective regimes of synchronization, chaos and criticality. Thanks to the measures developed in the thesis, it has been possible to draw with great precision the phase diagram (hitherto unknown) of the leaky integrate-and-fire single neuron model connected with a Tsodyks-Uziel-Markram model for short-term synaptic plasticity on a mean field and disordered topology, and to elucidate (also analytically, by means of the reduction of the dynamics to a few simple coupled equations) the mechanism by which the model becomes chaotic in the mean field phase, preserves chaos and generates power-law distributed avalanches in the disordered topology.Questa tesi riguarda lo studio dei regimi dinamici emergenti in una rete neurale, in presenza del meccanismo di plasticità sinaptica a breve termine. In particolare, l'obiettivo è stato quello di caratterizzare e studiare i regimi collettivi di sincronizzazione, caos e criticalità. Grazie alle misure sviluppate nella tesi, è stato possibile stabilire con grande precisione il diagramma di fase (finora sconosciuto) del modello a singolo neurone leaky integrate-and-fire connesso con un modello di plasticità sinaptica Tsodyks-Uziel-Markram in campo medio e su una topologia disordinata, e chiarire (anche analiticamente, mediante la riduzione della dinamica a poche semplici equazioni accoppiate) il meccanismo con cui il modello diventa caotico nella fase di campo medio e preserva il caos e genera valanghe con taglie distribuite a legge di potenza nella topologia disordinata

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)