1,721,061 research outputs found
Experimental Comparison of Theory-Guided Deep Learning Algorithms
The enrichment of machine learning models with domain knowledge has a growing impact on modern engineering and physics problems. This trend stems from the fact that the rise of deep learning algorithms is closely associated with an increasing demand for data that is not acceptable or available in many use cases. In this context, the incorporation of physical knowledge or a-priori constraints has been shown to be beneficial in many tasks. On the other hand, this collection of approaches is context-specific, and it is difficult to generalize them to new problems. In this paper, we experimentally compare some of the most widely used theory injection strategies to perform a systematic analysis of their advantages. Selected state-of-the-art algorithms have been reproduced for different use cases to evaluate their effectiveness with smaller training data and to discuss how the underlined strategies can fit into new application contexts
A Comparative Study of Neural Ordinary Differential Equations and Neural Operators for Modeling Temporal Dynamics
Capturing the dynamics of relational systems is a key challenge in the natural sciences, with applications ranging from simulating molecular interactions to analyzing particle mechanics. Machine learning approaches have made significant progress in this area by using graph neural networks to learn and visualize spatial interactions effectively. Neural ordinary differential equations (Neural ODEs) and neural operators (NO) represent two distinct paradigms. However, a clear comparative understanding of when to prefer one over the other is still lacking. To address this gap, we present the first systematic comparison between two representative architectures: EGNO (Equivariant Graph Neural Operator) and SEGNO (Second-order Equivariant Graph Neural Ordinary Differential Equation). Through a series of experiments, we investigate their strengths and limitations in various simulation scenarios in the multi-step trajectory prediction tasks. Specifically, we employ rollout strategies and different input/output configurations, including multiple and irregularly sampled time steps. Our findings highlight a key trade-off between precision and stability that is central to model selection. SEGNO demonstrates superior robustness and stability over long prediction horizons, making it well-suited for tasks requiring reliable long-term forecasting. Conversely, EGNO offers higher precision during early stages of the trajectory and better leverages diverse training configurations, thanks to its discretization-invariant design. In summary, Neural Operators (EGNO) are preferable when short-term accuracy and data efficiency are critical, while Neural ODEs (SEGNO) are advantageous for scenarios demanding stable long-term predictions. This work not only clarifies the practical advantages of each approach but also lays the groundwork for informed model selection and future hybrid strategies in dynamical system modeling
Hermes, a low-latency transactional storage for binary data streams from remote devices
In many contexts where data is streamed on a large scale, such as video surveillance systems, there is a dual requirement: secure data storage and continuous access to audio and video content by third parties, such as human operators or specific business logic, even while the media files are still being collected. However, using transactions to ensure data persistence often limits system throughput and latency. This paper presents a solution that enables both high ingestion rates with transactional data persistence and near real-time, low-latency access to the stream during collection. This immediate access enables the prompt application of specialized data engineering algorithms during data acquisition. The proposed solution is particularly suitable for binary data sources such as audio and video recordings in surveillance systems, and it can be extended to various big data scenarios via well-defined general interfaces. The scalability of the approach is based on the microservice architecture. Preliminary results obtained with Apache Kafka and MongoDB replica sets show that the proposed solution provides up to 3 times higher throughput and 2.2 times lower latency compared to standard multi-document transactions
Prediction of coffee consumption using Graph Neural Networks and Explainable AI
Accurate forecasting regional sales in heterogeneous locations presents a complex challenge that extends beyond the capabilities of traditional predictive models. In this study, we focus on predicting coffee sales for one of the local coffee companies in Italy by integrating machine learning techniques with graph-based deep learning models. We begin by establishing a baseline using a Multi-Layer Perceptron (MLP) and subsequently apply six Graph Neural Network (GNN) architectures: GCN, GAT, GraphSAGE, GIN, ChebNet, and GraphConv to capture spatial dependencies among distribution points. To enhance model interpretability and guide feature selection, we incorporate Integrated Gradients from the Explainable AI (XAI) framework. Experimental results demonstrate that GNNs consistently outperform the MLP baseline, particularly in capturing location-driven relational patterns. In particular, the results show how GraphSAGE and ChebNet outperformed the other architectures. The integration of graph-based modeling with interpretable learning provides valuable insights for optimizing sales strategies in geographically distributed markets
Combining fault-tolerant persistence and low-latency streaming access to binary data for AI models
In many AI-enabled scenarios, such as video surveillance systems, besides requiring the data to be stored safely, human operators and AI models must also be able to access audio and video streams continuously while media files are still being collected. However, system throughput and latency are often limited by the use of transactionality to guarantee data persistence. This paper presents a solution providing both high ingestion rates with transactional data persistence and low-latency access to the stream during collection in near real-time. This enables the AI algorithms to be immediately applied as soon as the data is received. The binary data sources fit well with the audio and video capture of surveillance or similar systems, but the proposed solution can be extended through well-defined general interfaces. The scalability of the proposed approach is based on the microservice architecture. Using Apache Kafka and MongoDB replica sets, preliminary results show that the proposed solution provides up to 6 times larger throughput and 4.5 times lower latency than current standard multi-document transactions
Lorentz-invariant augmentation for high-energy physics
In recent years, machine learning models for jet tagging in high-energy physics have gained considerable attention. However, many existing approaches overlook the physical invariants that jets must adhere to, particularly the fundamental spacetime symmetry governed by Lorentz transformations.
In this study, we propose a model-agnostic training strategy that incorporates theory-guided data augmentation to simulate the effects of Lorentz transformations on jet data. We specifically focus on the state-of-the-art baseline ParticleNet, a neural network architecture designed for the direct processing of particle clouds for jet tagging. To evaluate the effectiveness of our approach, we conduct experiments with different augmentation strategies and assess the performance of the augmented models on the widely used top-tagging reference dataset.
The results show that even a small application of the data augmentation strategy increases the robustness of the model to Lorentz boost attacks, i.e., high transformation β.
While the accuracy of the baseline model decreases rapidly with increasing intensity of the transformation β, the augmented models exhibit more stable performance.
Remarkably, models that underwent a moderate level of augmentation demonstrated a statistically significant performance boost on transformations beyond the ones seen at train time. This finding highlights the potential of the data augmentation strategy in enhancing model accuracy while preserving the essential physical properties of the jets
A Model-based Curriculum Learning Strategy for Training SegFormer
The use of Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) in computer vision opened up new tracks in this area. However, a significant drawback of these models is the large amount of data required to obtain competitive results. This critical issue limits their application in domains where large labeled data collections are unavailable. Some strategies have been proposed to use relatively limited labeled data sets to train CNN-based models. Curriculum learning is one of the currently available strategies to train deep learning models faster and with less data. However, to our knowledge, curriculum learning techniques have never been used at the model level to support ViT training for semantic segmentation. We propose a new curriculum learning technique tailored to ViT models to fill this gap. The results show the effectiveness of the proposed strategy in training ViT models from scratch to solve the semantic segmentation task
Correlating Espresso Quality with Coffee-Machine Parameters by Means of Association Rule Mining
Coffee is among the most popular beverages in many cities all over the world, being both at the core of the busiest shops and a long-standing tradition of recreational and social value for many people. Among the many coffee variants, espresso attracts the interest of different stakeholders: from citizens consuming espresso around the city, to local business activities, coffee-machine vendors and international coffee industries. The quality of espresso is one of the most discussed and investigated issues. So far, it has been addressed by means of human experts, electronic noses, and chemical approaches. The current work, instead, proposes a data-driven approach exploiting association rule mining. We analyze a real-world dataset of espresso brewing by professional coffee-making machines, and extract all correlations among external quality-influencing variables and actual metrics determining the quality of the espresso. Thanks to the application of association rule mining, a powerful data-driven exhaustive and explainable approach, results are expressed in the form of human-readable rules combining the variables of interest, such as the grinder settings, the extraction time, and the dose amount. Novel insights from real-world coffee extractions collected on the field are presented, together with a data-driven approach, able to uncover insights into the espresso quality and its impact on both the life of consumers and the choices of coffee-making industries
Trusting deep learning natural-language models via local and global explanations
Despite the high accuracy offered by state-of-the-art deep natural-language models (e.g., LSTM, BERT), their application in real-life settings is still widely limited, as they behave like a black-box to the end-user. Hence, explainability is rapidly becoming a fundamental requirement of future-generation data-driven systems based on deep-learning approaches. Several attempts to fulfill the existing gap between accuracy and interpretability have been made. However, robust and specialized eXplainable Artificial Intelligence solutions, tailored to deep natural-language models, are still missing. We propose a new framework, named T-EBANO, which provides innovative prediction-local and class-based model-global explanation strategies tailored to deep learning natural-language models. Given a deep NLP model and the textual input data, T-EBANO provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process. Specifically, the framework extracts sets of interpretable features mining the inner knowledge of the model. Then, it quantifies the influence of each feature during the prediction process by exploiting the normalized Perturbation Influence Relation index at the local level and the novel Global Absolute Influence and Global Relative Influence indexes at the global level. The effectiveness and the quality of the local and global explanations obtained with T-EBANO are proved on an extensive set of experiments addressing different tasks, such as a sentiment-analysis task performed by a fine-tuned BERT model and a toxic-comment classification task performed by an LSTM model. The quality of the explanations proposed by T-EBANO, and, specifically, the correlation between the influence index and human judgment, has been evaluated by humans in a survey with more than 4000 judgments. To prove the generality of T-EBANO and its model/task-independent methodology, experiments with other models (ALBERT, ULMFit) on popular public datasets (Ag News and Cola) are also discussed in detail
- …
