HAL-CentraleSupelec
Not a member yet
77624 research outputs found
Sort by
Semantic Parsing for Evaluating Large Language Models: Separating Linguistic Abilities with YARN
International audienceWe evaluate large language models (LLMs) through semantic parsing into Yarn, a structured meaning representation that distinguishes predicate-argument structure from higher-level linguistic features such as tense, aspect, and modality. For evaluation, we employ SmatchY, a fine-grained metric designed to assess different layers of meaning independently. Our experiments test multiple LLMs under varied conditions, including inference modes, linearization formats (JSON and logic-inspired CFG), and the presence or absence of auxiliary supervision via partial semantic parses. Results show that model performance is highly sensitive to both representational design and supervision, with no single configuration consistently outperforming the others. While some models gain from additional semantic information in prompts, others are negatively affected. A layer-wise analysis indicates that surface-level features such as temporality and negation are captured more reliably than deeper semantic phenomena like quantification. Consistent with prior work, our findings highlight the limited capacity of current LLMs to generate fully formal meaning representations
In the Search for Truth: Refining and Exploring Variability in Neuroimaging Pipelines
Neuroimaging pipelines can be designed as highly configurable software workflows whose variants may produce different analysis results and scientific conclusions. Managing this analytical variability is particularly challenging because outputs are complex, high-dimensional brain maps and no ground truth exists to assess their correctness. We model neuroimaging pipelines as a software product line and leverage the resulting feature model in a novel two-step process. First, we refine the configuration space by learning constraints from observed outputs and integrating them into the feature model. Second, we explore the refined space to identify influential features, characterize clusters, and predict unseen configurations. Evaluations on two case studies show that a small subset of features explains most observed variability and that accurate predictions can be achieved from a limited number of executed variants. They also demonstrate the complementarity of the two steps: refining the configuration space by excluding undesirable variants improves the interpretability and predictive power of subsequent exploration. Beyond neuroimaging, our results demonstrate how feature models can support both specialization and structured exploration of complex configuration spaces without ground truth
Taxonomy of Interaction Techniques to Mitigate Inappropriate Social Interactions in Social Virtual Reality
International audienceSocial Virtual Reality (SVR) enables users to embody avatars and interact with others in shared virtual environments. On platforms such as Rec Room, VRChat, and Meta Horizon Worlds, inappropriate social behaviours are common and can lead to harmful emotional experiences for users. To prevent or mitigate these negative effects, SVR platforms offer a range of actions, referred to as safety tools, that allow users to control the information and interactions they are exposed to. Although prior research has analysed and proposed initial classifications of these tools, a structured, theory-driven approach to their characterization is still lacking. In this paper, we propose a taxonomy of safety tools for SVR platforms inspired by existing literature in social psychology and human-computer interaction. In order to characterise these tools, we focus on the area of social interaction, which occurs between a sender and a receiver, within a social context of the SVR medium. Our approach not only provides a model of social interaction in SVR, but also describes how safety tools function, including the processes of selecting and manipulating them, factors that are critical to their effectiveness. Finally, we apply our framework to characterise existing safety tools, identify their limitations, and present design guidelines for future SVR platforms
CVE-LMTune: Tuning LMs for Multi-Taxonomy Vulnerability Classification
CVE-LMTune is a tool that provides an automated pipeline for collecting and preprocessing labeled vulnerability data from multiple sources, as well as for constructing a unified dataset for fine-tuning and evaluating language models in multi-label classification across CWE, CAPEC, and MITRE ATT&CK. The tool integrates both fine-tuning and evaluation pipelines and offers advanced support for hierarchical classification. In particular, it enables a hierarchical approach based on a DAG of specialized models, where each model handles a specific taxonomy level, with the goal of improving overall classification performance
Modeling washboard patterns on unpaved roads through transport dynamics
International audienc
Privacy Meets Regulations: Shaping the Future of Work
International audienceCrowdworking platforms facilitate diverse workers in executing tasks for various requesters, contributing to the growth of the gig economy and the emergence of competing and complementary independent platforms. This has led to the development of multiplatform crowdworking systems, where workers and requesters often engage with multiple platforms. Recently, there has been an increasing interest among governmental, legal, and social institutions in enforcing regulations, such as minimum and maximum work hours, on these platforms. Consequently, collaboration among platforms within multi-platform systems is essential to enforce these cross-platform regulations effectively. However, while such collaboration necessitates the transparent sharing of information regarding tasks and participants, it is crucial to preserve the privacy of all involved participants. This paper outlines a vision for regulating, preserving privacy, and structuring future multi-platform crowdworking environments. We propose a potential instance of a multi-platform crowdworking system capable of enforcing a significant subset of practical global regulations across distributed independent platforms while preserving privacy through the use of lightweight anonymous tokens and fault-tolerant protocols
How important are inter-dataset interactions for large scale analysis of fMRI data: A multi-dimensional comparison
International audienceAnalyzing multi-subject functional magnetic resonance imaging (fMRI) data requires methods that can jointly capture shared and individual patterns of brain activity across participants. Joint blind source separation (JBSS) techniques, such as independent vector analysis (IVA), provide a principled framework for this purpose by modeling dependencies across subjects while identifying distinct functional networks. Constrained IVA variants, including adaptive-reverse cIVA-G (ar-cIVA-G) and threshold-free cIVA-G (tf-cIVA-G), further enhance interpretability through the use of reference templates and inter-subject correlation constraints. Alternatively, regression-based methods like IVA-G regression (regIVA-G) and reference-guided component analysis (RGCA) process subjects individually, aligning their components to references with improved computational efficiency. Despite their potential, systematic evaluations of reference-based JBSS approaches for fMRI analysis remain limited. In this work, we present a comparative study of these methods to assess their capacity for identifying schizophrenia-related biomarkers using real fMRI data from subjects with schizophrenia and healthy controls. Our results demonstrate that both constrained IVA and regression-based methods effectively extract meaningful biomarkers while the latter achieve comparable performance at substantially reduced computational cost
Demystifying Performer Attention: Handle Genome-Length Sequences Efficiently
https://thekhair.github.io/This document provides a comprehensive tutorial on attention mechanisms, starting from the fundamental self-attention mechanism and progressing to the efficient Performer attention. We explain all mathematical concepts with clarity, using gene sequence analysis as a motivating example throughout. The document includes stepby-step explanations, comparative analyses, practical examples, and complete PyTorch implementation code for Performer attention. All concepts are presented in an accessible manner suitable for both beginners and experienced practitioners in machine learning and computational biology.This document provides a comprehensive tutorial on attention mechanisms, starting from the fundamental self-attention mechanism and progressing to the efficientPerformer attention. We explain all mathematical concepts with clarity, using gene sequence analysis as a motivating example throughout. The document includes step-by-step explanations, comparative analyses, practical examples, and complete PyTorch implementation code for Performer attention. All concepts are presented in an accessible manner suitable for both beginners and experienced practitioners in machine learning and computational biology
Analyse Structurelle des changements de mode dans les DAE multimodes
Hybrid systems are an important concept in Cyber-Physical Systems modeling, for which multiphysics modeling from first principles and the reuse of models from libraries are key. To achieve this, DAEs must be used to specify the dynamics in each discrete state (or mode in our context). This led to the development of DAE-based equational languages supporting multiple modes, of which Modelica is a popular standard. Mode switching can be time-or state-based. Impulsive behaviors can occur at mode changes. While mode changes are well understood in particular physics (e.g., contact mechanics), this is not the case in physics-agnostic paradigms such as Modelica. This situation causes difficulties for the compilation of programs, often requiring users to manually "smooth out" mode changes. In this paper, we propose a novel approach for the hot restart at mode changes in such paradigms. We propose a mathematical meaning for hot restarts (such a mathematical meaning does not exist in general), as well as a combined structural-and-impulse analysis for mode changes, generating the hot restart even in the presence of impulses. Our algorithm detects at compile time if the mode change is insufficiently specified, in which case it returns diagnostics information to the user.La modélisation des systèmes cyber-physiques repose sur une modélisation à partir des principes de la physique, et en réutilisant au maximum des modèles prédéfinis issus d’unebibliothèque. Cela exige le recours aux Equations Différentielles Algébriques (DAE) admettant plusieurs modes (une DAE commutée, ou une DAE hybride). Le standard de modélisation est le langage Modelica. Les changements de mode peuvent ˆetre déclenchés de manière externe, ou par des conditions portant sur les états. Ces changements de mode sont connus et traités à l’intérieur de physiques particulières (mécanique avec contacts). Il en va autrement dans un cadre multi-physique général, qui est, pourtant, celui de Modelica et d’autres langages de modélisation multi-physique. Dans ce papier, nous proposons une approche nouvelle pour le redémarrage à chaud suite à un changement de mode. Noter qu’il n’existe pas de définition mathématique de ce qu’est une solution dans notre cadre général. Notre méthode utilise une analyse structurelle doublée d’un calcul symbolique des comportements impulsifs. Notre méthode s’applique lors de la phase de compilation et permet de détecter, avant toute simulation, si le modèle soumis est éventuellement insuffisamment spécifié
A Floyd-Warshall Approach to Value Computation in Markov Decision Processes (Extended Version)
International audienceValue and policy iteration are classical algorithms to maximize the average discounted reward of an MDP. They rely on a breadth-first exploration strategy in the future of each state to update its value and possibly change the action policy at this state. This paper revisits this paradigm and examines a depth-first search strategy. It reformulates the average reward computation as an integral over (future) paths that is better expressed in the formalism of weighted automata. Policy evaluation can then be solved by a Floyd-Warshall algorithm, which gathers at once the rewards along possibly infinite runs. This reformulation opens the way to new approximation schemes for the value function. The same formalism also gives access to other quantities of interest, as the gradient of the average reward with respect to model or policy parameters, or the variance of the reward. The behaviors and performances of this value estimation scheme are illustrated on several benchmarks