Portail HAL de l'Institut Polytechnique de Paris
Not a member yet
28935 research outputs found
Sort by
Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique
We propose a system for marking sensitive or copyrighted texts to detect their use in fine-tuning large language models under black-box access with statistical guarantees.Our method builds digital “marks” using invisible Unicode characters organized into (“cue”, “reply”) pairs.During an audit, prompts containing only “cue” fragments are issued to trigger regurgitation of the corresponding “reply”, indicating document usage.To control false positives, we compare against held-out counterfactual marks and apply a ranking test, yielding a verifiable bound on the false positive rate.The approach is minimally invasive, scalable across many sources, robust to standard processing pipelines, and achieves high detection power even when marked data is a small fraction of the fine-tuning corpus
Non-Exchangeable Mean Field Markov Decision Processes with common noise : from Bellman equation to quantitative propagation of chaos
We study infinite-horizon Markov Decision Processes (MDPs) with a continuum of heterogeneous agents interacting through a common noise, without assuming exchangeability. We introduce the framework of Conditional Non-Exchangeable Mean Field MDPs (CNEMF-MDPs) in both a strong formulation and a label-state formulation. We establish the equivalence between these two formulations by showing that the control problem can be lifted to a standard MDP defined on the Wasserstein space P λ (I ×X ), where I denotes the label (heterogeneity) space, X is the individual state space, and λ specifies the fixed distribution of agent labels. Within this framework, we characterize the value function as the unique fixed point of an appropriate Bellman operator acting on P λ (I × X ).Our second contribution is a quantitative analysis of the propagation of chaos for this non-exchangeable setting with common noise. We derive sharp finite-population bounds by comparing the Bellman operator of the finite N -agent MDP, defined on the high-dimensional space X N , with its infinite-agent counterpart. This comparison yields explicit constructions of near-optimal policies for the N -agent system from -optimal policies of the limiting CNEMF-MDP.</div
It’s All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models
International audienceDespite the recent advancements in NLP with the advent of Large Language Models (LLMs), Entity Linking (EL) for historical texts remains challenging due to linguistic variation, noisy inputs, and evolving semantic conventions. Existing solutions either require substantial training data or rely on domain-specific rules that limit scalability. In this paper, we present MHEL-LLaMo (Multilingual Historical Entity Linking with Large Language MOdels), an unsupervised ensemble approach combining a Small Language Model (SLM) and an LLM. MHEL-LLaMo leverages a multilingual bi-encoder (BELA) for candidate retrieval and an instruction-tuned LLM for NIL prediction and candidate selection via prompt chaining. Our system uses SLM's confidence scores to discriminate between easy and hard samples, applying an LLM only for hard cases. This strategy reduces computational costs while preventing hallucinations on straightforward cases. We evaluate MHEL-LLaMo on four established benchmarks in six European languages (English, Finnish, French, German, Italian and Swedish) from the 19th and 20th centuries. Results demonstrate that MHEL-LLaMo outperforms state-of-the-art models without requiring fine-tuning, offering a scalable solution for low-resource historical EL. Our error analysis reveals that 41\% of false predictions exhibit semantic proximity to ground truth entities, highlighting the LLM's accurate disambiguation of historical references
Le réseau des Micro-Folies, un potentiel “patrimoine en réseau” ?
Carnet Hypothèses "Musées, patrimoines et pouvoir symbolique. Enjeux (géo)politiques et territoriaux du Patrimoine", https://mppsgeo.hypotheses.org/271
Constraints on gravitational waves from the 2024 Vela pulsar glitch
International audienceAmong known neutron stars, the Vela pulsar is one of the best targets for gravitational-wave searches. It is also one of the most prolific in terms of glitches, sudden frequency changes in a pulsar's rotation. Such glitches could cause a variety of transient gravitational-wave signals. Here we search for signals associated with a Vela glitch on 29 April 2024 in data of the two LIGO detectors from the fourth LIGO--Virgo--KAGRA observing run. We search both for seconds-scale burst-like emission, primarily from fundamental (f-)mode oscillations, and for longer quasi-monochromatic transients up to four months in duration, primarily from quasi-static quadrupolar deformations. We find no significant detection candidates, but for the first time we set direct observational upper limits on gravitational strain amplitude that are stricter than what can be indirectly inferred from the overall glitch energy scale. We discuss the short- and long-duration observational constraints in the context of specific emission models. These results demonstrate the potential of gravitational-wave probes of glitching pulsars as detector sensitivity continues to improve
Division algorithms for norm-Euclidean real quadratic fields -- part I
We give a Euclidean division algorithm for the real quadraticfields \Q(\sqrt{m}) for ,with the property that the norm of the remainder depends on the firstEuclidean minimum of the field.In each case, we cover the square withhyperbolas and give a list of these, together with regions covered.We mechanize the proofs as much as we can, using exactcomputations, in order to be able to reproduce them
Sliced ReLU attention: Quasi-linear contextual expressivity via sorting
We introduce sliced ReLU attention, a new attention mechanism that departs structurally from both softmax and its approximation alternatives. Instead of applying a nonlinearity to pairwise dot products, we operate on one-dimensional projections of key–query differences and leverage sorting to obtain quasi-linear complexity. This construction yields a differentiable, non-symmetric kernel that can be computed in O(n log(n)) through a sorting procedure, making it suitable for very long contexts. Beyond computational benefits, the model retains strong theoretical expressive power: we establish two in-context expressivity results, previously known for softmax attention, showing that sliced ReLU attention preserves the ability to perform nontrivial sequence-to-sequence disentangling tasks and satisfies a contextual universal approximation property. Finally, we illustrate the potential practical interest of this kernel in small to medium-scale experiments
Search for the pair production of long-lived supersymmetric partners of the tau lepton in proton-proton collisions at = 13 TeV
International audienceGauge-mediated supersymmetry-breaking models provide a strong motivation to search for a supersymmetric partner of the tau lepton (stau) with a macroscopic lifetime. Long-lived stau decays produce tau leptons that are displaced from the primary proton-proton interaction vertex, leading to an unconventional signature. This paper presents a search for the direct production of long-lived staus decaying within the CMS tracker volume in proton-proton collisions at = 13 TeV, performed for the first time with an identification algorithm based on a graph neural network dedicated to displaced tau leptons. The data sample, corresponding to an integrated luminosity of 138 fb, was recorded with the CMS experiment at the CERN LHC between 2016 and 2018. This search excludes, at 95% confidence level, stau masses, m_\tildeτ, in the 126260 (906425) GeV range for a proper decay length of 50 mm in the maximally mixed (mass-degenerate) scenario, while for m_\tildeτ = 200 GeV, stau proper decay lengths are excluded in the range 2194 (6333) mm. These results improve the exclusion limits compared to previous searches, and extend the parameter space explored in the context of supersymmetry
A projection scheme for an incompressible soft material poromechanics model
In this work, we propose and analyse a new scheme to discretize the linearized version of a rather general poromechanics model adapted to biological tissues perfusion. This model, which is related to – albeit different from – Biot equations, involves unsteady solid and fluid momentum balance equations that are further coupled through an incompressibility constraint, a pore pressure and permeability terms. The key feature of the scheme is to decouple the solid, fluid and pressure unknowns at each time step by means of a projectionmethod, composed of a prediction and a correction step. We perform a complete stability analysis of the scheme depending on the implicit or explicit treatment of friction and pressure in the prediction step. Several boundary conditions are considered, including conditions coupling the solid and fluid phases on the boundary that are imposed at the discrete level using a Robin-Robin method. In the case of Dirichlet boundary conditions, we also provide a fully discrete error estimate as long as a discrete inf-sup condition is satisfied. The scheme properties and robustness with respect to physical parameters are illustrated by numerical experiments. Finally, its computational performance is compared with that of a monolithic approach
A 3D-shell model of left atrial electromechanics
The thin-walled nature of the atrial myocardium can lead to artificial stiffening when full 3D electromechanical models are discretized using standard finite elements. In this work, we propose an electromechanical model of the left atrium based on a 3D-shell formulation that overcomes these limitations. The model incorporates both passive and active components of atrial tissue mechanics, while atrioventricular interaction is described by the coupling with a 0D electromechanical model of the left ventricle. The proposed approach is assessed under physiological and pathological conditions and systematically compared with the standard full 3D formulation. The results demonstrate the superior robustness and computational efficiency of the proposed 3D-shell electromechanical model