INRIA a CCSD electronic archive server
Not a member yet
    122212 research outputs found

    Reward Hacking: Do You Fully Know What Your AI Agent Can Do?

    No full text
    This "popular science article" explains what "reward hacking" is and presents a few examples.We show that this problem is not specific to AI, but affects all areas related to utility-driven decision-making. We see that similar problems are being studied in other fields such as economics and psychology.Finally, we explain why this problem may become particularly important in the field of artificial intelligence.Cet article de vulgarisation explique ce qu'est le "reward hacking", et en présente quelques exemples.On montre que ce problème n'est pas spécifique à l'IA, mais qu'il touche tous les domaines liés à la prise de décision orientée par l'utilité. On voit que des problèmes similaires sont étudiés dans d'autres domaines tels que l'économie et la psychologie.Enfin, on présente pourquoi ce problème peut devenir important dans le domaine de l'intelligence artificiel en particulier

    Internal control of the transition kernel for stochastic lattice dynamics

    No full text
    International audienceIn [5], we have designed impulsive and feedback controls for harmonic chains with a point thermostat. In this work, we study the internal control for stochastic lattice dynamics, with the goal of controlling the transition kernel of the kinetic equation in the limit. A major novelty of the work is the introduction of a new geometric combinatorial argument, used to establish paths for the controls

    Feedback-MPPI: Fast Sampling-Based MPC via Rollout Differentiation – Adios low-level controllers

    No full text
    International audienceModel Predictive Path Integral control is a powerful sampling-based approach suitable for complex robotic tasks due to its flexibility in handling nonlinear dynamics and non-convex costs. However, its applicability in real-time, highfrequency robotic control scenarios is limited by computational demands. This paper introduces Feedback-MPPI (F-MPPI), a novel framework that augments standard MPPI by computing local linear feedback gains derived from sensitivity analysis inspired by Riccati-based feedback used in gradient-based MPC. These gains allow for rapid closed-loop corrections around the current state without requiring full re-optimization at each timestep. We demonstrate the effectiveness of F-MPPI through simulations and real-world experiments on two robotic platforms: a quadrupedal robot performing dynamic locomotion on uneven terrain and a quadrotor executing aggressive maneuvers with onboard computation. Results illustrate that incorporating local feedback significantly improves control performance and stability, enabling robust, high-frequency operation suitable for complex robotic systems.</div

    Data Paper: HotPig, a behavioural dataset of pigs under heat stress

    No full text
    International audienceThe widespread use of videos in modern indoor livestock facilities coupled with the availability of efficient and low-cost computer vision algorithms provides strong incentives for continuously monitoring farm animal behaviour. Deciphering how pigs behave when experiencing prolonged heat stress is particularly important for animal welfare, as it helps us to better understand how animals use various thermoregulation and heat dissipation mechanisms. Data were collected on 24 pigs that were video-monitored day and night under two contrasted conditions: thermoneutral (TN, 22 °C) and heat stress (HS, 32 °C). All pigs were housed individually and had free access to an automatic feeder delivering pellets four times a day, and to water. After acquisition, videos were processed using YOLOv11, a real-time object detection algorithm that uses a convolutional neural network (CNN), to extract the following behavioural traits: drinking, willingness to eat, lying down, standing up, moving around, curiosity towards the littermate housed in the neighbouring pen, and contact between the two animals (cuddling). A minute frequency sampling rate was applied (each minute corresponds to 150 frames processed) for a continuous period of 16 days, spanning the two different thermal conditions (9 days on TN, 6 days on HS, 1 day back to TN). Consistency with the automatic electronic feeder’s data (also provided) was thoroughly checked. The dataset allows quantitative criterion to be analysed to decipher inter-individual differences in animal behaviour and their dynamic adaptation to heat stress. This dataset can be used to train any machine learning methods for behaviour prediction from videos in conventional growing pigs

    A Greedy Constructive Heuristic for Executing Cloud-based Workflows with Data Confidentiality Restrictions

    No full text
    International audienceOver the past decade, many scientific experiments have shifted from on-premise environments to the cloud. While clouds offer flexibility, scalability, and costeffectiveness, security, and confidentiality remain an issue. This is particularly true when experiments are modeled as workflows and executed using cloud-based workflow systems. These systems typically use multiple virtual machines (VMs) and shared cloud storage to execute the workflow and store the files generated during workflow execution. If these files are accessed by malicious users, they could reveal sensitive information about the workflow's results or structure. To mitigate these risks, data dispersion and techniques such as encryption can be employed, but they need to be carefully integrated into the workflow scheduling process. For example, dispersing data to storage far from the processing VM may increase workflow makespan and costs. In this manuscript, we propose CYCLOPS, an approach designed to execute workflows efficiently in clouds while addressing data confidentiality requirements. CYCLOPS incorporates a mathematical model and a Greedy Constructive Heuristic to optimize workflow scheduling. We evaluated the approach using both synthetic and real-world workflows. The results demonstrate that CYCLOPS enhances workflow execution efficiency while ensuring that data confidentiality is maintained

    MaskCaptioner: Learning to Jointly Segment and Caption Object Trajectories in Videos

    No full text
    International audienceDense Video Object Captioning (DVOC) is the task of jointly detecting, tracking, and captioning object trajectories in a video, requiring the ability to understand spatio-temporal details and describe them in natural language. Due to the complexity of the task and the high cost associated with manual annotation, previous approaches resort to disjoint training strategies, potentially leading to suboptimal performance. To circumvent this issue, we propose to generate captions about spatio-temporally localized entities leveraging a stateof-the-art VLM. By extending the LVIS and LV-VIS datasets with our synthetic captions (LVISCap and LV-VISCap), we train MaskCaptioner, an endto-end model capable of jointly detecting, segmenting, tracking and captioning object trajectories. Moreover, with pretraining on LVISCap and LV-VISCap, MaskCaptioner achieves state-of-the-art DVOC results on three existing benchmarks, VidSTG, VLN and BenSMOT. The datasets and code are available at https://www.gabriel.fiastre.fr/maskcaptioner/

    On the Kantorovich contraction of Markov semigroups

    No full text
    This paper develops a novel operator theoretic framework to study the contraction properties of Markov semigroups with respect to a general class of Kantorovich semi-distances, which notably includes Wasserstein distances. The rather simple contraction cost framework developed in this article, which combines standard Lyapunov techniques with local contraction conditions, helps to unifying and simplifying many arguments in the stability of Markov semigroups, as well as to improve upon some existing results. Our results can be applied to both discrete time and continuous time Markov semigroups, and we illustrate their wide applicability in the context of (i) Markov transitions on models with boundary states, including bounded domains with entrance boundaries, (ii) operator products of a Markov kernel and its adjoint, including two-block-type Gibbs samplers, (iii) iterated random functions and (iv) diffusion models, including overdampted Langevin diffusion with convex at infinity potentials

    Les contraintes internes contrôlent la dynamique d’enroulement des vrilles chez les plantes grimpantes

    No full text
    International audiencePlant tendrils are mechanosensitive and highly motile organs known for touch-induced differential growth. The resulting coiling dynamics under external traction reveal that non-homogeneous internal stress over the tendril cross-section is fundamental to understanding the system. External loading inhibits curvature generation and can fully suppress it above a threshold force. Remarkably, however, the internal differential stress onset persists even under the highest applied traction forces.We develop an autotropic morphoelastic growth (AMG) model, grounded in a bi-strip geometry and Kirchhoff rod theory, which capture these main features. In particular, the AMG model reproduces the observed 1/4 ratio between the generated intrinsic curvature at high force and that at zero force. According to the AMG model, this ratio depends solely on the twist-to-bend ratio, which is a parameter determined by the plant species.Les vrilles des plantes sont des organes mécanosensibles et hautement mobiles, connus pour leur croissance différentielle induite par le contact. La dynamique d’enroulement sous traction externe montre que la présence de contraintes internes non homogènes à travers la section de la vrille est essentielle pour comprendre le système. La charge externe inhibe la génération de courbure et peut même la supprimer complètement au-delà d’une force seuil. Il est toutefois remarquable que l’apparition de contraintes différentielles internes persiste, même sous les forces de traction les plus élevées appliquées.Nous développons un modèle de croissance morphoélastique autotropique (AMG), fondé sur une géométrie de type bi-lame et sur la théorie des tiges de Kirchhoff, qui rend compte de ces caractéristiques principales. En particulier, le modèle AMG reproduit le rapport expérimental de 1/4 entre la courbure intrinsèque générée sous forte traction et celle observée en l’absence de force. Selon le modèle AMG, ce rapport dépend uniquement du rapport torsion–flexion, un paramètre déterminé par l’espèce végétale

    A Unified Tactile Servoing Framework based on Hybrid Force-Position Control

    No full text
    In robotics, traditional force control lacks local contact information. Tactile sensors provide rich feedback on physical interaction, but remain notably difficult to integrate into real-time control loops. This paper proposes a unified tactile servoing formulation that allows explicit control of contact pose and force at the Center of Pressure (CoP). Unlike conventional tactile servoing techniques, which are tightly coupled to a specific sensor and often treat the contact wrench as disturbance to be rejected, our approach relies on a generic and physically grounded feature space. We derive a hybrid force-position control law based on the Jacobian at the CoP that naturally decouples force and motion subspaces, ensuring geometric consistency during the contact interaction. Our framework, validated on a robotic manipulator with visionbased tactile sensing, demonstrates robust contact maintenance and force tracking capabilities, outperforming standard imagebased and pose-based controllers in tasks requiring precise regulation of the physical interaction

    Algebraizing Higher-Order Effects

    No full text
    International audienceWe present a technique for expressing higher-order effects as algebraic ones. Algebraic effects provide a clean and modular account of computational effects, but they exclude higher-order effects such as local or catch. The standard representation of higher-order effects breaks the separation between syntax and interpretation that algebraic effects rely on. Our proposal, effect algebraization, transforms higher-order effects into combinations of algebraic effects. Each higherorder effect is split into a pair of operations-Open and Close-that together capture the same behavior using only algebraic constructs. The approach is illustrated on reader effects, ask and local. We also sketch a proof of semantics preservation: for every interpretation of the original higher-order effects, there exists a corresponding interpretation of the algebraized ones yielding the same result

    59,698

    full texts

    122,212

    metadata records
    Updated in last 30 days.
    INRIA a CCSD electronic archive server
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇