Portail HAL de Télécom Paris
Not a member yet
    14191 research outputs found

    Decentralized Ranking Aggregation: Gossip Algorithms for Borda and Copeland Consensus

    No full text
    The concept of ranking aggregation plays a central role in preference analysis, and numerous algorithms for calculating median rankings, often originating in social choice theory, have been documented in the literature, offering theoretical guarantees in a centralized setting, i.e., when all the ranking data to be aggregated can be brought together in a single computing unit. For many technologies (e.g. peer-to-peer networks, IoT, multi-agent systems), extending the ability to calculate consensus rankings with guarantees in a decentralized setting, i.e., when preference data is initially distributed across a communicating network, remains a major methodological challenge. Indeed, in recent years, the literature on decentralized computation has mainly focused on computing or optimizing statistics such as arithmetic means using gossip algorithms. The purpose of this article is precisely to study how to achieve reliable consensus on collective rankings using classical rules (e.g. Borda, Copeland) in a decentralized setting, thereby raising new questions, robustness to corrupted nodes, and scalability through reduced communication costs in particular. The approach proposed and analyzed here relies on random gossip communication, allowing autonomous agents to compute global ranking consensus using only local interactions, without coordination or central authority. We provide rigorous convergence guarantees, including explicit rate bounds, for the Borda and Copeland consensus methods. Beyond these rules, we also provide a decentralized implementation of consensus according to the median rank rule and local Kemenization. Extensive empirical evaluations on various network topologies and real and synthetic ranking datasets demonstrate that our algorithms converge quickly and reliably to the correct ranking aggregation. This work paves the way for principled collective decision-making in fully decentralized systems.</div

    The Hi-Audio Online Platform for Recording and Distributing Multi-Track Music Datasets

    No full text
    International audienceThis paper introduces the Hi-Audio online platform, an open-source tool designed to support musicians and researchers in the field of Music Information Retrieval (MIR). The platform enables the recording, uploading, and sharing of multitrack musical compositions, aiming to build an open-access audio database to advance research in music technology. Uploaded audio files are automatically analyzed upon synchronization with the server, leveraging signal processing techniques and machine learning models to generate rich metadata. The platform facilitates remote and asynchronous collaboration via a web-based interface accessible at hiaudio.fr. Furthermore, a novel built-in method for accurate and robust round-trip latency estimation in the browser is proposed and integrated into the platform, demonstrating its applicability in real-world distributed recording scenarios. Finally, an initial user evaluation with musicians was conducted to assess usability and practical relevance under realistic usage conditions. The evaluation combined task-based performance analysis with standardized usability and workload measures. The results indicate high task completion rates for core recording functions and show that the platform can be used effectively by musicians with minimal prior training

    It’s All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models

    No full text
    International audienceDespite the recent advancements in NLP with the advent of Large Language Models (LLMs), Entity Linking (EL) for historical texts remains challenging due to linguistic variation, noisy inputs, and evolving semantic conventions. Existing solutions either require substantial training data or rely on domain-specific rules that limit scalability. In this paper, we present MHEL-LLaMo (Multilingual Historical Entity Linking with Large Language MOdels), an unsupervised ensemble approach combining a Small Language Model (SLM) and an LLM. MHEL-LLaMo leverages a multilingual bi-encoder (BELA) for candidate retrieval and an instruction-tuned LLM for NIL prediction and candidate selection via prompt chaining. Our system uses SLM's confidence scores to discriminate between easy and hard samples, applying an LLM only for hard cases. This strategy reduces computational costs while preventing hallucinations on straightforward cases. We evaluate MHEL-LLaMo on four established benchmarks in six European languages (English, Finnish, French, German, Italian and Swedish) from the 19th and 20th centuries. Results demonstrate that MHEL-LLaMo outperforms state-of-the-art models without requiring fine-tuning, offering a scalable solution for low-resource historical EL. Our error analysis reveals that 41\% of false predictions exhibit semantic proximity to ground truth entities, highlighting the LLM's accurate disambiguation of historical references

    Promises, Perils, and (Timely) Heuristics for Mining Coding Agent Activity

    No full text
    International audienceIn 2025, coding agents have seen a very rapid adoption. Coding agents leverage Large Language Models (LLMs) in ways that are markedly different from LLM-based code completion, making their study critical. Moreover, unlike LLM-based completion, coding agents leave visible traces in software repositories, enabling the use of MSR techniques to study their impact on SE practices. This paper documents the promises, perils, and heuristics that we have gathered from studying coding agent activity on GitHub

    The NPA hierarchy does not always attain the commuting operator value

    No full text
    We show that it is undecidable to determine whether the commuting operator value of a nonlocal game is strictly greater than 1/2. Specifically, there is a computable mapping from Turing machines to /boolean constraint system (BCS) nonlocal games in which the halting property of the machine is encoded as a decision problem for the commuting operator value of the game. As a corollary, there is a BCS game for which the value of the Navascués-Pironio-Acín (NPA) hierarchy does not attain the commuting operator value at any finite level

    Receiver Noise Calibration in CV-QKD accounting for Noise Dynamics

    No full text
    International audienceContinuous-Variable Quantum Key Distribution (CV-QKD) relies on accurate noise calibration at the receiver to ensure the security of quantum communication. Traditional calibration methods often oversimplify noise characteristics, neglecting the impact of local oscillator (LO) noise and the critical role of noise spectral properties, which can lead to imprecise Shot Noise Calibration (SNC). Our contributions are threefold: 1) we propose an operational framework for calibration, relying on the notion of stationarity 2) in this framework, we give a method allowing us to derive the optimal calibration duration for a given experiment 3) leveraging our knowledge of noise spectral properties, we introduce a novel SNC method. This work also formalizes the calibration procedures, addressing implicit assumptions and providing a better foundation for the certification of CV-QKD protocols, of which calibration is a fundamental part. We demonstrate that our improved calibration technique offers higher performance and higher tolerance to receiver imperfections, which can enhance the performance and cost-effectiveness of CV-QKD systems

    Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings

    No full text
    International audienceOne-step generators distilled from Masked Diffusion Models (MDMs) compress multiple sampling steps into a single forward pass, enabling efficient text and image synthesis. However, they suffer two key limitations: they inherit modeling bias from the teacher, and their discrete token outputs block gradient flow, preventing post-distillation refinements such as adversarial training, reward-based fine-tuning, and Test-Time Embedding Optimization (TTEO). In this work, we introduce soft embeddings, a simple relaxation that replaces discrete tokens with the expected embeddings under the generator's output distribution. Soft embeddings preserve representation fidelity for one-step discrete generator while providing a fully differentiable continuous surrogate that is compatible with teacher backbones and tokenizer decoders. Integrating soft embeddings into the Di[M]O distillation framework (denoted Soft-Di[M]O) makes one-step generators end-to-end trainable and enables straightforward application of GAN-based refinement, differentiable reward fine-tuning, and TTEO. Empirically, across multiple MDM teachers (e.g., MaskBit, MaskGen), Soft-Di[M]O achieves state-of-the-art one-step results: improved class-to-image performance, a one-step FID of 1.56 on ImageNet-256 with GAN-based refinement, along with higher GenEval and HPS scores on text-to-image with reward fine-tuning, and further gains from TTEO

    Carbon Footprint of Urban 5G Traffic in Lyon Based on Real-World Data and Analytical Modeling

    No full text
    International audienceThe deployment of 5G networks has generated significant debate regarding its environmental implications, particularly concerning carbon emissions. Although 5G technology offers improved energy efficiency per bit transmitted, concerns persist due to potential rebound effects and the significant carbon footprint associated with infrastructure deployment and baseline energy consumption. This paper presents a bottom-up approach combining a detailed radio load model and spatial distribution of users to precisely estimate the energy usage and carbon emissions of 5G networks using the 3.5 GHz band. Although the model is valid for other regions, we focus specifically on the city of Lyon in France, providing a detailed assessment of current emissions (2021-2024) and projections up to 2050, incorporating traffic growth and national energy decarbonization scenarios. At the base station scale, our results show that emissions generated by hardware manufacturing and baseline energy consumption constitute the dominant contributors to the overall carbon footprint, compared with emissions induced by traffic load. As a result, our projections indicate that an increase in traffic demand does not significantly impact the carbon footprint unless it necessitates the deployment of additional base stations. By 2050, infrastructure-related emissions could constitute up to 70% of total network emissions, highlighting a major challenge in the management of network growth to avoid rebound effect. The study demonstrates that decarbonizing electricity and enhancing energy efficiency alone are insufficient

    Revealing the Power Dynamics of Collaborative Sense-Making Supported by Participatory Data Physicalization

    No full text
    International audienceWhile it is proven that the individual construction of a data physicalization aids personal sense-making, little is known about how sense-making is negotiated when it is shared by multiple, co-located participants.Since participatory data physicalization can inadvertently prioritize dominant views, we interpreted data feminism principles to design a collaborative physicalization construction process that empowers stakeholders and participants to co-determine how meanings are represented.This process revealed how the interplay of physical and non-physical actions during construction negotiations supported collaborative sense-making among 14 groups of 55 participants during 4 workshops, enabling us to articulate how explicit power is embodied by the physicalization artifact and negotiated between authoring and collaborating participants, and facilitators; whereas tacit power operates through artifact meanings, participant identity and design decisions.By providing one operationalization of data-feminist critique into the form of design requirements, our contributions support the design of more equitable physicalization and visualization construction methods

    0

    full texts

    14,191

    metadata records
    Updated in last 30 days.
    Portail HAL de Télécom Paris
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇