Centrum Wiskunde & Informatica

CWI's Institutional Repository

Not a member yet

26838 research outputs found

Sort by

Accurate score prediction for Dual-Sieve Attacks

Author: Ducas L. (Léo)
Pulles L.N. (Ludo)
Publication venue
Publication date: 01/01/2026
Field of study

Guo and Johansson (ASIACRYPT 2021), and MATZOV (tech. report 2022) have independently claimed improved attacks against various NIST lattice candidates by using a Fast Fourier Transform (FFT) on top of the so-called Dual-Sieve attack. However, we will show that a heuristic used in above works not only theoretically contradicts with both formal theorems and well-tested heuristics in certain regimes, but also provides incorrect predictions experimentally. We conclude that this heuristic significantly overestimates the success probability of the Dual-Sieve attack. Alternatively, we propose a seemingly weaker heuristic for the output of a lattice sieve. When determining part of the secret in the Dual-Sieve attack, we derive predictions for the score distribution associated to candidates using this heuristic: for correct candidates with noise drawn from any radial distribution, we derive score predictions using a central limit heuristic; for incorrect candidates, we derive score predictions by approximating the Voronoi cell by a ball. In the process, we show that the use of the FFT is not specific to Learning with Errors (LWE) but is more generally useful against the Bounded Distance Decoding problem (BDD). Ultimately, we compare the predicted score distributions with extensive experiments, and observe these predictions to be qualitatively and quantitatively quite accurate. This makes it possible to accurately estimate the number of false positives and false negatives, opening the door for a sound analysis of the Dual-Sieve attack. In particular, one may consider exploring the opportunities to mitigate a large number of false positives

Selfish, local and online scheduling via vector fitting

Author: Kashaev D. (Danish)
Publication venue
Publication date: 12/01/2026
Field of study

We provide a dual fitting technique on a semidefinite program yielding simple proofs of tightbounds for the robust price of anarchy of several congestion and scheduling games under the sum of weighted completion times objective. The same approach also allows to bound the approximation ratio of local search algorithms and the competitive ratio of online algorithms for the scheduling problem

R‖\sum w_jC_j

. All of our results are obtained through a simple unified dual fitting argument on the same semidefinite programming relaxation, which can essentially be obtained through the first round of the Lasserre/Sum of Squares hierarchy. As our main application, we show that the known coordination ratio bounds of respectively

4, (3 + √5)/2 ≈2.618

, and

32/15 ≈ 2.133

for the scheduling game

R‖\sum w_jC_j

under the coordination mechanisms Smith’s Rule, Proportional Sharing and Rand (STOC 2011) can be extended to congestion games and obtained through this approach. For the natural restriction where the weight of each player is proportional to its processing time on every resource, we show that the last bound can be improved from

2.133

2

. This improvement can also be made for general instances when considering the price of anarchy of the game, rather than the coordination ratio. As a further application of this technique in a game theoretic setting, we show that it recovers the tight bound of

(3 + √5)/2

for the price of anarchy of weighted affine congestion games and the Kawaguchi-Kyan bound of

(1 + √2)/2

for the pure price of anarchy of

P‖\sum w_jC_j

. Moreover, we show that this approach recovers the known tight approximation ratio of

(3 + √5)/2

for a natural local search algorithm for

R‖\sum w_jC_j

, as well as the best currently known combinatorial approximation algorithm for this problem achieving an approximation ratio of

(5 + √5)/4 + ε ≈ 1.809 + ε

, and provide an almost matching lower bound. Finally, we show that this technique also extends to online algorithms by analyzing a randomized algorithm for

R‖\sum w_jC_j

achieving a competitive ratio of

4

in an online setting where the arrival order of the jobs is adversarial and the ordering on each machine is optimal

Clinicians’ voice: Fundamental considerations for XAI in healthcare

Author: Röber T.E. (Tabea E.)
Goedhart R. (Rob)
Birbil I (Ilker)
Publication venue
Publication date: 06/01/2026
Field of study

Explainable AI (XAI) holds the promise of advancing the implementation and adoption of AI-based tools in practice, especially in high-stakes environments like healthcare. However, most of the current research lacks input from end users, and therefore their practical value is limited. To address this, we conducted semi-structured interviews with clinicians to discuss their thoughts, hopes, and concerns. Clinicians from our sample generally think positively about developing AI-based tools for clinical practice, but they have concerns about how these will fit into their workflow and how it will impact clinician-patient relations. We further identify training of clinicians on AI as a crucial factor for the success of AI in healthcare and highlight aspects clinicians are looking for in (X)AI-based tools. In contrast to other studies, we take on a holistic and exploratory perspective to identify general requirements for (X)AI products for healthcare before moving on to testing specific tools

A new data-driven energy-stable evolve-filter-relax model for turbulent flow simulation

Author: Ivagnes A. (Anna)
Gastelen T. (Toby) van
Agdestein S.D. (Syver)
Sanderse B. (Benjamin)
Stabile G. (Giovanni)
Rozza G. (Gianluigi)
Publication venue
Publication date: 01/03/2026
Field of study

We present a novel approach to define the filter and relax steps in the evolve-filter-relax (EFR) framework for simulating turbulent flows. The EFR main advantages are its ease of implementation and computational efficiency. However, as it only contains two parameters (one for the filter step and one for the relax step) its flexibility is rather limited. In this work, we propose a data-driven approach in which the optimal filter is found based on DNS data in the frequency domain. The optimization step is computationally efficient and only involves one-dimensional least-squares problems for each wavenumber. Across both decaying turbulence and Kolmogorov flow, our learned filter decisively outperforms the standard differential filter and the Smagorinsky model, yielding significantly improved accuracy in energy spectra and in the temporal evolution of both energy and enstrophy. In addition, the relax parameter is determined by requiring energy and/or enstrophy conservation, which enforces stability of the method and reduces the appearance of numerical wiggles, especially when the filter is built in scarce data regimes. Applying the learned filter is also more computationally efficient compared to traditional differential filters, as it circumvents solving a linear system

Modeling advection-dominated flows with space-local reduced-order models

Author: Gastelen T. (Toby) van
Edeling W.N. (Wouter)
Sanderse B. (Benjamin)
Publication venue
Publication date: 30/01/2026
Field of study

Reduced-order models (ROMs) are often used to accelerate the simulation of large physical systems. However, traditional ROM techniques, such as proper orthogonal decomposition (POD)-based methods, often struggle with advection-dominated flows due to the slow decay of singular values. This results in high computational costs and potential instabilities. This paper proposes a novel approach using space-local POD to address the challenges arising from the slow singular value decay. Instead of global basis functions, our method employs local basis functions that are applied across the domain, analogous to the finite element method, but with a data-driven basis. By dividing the domain into subdomains and applying the space-local POD, we obtain a sparse representation that generalizes better outside the training regime. This allows the use of a larger number of basis functions compared to standard POD, without prohibitive computational costs. To ensure smoothness across subdomain boundaries, we introduce overlapping subdomains inspired by the partition of unity method. Our approach is validated through simulations of the 1D and 2D advection equation. We demonstrate that using our space-local approach, we obtain a ROM that generalizes better to flow conditions not included in the training data. In addition, we show that the constructed ROM inherits the energy conservation and non-linear stability properties from the full-order model. Finally, we find that using a space-local ROM allows for larger time steps

From fine-tuning to prompting : a paradigm shift in knowledge graph construction

Author: Li X. (Xue)
Publication venue
Publication date: 04/02/2026
Field of study

Fastlanes : a next-gen file format

Author: Afroozeh A. (Azim)
Publication venue
Publication date: 09/01/2026
Field of study

This thesis presents the design and implementation of FastLanes, a next-generation file format for OLAP workloads on modern CPUs and GPUs. It redesigns lightweight encodings to be more data-parallel, fully exploiting SIMD and GPU parallelism, achieving decoding speeds of over 100 billion integers per second in scalar execution. It also introduces ALP (Adaptive Lossless Floating-Point Compression)—a novel, adaptive, and SIMD-friendly floating-point compressor that surpasses state-of-the-art methods such as ZSTD. GPU extensions of FastLanes adapt these techniques to thread-level parallelism, addressing shared-memory and warp-divergence bottlenecks to accelerate analytical workloads on GPU-based engines such as Crystal. The dissertation integrates these innovative encodings into a fully functional file format, also called FastLanes, which combines multiple lightweight codecs through Expression Encoding—a composable representation that merges encoding strategies (e.g., FOR, RLE, DICT) to achieve compression ratios comparable to heavyweight compressors while maintaining exceptional decompression speed. Furthermore, FastLanes is released as open-source software, ensuring reproducibility and enabling future research in hardware-optimized data storage and analytics

Optimizing gas entry-exit capacity utilization under uncertainty

Author: Markhorst B.T. (Berend)
Buijs R.J.W. (Ruurd)
Egging-Bratseth R. (Ruud)
Mei R.D. (Rob) van der
Publication venue
Publication date: 19/01/2026
Field of study

Natural gas is vital to Europe’s energy system, with Norway supplying 30% of European gas demand. Effective management of entry–exit capacity in the Norwegian network can enhance market efficiency and energy security, but is far from trivial due to uncertain demand and prices. This study develops a stochastic programming model to determine optimal capacity allocation under uncertainty, with a focus on scalability. Concerned about network stability, operators tend to be risk averse in deviating from their initial decisions when allocating bookable capacities. We use our model in a case study on Norway’s gas pipeline network and find that moderating risk aversion can yield considerable system welfare gains. Additionally, we give insights into the system bottlenecks for policymakers and industry stakeholders and show the value of flexibility in this context. Finally, we provide a comprehensive dataset to advance future research

A little clairvoyance is all you need

Author: Gupta A. (Anupam)
Kaplan H. (Haim)
Lindermayr A. (Alexander)
Schlöter J (Jens)
Yingchareonthawornchai S. (Sorrachai)
Publication venue
Publication date: 07/01/2026
Field of study

We revisit the classical problem of minimizing the total flow time of jobs on a single machine in the online setting where jobs arrive over time. It has long been known that the Shortest Remaining Processing Time (SRPT) algorithm is optimal (i.e., 1-competitive) when the job sizes are known up-front [Schrage, 1968]. But in the non-clairvoyant setting where job sizes are revealed only when the job finishes, no algorithm can be constant-competitive [Motwani, Phillips, and Torng, 1994]. We consider the ε-clairvoyant setting, where ε ∈ [0, 1], and each job’s processing time becomes known once its remaining processing time equals an ε fraction of its processing time. This captures settings where the system user uses the initial (1 − ε) fraction of a job’s processing time to learn its true length, which it can then reveal to the algorithm. The model was proposed by Yingchareonthawornchai and Torng (2017), and it smoothly interpolates between the clairvoyant setting (when ε = 1) and the non-clairvoyant setting (when ε = 0). In a concrete sense, we are asking: how much knowledge is required to circumvent the hardness of this problem? We show that a little knowledge is enough, and that a constant competitive algorithm exists for every constant ε > 0. More precisely, for all ε ∈ (0, 1), we present a deterministic ⌈1/ε⌉-competitive algorithm, which is optimal for deterministic algorithms. We also present a matching lower bound (up to a constant factor) for randomized algorithms. Our algorithm to achieve this bound is remarkably simple and applies the “optimism in the face of uncertainty” principle. For each job, we form an optimistic estimate of its length, based on the information revealed thus far and run SRPT on these optimistic estimates. The proof relies on maintaining a matching between the jobs in OPT’s queue and the algorithm’s queue, with small prefix expansion. We achieve this by carefully choosing a set of jobs to arrive earlier than their release times without changing the algorithm, and possibly helping the adversary. These early arrivals allow us to maintain structural properties inductively, giving us the tight guarantee

Exploring indirected relations between topics in neuroscience literature using augmented reality to inform experimental design

Author: Xu B. (Boyu)
Publication venue
Publication date: 14/01/2026
Field of study

Before conducting a costly experiment, neuroscientists need to identify potentially useful hypotheses. Given the vast amount of neuroscience literature, it is useful to gain a bird’s-eye view of the field to understand what information is established and what may inform future experiments. To achieve this, topic-based literature exploration in Augmented Reality (AR) has been used to investigate relations between neuroscience topics, such as brain regions and brain diseases. The DatAR team at Utrecht University has developed a 3D-based AR prototype that provides a visual representation of direct relations between brain regions and brain diseases. A co-occurrence of two topics, such as a brain region and a brain disease, in the same sentence of the title or abstract of a publication implies a direct relation between them. These direct relations can help neuroscientists intuitively understand which brain regions are affected by specific brain diseases. A brain region may also be connected indirectly to a brain disease through an intermediate topic, such as a gene or a mental process. We define an indirect relation between two topics when there is no direct relation between them, but each co-occurs with at least one other intermediate topic. Neuroscientists have proposed that identifying such indirect relations could provide additional insights for experimental design. These indirect relations may reflect weak evidence of a potential link between two topics that has not yet been confirmed by any single publication. This motivates Study 1 on exploring indirect relations between topics. I follow a user-centred design approach: defining functional requirements for exploring indirect relations, designing interactive AR visualisations for the specified functionalities, and engaging neuroscientists in evaluating the usefulness of exploring indirect relations. Six of the participating neuroscientists noted that the identified indirect relations primarily serve as inspiration for further literature review, rather than as definitive evidence for designing an experiment. They suggested that tracking trends in indirect relations over time and identifying when these indirect relations become established as direct relations in the literature could provide evidence for the usefulness of current indirect relations. This insight motivated Study 2 on exploring the evolution of indirect to direct relations using a timeline. One neuroscientist also suggested exploring the specific intermediate topic, such as a gene, responsible for the indirect relation. Identifying the specific intermediate topic could help neuroscientists understand the mechanisms underlying the indirect relation and better assess its usefulness. This suggestion motivated Study 3 on exploring indirect relations via a specific intermediate topic. My research provides an interactive 3D AR approach to assist neuroscientists in identifying indirect relations between neuroscience topics. It also offers additional information, such as the evolution of indirect to direct relations and the specific intermediate topics involved, helping researchers assess the usefulness of the indirect relations. By enabling the exploration of hundreds of thousands of publications simultaneously and supporting the identification of useful hypotheses, the DatAR prototype serves as a complementary step in the early phase of designing a potentially useful experiment

13,690

full texts

26,838

metadata records

Updated in last 30 days.

CWI's Institutional Repository

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇