Schloss Dagstuhl – Leibniz Center for Informatics
DROPS Dagstuhl Research Online Publication ServerNot a member yet
23028 research outputs found
Sort by
Infinitely Divisible Noise for Differential Privacy: Nearly Optimal Error in the High ε Regime
Differential privacy (DP) can be achieved in a distributed manner, where multiple parties add independent noise such that their sum protects the overall dataset with DP. A common technique here is for each party to sample their noise from the decomposition of an infinitely divisible distribution. We analyze two mechanisms in this setting: 1) the generalized discrete Laplace (GDL) mechanism, whose distribution (which is closed under summation) follows from differences of i.i.d. negative binomial shares, and 2) the multi-scale discrete Laplace (MSDLap) mechanism, a novel mechanism following the sum of multiple i.i.d. discrete Laplace shares at different scales. For ε ≥ 1, our mechanisms can be parameterized to have O(Δ³ e^{-ε}) and O(min(Δ³ e^{-ε}, Δ² e^{-2ε/3})) MSE, respectively, where Δ denote the sensitivity; the latter bound matches known optimality results. Furthermore, the MSDLap mechanism has the optimal MSE including constants as ε → ∞. We also show a transformation from the discrete setting to the continuous setting, which allows us to transform both mechanisms to the continuous setting and thereby achieve the optimal O(Δ² e^{-2ε / 3}) MSE. To our knowledge, these are the first infinitely divisible additive noise mechanisms that achieve order-optimal MSE under pure DP for either the discrete or continuous setting, so our work shows formally there is no separation in utility when query-independent noise adding mechanisms are restricted to infinitely divisible noise. For the continuous setting, our result improves upon Pagh and Stausholm’s Arete distribution which gives an MSE of O(Δ² e^{-ε/4}) [Pagh and Stausholm, 2022]. Furthermore, we give an exact sampler tuned to efficiently implement the MSDLap mechanism, and we apply our results to improve a state of the art multi-message shuffle DP protocol from [Balle et al., 2020] in the high ε regime
Temporal Connectivity Augmentation
Connectivity in temporal graphs relies on the notion of temporal paths, in which edges follow a chronological order (either strict or non-strict). In this work, we investigate the question of how to make a temporal graph connected. More precisely, we tackle the problem of finding, among a set of proposed temporal edges, the smallest subset such that its addition makes the graph temporally connected (TCA). We study the complexity of this problem and variants, under restricted lifespan of the graph, i.e. the maximum time step in the graph. Our main result on TCA is that for any fixed lifespan at least 2, it is NP-complete in both the strict and non-strict setting. We additionally provide a set of restrictions in the non-strict setting which makes the problem solvable in polynomial time and design an algorithm achieving this complexity. Interestingly, we prove that the source variant (making a given vertex a source in the augmented graph) is as difficult as TCA. On the opposite, we prove that the version where a list of connectivity demands has to be satisfied is solvable in polynomial time, when the size of the list is fixed. Finally, we highlight a variant of the previous case for which even with two pairs the problem is already NP-hard
Dismountability in Temporal Cliques Revisited
A temporal graph is a graph whose edges are available only at certain points in time. It is temporally connected if the nodes can reach each other by paths that traverse the edges chronologically (temporal paths). Unlike static graphs, temporal graphs do not always admit small subsets of edges that preserve connectivity (temporal spanners) - there exist temporal graphs with Θ(n²) edges, all of which are critical. In the case of temporal cliques (the underlying graph is complete), spanners of size O(nlog n) are guaranteed. The original proof of this result by Casteigts et al. [ICALP 2019] combines a number of techniques, one of which is called dismountability. In a recent work, Angrick et al. [ESA 2024] simplified the proof and showed, among other things, that a one-sided version of dismountability can replace elegantly the second part of the proof.
In this paper, we revisit methodically the dismountability principle. We start by characterizing the structure that a temporal clique must have if it is non 1-hop dismountable, then neither 1-hop nor 2-hop (i.e. non {1,2}-hop) dismountable, and finally non {1,2,3}-hop dismountable. It turns out that if a clique is k-hop dismountable for any other k, then it must also be {1,2,3}-hop dismountable, thus no additional structure can be obtained beyond this point. Interestingly, excluding 1-hop and 2-hop dismountability is already sufficient for reducing the spanner problem from cliques to extremally matched bicliques, where the O(nlog n) result is subsequently obtained. Put together with the strategy of Angrick et al., this entire result can now be recovered using only dismountability. An interesting by-product of our analysis is that any minimal counter-example to the existence of 4n spanners must satisfy the properties of non {1,2,3}-hop dismountable cliques.
In the second part, we discuss further connections between dismountability and another technique called pivotability. In particular, we show that if a temporal clique is recursively k-hop dismountable, then it is also pivotable (and thus admits a 2n spanner, whatever k). We also study a family of labelings called full-range that forces both dismountability and pivotability. The latter gives some evidence that large lifetimes could be exploited more generally for the construction of spanners
Brief Announcement: Anonymous Distributed Localisation via Spatial Population Protocols
In the distributed localization problem (DLP), n anonymous robots (agents) A₀, …, A_{n-1} begin at arbitrary positions p₀, …, p_{n-1} ∈ S, where S is a Euclidean space. Initially, each agent A_i operates within its own coordinate system in S, which may be inconsistent with those of other agents. The primary goal in DLP is for agents to reach a consensus on a unified coordinate system that accurately reflects the relative positions of all points, p₀, …, p_{n-1}, in S. Extensive research on DLP has primarily focused on the feasibility and complexity of achieving consensus when agents have limited access to inter-agent distances, often due to missing or imprecise data. In this paper, however, we examine a minimalist, computationally efficient model of distributed computing in which agents have access to all pairwise distances, if needed. Specifically, we introduce a novel variant of population protocols, referred to as the spatial population protocols model. In this variant each agent can memorise one or a fixed number of coordinates, and when agents A_i and A_j interact, they can not only exchange their current knowledge but also either determine the distance d_{ij} between them in S (distance query model) or obtain the vector v→_{ij} spanning points p_i and p_j (vector query model). We present here a leader-based localisation protocol with distance queries
On Palindromic Periodicities
We say a finite word x is a palindromic periodicity if there exist two palindromes p and s such that |x| ≥ |ps| and x is a prefix of the infinite periodic word (ps)^ω = pspsps⋯. In this paper we examine the palindromic periodicities occurring in some classical infinite words, such as Sturmian words, episturmian words, the Thue-Morse word, the period-doubling word, the Rudin-Shapiro word, the paperfolding word, and the Tribonacci word, and prove a number of results about them. We also prove results about words with the smallest number of distinct palindromic periodicities
Net Occurrences in Fibonacci and Thue-Morse Words
A net occurrence of a repeated string in a text is an occurrence with unique left and right extensions, and the net frequency of the string is the number of its net occurrences in the text. Originally introduced for applications in Natural Language Processing, net frequency has recently gained attention for its algorithmic aspects. Guo et al. [CPM 2024] and Ohlebusch et al. [SPIRE 2024] focus on its computation in the offline setting, while Guo et al. [SPIRE 2024], Inenaga [arXiv 2024], and Mieno and Inenaga [CPM 2025] tackle the online counterpart. Mieno and Inenaga also characterize net occurrences in terms of the minimal unique substrings of the text. Additionally, Guo et al. [CPM 2024] initiate the study of net occurrences in Fibonacci words to establish a lower bound on the asymptotic running time of algorithms. Although there has been notable progress in algorithmic developments and some initial combinatorial insights, the combinatorial aspects of net occurrences have yet to be thoroughly examined. In this work, we make two key contributions. First, we confirm the conjecture that each Fibonacci word contains exactly three net occurrences. Second, we show that each Thue-Morse word contains exactly nine net occurrences. To achieve these results, we introduce the notion of overlapping net occurrence cover, which narrows down the candidate net occurrences in any text. Furthermore, we provide a precise characterization of occurrences of Fibonacci and Thue-Morse words of smaller order, offering structural insights that may have independent interest and potential applications in algorithm analysis and combinatorial properties of these words
Space-Efficient Online Computation of String Net Occurrences
A substring u of a string T is said to be a repeat if u occurs at least twice in T. An occurrence [i..j] of a repeat u in T is said to be a net occurrence if each of the substrings aub = T[i-1..j+1], au = T[i-1..j], and ub = T[i..j+1] occurs exactly once in T. The occurrence [i-1..j+1] of aub is said to be an extended net occurrence of u. Let T be an input string of length n over an alphabet of size σ, and let ENO(T) denote the set of extended net occurrences of repeats in T. Guo et al. [SPIRE 2024] presented an online algorithm which can report ENO(T[1..i]) in T[1..i] in O(nσ²) time, for each prefix T[1..i] of T. Very recently, Inenaga [arXiv 2024] gave a faster online algorithm that can report ENO(T[1..i]) in optimal O(#ENO(T[1..i])) time for each prefix T[1..i] of T, where #S denotes the cardinality of a set S. Both of the aforementioned data structures can be maintained in O(n log σ) time and occupy O(n) space, where the O(n)-space requirement comes from the suffix tree data structure. In particular, Inenaga’s recent algorithm is based on Weiner’s right-to-left online suffix tree construction. In this paper, we show that one can modify Ukkonen’s left-to-right online suffix tree construction algorithm in O(n) space, so that ENO(T[1..i]) can be reported in optimal O(#ENO(T[1..i])) time for each prefix T[1..i] of T. This is an improvement over Guo et al.’s method that is also based on Ukkonen’s algorithm. Further, this leads us to the two following space-efficient alternatives:
- A sliding-window algorithm of O(d) working space that can report ENO(T[i-d+1..i]) in optimal O(#ENO(T[i-d+1..i])) time for each sliding window T[i-d+1..i] of size d in T.
- A CDAWG-based online algorithm of O() working space that can report ENO(T[1..i]) in optimal O(#ENO(T[1..i])) time for each prefix T[1..i] of T, where < 2n is the number of edges in the CDAWG for T. All of our proposed data structures can be maintained in O(n log σ) time for the input online string T. We also discuss that the extended net occurrences of repeats in T can be fully characterized in terms of the minimal unique substrings (MUSs) in T
Extremal Betti Numbers and Persistence in Flag Complexes
We investigate several problems concerning extremal Betti numbers and persistence in filtrations of flag complexes. For graphs on n vertices, we show that β_k(X(G)) is maximal when G = _{n,k+1}, the Turán graph on k+1 partition classes, where X(G) denotes the flag complex of G. Building on this, we construct an edgewise (one edge at a time) filtration = G₁ ⊆ ⋯ ⊆ _{n,k+1} for which β_k(X(G_i)) is maximal for all graphs on n vertices and i edges. Moreover, the persistence barcode ℬ_k(X(G)) achieves a maximal number of intervals, and total persistence, among all edgewise filtrations with |E(_{n,k+1})| edges.
For k = 1, we consider edgewise filtrations of the complete graph K_n. We show that the maximal number of intervals in the persistence barcode is obtained precisely when G_{⌈n/2⌉ ⋅ ⌊n/2⌋} = _{n,2}. Among such filtrations, we characterize those achieving maximal total persistence. We further show that no filtration can optimize β₁(X(G_i)) for all i, and conjecture that our filtrations maximize the total persistence over all edgewise filtrations of K_n
Simplification of Trajectory Streams
While there are software systems that simplify trajectory streams on the fly, few curve simplification algorithms with quality guarantees fit the streaming requirements. We present streaming algorithms for two such problems under the Fréchet distance d_F in ℝ^d for some constant d ≥ 2.
Consider a polygonal curve τ in ℝ^d in a stream. We present a streaming algorithm that, for any ε ∈ (0,1) and δ > 0, produces a curve σ such that d_F(σ,τ[v₁,v_i]) ≤ (1+ε)δ and |σ| ≤ 2 opt-2, where τ[v₁,v_i] is the prefix in the stream so far, and opt = min{|σ'|: d_F(σ',τ[v₁,v_i]) ≤ δ}. Let α = 2(d-1)⌊d/2⌋² + d. The working storage is O(ε^{-α}). Each vertex is processed in O(ε^{-α} log 1/ε) time for d ∈ {2,3} and O(ε^{-α}) time for d ≥ 4 . Thus, the whole τ can be simplified in O(ε^{-α}|τ| log 1/ε) time. Ignoring polynomial factors in 1/ε, this running time is a factor |τ| faster than the best static algorithm that offers the same guarantees.
We present another streaming algorithm that, for any integer k ≥ 2 and any ε ∈ (0,1/17), maintains a curve σ such that |σ| ≤ 2k-2 and d_F(σ,τ[v₁,v_i]) ≤ (1+ε) ⋅ min{d_F(σ',τ[v₁,v_i]): |σ'| ≤ k}, where τ[v₁,v_i] is the prefix in the stream so far. The working storage is O((kε^{-1}+ε^{-(α+1)})log 1/(ε)). Each vertex is processed in O(kε^{-(α+1)}log²1/(ε)) time for d ∈ {2,3} and O(kε^{-(α+1)} log 1/ε) time for d ≥ 4
A Theory of Sub-Barcodes
The primary tool in topological data analysis (TDA) is persistent homology, which involves computing a barcode - often from point-cloud or scalar field data - that serves as a topological signature for the underlying function. In this work, we introduce sub-barcodes and show how they arise naturally from factorizations of persistence module homomorphisms. We show that, as a partial order induced by factorizations, the relation of being a sub-barcode is strictly stronger than the rank invariant, and we apply sub-barcode theory to the problem of inferring information about the barcode of an unknown Lipschitz function from samples. The advantage of this approach is that it permits strong guarantees - with no noise - while requiring no sampling assumptions, and the resulting barcode is guaranteed to be a sub-barcode of every Lipschitz function that agrees with the data. We also present an algorithmic theory that allows for the efficient approximation of sub-barcodes using filtered Delaunay triangulations for Euclidean inputs