Helmholtz Center for Information Security
CISPA – Helmholtz-Zentrum für InformationssicherheitNot a member yet
3406 research outputs found
Sort by
Fuzztruction: Using Fault Injection-based Fuzzing to Leverage Implicit Domain Knowledge
Today’s digital communication relies on complex protocols and specifications for exchanging structured messages and data. Communication naturally involves two endpoints: One generating data and one consuming it. Traditional fuzz testing approaches replace one endpoint, the generator, with a fuzzer and rapidly test many mutated inputs on the target program under test. While this fully automated approach works well for loosely structured formats, this does not hold for highly structured formats, especially those that go through complex transformations such as compression or encryption.
In this work, we propose a novel perspective on generating inputs in highly complex formats without relying on heavy-weight program analysis techniques, coarse-grained grammar approximation, or a human domain expert. Instead of mutating the inputs for a target program, we inject faults into the data generation program so that this data is almost of the expected format. Such data bypasses the initial parsing stages in the consumer program and exercises deeper program states, where it triggers more interesting program behavior. To realize this concept, we propose a set of compile-time and run-time analyses to mutate the generator in a targeted manner, so that it remains intact and produces semi-valid outputs that satisfy the constraints of the complex format. We have implemented this approach in a prototype called Fuzztruction and show that it outperforms the state-of-the-art fuzzers AFL++, SymCC, and Weizz. Fuzztruction finds significantly more coverage than existing methods, especially on targets that use cryptographic primitives. During our evaluation, Fuzztruction uncovered 151 unique crashes (after automated deduplication). So far, we manually triaged and reported 27 bugs that have been acknowledged and 4 CVEs were assigne
FIDO2, CTAP 2.1, and WebAuthn 2: Provable Security and Post-Quantum Instantiation
The FIDO2 protocol is a globally used standard for passwordless authentication, building on an alliance between major players in the online authentication space. While already widely deployed, the standard is still under active development. Since version 2.1 of its CTAP sub-protocol, FIDO2 can potentially be instantiated with post-quantum secure primitives.
We provide the first formal security analysis of FIDO2 with the CTAP 2.1 and WebAuthn 2 sub-protocols. Our security models build on work by Barbosa et al. for their analysis of FIDO2 with CTAP 2.0 and WebAuthn 1, which we extend in several ways. First, we provide a more fine-grained security model that allows us to prove more relevant protocol properties, such as guarantees about token binding agreement, the None attestation mode, and user verification. Second, we can prove post-quantum security for FIDO2 under certain conditions and minor protocol extensions. Finally, we show that for some threat models, the downgrade resilience of FIDO2 can be improved, and show how to achieve this with a simple modification
On the effectiveness of partial variance reduction in federated learning with heterogeneous data
Data heterogeneity across clients is a key challenge in federated learning. Prior works address this by either aligning client and server models or using control variates to correct client model drift. Although these methods achieve fast convergence in convex or simple non-convex problems, the performance in over-parameterized models such as deep neural networks is lacking. In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers. We observe that while the feature extraction layers are learned efficiently by FedAvg, the substantial diversity of the final classification layers across clients impedes the performance. Motivated by this, we propose to correct model drift by variance reduction only on the final layers. We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost. We furthermore provide proof for the convergence rate of our algorithm
Iterative Circuit Repair Against Formal Specifications
We present a deep learning approach for repairing sequential circuits against formal specifications given in linear-time temporal logic (LTL). Given a defective circuit and its formal specification, we train Transformer models to output circuits that satisfy the corresponding specification. We propose a separated hierarchical Transformer for multimodal representation learning of the formal specification and the circuit. We introduce a data generation algorithm that enables generalization to more complex specifications and out-of-distribution datasets. In addition, our proposed repair mechanism significantly improves the automated synthesis of circuits from LTL specifications with Transformers. It improves the state-of-the-art by 6.8 percentage points on held-out instances and 11.8 percentage points on an out-of-distribution dataset from the annual reactive synthesis competition
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
The recently proposed stochastic Polyak stepsize (SPS) and stochastic linesearch (SLS) for SGD have shown remarkable effectiveness when training overparameterized models. However, two issues remain unsolved in this line of work. First, in non-interpolation settings, both algorithms only guarantee convergence to a neighborhood of a solution which may result in a worse output than the initial guess. While artificially decreasing the adaptive stepsize has been proposed to address this issue (Orvieto et al. 2022), this approach results in slower convergence rates under interpolation. Second, intuitive line-search methods equipped with variance-reduction (VR) fail to converge (Dubois-Taine et al. 2022). So far, no VR methods successfully accelerate these two stepsizes with a convergence guarantee. In this work, we make two contributions: Firstly, we propose two new robust variants of SPS and SLS, called AdaSPS and AdaSLS, which achieve optimal asymptotic rates in both strongly-convex or convex and interpolation or noninterpolation settings, except for the case when we have both strong convexity and non-interpolation. AdaSLS requires no knowledge of problem-dependent parameters, and AdaSPS requires only a lower bound of the optimal function value as input. Secondly, we propose a novel VR method that can use Polyak stepsizes or line-search to achieve acceleration. When it is equipped with AdaSPS or AdaSLS, the resulting algorithms obtain the optimal rate for optimizing convex smooth
functions. Finally, numerical experiments on synthetic and real datasets validate our theory and demonstrate the effectiveness and robustness of our algorithms
EdgeTDC: On the Security of Time Difference of Arrival Measurements in CAN Bus Systems
A Controller Area Network (CAN bus) is a message- based protocol for intra-vehicle communication designed mainly with robustness and safety in mind. In real-world deployments, CAN bus does not offer common security features such as message authentication. Due to the fact that automotive suppliers need to guarantee interoperability, most manufacturers rely on a decade- old standard (ISO 11898) and changing the format by introducing MACs is impractical. Research has therefore suggested to address this lack of authentication with CAN bus Intrusion Detection Systems (IDSs) that augment the bus with separate modules. IDSs attribute messages to the respective sender by measuring physical- layer features of the transmitted frame. Those features are based on timings, voltage levels, transients—and, as of recently, Time Difference of Arrival (TDoA) measurements. In this work, we show that TDoA-based approaches presented in prior art are vulnerable to novel spoofing and poisoning attacks. We describe how those proposals can be fixed and present our own method called EdgeTDC. Unlike existing methods, EdgeTDC does not rely on Analog-to-digital converters (ADCs) with high sampling rate and high dynamic range to capture the signals at sample level granularity. Our method uses time-to-digital converters (TDCs) to detect the edges and measure their timings. Despite being inexpensive to implement, TDCs offer low latency, high location precision and the ability to measure every single edge (rising and falling) in a frame. Measuring each edge makes analog sampling redundant and allows the calculation of statistics that can even detect tampering with parts of a message. Through extensive experimentation, we show that EdgeTDC can successfully thwart masquerading attacks in the CAN system of modern vehicles
Fuzzing Embedded Systems Using Debug Interfaces
Fuzzing embedded systems is hard. Their key components - microcontrollers - are highly diverse and cannot be easily virtualized; their software may not be changed or instrumented. However, we observe that many, if not most, microcontrollers feature a debug interface through which a debug probe (typically controllable via GDB, the GNU debugger) can set a limited number of hardware breakpoints. Using these, we extract partial coverage feedback even for uninstrumented binary code; and thus enable effective fuzzing for embedded systems through a generic, widespread mechanism. In its evaluation on four different microcontroller boards, our prototypical implementation GDBFuzz quickly reaches high code coverage and detects known and new vulnerabilities. As it can be applied to any program and system that GDB can debug, GDBFuzz is one of the least demanding and most versatile coverage-guided fuzzers
“Would You Give the Same Priority to the Bank and a Game? I Do Not!” Exploring Credential Management Strategies and Obstacles during Password Manager Setup
Password managers allow users to improve password security by handling large numbers of strong and unique passwords without the burden of memorizing them. While users are encouraged to add all credentials to their password manager and update weak credentials, this task can require significant effort and thus jeopardize security benefits if not completed thoroughly. However, user strategies to add credentials, related obstacles, and their security implications are not well understood. To address this gap in security research, we performed a mixed-methods study, including expert reviews of 14 popular password managers and an online survey with 279 users of built-in and third-party password managers. We extend previous work by examining the status quo of password manager setup features and investigating password manager users’ setup strategies. We confirm previous research and find that many participants utilize password managers for convenience, not as a security tool. They most commonly add credentials whenever a website is visited, and prioritize what they add. Similarly, passwords are often only updated when they are considered insecure. Additionally, we observe a severe distrust towards password managers, leading to users not adding important passwords. We conclude our work by giving recommendations for password manager developers to help users overcome the obstacles we identified
Inferring Symbolic Automata
We study the learnability of symbolic finite state automata (SFA), a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data, and its strengthening efficient identifiability, which are concerned with the existence of a systematic set of characteristic samples from which a learner can correctly infer the target language. We provide a necessary condition for identification of SFAs in the limit using polynomial time and data, and a sufficient condition for efficient learnability of SFAs. From these conditions we derive a positive and a negative result. The performance of a learning algorithm is typically bounded as a function of the size of the representation of the target language. Since SFAs, in general, do not have a canonical form, and there are trade-offs between the complexity of the predicates on the transitions and the number of transitions, we start by defining size measures for SFAs. We revisit the complexity of procedures on SFAs and analyze them according to these measures, paying attention to the special forms of SFAs: normalized SFAs and neat SFAs, as well as to SFAs over a monotonic effective Boolean algebra. This is an extended version of the paper with the same title published in CSL’22
Pakistani Teens and Privacy - How Gender Disparities, Religion and Family Values Impact the Privacy Design Space
The understanding of how teenagers perceive, manage and perform privacy is less well-understood in spaces outside of Western, educated, industrialised, rich and democratic countries.
To fill this gap we interviewed 30 teens to investigate the privacy perceptions, practices, and experienced digital harms of young people in Pakistan, a particularly interesting context as privacy in this context is not seen as an individual right or performed within an individualistic framework but instead is influenced by a combination of factors including social norms, family dynamics and religious beliefs.
Based on our findings, we developed four personas to systematize the needs and values of this specific population and then conducted focus groups with co-design activities to further explore privacy conflicts. Among other things that confirm and extend existing theories on teen’s privacy practices and perceptions, our findings suggest that young women are disproportionately impacted by privacy violations and the harms extend beyond themselves to include their families