Helmholtz Center for Information Security
CISPA – Helmholtz-Zentrum für InformationssicherheitNot a member yet
3406 research outputs found
Sort by
Formal Analysis of Session-Handling in Secure Messaging: Lifting Security from Sessions to Conversations
The building blocks for secure messaging apps, such as Signal’s X3DH and Double Ratchet (DR) protocols, have received a lot of attention from the research community. They have notably been proved to meet strong security properties even in the case of compromise such as Forward Secrecy (FS) and Post-Compromise Security (PCS). However, there is a lack of formal study of these properties at the application level. Whereas the research works have studied such properties in the context of a single ratcheting chain, a conversation between two persons in a messaging application can in fact be the result of merging multiple ratcheting chains.
In this work, we initiate the formal analysis of secure mes- saging taking the session-handling layer into account, and apply our approach to Sesame, Signal’s session management. We first experimentally show practical scenarios in which PCS can be violated in Signal by a clone attacker, despite its use of the Double Ratchet. We identify how this is enabled by Signal’s session-handling layer. We then design a formal model of the session-handling layer of Signal that is tractable for automated verification with the Tamarin prover, and use this model to rediscover the PCS violation and propose two provably secure mechanisms to offer stronger guarantees
The Network Zoo: a multilingual package for the inference and analysis of gene regulatory networks
Inference and analysis of gene regulatory networks (GRNs) require software that integrates multi-omic data from various sources. The Network Zoo (netZoo; netzoo.github.io) is a collection of open-source methods to infer GRNs, conduct differential network analyses, estimate community structure, and explore the transitions between biological states. The netZoo builds on our ongoing development of network methods, harmonizing the implementations in various computing languages and between methods to allow better integration of these tools into analytical pipelines. We demonstrate the utility using multi-omic data from the Cancer Cell Line Encyclopedia. We will continue to expand the netZoo to incorporate additional methods
From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!
Clickbait PDFs are PDF documents that do not embed malware but trick victims into visiting malicious web pages leading to attacks like password theft or drive-by download. While recent reports indicate a surge of clickbait PDFs, prior works have largely neglected this new threat, considering PDFs only as accessories of email phishing campaigns.
This paper investigates the landscape of clickbait PDFs and presents the first systematic and comprehensive study of this phenomenon. Starting from a real-world dataset, we identify 44 clickbait PDF clusters via clustering and characterize them by looking at their volumetric, temporal, and visual features. Among these, we identify three large clusters covering 89% of the dataset, exhibiting significantly different volumetric and temporal properties compared to classical email phishing, and relying on web UI elements as visual baits.
Finally, we look at the distribution vectors and show that clickbait PDFs are not only distributed via attachments but also via Search Engine Optimization attacks, placing clickbait PDFs outside the email distribution ecosystem.
Clickbait PDFs seem to be a lurking threat, not subjected to any form of content-based filtering or detection: AV scoring systems, like VirusTotal, rank them considerably low, creating a blind spot for organizations. While URL blocklists can help to prevent victims from visiting the attack web pages, we observe that they have a limited coverage
Semantic Debugging
Why does my program fail? We present a novel and general technique to automatically determine failure causes and conditions, using logical properties over input elements: "The program fails if and only if int(⟨length⟩) > len(⟨payload⟩) holds - that is, the given ⟨length⟩ is larger than the ⟨payload⟩ length." Our AVICENNA prototype uses modern techniques for inferring properties of passing and failing inputs and validating and refining hypotheses by having a constraint solver generate supporting test cases to obtain such diagnoses. As a result, AVICENNA produces crisp and expressive diagnoses even for complex failure conditions, considerably improving over the state of the art with diagnoses close to those of human experts
Revisiting Neural Program Smoothing for Fuzzing
Testing with randomly generated inputs (fuzzing) has gained significant traction due to its capacity to expose program vulnerabilities automatically. Fuzz testing campaigns generate large amounts of data, making them ideal for the application of machine learning (ML). Neural program smoothing, a specific family of ML-guided fuzzers, aims to use a neural network as a smooth approximation of the program target for new test case generation. In this paper, we conduct the most extensive evaluation of neural program smoothing (NPS) fuzzers against standard gray-box fuzzers (>11 CPU years and >5.5 GPU years), and make the following contributions: (1) We find that the original performance claims for NPS fuzzers do not hold; a gap we relate to fundamental, implementation, and experimental limitations of prior works. (2) We contribute the first in-depth analysis of the contribution of machine learning and gradient-based mutations in NPS . (3) We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance, but standard gray-box fuzzers almost always surpass NPS-based fuzzers. (4) As a consequence, we propose new guidelines targeted at benchmarking fuzzing based
on machine learning, and present a platform, MLFuzz, with GPU access for easy and reproducible evaluation of ML -based fuzzers. Neuzz++, MLFuzz, and all our data are public
Cloud Watching: Understanding Attacks Against Cloud-Hosted Services
Cloud computing has dramatically changed service deployment patterns. In this work, we analyze how attackers identify and target cloud services in contrast to traditional enterprise networks and network telescopes. Using a diverse set of cloud honeypots in 5 providers and 23 countries as well as 2 educational networks and 1 network telescope, we analyze how IP address assignment, geography, network, and service-port selection, influence what services are targeted in the cloud. We find that scanners that target cloud compute are selective: they avoid scanning networks without legitimate services and they discriminate between geographic regions. Further, attackers mine Internet-service search engines to find exploitable services and, in some cases, they avoid targeting IANA-assigned protocols, causing researchers to misclassify at least 15% of traffic on select ports. Based on our results, we derive recommendations for researchers and operators
"Always Contribute Back": A Qualitative Study on Security Challenges of the Open Source Supply Chain
Open source components are ubiquitous in companies’ setups, processes, and software. Utilizing these external components as building blocks enables companies to leverage the benefits of open source software, allowing them to focus their efforts on features and faster delivery instead of writing their own components. But by introducing these components into their software stack, companies inherit unique security challenges and attack surfaces: including code from potentially unvetted contributors, as well as the obligation to assess and mitigate the impact of vulnerabilities in external components.
In 25 in-depth, semi-structured interviews with software developers, architects, and engineers from industry projects, we investigate their projects’ processes, decisions, and considerations in the context of external open source code. We find that open source components play an important role in many of our participants’ projects, that most projects have some form of company policy or at least best practice for including external code, and that many developers wish for more developer-hours, dedicated teams, or tools to better audit included components. Based on our findings, we discuss implications for company stakeholders and the open source software ecosystem. Overall, we appeal to companies to not treat the open source ecosystem as a free (software) supply chain and instead to contribute towards the health and security of the overall software ecosystem they benefit from and are part of
MobileAtlas: Geographically Decoupled Measurements in Cellular Networks for Security and Privacy Research
Cellular networks are not merely data access networks to the Internet. Their distinct services and ability to form large
complex compounds for roaming purposes make them an attractive research target in their own right. Their promise of providing a consistent service with comparable privacy and security across roaming partners falls apart at close inspection.
Thus, there is a need for controlled testbeds and measurement tools for cellular access networks doing justice to the technology’s unique structure and global scope. Particularly, such measurements suffer from a combinatorial explosion of operators, mobile plans, and services. To cope with these challenges, we built a framework that geographically decouples the SIM from the cellular modem by selectively connecting both remotely. This allows testing any subscriber with any operator at any modem location within minutes without moving parts. The resulting GSM/UMTS/LTE measurement and testbed platform offers a controlled experimentation environment, which is scalable and cost-effective. The platform is extensible and fully open-sourced, allowing other researchers to contribute locations, SIM cards, and measurement scripts.\ud
Using the above framework, our international experiments in commercial networks revealed exploitable inconsistencies in traffic metering, leading to multiple phreaking opportunities, i.e., fare-dodging. We also expose problematic IPv6 firewall configurations, hidden SIM card communication to the home network, and fingerprint dial progress tones to track victims across different roaming networks and countries with voice calls
Stochastic distributed learning with gradient quantization and double-variance reduction
ABSTRACTWe consider distributed optimization over several devices, each sending incremental model updates to a central server. This setting is considered, for instance, in federated learning. Various schemes have been designed to compress the model updates in order to reduce the overall communication cost. However, existing methods suffer from a significant slowdown due to additional variance ω>0 coming from the compression operator and as a result, only converge sublinearly. What is needed is a variance reduction technique for taming the variance introduced by compression. We propose the first methods that achieve linear convergence for arbitrary compression operators. For strongly convex functions with condition number κ, distributed among n machines with a finite-sum structure, each worker having less than m components, we also (i) give analysis for the weakly convex and the non-convex cases and (ii) verify in experiments that our novel variance reduced schemes are more efficient than the baselines. Moreover, we show theoretically that as the number of devices increases, higher compression levels are possible without this affecting the overall number of communications in comparison with methods that do not perform any compression. This leads to a significant reduction in communication cost. Our general analysis allows to pick the most suitable compression for each problem, finding the right balance between additional variance and communication savings. Finally, we also (iii) give analysis for arbitrary quantized updates
UnGANable: Defending Against GAN-based Face Manipulation
Deepfakes pose severe threats of visual misinformation to our society. One representative deepfake application is face manipulation that modifies a victim’s facial attributes in an image, e.g., changing her age or hair color. The state-of-the-art face manipulation techniques rely on Generative Adversarial Networks (GANs). In this paper, we propose the first defense system, namely UnGANable, against GAN-inversionbased face manipulation. In specific, UnGANable focuses on defending GAN inversion, an essential step for face manipulation. Its core technique is to search for alternative images (called cloaked images) around the original images (called target images) in image space. When posted online, these cloaked images can jeopardize the GAN inversion process. We consider two state-of-the-art inversion techniques including optimization-based inversion and hybrid inversion, and design five different defenses under five scenarios depending on the defender’s background knowledge. Extensive experiments on four popular GAN models trained on two benchmark face datasets show that UnGANable achieves remarkable effectiveness and utility performance, and outperforms multiple baseline methods. We further investigate four adaptive adversaries to bypass UnGANable and show that some of them are slightly effective