1,721,065 research outputs found
The integration of multi-color taint-analysis with dynamic symbolic execution for Java web application security analysis
The view of IT security in today’s software development processes is changing. While IT
security used to be seen mainly as a risk that had to be managed during the operation
of IT systems, a class of security weaknesses is seen today as measurable quality aspects
of IT system implementations, e.g., the number of paths allowing SQL injection attacks.
Current trends, such as DevSecOps pipelines, therefore establish security testing in the
development process aiming to catch these security weaknesses before they make their
way into production systems. At the same time, the analysis works differently than in
functional testing, as security requirements are mostly universal and not project specific.
Further, they measure the quality of the source code and not the function of the system.
As a consequence, established testing strategies such as unit testing or integration testing
are not applicable for security testing. Instead, a new category of tools is required in
the software development process: IT security weakness analyzers. These tools scan
the source code for security weaknesses independent of the functional aspects of the
implementation. In general, such analyzers give stronger guarantees for the presence
or absence of security weaknesses than functional testing strategies. In this thesis, I
present a combination of dynamic symbolic execution and explicit dynamic multi-color
taint analysis for the security analysis of Java web applications. Explicit dynamic
taint analysis is an established monitoring technique that allows the precise detection of
security weaknesses along a single program execution path, if any are present. Multi-color
taint analysis implies that different properties defining diverse security weaknesses can
be expressed at the same time in different taint colors and are analyzed in parallel during
the execution of a program path. Each taint color analyzes its own security weakness
and taint propagation can be tailored in specific sanitization points for this color. The
downside of dynamic taint analysis is the single exploration of one path. Therefore, this
technique requires a path generator component as counterpart that ensures all relevant
paths are explored. Dynamic symbolic execution is appropriate here, as enumerating all
reachable execution paths in a program is its established strength. The Jaint framework
presented here combines these two techniques in a single tool. More specifically, the
thesis looks into SMT meta-solving, extending dynamic symbolic execution on Java
programs with string operations, and the configuration problem of multi-color taint
analysis in greater detail to enable Jaint for the analysis of Java web applications. The
evaluation demonstrates that the resulting framework is the best research tool on the
OWASP Benchmark. One of the two dynamic symbolic execution engines that I worked
on as part of the thesis has won gold in the Java track of SV-COMP 2022. The other
demonstrates that it is possible to lift the implementation design from a research specific
JVM to an industry grade JVM, paving the way for the future scaling of Jaint
Design principles for data quality tools
Data quality is an essential aspect of organizational data management and can facilitate accurate decision-making and building competitive advantages. Nu-merous data quality tools aim to support data quality work by offering automa-tion for different activities, such as data profiling or validation. However, de-spite a long history of tools and research, a lack of data quality remains an issue for many organizations. Data quality tools face changes in the organizational (e.g., evolving data architectures) and technical (e.g., big data) environment. Established tools cannot fully comprehend these changes, and limited prescrip-tive design knowledge on creating adequate tools is available. In this cumula-tive dissertation, we summarize the findings of nine individual studies on the objectives and design of data quality tools. Most importantly, we conducted four case studies on implementing data quality tools in real-world scenarios. In each case, we designed and implemented a separate data quality tool and abstracted the essential design elements. A subsequent cross-case analysis helped us accu-mulate the available design knowledge, resulting in the proposal of 13 general-ized design principles. With the proposal of empirically grounded design knowledge, the dissertation contributes to the managerial and scientific commu-nities. Managers can use our results to create customized data quality tools and assess offerings at the market. Scientifically, we address the lack of prescriptive design knowledge for data quality tools and offer many opportunities to extend our research in multiple directions. The continuous work on data quality tools will help them become more successful in ensuring data fulfills high-quality standards for the benefit of businesses and society
Systematic approach towards safety of the intended functionality
With the transition from advanced driver assistance systems to automated vehicles, safety is becoming a key goal for broad market introduction. Functional safety for low-level automated driving (L0-L2 by SAE standard) is well-measurable and manageable based on the methods described by the standard ISO 26262. However, since the fallback of the human driver is gradually taken out of the loop for automated driving systems (ADS), ISO 26262 is insufficient to cover the analysis of certain critical situations. In these situations, failures are not only due to the vehicle’s E/E system, but will also be in addition due to difficult environmental situations. They are deemed difficult for ADS, as they could potentially be improperly handled due to certain specifications or design insufficiencies. Such conditions are crucial to safety verification: organizing scenario-based testing based on them is more efficient and feasible than exhaustively exploring the scenario space. Meanwhile, this requires systematically identifying these conditions within a given Operational Design Domain (ODD) and developing a corresponding test strategy. Thus, this thesis elaborates on a systematic approach to tackle the challenges around difficult environmental conditions for ADS. Firstly, we interpret the nature of difficult conditions based on the state-of-the-art literature. Next, we summarize three types of difficult conditions, namely the presence/absence of specific environmental factors within the given ODD, specific behaviors of environmental factors, and specific interactions among environmental factors. Correspondingly, we propose formal, machine-readable formulations for each type. Consequently, the difficult conditions can be described uniformly, in favor of evaluating these conditions against certain criteria, creating test cases, and tracing test results. After that, we design both analytical and data-driven approaches to systematically identify difficult environmental conditions from the given ODD: On the one hand, we design an analytical method called Scenario-based Hazard and Fault Analysis (SHFA), which supports domain experts to elicit difficult environmental conditions by analyzing potential hazards in driving scenarios with their domain experience. On the other hand, we aim at finding critical scenarios containing difficult environmental conditions from driving data. To that end, we develop a fully automatic pipeline for reconstructing automated vehicle disengagement scenarios from real test drives. Finally, we present an overall test strategy and a test case generation method to integrate difficult conditions into scenario-based testing. This thesis has been developed in close collaboration with industrial automated vehicle production, and therefore, the presented concept and methods target conformance and compliance with the state-of-the-art automotive safety standards like ISO 21448 and regulations like EU 2022/1426. To the best of our knowledge, this thesis provides the first coherent framework for identifying, managing, and testing difficult environmental conditions for verifying ADS on the system level. The empirical findings suggest that concepts and methods around difficult environmental conditions can significantly contribute to identifying and constructing critical test cases, thereby advancing scenario-based verification for automated vehicles
Adaptive Learning for Learn-Based Regression Testing
Regression testing is an important activity to prevent the introduction of regressions into software updates. Learn-based testing can be used to automatically check new versions of a system for regressions on a system level. This is done by learning a model of the system and model checking this model for system property violations.Learning the model of a large system can take an unpractical amount of time however. In this work we investigate if the concept of adaptive learning can improve the learning speed of a model in a regression testing scenario.We have performed several experiments with this technique on two systems: ToDoMVC and SSH. We find that there can be a large benefit to using adaptive learning. In addition we find three main factors that influence the benefit of adaptive learning. There are however also some shortcomings to adaptive learning that should be investigated further
Semantische Klassifikation von urbanen Verkehrszenarien für die Absicherung des automatischen Fahrens
The safety argumentation of an automated vehicle is an essential condition before its use on public roads. For this reason, a thorough Verification and Validation (V&V) process is a fundamental aspect of the development and commercial release for every automated vehicle. In recent years, a number of scientific publications have reasoned that a distance- based V&V approach, aimed at providing a stochastic safety argument by achieving a desired failure-rate over a defined testing distance, will not be feasible to implement for automated vehicles. The main reason for this is the fact that the distance required to be driven with development vehicles exceeds economical and practical limits by far. To overcome this challenge, scenario-based V&V approaches are currently a subject of many research activities. These methodologies aim to evaluate the safety of an automated vehicle by testing it in a variety of different traffic scenarios. This allows to decompose the safety V&V into smaller units in the form of scenarios instead of having to achieve one large statistical argument for the safety of the system. However, urban traffic scenarios themselves form a complex, high dimensional state-space, which makes it challenging to argue for the completeness of scenario-based V&V approaches. This work aims to address this challenge through a semantic classification of urban traffic scenarios. By extracting them in a structured manner from large volumes of recorded driving data, it is possible to analyze them statistically and to make empirical, data-driven contributions to the V&V process of automated vehicles. To this end, a catalog of driving maneuvers is introduced to describe the behavior of vehicles in urban traffic on a semantic level. Next, algorithms for the automated classification of these maneuvers are implemented and evaluated with respect to their detection accuracy. Based on this automated maneuver classification, an empirical analysis of urban traffic scenario diversity is conducted. Here, a special focus is put on saturation effects during data collection as well as the observed exposure of various semantic scenario elements. The results of the investigations provide some of the first quantitative insights into the Long Tail-problem of automated driving V&V, which is often mentioned in current literature in this field. From the empirical findings it is further concluded that a semantic scenario classification has the potential to contribute substantially to a data-driven, scenario-based safety argumentation for automated vehicles.Der Sicherheitsnachweis eines automatischen Fahrzeugs ist ein zwingend notwendiger Schritt, den es vor dessen Markteinführung auf öffentlichen Straßen zu absolvieren gilt. Aus diesem Grund ist ein schlüssiger Prozess zur Verifikation und Validierung (kurz V&V) elementarer Bestandteil der Entwicklungsarbeit jedes automatischen Fahrsystems. In den letzten Jahren kamen verschiedene wissenschaftliche Veröffentlichungen zu dem Schluss, dass ein distanzbasierter V&V-Ansatz, bei dem ein Sicherheitsnachweis statistisch durch das Erreichen einer angestrebten Fehlerrate über eine zuvor festgelegte Referenzdistanz erbracht wird, für automatische Fahrzeuge aller Wahrscheinlichkeit nach nicht anwendbar sein wird. Dies ist vor allem darin begründet, dass die notwendigen Testdistanzen ein ökonomisch vertretbares Maß bei Weitem übersteigen würden. Aus diesem Grund sind szenarienbasierte V&V-Ansätze derzeit Gegenstand vieler Forschungsaktivitäten, in denen die Sicherheit des automatisierten Fahrzeugs in einer Vielzahl von Verkehrsszenarien nachgewiesen werden soll. Diese Arbeit befasst sich mit der semantischen Klassifikation von urbanen Verkehrs- szenarien, um diese in großen Datenmengen strukturiert erfassen und statistisch für den V&V-Prozess automatisierter Fahrzeuge analysieren zu können. Zu diesem Zweck wird zunächst ein Fahrmanöverkatalog zur semantischen Beschreibung urbanen Fahrverhaltens entwickelt. Anschließend werden Algorithmen zur automatischen Klassifikation dieser Fahrmanöver in aufgenommenen Fahrzeugmessdaten implementiert und hinsichtlich ihrer Erkennungsgüte evaluiert. Basierend auf dieser automatischen Klassifikation wird eine empirische Analyse urbaner Verkehrsszenarien durchgeführt, mit einem besonderen Augenmerk auf Sättigungseffekten bei der Datensammlung sowie der Exposition einzelner semantischer Szenarienelemente. Die Untersuchungen liefern erste quantitative Nachweise des vielfach in der wissenschaftlichen Literatur diskutierten Long-Tail-Problems bei der Absicherung automatischer Fahrfunktionen. Aus den empirischen Erkenntnissen wird geschlussfolgert, dass eine semantische Szenarienklassifikation das Potenzial hat, einen wichtigen Beitrag zur einer datengetriebenen, szenarienbasierten Sicherheitsargumentation für automatisierte Fahrzeuge zu leisten
Modelling and Analysing ERTMS Hybrid Level 3 with the mCRL2 Toolset
ERTMS Hybrid Level 3 is a recent proposal for a train control system specification that serves to increase the capacity of the railway network by allowing multiple trains with an integrity monitoring system and a GSM-R connection to the trackside on a single section. In this paper we model the principles of ERTMS Hybrid Level 3 in the mCRL2 process algebra and perform an analysis with its associated toolset. Our analysis has resulted in suggestions for improvement of the principles that will be taken into account in the next version of the specification.</p
Software fault injection and localization in embedded systems
Injection and localization of software faults have been extensively researched, but the results are not directly transferable to embedded systems. The domain-specific constraints applying to these systems, such as limited resources and the predominant C/C++ programming languages, require a specific set of injection and localization techniques. In this thesis, we have assessed existing approaches and have contributed a set of novel methods for software fault injection and localization in embedded systems.
We have developed a method based on AspectC++ for the injection of errors at interfaces and a method based on Clang for the accurate injection of software faults directly into source code. Both approaches work particularly well in the context of embedded systems, because they do not require runtime support and modify binaries only when necessary. Nevertheless, they are suitable to inject software faults and errors into the software of other domains.
These contributions required a thorough assessment of fault injection techniques and fault models presented in literature over the years, which raised multiple questions regarding their validity in the context of C/C++. We found that macros (particularly header files), compile-time language constructs, and the commonly used optimization levels introduce a non-negligible bias to experimental results achieved by injection methods operating on any other layer than the source code. Additionally, we found that the textual specification of fault models is prone to ambiguities and misunderstandings. We have conceived an automatic fault classifier to solve this problem in a field study.
Regarding software fault localization, we have combined existing methods making use of program spectra and assertions, and have contributed a new oracle type for autonomous localization of software faults in the field. Our evaluation shows that this approach works particularly well in the context of embedded systems because the generated information can be processed in real-time and, therefore, it can run in an unsupervised manner.
Concluding, we assessed a variety of injection and localization approaches in the context of embedded systems and contributed novel methods where applicable improving the current state-of-the-art. Our results also point out weaknesses regarding the general validity of the majority of previous injection experiments in C/C++
Engineering of Safe Autonomous Vehicles through Seamless Integration of System Development and System Operation
One of the significant open challenges is the lack of verification and validation approaches for assuring the safety of autonomous vehicles. The vast number of realworld traffic situations have to be considered in the verification and validation. Today's conventional engineering methods are not adequate for providing such guarantees for autonomous vehicles in a cost-efficient way. One strategy for reducing the costs of quality assurance is transferring a significant part of the verification and validation from road tests to (system-level) simulations. Extensive coverage of real-world situations in simulations requires the integration of development and operation. This thesis presents an engineering approach that integrates the development and operation of autonomous vehicles seamlessly using runtime monitoring. The runtime monitoring verifies if autonomous vehicles satisfy their requirements and operate within safe limits which have been verified in the simulations. Systematic and comprehensive simulations support the improvement of autonomous vehicles and coverage of traffic situations. Results of the runtime monitoring during operation are transferred to the development for the verification of autonomous vehicles and their safe limits in simulations with additional traffic situations. The incomplete verification of autonomous vehicles for the vast number of real-world traffic situations in simulations requires the validation of simulation results and additional monitoring in the real world. Results from simulations are transferred to the runtime monitoring during operation in the real world. Vehicle data and real-world situations possess high complexities and, therefore, impact the complexity and efficiency of the verification in simulations. The runtime monitoring abstracts from internal data of autonomous vehicles and real-world situations in the evaluation
Programmierkonzepte für die Umsetzung von Nutzungsrichtlinien in industriellen Datenräumen
Daten haben sich im Laufe der Zeit immer mehr zu einem wertvollem Asset entwickelt. Aus diesem Grund ist für Rechteinhaber die Kontrolle über die eigenen Daten von zentraler Bedeutung. Die Fähigkeit des Rechteinhabers selbstbestimmt über die Nutzung seiner Daten zu verfügen wird als Datensouveränität bezeichnet. Diese Arbeit beschäftigt sich mit der Frage, wie die Erlangung sowie der Erhalt der Datensouveränität technisch durch Usage Control Mechanismen unterstützt werden kann.
In der vorliegenden Arbeit wird eine flexible und erweiterbare Programmiersprache entwickelt, welche über integrierte Usage Control Mechanismen verfügt und den Namen D° trägt. Durch die Umsetzung des Programmierparadigmas der policy-agnostischen Programmierung wird die Komplexität der Usage Control Mechanismen gekapselt und kann durch Experten adressiert werden. Ein Teil dieser Komplexität ist in den Compiler verlagert und gelöst worden und muss von Anwendern der Sprache nicht mehr beachtet werden. Hierdurch wird der Applikationsentwickler entlastet und die korrekte Nutzung von Usage Control Mechanismen vereinfacht.
Des Weiteren wird präsentiert, wie das Remote Evaluation Paradigma für D° umgesetzt werden kann. Das Paradigma zielt auf Szenarien der kooperativen Datennutzung ab und verzichtet auf den Versand von Daten an Dritte, welche die Daten verwenden möchten. Stattdessen werden die datenverarbeitenden Applikationen und deren Berechnungsergebnisse hin- und hergeschickt. Hierdurch verbleiben die Daten stets auf den Systemen des Rechteinhabers, welche gleichzeitig auf die Vorteile der Usage Control Mechanismen in D° zurückgreifen können. Dies erlaubt die kooperative Datennutzung in Szenarien, in denen die Weitergabe von Daten ausgeschlossen ist und technische Maßnahmen zur Datennutzungskontrolle notwendig sind.
Die erzielten Ergebnisse werden mithilfe eines größeren Demonstrators präsentiert und validiert. Dabei werden die einzelnen Aspekte von D° anhand von Beispielen praktisch vorgestellt. Außerdem findet eine Einordnung der Lösung in die International Data Spaces statt, welche die vorliegende Arbeit maßgeblich motiviert und geprägt haben. Bei dieser Einordnung wird gezeigt, dass die Mächtigkeit der Usage Control Mechanismen von D° gleich oder besser zu der von anderen Usage Control Mechanismen, welche in den International Data Spaces verwendet werden, ist
- …
