1,721,007 research outputs found

    A federated society of bots for smart contract testing

    Full text link
    Smart contracts are a new type of software that allows its users to perform irreversible transactions on a distributed persistent data storage called the blockchain. The nature of such contracts and the technical details of the blockchain architecture give raise to new kinds of faults, which require specific test behaviours to be exposed. In this paper we present SoCRATES, a generic and extensible framework to test smart contracts running in a blockchain. The key properties of SoCRATES are: (1) it comprises bots that interact with the blockchain according to a set of composable behaviours; (2) it can instantiate a society of bots, which can trigger faults due to multi-user interactions that are impossible to expose with a single bot. Our experimental results show that SoCRATES can expose known faults and detect previously unknown faults in contracts currently published in the Ethereum blockchain. They also show that a society of bots is often more effective than a single bot in fault exposure. (C) 2020 Elsevier Inc. All rights reserved

    Boundary State Generation for Testing and Improvement of Autonomous Driving Systems

    Full text link
    Recent advances in Deep Neural Networks (DNNs) and sensor technologies are enabling autonomous driving systems (ADSs) with an ever-increasing level of autonomy. However, assessing their dependability remains a critical concern. State-of-the-art ADS testing approaches modify the controllable attributes of a simulated driving environment until the ADS misbehaves. In such approaches, environment instances in which the ADS is successful are discarded, despite the possibility that they could contain hidden driving conditions in which the ADS may misbehave. In this paper, we present GENBO (GENerator of BOundary state pairs), a novel test generator for ADS testing. GENBO mutates the driving conditions of the ego vehicle (position, velocity and orientation), collected in a failure-free environment instance, and efficiently generates challenging driving conditions at the behavior boundary (i.e., where the model starts to misbehave) in the same environment instance. We use such boundary conditions to augment the initial training dataset and retrain the DNN model under test. Our evaluation results show that the retrained model has, on average, up to 3x higher success rate on a separate set of evaluation tracks with respect to the original DNN model

    Deep Reinforcement Learning for Black-box Testing of Android Apps

    No full text
    The state space of Android apps is huge, and its thorough exploration during testing remains a significant challenge. The best exploration strategy is highly dependent on the features of the app under test. Reinforcement Learning (RL) is a machine learning technique that learns the optimal strategy to solve a task by trial and error, guided by positive or negative reward, rather than explicit supervision. Deep RL is a recent extension of RL that takes advantage of the learning capabilities of neural networks. Such capabilities make Deep RL suitable for complex exploration spaces such as one of Android apps. However, state-of-the-art, publicly available tools only support basic, Tabular RL. We have developed ARES, a Deep RL approach for black-box testing of Android apps. Experimental results show that it achieves higher coverage and fault revelation than the baselines, including state-of-the-art tools, such as TimeMachine and Q-Testing. We also investigated the reasons behind such performance qualitatively, and we have identified the key features of Android apps that make Deep RL particularly effective on them to be the presence of chained and blocking activities. Moreover, we have developed FATE to fine-tune the hyperparameters of Deep RL algorithms on simulated apps, since it is computationally expensive to carry it out on real apps

    STILE: A tool for optimizing E2E web test scripts parallelization

    No full text
    Web applications quality is commonly assessed by executing End-to-End (E2E) test scripts interacting with those systems as a human tester would. To avoid setting up the web application state for each test script, testers usually create test scripts that may depend on others previously executed. However, the presence of dependencies prevents parallelization, a fundamental technique for speedup the execution of large test suites. In this paper, we present STILE, a tool for parallelizing the execution of E2E web test scripts that generates and executes a set of test schedules satisfying two important constraints: (1) every schedule respects existing test dependencies, and (2) all test scripts in the test suite are executed at least once. Moreover, STILE optimizes the execution by running only once the test scripts that are shared among the schedules. We empirically evaluated STILE on eight E2E test suites by comparing the execution time of STILE both with the sequential execution and with the parallel execution based on Selenium Grid. Our results show that STILE can reduce the execution time up to 80% w.r.t. the sequential execution and up to 50% w.r.t. Grid. Moreover, STILE provides a reduction in the CPUs usage (i.e., overall CPU-time) up to 75%

    STILE: A Tool for Parallel Execution of E2E Web Test Scripts

    No full text
    Automated end-to-end (E2E) Web testing relying on frameworks such as Selenium Web Driver is commonly used to assess the quality of web applications. However, the resulting test scripts may require long execution times, due to their interaction with the browser GUI and backend services. To avoid repeated and costly setup of the Web application state, testers tend to build test suites whose test scripts depend on each other (i.e., one test case sets up the application state expected by another test case). In this paper we present Stile, a tool for the parallel execution of Web test scripts that ensures the compliance of all execution schedules with the dependencies among the involved test scripts, while at the same time minimizing the execution time and the computation time required for such parallel execution. Experimental results show that execution times can be approximately halved thanks to Stile

    Diversity-based web test generation

    No full text
    Existing web test generators derive test paths from a navigational model of the web application, completed with either manually or randomly generated input values. However, manual test data selection is costly, while random generation often results in infeasible input sequences, which are rejected by the application under test. Random and search-based generation can achieve the desired level of model coverage only after a large number of test execution at- tempts, each slowed down by the need to interact with the browser during test execution. In this work, we present a novel web test generation algorithm that pre-selects the most promising candidate test cases based on their diversity from previously generated tests. As such, only the test cases that explore diverse behaviours of the application are considered for in-browser execution. We have implemented our approach in a tool called DIG. Our empirical evaluation on six real-world web applications shows that DIG achieves higher coverage and fault detection rates significantly earlier than crawling-based and search-based web test generators

    Model-based Exploration of the Frontier of Behaviours for Deep Learning System Testing

    No full text
    Replication package for the paper "Model-based Exploration of the Frontier of Behaviours for Deep Learning System Testing

    An industrial experience report on applying search-based boundary input generation to cyber-physical systems

    No full text
    Testing Cyber Physical Systems (CPS) is crucial, as they play a central role in modern society. In the complex input space of these systems, boundary test inputs provide a valuable asset for test engineers as they identify slight input modifications that dramatically impact Quality of Service. In this experience paper, we propose LiftJanus, the first search-based test generator for CPS that integrates test input minimization, boundary value detection, and automated system repair. We performed an empirical study involving two real-world elevator systems provided by our industrial collaborator, Orona. Our results proved that LiftJanus generated boundary inputs twice as effective as the baselines, with the repair algorithm successfully enhancing the system’s configuration in 76.25% of the cases. Interviews with domain experts confirmed that LiftJanus is a comprehensive solution for enhancing the quality of elevator systems

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore