1,721,037 research outputs found

    The Impact of Dormant Defects on Defect Prediction: A Study of 19 Apache Projects

    Full text link
    Defect prediction models can be beneficial to prioritize testing, analysis, or code review activities, and has been the subject of a substantial effort in academia, and some applications in industrial contexts. A necessary precondition when creating a defect prediction model is the availability of defect data from the history of projects. If this data is noisy, the resulting defect prediction model could result to be unreliable. One of the causes of noise for defect datasets is the presence of "dormant defects," i.e., of defects discovered several releases after their introduction. This can cause a class to be labeled as defect-free while it is not, and is, therefore "snoring." In this article, we investigate the impact of snoring on classifiers' accuracy and the effectiveness of a possible countermeasure, i.e., dropping too recent data from a training set. We analyze the accuracy of 15 machine learning defect prediction classifiers, on data from more than 4,000 defects and 600 releases of 19 open source projects from the Apache ecosystem. Our results show that on average across projects (i) the presence of dormant defects decreases the recall of defect prediction classifiers, and (ii) removing from the training set the classes that in the last release are labeled as not defective significantly improves the accuracy of the classifiers. In summary, this article provides insights on how to create defects datasets by mitigating the negative effect of dormant defects on defect prediction

    Snoring: A noise in defect prediction datasets

    No full text
    In order to develop and train defect prediction models, researchers rely on datasets in which a defect is often attributed to a release where the defect itself is discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon as 'sleeping defects'. We call 'snoring' the phenomenon where classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this paper we analyze, on data from 282 releases of six open source projects from the Apache ecosystem, the magnitude of the sleeping defects and of the snoring classes. Our results indicate that 1) on all projects, most of the defects in a project slept for more than 20% of the existing releases, and 2) in the majority of the projects the missing rate is more than 25% even if we remove 50% of releases

    Relationship between design patterns defects and crosscutting concern scattering degree: an empirical study

    No full text
    Design patterns are solutions to recurring design problems, aimed at increasing reuse, code quality and, above all, maintainability and resilience to changes. Despite such advantages, the usage of design patterns implies the presence of crosscutting code implementing the pattern usage and access from other system components. When the system evolves, the presence of crosscutting code can cause repeated changes, possibly introducing defects. This study reports an empirical study, in which it is showed that, for three open source projects, the number of defects in design-pattern classes is in several cases correlated with the scattering degree of their induced crosscutting concerns, and also varies among different kinds of pattern

    3rd International Workshop on Designing Empirical Studies: Assessing the Effectiveness of Agile Methods (IWDES 2009)

    No full text
    Assessing the effectiveness of a development methodology is difficult and requires an extensive empirical investigation. Moreover, the design of such investigations is complex since they involve several stakeholders and their validity can be questioned if not replicated in similar and different contexts. Agilists are aware that data collection is important and the problem of designing and execute meaningful experiments is common. This workshop aims at creating a critical mass for the development of new and extensive investigations in the Agile world

    An Empirical Study on the Maintenance of Source Code Clones

    No full text
    Code cloning has been very often indicated as a bad software development practice. However, many studies appearing in the literature indicate that this is not always the case. In fact, either changes occurring in cloned code are consistently propagated, or cloning is used as a sort of templating strategy, where cloned source code fragments evolve independently. This paper (a) proposes an automatic approach to classify the evolution of source code clone fragments, and (b) reports a fine-grained analysis of clone evolution in four different Java and C software systems, aimed at investigating to what extent clones are consistently propagated or they evolve independently. Also, the paper investigates the relationship between the presence of clone evolution patterns and other characteristics such as clone radius, clone size and the kind of change the clones underwent, i.e., corrective maintenance or enhancement

    Self-Admitted Technical Debt Removal and Refactoring Actions: Co-Occurrence or More?

    No full text
    Technical Debt (TD) concerns the lack of an adequate solution in a software project, from its design to the source code. Its admittance through comments or commit messages is referred to as Self-Admitted Technical Debt (SATD). Previous research has studied SATD from different perspectives, including its distribution, impact on software quality, and removal. In this paper, we investigate the relationship between refactorings and SATD removal. By leveraging a dataset of SATD and their removals in four open-source projects and by using an automated refactoring detection tool, we study the co-occurrence of refactorings and SATD removals. Results of the study indicate that refactorings are more likely to co-occur with SATD removals than with other commits, however, in most cases, they belong to different quality improvement activities performed at the same time

    An empirical study on the co-occurrence between refactoring actions and Self-Admitted Technical Debt removal

    No full text
    Technical Debt (TD) concerns the lack of an adequate solution in a software project, from its design to the source code. Its admittance through source code comments, issues, or commit messages is referred to as Self-Admitted Technical Debt (SATD). Previous research has studied SATD from different perspectives, including its distribution, impact on software quality, and removal. In this paper, we investigate the relationship between refactoring and SATD removal. By leveraging a dataset of SATD and their removals in four open-source projects and by using an automated refactoring detection tool, we study the co-occurrence of refactoring and SATD removals. Results of the study indicate that refactoring is more likely to co-occur with SATD removals than with other commits, however, in most cases, they belong to different quality improvement activities performed at the same time. Moreover, if looking closely at refactoring actions co-occurring with SATD removal in the same code entities, a relationship between these activities can be found. Finally, we found how both source code quality metrics and SATD removals play a statistically significant role in the likelihood that the commit applies a refactoring action

    The life and death of statically detected vulnerabilities: An empirical study

    No full text
    Vulnerable statements constitute a major problem for developers and maintainers of networking systems. Their presence can ease the success of security attacks, aimed at gaining unauthorized access to data and functionality, or at causing system crashes and data loss. Examples of attacks caused by source code vulnerabilities are buffer overflows, command injections, and cross-site scripting. This paper reports on an empirical study, conducted across three networking systems, aimed at observing the evolution and decay of vulnerabilities detected by three freely available static analysis tools. In particular, the study compares the decay of different kinds of vulnerabilities, characterizes the decay likelihood through probability density functions, and reports a quantitative and qualitative analysis of the reasons for vulnerability removals. The study is performed by using a framework that traces the evolution of source code fragments across subsequent commits

    The Evolution and Decay of Statically Detected Source Code Vulnerabilities

    No full text
    The presence of vulnerable statements in the source code is a crucial problem for maintainers: properly monitoring and, if necessary, removing them is highly desirable to ensure high security and reliability. To this aim, a number of static analysis tools have been developed to detect the presence of instructions that can be subject to vulnerability attacks, ranging from buffer overflow exploitations to command injection and cross-site scripting. Based on the availability of existing tools and of data extracted from software repositories, this paper reports an empirical study on the evolution of vulnerable statements detected in three software systems with different static analysis tools. Specifically, the study investigates on vulnerability evolution trends and on the decay time exhibited by different kinds of vulnerabilities. © 2008 IEE
    corecore