1,721,043 research outputs found
Dataset of the study "Exploring the Notion of Risk in Reviewer Recommendation"
Note: Please find the dockerized version of this replication package in the following link:
https://figshare.com/articles/dataset/Replication_Package_of_the_study_Exploring_the_Notion_of_Risk_in_Reviewer_Recommendation_/20673255
This repository contains the necessary data for replicating the necessary information to replicate the study of "Exploring the Notion of Risk in Reviewer Recommendation." This code extends the RelationalGit package (https://github.com/CESEL/RelationalGit) from the study of E. Mirsaeedi and P. C. Rigby [1] and adds some functionality that is needed to incorporate the concept of the fix-inducing likelihood of a project.
In addition to our dataset, this repository also have the supporting materials for our study. The supporting materials are in the "ICSME_online_materials_ICSME.pdf" and contains the following items:
Table 1 contains the detail of the dataset and some related statistics for each of the studied projects.
Table 2 have risk measures that were used in our defect prediction model. We use Commit Guru Tool to extracts the data from the GitHub repositories and then use this data to train our defect prediction model.
Figure 1 illustrates the distribution of predicted defect probability of different projects. This distribution shows how defect probability of different periods are similar to the adjacent periods.
References:
[1] E. Mirsaeedi and P. C. Rigby, ‘Mitigating turnover with code review recommendation: Balancing expertise, workload, and knowledge distribution’, στο Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020
Leveraging Fault Localisation to Enhance Defect Prediction
Software Quality Assurance (SQA) is a resource constrained activity. Research has explored various means of supporting that activity. For example, to aid in resource investment decisions, defect prediction identifies modules or changes that are likely to be defective in the future. To support repair activities, fault localisation identifies areas of code that are likely to require change to address known defects. Although the identification and localisation of defects are interdependent tasks, the synergy between defect prediction and fault localisation remains largely underexplored. We hypothesise that modifying code that was suspicious in the past is riskier than modifying code that was not. To validate our hypothesis, in this paper, we employ fault localisation, which localises the root cause of a program failure. We compute the past suspiciousness score of code changes to each fault, and use those scores to (1) define new features for training defect prediction models; and (2) guide the next actions of developers for a commit labelled as fix-inducing. An empirical study of three open-source projects confirms our hypothesis. The new suspiciousness features improve F1 score and balanced accuracy of Just-In-Time (JIT) defect prediction models by 4.2% to 92.2% and by 1.2% to 3.7%, respectively. When guiding developer actions, past code suspiciousness successfully guides developers to a defective file, inspecting two to nine fewer files on average, compared to the baselines based on previous findings on past faults. These results demonstrate the potential of synergies of fault localisation and defect prediction, and lay the groundwork for explorations of that combined space
Dataset of the study "Exploring the Notion of Risk in Reviewer Recommendation"
This repository contains the necessary data for replicating the necessary information to replicate the study of "Exploring the Notion of Risk in Reviewer Recommendation." This code extends the RelationalGit package (https://github.com/CESEL/RelationalGit) from the study of E. Mirsaeedi and P. C. Rigby [1] and adds some functionality that is needed to incorporate the concept of the fix-inducing likelihood of a project.
In addition to our dataset, this repository also have the supporting materials for our study. The supporting materials are in the "ICSME_online_materials_ICSME.pdf" and contains the following items:
Table 1 contains the detail of the dataset and some related statistics for each of the studied projects.
Table 2 have risk measures that were used in our defect prediction model. We use Commit Guru Tool to extracts the data from the GitHub repositories and then use this data to train our defect prediction model.
Figure 1 illustrates the distribution of predicted defect probability of different projects. This distribution shows how defect probability of different periods are similar to the adjacent periods.
References:
[1] E. Mirsaeedi and P. C. Rigby, ‘Mitigating turnover with code review recommendation: Balancing expertise, workload, and knowledge distribution’, στο Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Trade-Off Exploration for Acceleration of Continuous Integration
Continuous Integration (CI) is a popular software development practice that allows developers to quickly verify modifications to their projects. To cope with the ever-increasing demand for faster software releases, CI acceleration approaches have been proposed to expedite the feedback that CI provides.
However, adoption of CI acceleration is not without cost. The trade-off in duration and trustworthiness of a CI acceleration approach determines the practicality of the CI acceleration process. Indeed, if a CI acceleration approach takes longer to prime than to run the accelerated build, the benefits of acceleration are unlikely to outweigh the costs. Moreover, CI acceleration techniques may mislabel change sets (e.g., a build labelled as failing that passes in an unaccelerated setting or vice versa) or produce results that are inconsistent with an unaccelerated build (e.g., the underlying reason for failure does not match with the unaccelerated build). These inconsistencies call into question the trustworthiness of CI acceleration products.
We first evaluate the time trade-off of two CI acceleration products — one based on program analysis (PA) and the other on machine learning (ML). After replaying the CI process of 100,000 builds spanning ten open-source projects, we find that the priming costs (i.e., the extra time spent preparing for acceleration) of the program analysis product are substantially less than that of the machine learning product (e.g., average project-wise median cost difference of 148.25 percentage points). Furthermore, the program analysis product generally provides more time savings than the machine learning product (e.g., average project-wise median savings improvement of 5.03 percentage points). Given their deterministic nature, and our observations about priming costs and benefits, we recommend that organizations consider the adoption of program analysis based acceleration.
Next, we study the trustworthiness of the same PA and ML CI acceleration products. We re-execute 50 failing builds from ten open-source projects in non-accelerated (baseline), program analysis accelerated, and machine learning accelerated settings. We find that when applied to known failing builds, program analysis accelerated builds more often (43.83 percentage point difference across ten projects) align with the non-accelerated build results. Accordingly, we conclude that while there is still room for improvement for both CI acceleration products, the selected program analysis product currently provides a more trustworthy signal of build outcomes than the machine learning product.
Finally, we propose a mutation testing approach to systematically evaluate the trustworthiness of CI acceleration. We apply our approach to the deterministic PA-based CI acceleration product and uncover issues that hinder its trustworthiness. Our analysis consists of three parts: we first study how often the same build in accelerated and unaccelerated CI settings produce different mutation testing outcomes. We call mutants with different outcomes in the two settings “gap mutants”. Next, we study the code locations where gap mutants appear. Finally, we inspect gap mutants to understand why acceleration causes them to survive. Our analysis of ten thriving open-source projects uncovers 2,237 gap mutants. We find that: (1) the gap in mutation outcomes between accelerated and unaccelerated settings varies from 0.11%–23.50%; (2) 88.95% of gap mutants can be mapped to specific source code functions and classes using the dependency representation of the studied CI acceleration product; (3) 69% of gap mutants survive CI acceleration due to deterministic reasons that can be classified into six fault patterns. Our results show that deterministic CI acceleration suffers from trustworthiness limitations, and highlights the ways in which trustworthiness could be improved in a pragmatic manner.
This thesis demonstrates that CI acceleration techniques, whether PA or ML-based, present time trade-offs and can reduce software build trustworthiness. Our findings lead us to encourage users of CI acceleration to carefully weigh both the time costs and trustworthiness of their chosen acceleration technique. This study also demonstrates that the following improvements for PA-based CI acceleration approaches would improve their trustworthiness: (1) depending on the size and complexity of the codebase, it may be necessary to manually refine the dependency graph, especially by concentrating on class properties, global variables, and constructor components; and (2) solutions should be added to detect and bypass flaky test during CI acceleration to minimize the impact of flakiness
Mitigating the Uncertainty and Imprecision of Log-Based Code Coverage Without Requiring Additional Logging Statements
Understanding code coverage is an important precursor to software maintenance activities (e.g., better testing). Although modern code coverage tools provide key insights, they typically rely on code instrumentation, resulting in significant performance overhead. An alternative approach to code instrumentation is to process an application’s source code and the associated log traces in tandem. This so-called “log-based code coverage” approach does not impose the same performance overhead as code instrumentation. Previous work has introduced LogCoCo — a tool that implements log-based code coverage for Java. While LogCoCo breaks important new ground, it has fundamental limitations, namely: uncertainty due to the lack of logging statements in conditional branches, and imprecision caused by dependency injection. In this thesis, we propose Log2Cov, a tool that generates log-based code coverage for programs written in Python and addresses uncertainty and imprecision issues. We evaluate Log2Cov on three large and active open-source systems. More specifically, we compare the performance of Log2Cov to that of Coverage.py, an instrumentation-based coverage tool for Python. Our results indicate that 1) Log2Cov achieves high precision without introducing runtime overhead; and 2) uncertainty and imprecision can be reduced by up to 11% by statically analyzing the program’s source code and execution logs, without requiring additional logging instrumentation from developers. While our enhancements make substantial improvements, we find that future work is needed to handle conditional statements and exception handling blocks to achieve parity with instrumentation-based approaches. We conclude the thesis by drawing attention to these promising directions for future work
The Cost of Build Tool Downgrades: An Empirical Study of the Kubernetes Project
Software build tools automate the transformation of source code into deliverables. Since developers invoke builds multiple times per day, the performance of a build tool directly impacts developer productivity. Motivated by the potential of productivity improvements, artifact-based build tools have emerged to accelerate builds. Despite their advantages, recent work shows that a considerable proportion of projects that adopt these tools later downgrade to others. While prior work has explained the rationale of build tool downgrades, the cost of these downgrades in terms of build duration and computational resource usage is not well understood. Without such an understanding, software teams may make under-informed decisions about adopting or abandoning build tools.
In this thesis, we conduct an empirical study of the performance penalties associated with the downgrade of build tools in the Kubernetes project, focusing on its downgrade from an artifact-based build tool (Bazel) to a language-specific solution (Go Build). We reproduce and analyze full and incremental builds of change sets during the period leading up to the downgrade event. Our results show that, on the one hand, Bazel builds are significantly shorter than Go Build ones, specifically 23.06–38.66 incremental builds.
On the other hand, Bazel builds impose a larger memory footprint than Go Build (81.42–351.07 builds) and greater CPU load at parallelism settings beyond eight for full builds and beyond one for incremental builds. To understand the financial impact of abandoning Bazel, we further analyze the costs of resource consumption during continuous integration builds. We find that Bazel tends to be less costly than Go Build, with the estimated additional financial costs of downgrading ranging from 22.62–39.14 reaching up to 75.92 our findings by replicating our Kubernetes study on smaller projects, we observe that build tool downgrades have a smaller impact on build durations for codebases smaller than Kubernetes; however, artifact-based build tools consistently impose a larger memory footprint than their replacements.
We conclude that abandoning artifact-based build tools, despite perceived maintainability benefits, tends to incur considerable performance costs for large projects. This trade-off between tool complexity and efficiency highlights the need for informed decision-making and lays the groundwork for future research to develop tools that support balancing these trade-offs and more pragmatic decision-making about build tool adoption
- …
