1,720,975 research outputs found
Performance Engineering in Agile/DevOps Development Processes: Ensuring Software Performance While Moving Fast
Agile principles and DevOps practices play a pivotal role in modern software development. These methodologies aim to improve software organization productivity while preserving the quality of the produced software. Unfortunately, the assessment of important non-functional software properties, such as performance, can be challenging in these contexts. Frequent code changes and software releases make impractical the use of classical performance assurance approaches. Moreover, many performance issues require highly specific conditions to be detected, which may be difficult to replicate in a testing environment. This thesis investigates and tackles problems related to performance assessment of software systems in the context of Agile/DevOps development processes. Specifically, it focuses on three aspects.
The first aspect concerns practical and management problems in handling performance requirements. These problems were investigated through a 6-months industry collaboration with a large software organization that adopts an Agile software development process. The research was conducted in line with ethnographic research, which guided towards building knowledge from participatory observations, unstructured interviews and reviews of documentations. The study identified a set of management and practical challenges that arise from the adoption of Agile methodologies.
The second aspect concerns the impact of refactoring activities on software performance. Refactoring is a fundamental activity in modern software development, and it is a core development phase of many Agile methodologies (e.g., Test-Driven Development and Extreme Programming). Nevertheless, there is little knowledge about the impact of refactoring operations on software performance. This thesis aims to fill this gap by presenting the largest study to date that investigates the impact of refactoring on software performance, in terms of execution time. The change history of 20 Java open-source systems was mined with the goal of identifying commits in which developers have implemented refactoring operations impacting code components that are exercised by performance benchmarks. Through a quantitative and qualitative analysis, the impact of (different types of) refactoring on execution times were unveiled. The results showed that the impact of refactoring on execution time varies depending on the refactoring type, with none of them being 100% "safe" in ensuring that there is no performance regression. Some refactoring types can result in substantial performance regression and, as such, should be carefully considered when refactoring performance-critical parts of a system.
The third aspect concerns the introduction of techniques for performance assessment in the context of DevOps processes. Due to the fast-faced release cycle and the inherently non-deterministic nature of software performance, it is often unfeasible to proactively detect performance issues. For these reasons, today, the diagnosis of performance issues in production is a fundamental activity for maintaining high-quality software systems. This activity can be time-consuming, since it may require thorough inspection of large volumes of traces and performance indices. This thesis introduces two novel techniques for automated diagnosis of performance issues in service-based systems, which can be easily integrated into DevOps processes. These techniques are evaluated, in terms of effectiveness and efficiency, on a large number of datasets generated for two case study systems, and they are compared to two state-of-the-art techniques and three general-purpose clustering algorithms. The results showed that baselines were outperformed with a better and more stable effectiveness. Moreover, the presented techniques showed to be more efficient on large datasets when compared to the most effective baseline
Apocalissi siblimi: gli scenari dei videogame e l'epica della pittura romantica
Attraverso l'analisi iconografica di alcuni videogiochi si cerca di definire in chiave critica e teorica la vicinanza di questa forma di comunicazione visiva alla tradizione dell'arte e in particolare dell'arte romatica
Vero come in un gioco. Problemi estetici attorno al videogame
The essay suggests a reflection on the aesthetic evolution of simulation video games, paralleling it to that of iper-realism in art. An aesthetic rooted in technology tends to be subject to a technological type of development, more than to a cultural one
Towards effective assessment of steady state performance in Java software: are we there yet?
Microbenchmarking is a widely used form of performance testing in Java software. A microbenchmark repeatedly executes a small chunk of code while collecting measurements related to its performance. Due to Java Virtual Machine optimizations, microbenchmarks are usually subject to severe performance fluctuations in the first phase of their execution (also known as warmup). For this reason, software developers typically discard measurements of this phase and focus their analysis when benchmarks reach a steady state of performance. Developers estimate the end of the warmup phase based on their expertise, and configure their benchmarks accordingly. Unfortunately, this approach is based on two strong assumptions: (i) benchmarks always reach a steady state of performance and (ii) developers accurately estimate warmup. In this paper, we show that Java microbenchmarks do not always reach a steady state, and often developers fail to accurately estimate the end of the warmup phase. We found that a considerable portion of studied benchmarks do not hit the steady state, and warmup estimates provided by software developers are often inaccurate (with a large error). This has significant implications both in terms of results quality and time-effort. Furthermore, we found that dynamic reconfiguration significantly improves warmup estimation accuracy, but still it induces suboptimal warmup estimates and relevant side-effects. We envision this paper as a starting point for supporting the introduction of more sophisticated automated techniques that can ensure results quality in a timely fashion
AI-driven Java Performance Testing: Balancing Result Quality with Testing Time
Performance testing aims at uncovering efficiency issues of software systems. In order to be both effective and practical, the design of a performance test must achieve a reasonable trade-off between result quality and testing time. This becomes particularly challenging in Java context, where the software undergoes a warm-up phase of execution, due to just-in-time compilation. During this phase, performance measurements are subject to severe fluctuations, which may adversely affect quality of performance test results. Both practitioners and researchers have proposed approaches to mitigate this issue. Practitioners typically rely on a fixed number of iterated executions that are used to warm-up the software before starting to collect performance measurements (state-of-practice). Researchers have developed techniques that can dynamically stop warm-up iterations at runtime (state-of-the-art). However, these approaches often provide suboptimal estimates of the warm-up phase, resulting in either insufficient or excessive warm-up iterations, which may degrade result quality or increase testing time. There is still a lack of consensus on how to properly address this problem. Here, we propose and study an AI-based framework to dynamically halt warm-up iterations at runtime. Specifically, our framework leverages recent advances in AI for Time Series Classification (TSC) to predict the end of the warm-up phase during test execution. We conduct experiments by training three different TSC models on half a million of measurement segments obtained from JMH microbenchmark executions. We find that our framework significantly improves the accuracy of the warm-up estimates provided by state-of-practice and state-of-the-art methods. This higher estimation accuracy results in a net improvement in either result quality or testing time for up to +35.3% of the microbenchmarks. Our study highlights that integrating AI to dynamically estimate the end of the warm-up phase can enhance the cost-effectiveness of Java performance testing
Time Series Forecasting of Runtime Software Metrics: An Empirical Study
Software applications can produce a wide range of runtime software metrics (e.g., number of crashes, response times), which can be closely monitored to ensure operational efficiency and prevent significant software failures. These metrics are typically recorded as time series data. However, runtime software monitoring has become a high-effort task due to the growing complexity of today's software systems. In this context, time series forecasting (TSF) offers unique opportunities to enhance software monitoring and facilitate proactive issue resolution. While TSF methods have been widely studied in areas like economics and weather forecasting, our understanding of their effectiveness for software runtime metrics remains somewhat limited. In this paper, we investigate the effectiveness of four TSF methods on 25 real-world runtime software metrics recorded over a period of one and a half years. These methods comprise three recurrent neural network (RNN) models and one traditional time series analysis technique (i.e., SARIMA). The metrics are gathered from a large-scale IT infrastructure involving tens of thousands of digital devices. Our results indicate that, in general, RNN models are very effective in the runtime software metrics prediction, although in some scenarios and for certain specific metrics (e.g., waiting times) SARIMA proves to outperform RNN models. Additionally, our findings suggest that the advantages of using RNN models vanish when the prediction horizon becomes too wide, in our case when it exceeds one week
A smart city run-time planner for multi-site congestion management
The twenty most visited museums in Italy (over the total number of 4.976) attract one third of the entire visitors population. This results into a small number of very famous overcrowded museums (with an average waiting time of two hours) and a myriad of still beautiful but empty museums. In order to reduce the queuing time outside one of those famous museums, we have been asked to produce a mobile application that, based on statistical models, can return a timed reservation to the tourist, so to avoid to spend waiting time in a physical queue. Based on this pre-conditions, this paper presents a smart city run-time planner approach and app that, by taking in input some minimum information (location, congestion, and scheduled timed reservation), recommends near-by museums to visit while waiting the scheduled visit to the busy one. The mobile app has been designed to work in the absence of internet connection. The approach and its implementation is presented, as well as some initial conceptual applications to an explanatory example
On the Compression of Language Models for Code: An Empirical Study on CodeBERT
Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. To address this challenge, researchers have begun applying various compression strategies to improve the efficiency of language models for code. These strategies aim to optimize inference latency and memory usage, though often at the cost of reduced model effectiveness. However, there is still a significant gap in understanding how these strategies influence the efficiency and effectiveness of language models for code. Here, we empirically investigate the impact of three well-known compression strategies - knowledge distillation, quantization, and pruning - across three different classes of software engineering tasks: vulnerability detection, code summarization, and code search. Our findings reveal that the impact of these strategies varies greatly depending on the task and the specific compression method employed. Practitioners and researchers can use these insights to make informed decisions when selecting the most appropriate compression strategy, balancing both efficiency and effectiveness based on their specific needs
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
