Search CORE

1,721,051 research outputs found

A toolbox for software architecture design

Author: Falessi D
FALESSI DAVIDE
Publication venue
Publication date: 07/01/2008
Field of study

L'architettura software è emersa come una importante tematica, nel campo del software engineering, per gestire la complessità relativa allo sviluppo e manutenzione di grandi sistemi software. L' idea chiave di questa tesi è che non c'è alcuna soluzione miracolosa nell' ambito del software engineering ma ogni tecnica ha i suoi vantaggi e svantaggi; di conseguenza, la bontà di un determinato metodo varia in base alle caratteristiche del contesto di applicazione. Un famoso proverbio dice: 1)un cattivo artigiano incolpa i suoi strumenti, 2) il peggior artigiano sceglie lo strumento sbagliato e poi da colpa allo strumento, 3) un buon artigiano sceglie ed usa gli strumenti in maniera opportuna. Mentre la comunità del software engineering si è occupata prevalentemente di offrire sempre nuovi metodi il cui utilizzo è supposto dare dei risultati migliori dei precedenti, c'è un vuoto nell' assistere l'architetto del sistema software nel selezionare o mettere a punto il giusto metodo. In questo contesto, la presente tesi offre un insieme di strumenti per la progettazione dell' architettura software, dal quale l' architetto di sistema è agevolato a selezione (e mettere appunto) il miglior strumento da utilizzare in base alle peculiarità del contesto di applicazione. Progettare una architettura software è un' attività molto complicata e involve differente tipologie di attività; la presente tesi riguarda le seguenti attività: metodi di progettazione dell' architettura software; tecniche decisionali per giungere a compromessi durane la progettazione dell' architettura software; strategie per applicare studi empirici sui metodi di progettazione architetturale; strategie di documentazioni delle decisioni progettuali.Software architecture has emerged as an important field of software engineering for managing the realm of large-system development and maintenance. The main intent of software architecture is to provide intellectual control over a sophisticated system enormous complexity. The key idea of this dissertation is that there is no silver bullet in software engineering, each method has pros and cons; the goodness of a method (tool, technique, etc.) varies based on the peculiarities of the application context. According to a famous idiom: i) a poor craftsman blames his tool, ii) a really bad craftsman chooses the wrong tool for the job and then he blames the tool, iii) a good craftsman chooses and uses the tool well. While the software engineering community has been mainly focused on providing new methods, which usage are aimed/supposed to provide better results than older methods, there is a lack in helping the software practitioners in selecting/adapting the available tools. Hence, in this view, the contribution of the present dissertation, instead of being a new method for architectural design, which would have been easily forgotten in a bookcase, is a characterization of the existing methods. In other words, this dissertation provides a toolbox for software architecture design, from which software architects can select the best method to apply, according to the application context. In this dissertation, the available architectural methods have been characterized by means of empirical studies. Unfortunately, the application of empirical methods on software architecture includes some troubles. A contribution of the present dissertation is a characterization of the available empirical methods by exposing their levels of challenges that researchers have to face when applying empiricism to software architecture. Such a proposed characterization should help to increase both the number and validity of software architecture empirical studies by allowing researchers to select the most suitable empirical method(s) to apply (i.e. the one with minor challenges), based on the application contexts (e.g. available software applications, architects, reviewers). However, in our view, in order to provide high levels of conclusion and internal validity, empirical methods for software architecture should be oriented to take advantage of both quantitative and qualitative data. Moreover, based on the results from two experiments, the challenges, in conducting evidence-based software architecture investigations, might 1) highly influence the results of the empirical studies, and 2) be faced by empiricistsâ cleverness. Architecting software system is a complex job and it encompasses several activities; this dissertation focuses on the following families of activities: software architecture design, resolving architectural tradeoffs, documenting design decisions, and enacting empirical studies on software architecture (as just described). Regarding the resolution of architectural tradeoffs, based on our review of already proposed decision making techniques, we realized that no one of the available decision-making technique can be considered in general better than another; each technique has intrinsically its own level of complexity and proneness to specific problems. Since we cannot decide in advance what degree of complexity of modeling is sufficient, instead of proceeding by trial and error, we offered guidelines on which complexity to emphasize for avoiding specific problem(s). Our key idea is to expose and organize in a useful way, namely by a characterization schema, in what extent each decision-making technique is prone to specific problems. In this way, the level of proneness of specific technique to specific problems becomes a quality attribute of the decision-making technique. Furthermore, we situated in the proposed characterization schema eighteen different decision-making techniques already proposed by the literature in the domains of architecture design, COTS selection, and release planning. Such localization validates the completeness of the proposed characterization schema, and it provides a useful reference for analyzing the state of the art Regarding software architecture design, this dissertation tried to answer to following question: " Do actual software architecture design methods meet architects needs?" To do so, we provide a characterization of the available methods by defining nine categories of software architectsâ needs, proposing an ordinal scale for evaluating the degree to which a given software architecture design method meets the needs, and then applying this to a set of software architecture design methods. Based on results from the present study, we argue that there are two opposite but valid answers to the aforementioned question: a) Yes, they do. In fact, we showed that one or more software architecture design methods are able to meet each individual architect needs that we considered. b) No, they do not. In fact, we showed that there is no software architecture design method that is able to meet any tuple of seven or more needs, which means that there is still some work to do to improve software architecture design methods to actually help architects. In order to provide directions for software architecture design method improvement, we presented couples of needs, and triplets of needs that actual software architecture design methods are unable to meet. Moreover, an architect can use such characterization to choose the software architecture design method which better meets his needs. Regarding design decision documentation, we conducted a controlled experiment for analyzing the impact of documenting design decisions rationale on effectiveness and efficiency of individual/team decision-making in presence of requirement changes. Main results show that, for both individual and team-based decision-making, effectiveness significantly improves, while efficiency remains unaltered, when decision-makers are allowed to use, rather not use, the proposed design rationale documentation technique. Being sure that documenting design-decisions rationale does help, we argued why it is not used in practice and what we can do to facilitate its usage. Older design decisions rationale documentation methods aimed at maximizing the consumer (documentation reader) benefits by forcing the producer (documentation writer) to document all the potential useful information; they eventually ran into too many inhibitors to be used in practice. In this dissertation we propose a value-based approach for documenting the reasons behind design decision, which is based on a priori understanding of who will benefit later on, from what set of information, and in which amount. Such a value-based approach for documenting the reasons behind design decision offers means to mitigate all the known inhibitors and it is based on the hypothesis that the set of required information depends on the purpose (use case) of the documentation. In order to validate such a hypothesis we ran an experiment in a controlled environment, employing fifty subjects, twenty-five decisions, and five different purposes (documentation use case) of the documentation. Each subjects practically used the documentation to enact all the five documentation use case(s) by providing an answer and a level of utility for each category of information in the provided documentation. Both descriptive and statistical results confirm our expectancies that the level of utility, related to the same category of information in the design decision rationale documentation, significantly changes according to the purpose of the documentation. Such result is novel and implies that the consumer of the rationale documentation requires, or not, a specific category of information according the specific purpose of the documentation. Consequently, empirical results suggest that the producer can tailor the design decision rationale documentation by including only the information required for the expected purposes of the documentation. This result demonstrates the feasibility of our proposed value-based approach for documenting the reasons behind design decision

ART

Validating and prioritizing quality rules for managing technical debt: An industrial case study

Author: Voegele A.
Falessi D.
Publication venue
Publication date: 01/01/2015
Field of study

One major problem in using static analyzers to manage, monitor, control, and reason about technical debt is that industrial projects have a huge amount of technical debt which reflects hundreds of quality rule violations (e.g., high complex module or low comment density). Moreover the negative impact of violating quality rules (i.e., technical debt interest) may vary across rules or even across contexts. Thus, without a context-specific validation and prioritization of quality rules, developers cannot effectively manage technical debt. This paper reports on a case study aimed at exploring the interest associated with violating quality rules; i.e., we investigate if and which quality rules are important for software developers. Our empirical method consists of a survey and a quantitative analysis of the historical data of a CMMI Level 5 software company. The main result of the quantitative analysis is that classes violating several quality rules are five times more defect prone than classes not violating any rule. The main result of the survey is that some rules are perceived by developers as more important than others; however, there is no false positive (i.e., incorrect rule or null interest). These results pave the way to a better practical use of quality rules to manage technical debt and describe new research directions for building a scientific foundation to the technical debt metaphor

ART

The Effort Savings from Using NLP to Classify Equivalent Requirements

Author: Cantone G
Falessi D
Publication venue
Publication date: 08/01/2019
Field of study

When Considering What and how to reuse, one must understand the differences and similarities of the systems being developed; this activity is part of the domain analysis. Among the several ways to perform domain analysis, identifying equivalent requirements (that is, the common elements in the domain) is scalable and noninvasive, and it supports the consolidation of requirements

ART

Towards an open-source tool for measuring and visualizing the interest of technical debt

Author: Reichel A.
Falessi D.
Publication venue
Publication date: 01/01/2015
Field of study

Current tools for managing technical debt are able to report the principal of the debt, i.e., the amount of effort required to fix all the quality rules violated in a project. However, they do not report the interest, i.e., the disadvantages the project had or will have due to quality rules violations. As a consequence, the user lacks support in understanding how much the principal should be reduced and why. We claim that information about the interest is, at least, as important as the information about the principal; the interest should be quantified and treated as a first-class entity like the principal. In this paper we aim to advance the state of the art of how the interest is measured and visualized. The goal of the paper is to describe MIND, an open-source tool which is, to the best of our knowledge, the first tool supporting the quantification and visualization of the interest. MIND, by analyzing historical data coming from Redmine and Git repositories, reports the interest incurring in a software project in terms of how many extra defects occurred, or will occur, due to quality rules violations. We evaluated MIND by using it to analyze a software project stored in a dataset of more than a million lines of code. Results suggest that MIND accurately measures the interest of technical debt

ART

Facilitating feasibility analysis: the pilot defects prediction dataset maker

Author: Max Jason Moede
Davide Falessi
Falessi D.
Moede M. J.
Publication venue
Publication date: 01/01/2018
Field of study

Our industrial experience in institutionalizing defect prediction models in the software industry shows that the first step is to measure prediction metrics and defects to assess the feasibility of the tool, i.e., if the accuracy of the defect prediction tool is higher than of a random predictor. However, computing prediction metrics is time consuming and error prone. Thus, the feasibility analysis has a cost which needs some initial investment by the potential clients. This initial investment acts as a barrier for convincing potential clients of the benefits of institutionalizing a software prediction model. To reduce this barrier, in this paper we present the Pilot Defects Prediction Dataset Maker (PDPDM), a desktop application for measuring metrics to use for defect prediction. PDPDM receives as input the repository's information of a software project, and it provides as output, in an easy and replicable way, a dataset containing a set of 17 well-defined product and process metrics, that have been shown to be useful for defect prediction, such as size and smells. PDPDM avoids the use of outdated datasets and it allows researchers and practitioners to create defect datasets without the need to write any lines of code

Crossref

ART

Issue tracking systems: What developers want and use

Author: Khosmood F.
Falessi D.
Hernandez F.
Publication venue
Publication date: 01/01/2019
Field of study

An Issue Tracking System (ITS) allows a developer to keep track of, prioritize, and assign multitudes of bugs, feature requests, and other development tasks such as testing. Despite ITSs play a significant role in day-to-day developers' activities, no previous study investigated what developers want and use in an ITS. The aim of this paper is twofold. First, we provide a feature matrix that maps six of the most used ITS to features, and second, we measure the developers' level of use and perceived importance of each feature. This knowledge has multiple benefits such as supporting the decision of the ITS to use and revealing promising areas of research and development. Specifically, quality improvement effort should target improving functionality in use, and development effort should target supporting functionalities needed. In this paper, we define and extract ten core ITS features and asked more than a hundred developers to rate their importance and use. Our results show that Advanced Search and Flexible Notifications are the most important features. Moreover, results show that no feature has been used by more than 90% of the respondents. Another interesting finding is that 27% of respondents rate Workflow Automation as a useful or required feature, despite having never used it themselves; this suggests the need to better training, exposure or of availability of ITS features. In conclusion, our results pave the way to significant research and development effort on ITS

ART

What if i Had No Smells?

Author: Falessi D.
Russo B.
Mullen K.
Publication venue
Publication date: 01/01/2017
Field of study

What would have happened if I did not have any code smell? This is an interesting question that no previous study, to the best of our knowledge, has tried to answer. In this paper, we present a method for implementing a what-if scenario analysis estimating the number of defective files in the absence of smells. Our industrial case study shows that 20% of the total defective files were likely avoidable by avoiding smells. Such estimation needs to be used with the due care though as it is based on a hypothetical history (i.e., zero number of smells and same process and product change characteristics). Specifically, the number of defective files could even increase for some types of smells. In addition, we note that in some circumstances, accepting code with smells might still be a good option for a company

ART

Snoring: A noise in defect prediction datasets

Author: Di Penta M.
Falessi D.
Ahluwalia A.
Publication venue
Publication date: 01/01/2019
Field of study

In order to develop and train defect prediction models, researchers rely on datasets in which a defect is often attributed to a release where the defect itself is discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon as 'sleeping defects'. We call 'snoring' the phenomenon where classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this paper we analyze, on data from 282 releases of six open source projects from the Apache ecosystem, the magnitude of the sleeping defects and of the snoring classes. Our results indicate that 1) on all projects, most of the defects in a project slept for more than 20% of the existing releases, and 2) in the majority of the projects the missing rate is more than 25% even if we remove 50% of releases

ART

The Impact of Dormant Defects on Defect Prediction: A Study of 19 Apache Projects

Author: Falessi D
Di Penta M
Ahluwalia A
Publication venue
Publication date: 01/01/2022
Field of study

Defect prediction models can be beneficial to prioritize testing, analysis, or code review activities, and has been the subject of a substantial effort in academia, and some applications in industrial contexts. A necessary precondition when creating a defect prediction model is the availability of defect data from the history of projects. If this data is noisy, the resulting defect prediction model could result to be unreliable. One of the causes of noise for defect datasets is the presence of "dormant defects," i.e., of defects discovered several releases after their introduction. This can cause a class to be labeled as defect-free while it is not, and is, therefore "snoring." In this article, we investigate the impact of snoring on classifiers' accuracy and the effectiveness of a possible countermeasure, i.e., dropping too recent data from a training set. We analyze the accuracy of 15 machine learning defect prediction classifiers, on data from more than 4,000 defects and 600 releases of 19 open source projects from the Apache ecosystem. Our results show that on average across projects (i) the presence of dormant defects decreases the recall of defect prediction classifiers, and (ii) removing from the training set the classes that in the last release are labeled as not defective significantly improves the accuracy of the classifiers. In summary, this article provides insights on how to create defects datasets by mitigating the negative effect of dormant defects on defect prediction

ART

STRESS: A Semi-Automated, Fully Replicable Approach for Project Selection

Author: Serebrenik A.
Falessi D.
Smith W.
Publication venue
Publication date: 01/01/2017
Field of study

The mining of software repositories has provided significant advances in a multitude of software engineering fields, including defect prediction. Several studies show that the performance of a software engineering technology (e.g., prediction model) differs across different project repositories. Thus, it is important that the project selection is replicable. The aim of this paper is to present STRESS, a semi-automated and fully replicable approach that allows researchers to select projects by configuring the desired level of diversity, fit, and quality. STRESS records the rationale behind the researcher decisions and allows different users to re-run or modify such decisions. STRESS is open-source and it can be used used locally or even online (www.falessi.com/STRESS/). We perform a systematic mapping study that considers studies that analyzed projects managed with JIRA and Git to asses the project selection replicability of past studies. We validate the feasible application of STRESS in realistic research scenarios by applying STRESS to select projects among the 211 Apache Software Foundation projects. Our systematic mapping study results show that none of the 68 analyzed studies is completely replicable. Regarding STRESS, it successfully supported the project selection among all 211 ASF projects. It also supported the measurement of 100 projects characteristics, including the 32 criteria of the studies analyzed in our mapping study. The mapping study and STRESS are, to our best knowledge, the first attempt to investigate and support the replicability of project selection. We plan to extend them to other technologies such as GitHub

ART