1,720,986 research outputs found

    Learning from bug-introducing changes to prevent fault prone code

    No full text
    A version control system, such as CVS/SVN, can provide the history of software changes performed during the evolution of a software project. Among all the changes performed there are some which cause the introduction of bugs, often resolved later with other changes. In this paper we use a technique to identify bug-introducing changes to train a model that can be used to predict if a new change may introduces or not a bug. We represent software changes as elements of a n-dimensional vector space of terms coordinates extracted from source code snapshots. The evaluation of various learning algorithms on a set of open source projects looks very promising, in particular for KNN (K-Nearest Neighbor algorithm) where a significant tradeoff between precision and recall has been obtained. © 2007 AC

    Relationship between design patterns defects and crosscutting concern scattering degree: an empirical study

    No full text
    Design patterns are solutions to recurring design problems, aimed at increasing reuse, code quality and, above all, maintainability and resilience to changes. Despite such advantages, the usage of design patterns implies the presence of crosscutting code implementing the pattern usage and access from other system components. When the system evolves, the presence of crosscutting code can cause repeated changes, possibly introducing defects. This study reports an empirical study, in which it is showed that, for three open source projects, the number of defects in design-pattern classes is in several cases correlated with the scattering degree of their induced crosscutting concerns, and also varies among different kinds of pattern

    Labeling negative examples in supervised learning of new gene regulatory connections

    No full text
    Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. The basic approach consists into building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases or known in literature. Such a classifier is then used to predict new unknown connections. The quality of the training set plays a crucial role in such an inference scheme. In binary classification the training set should be composed of positive and negative examples, but in Biology literature the only collected information is whether two genes interact. Instead, the counterpart information is usually not reported, as Biologists are not aware to state whether two genes are not interacting. The over presence of topology motifs in currently known gene regulatory networks, such as, feed-forward loops, bi-fan clusters, and single input modules, could drive the selection of reliable negative examples. We introduce, discuss, and evaluate a number of negative selection heuristics that exploits the known gene network topology of Escherichia coli and Saccharomyces cerevisiae. © 2011 Springer-Verlag Berlin Heidelberg

    A negative selection heuristic to predict new transcriptional targets

    No full text
    Background: Supervised machine learning approaches have been recently adopted in the inference of transcriptional targets from high throughput trascriptomic and proteomic data showing major improvements from with respect to the state of the art of reverse gene regulatory network methods. Beside traditional unsupervised techniques, a supervised classifier learns, from known examples, a function that is able to recognize new relationships for new data. In the context of gene regulatory inference a supervised classifier is coerced to learn from positive and unlabeled examples, as the counter negative examples are unavailable or hard to collect. Such a condition could limit the performance of the classifier especially when the amount of training examples is low.Results: In this paper we improve the supervised identification of transcriptional targets by selecting reliable counter negative examples from the unlabeled set. We introduce an heuristic based on the known topology of transcriptional networks that in fact restores the conventional positive/negative training condition and shows a significant improvement of the classification performance. We empirically evaluate the proposed heuristic with the experimental datasets of Escherichia coli and show an example of application in the prediction of BCL6 direct core targets in normal germinal center human B cells obtaining a precision of 60%.Conclusions: The availability of only positive examples in learning transcriptional relationships negatively affects the performance of supervised classifiers. We show that the selection of reliable negative examples, a practice adopted in text mining approaches, improves the performance of such classifiers opening new perspectives in the identification of new transcriptional targets. © 2013 Cerulo et al.; licensee BioMed Central Ltd

    The Evolution and Decay of Statically Detected Source Code Vulnerabilities

    No full text
    The presence of vulnerable statements in the source code is a crucial problem for maintainers: properly monitoring and, if necessary, removing them is highly desirable to ensure high security and reliability. To this aim, a number of static analysis tools have been developed to detect the presence of instructions that can be subject to vulnerability attacks, ranging from buffer overflow exploitations to command injection and cross-site scripting. Based on the availability of existing tools and of data extracted from software repositories, this paper reports an empirical study on the evolution of vulnerable statements detected in three software systems with different static analysis tools. Specifically, the study investigates on vulnerability evolution trends and on the decay time exhibited by different kinds of vulnerabilities. © 2008 IEE

    The life and death of statically detected vulnerabilities: An empirical study

    No full text
    Vulnerable statements constitute a major problem for developers and maintainers of networking systems. Their presence can ease the success of security attacks, aimed at gaining unauthorized access to data and functionality, or at causing system crashes and data loss. Examples of attacks caused by source code vulnerabilities are buffer overflows, command injections, and cross-site scripting. This paper reports on an empirical study, conducted across three networking systems, aimed at observing the evolution and decay of vulnerabilities detected by three freely available static analysis tools. In particular, the study compares the decay of different kinds of vulnerabilities, characterizes the decay likelihood through probability density functions, and reports a quantitative and qualitative analysis of the reasons for vulnerability removals. The study is performed by using a framework that traces the evolution of source code fragments across subsequent commits

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore