1,721,012 research outputs found

    Protein function prediction from interaction networks using a random walk ranking algorithm

    No full text
    Predicting protein function at the proteomic-scale is a key task in computational systems biology. High-throughput experimental methods have recently made available many protein interaction networks that need to be analyzed in order to provide insight into the functional role of proteins in the organization of the cell. We propose here a new approach to computational function annotation of protein interaction maps based on a random walk algorithm. Our method exploits the whole topology of the network according to the basic principles of a ranking algorithm for link analysis. We apply the proposed algorithm to analyze the yeast protein interaction network and show that it represents a valid alternative to other annotation techniques based on network analysis by comparing it with the effective majority vote algorith

    A graph-based semi-supervised algorithm for protein function prediction from interaction maps.

    No full text
    Protein function prediction represents a fundamental challenge in bioinformatics. The increasing availability of proteomics network data has enabled the development of several approaches that exploit the information encoded in networks in order to infer protein function. In this paper we introduce a new algorithm based on the concept of topological overlap between nodes of the graph, which addresses the problem of the classification of partially labeled protein interaction networks. The proposed approach is tested on the yeast interaction map and compared with two current state-of-the-art algorithms. Cross-validation experiments provide evidence that the proposed method represents a competitive alternative in a wide range of experimental conditions and also that, in many cases, it provides enhanced predictive accuracy

    Improved Biological Network Reconstruction Using Graph Laplacian Regularization

    No full text
    Biological networks reconstruction is a crucial step towards the functional characterization and elucidation of living cells. Computational methods for inferring the structure of these networks are of paramount importance since they provide valuable information regarding organization and behavior of the cell at a system level and also enable careful design of wet-lab experiments. Despite many recent advances, according to the scientific literature, there is room for improvements from both the efficiency and the accuracy point of view in link prediction algorithms. In this article, we propose a new method for the inference of biological networks that makes use of a notion of similarity between graph vertices within the framework of graph regularization for ranking the links to be predicted. The proposed approach results in more accurate classification rates in a wide range of experiments, while the computational complexity is reduced by two orders of magnitude with respect to many current state-of-the-art algorithms

    A lossy compression technique enabling duplication-aware sequence alignment

    No full text
    In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment

    Longest Common Subsequence between Run-Length-Encoded Strings: a New Algorithm with Improved Parallelism

    No full text
    Data compression can be used to simultaneously reduce memory, communication and computation requirements of string comparison. In this paper we address the problem of computing the length of the longest common subsequence (LCS) between run-length-encoded (RLE) strings. We exploit RLE both to reduce the complexity of LCS computation from O(M×N) to O(mN+Mn−mn), where M and N are the lengths of the original strings and m and n the number of runs in their RLE representation, and to improve the inherent parallelism of the proposed algorithm, so that it may execute in O(m+n) steps on a systolic array of M+N units. We also discuss the application of the proposed algorithm to the related problem of edit distance (ED) computation

    Parallel Comparison of Run-Length-Encoded Strings on a Linear Systolic Array

    No full text
    The length of the longest common subsequence (LCS) between two strings of M and N characters can be computed by an O(M × N) dynamic programming algorithm, that can be executed in O(M + N) steps by a linear systolic array. It has been recently shown that the LCS between run-length-encoded (RLE) strings of m and n runs can be computed by an O(nM + Nm − nm) algorithm that could be executed in O(m + n) steps by a parallel hardware. However, the algorithm cannot be directly mapped on a linear systolic array because of its irregular structure. In this paper, we propose a modified algorithm that exhibits a more regular structure at the cost of a marginal reduction of the efficiency of RLE. We outline the algorithm and we discuss its mapping on a linear systolic array

    A faster algorithm for the computation of string convolutions using LZ78 parsing

    No full text
    String convolution between vectors of integers representing a pattern and a text is a widely used computational primitive in string processing. In this paper, we investigate the use of an algorithmic framework which exploits sequence repetitions (identified according to the Lempel-Ziv parsing technique, i.e., the LZ78 algorithm) to speed up conventional algorithms (based on Fast Fourier Transform) for the computation of convolution between a pattern and a text, when the text is long enough and the pattern is sufficiently small. In particular, we present a deterministic algorithm which, given a text T of length n (drawn from a constant size alphabet |SigmaT|) and a pattern P of length m (drawn from a constant size alphabet |SigmaP|), computes the convolution between P and T with time and space complexity O(n + (nm/logn)h), where h is the entropy of text T

    A Study on the Impact of Packet Length on Communication in Low Power Wireless Sensor Networks under Interference

    Full text link
    Reliability is nowadays considered a key requirement in wireless sensor networks for their increasing diffusion in various Internet of Things applications. However, radio interference from various sources may heavily affect the performance of wirelessly connected embedded devices, resulting into increased packet collisions and congestions. There is therefore a widely recognised need of both theoretical and practical investigations capable of shedding light on the factors affecting the operativeness of sensor networks subject to interference. In this article we investigate the role of packet length into the reliability and energy efficiency of low-power medium access protocols. Specifically, we propose a mathematical model to explore the functional dependence of the reliability of a pair of sensor nodes under interference from packet length. We also present a wide range of experimental evaluations aimed at validating the model and at providing novel insights on dependability issues. In particular, we assess the performance of Contiki’s default MAC layer (together with that of an always listening receiver, as a baseline) in terms of packet loss rate and energy efficiency for varying payload lengths. Experimental results highlight the interplay between packet size and interference and in particular the trade-off between the robustness against interference and the overhead imposed to communication as a function of the length of data packets. The Pareto curve describing the energy efficiency as a function of the packet loss rate, demonstrates the existence of intermediate packet size representing an optimal choice for balancing energy consumption and communication reliability, enabling adequate system dimensioning at design-level

    Randomized Gossip with Power of Two Choices for Energy Aware Distributed Averaging

    No full text
    Distributed computation of average values held by nodes belonging to a self-organized network is a key task in many application areas, ranging from sensor and ad-hoc networks to networked control systems. Severe computational, communication, and energy constraints typical of these environments prompt for the design of specific solutions addressing these issues. In this context, gossip algorithms represent valuable approaches because of their simple local communication patterns, resulting into robustness to dynamic topology changes. Several variants of gossip-based techniques have been proposed, mainly focused on improvements of the convergence time, which directly impacts energy expenditure. Energy efficiency remains however a challenging issue to be addressed. In this paper we introduce a novel energy-aware distributed averaging algorithm which combines the standard randomized gossip protocol with a probabilistic load balancing technique, the power of two choices. Experimental results show that the proposed solution achieves better load balancing with respect to standard pairwise averaging, enabling considerable improvements in the network lifetime without impairing convergence time
    corecore