1,721,256 research outputs found

    A Unifying Taxonomy of Pattern Matching in Degenerate Strings and Founder Graphs

    Full text link
    Elastic Degenerate (ED) strings and Elastic Founder (EF) graphs are two versions of acyclic components of pangenomes. Both ED strings and EF graphs (which we collectively name variable strings) extend the well-known notion of indeterminate string. Recent work has extensively investigated algorithmic tasks over these structures, and over several other variable strings notions that they generalise. Among such tasks, the basic operation of matching a pattern into a text, which can serve as a toolkit for many pangenomic data analyses using these data structures, deserves special attention. In this paper we: (1) highlight a clear taxonomy within both ED strings and EF graphs ranging through variable strings of all types, from the linear string up to the most general one; (2) investigate the problem PvarT(X,Y) of matching a solid or variable pattern of type X into a variable text of type Y; (3) using as a reference the quadratic conditional lower bounds that are known for PvarT(solid,ED) and PvarT(solid,EF), for all possible types of variable strings X and Y we either prove the quadratic conditional lower bound for PvarT(X,Y), or provide non-trivial, often sub-quadratic, upper bounds, also exploiting the above-mentioned taxonomy

    A PAN-CANCER ANALYSIS OF ALTERNATIVE PROMOTERS USING RNA-SEQ DATA

    No full text
    Ph.DDOCTOR OF PHILOSOPHY (SOC

    Front Matter, Table of Contents, Preface, Conference Organization

    No full text
    Front Matter, Table of Contents, Preface, Conference Organizatio

    Minimal Phylogenetic Supertrees and Local Consensus Trees

    Full text link
    The problem of constructing a minimally resolved phylogenetic supertree (i.e., having the smallest possible number of internal nodes) that contains all of the rooted triplets from a consistent set R is known to be NP-hard. In this paper, we prove that constructing a phylogenetic tree consistent with R that contains the minimum number of additional rooted triplets is also NP-hard, and develop exact, exponential-time algorithms for both problems. The new algorithms are applied to construct two variants of the local consensus tree; for any set S of phylogenetic trees over some leaf label set L, this gives a minimal phylogenetic tree over L that contains every rooted triplet present in all trees in S, where ``minimal'' means either having the smallest possible number of internal nodes or the smallest possible number of rooted triplets. The second variant generalizes the RV-II tree, introduced by Kannan, Warnow, and Yooseph in 1998

    LIPIcs, Volume 312, WABI 2024, Complete Volume

    No full text
    LIPIcs, Volume 312, WABI 2024, Complete Volum

    Calling large indels in 1047 Arabidopsis with IndelEnsembler

    No full text
    10.1093/nar/gkab904NUCLEIC ACIDS RESEARCH491910879-1089

    Fixed parameter polynomial time algorithms for maximum agreement and compatible supertrees

    Full text link
    Proceedings of the 25th International Symposium on Theoretical Aspects of Computer Science, STACS 2008361-37

    MACHINE LEARNING ALGORITHMS FOR THE IDENTIFICATION OF CANCER CELLS USING GENE EXPRESSION DATA

    No full text
    Ph.DDOCTOR OF PHILOSOPHY (SOC

    On Finding the Adams Consensus Tree

    Full text link
    This paper presents a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels, for the first time improving the time complexity of a widely used algorithm invented by Adams in 1972 [1]. Our algorithm applies the centroid path decomposition technique [9] in a new way to traverse the input trees' centroid paths in unison, and runs in O(k n \log n) time, where k is the number of input trees and n is the size of the leaf label set. (In comparison, the old algorithm from 1972 has a worst-case running time of O(k n^2).) For the special case of k = 2, an even faster algorithm running in O(n \cdot \frac{\log n}{\log\log n}) time is provided, which relies on an extension of the wavelet tree-based technique by Bose et al. [6] for orthogonal range counting on a grid. Our extended wavelet tree data structure also supports truncated range maximum queries efficiently and may be of independent interest to algorithm designers

    ASMdb: A comprehensive database for allele-specific DNA methylation in diverse organisms

    No full text
    10.1093/nar/gkab937Nucleic Acids Research50D1D60-D7
    corecore