1,721,256 research outputs found
A Unifying Taxonomy of Pattern Matching in Degenerate Strings and Founder Graphs
Elastic Degenerate (ED) strings and Elastic Founder (EF) graphs are two versions of acyclic components of pangenomes. Both ED strings and EF graphs (which we collectively name variable strings) extend the well-known notion of indeterminate string. Recent work has extensively investigated algorithmic tasks over these structures, and over several other variable strings notions that they generalise. Among such tasks, the basic operation of matching a pattern into a text, which can serve as a toolkit for many pangenomic data analyses using these data structures, deserves special attention. In this paper we: (1) highlight a clear taxonomy within both ED strings and EF graphs ranging through variable strings of all types, from the linear string up to the most general one; (2) investigate the problem PvarT(X,Y) of matching a solid or variable pattern of type X into a variable text of type Y; (3) using as a reference the quadratic conditional lower bounds that are known for PvarT(solid,ED) and PvarT(solid,EF), for all possible types of variable strings X and Y we either prove the quadratic conditional lower bound for PvarT(X,Y), or provide non-trivial, often sub-quadratic, upper bounds, also exploiting the above-mentioned taxonomy
A PAN-CANCER ANALYSIS OF ALTERNATIVE PROMOTERS USING RNA-SEQ DATA
Ph.DDOCTOR OF PHILOSOPHY (SOC
Front Matter, Table of Contents, Preface, Conference Organization
Front Matter, Table of Contents, Preface, Conference Organizatio
Minimal Phylogenetic Supertrees and Local Consensus Trees
The problem of constructing a minimally resolved phylogenetic supertree (i.e., having the smallest possible number of internal nodes) that contains all of the rooted triplets from a consistent set R is known to be NP-hard. In this paper, we prove that constructing a phylogenetic tree consistent with R that contains the minimum number of additional rooted triplets is also NP-hard, and develop exact, exponential-time algorithms for both problems. The new algorithms are applied to construct two variants of the local consensus tree;
for any set S of phylogenetic trees over some leaf label set L,
this gives a minimal phylogenetic tree over L that contains every
rooted triplet present in all trees in S, where ``minimal'' means either having the smallest possible number of internal nodes or
the smallest possible number of rooted triplets. The second variant generalizes the RV-II tree, introduced by Kannan, Warnow, and Yooseph in 1998
LIPIcs, Volume 312, WABI 2024, Complete Volume
LIPIcs, Volume 312, WABI 2024, Complete Volum
Calling large indels in 1047 Arabidopsis with IndelEnsembler
10.1093/nar/gkab904NUCLEIC ACIDS RESEARCH491910879-1089
Fixed parameter polynomial time algorithms for maximum agreement and compatible supertrees
Proceedings of the 25th International Symposium on Theoretical Aspects of Computer Science, STACS 2008361-37
MACHINE LEARNING ALGORITHMS FOR THE IDENTIFICATION OF CANCER CELLS USING GENE EXPRESSION DATA
Ph.DDOCTOR OF PHILOSOPHY (SOC
On Finding the Adams Consensus Tree
This paper presents a fast algorithm for finding the Adams consensus tree of a set of conflicting phylogenetic trees with identical leaf labels, for the first time improving the time complexity of a widely used algorithm invented by Adams in 1972 [1]. Our algorithm applies
the centroid path decomposition technique [9] in a new way to traverse the input trees' centroid paths in unison, and runs in O(k n \log n) time, where k is the number of input trees and n is the size of the leaf label set. (In comparison, the old algorithm from 1972 has a worst-case running time of O(k n^2).) For the special case of k = 2, an even faster algorithm running in O(n \cdot \frac{\log n}{\log\log n}) time is provided, which relies on an extension of the wavelet tree-based technique by Bose et al. [6] for orthogonal range counting on a grid.
Our extended wavelet tree data structure also supports truncated
range maximum queries efficiently and may be of independent interest to algorithm designers
ASMdb: A comprehensive database for allele-specific DNA methylation in diverse organisms
10.1093/nar/gkab937Nucleic Acids Research50D1D60-D7
- …
