1,721,054 research outputs found
Set Graphs. III. Proof Pearl: Claw-Free Graphs Mirrored into Transitive Hereditarily Finite Sets
We report on the formalization of two classical results about claw-free graphs, which have been verified correct by Jacob T. Schwartz’s proof-checker Referee. We have proved formally that every connected claw-free graph admits (1) a near-perfect matching, (2) Hamiltonian cycles in its square. To take advantage of the set-theoretic foundation of Referee, we exploited set equivalents of the graph-theoretic notions involved in our experiment: edge, source, square, etc. To ease some
proofs, we have often resorted to weak counterparts of well-established notions such as cycle, claw-freeness, longest directed path, etc
Ranking, unranking and random generation of extensional acyclic digraphs
Extensional acyclic digraphs are acyclic digraphs whose vertices have pairwise different sets of out-neighbors; they represent hereditarily finite sets, which stand at the basis of some computer languages. In this paper we give an O(n^3) algorithm for generating uniformly at random an extensional acyclic digraph on n vertices. This is done by first proposing a linear-time algorithm for encoding such digraphs by particular (n−1)-tuples of subsets of {0,...,n−2}. We then give a new counting recurrence for such tuples, which we exploit in ranking/unranking algorithms. These are also useful for indexing data structures by hereditarily finite sets
Using aetnanova to formally prove that the Davis-Putnam satisfiability test is correct
This paper reports on using the ÆtnaNova/Referee proof-verification system to formalize issues regarding the satisfiability of CNF-formulae of propositional logic. We specify an “archetype” version of the Davis-Putnam-Logemann-Loveland algorithm through the THEORY of recursive functions based on a well-founded relation, and prove it to be correct.Within the same framework, and by resorting to the Zorn lemma, we develop a straightforward proof of the compactness theorem.</p
A Novel Combinatorial Method for Estimating Transcript Expression with RNA-Seq: Bounding the Number of PathsAlgorithms in Bioinformatics
RNA-Seq technology offers new high-throughput ways for transcript identification and quantification based on short reads, and has recently attracted great interest. The problem is usually modeled by a weighted splicing graph whose nodes stand for exons and whose edges stand for split alignments to the exons. The task consists of finding a number of paths, together with their expression levels, which optimally explain the coverages of the graph under various fitness functions, such least sum of squares. In (Tomescu et al. RECOMB-seq 2013) we showed that under general fitness functions, if we allow a polynomially bounded number of paths in an optimal solution, this problem can be solved in polynomial time by a reduction to a min-cost flow program. In this paper we further refine this problem by asking for a bounded number k of paths that optimally explain the splicing graph. This problem becomes NP-hard in the strong sense, but we give a fast combinatorial algorithm based on dynamic programming for it. In order to obtain a practical tool, we implement three optimizations and heuristics, which achieve better performance on real data, and similar or better performance on simulated data, than state-of-the-art tools Cufflinks, IsoLasso and SLIDE. Our tool, called Traph, is available at http://www.cs.helsinki.fi/gsa/traph/
Indexes for Jumbled Pattern Matching in Strings, Trees and Graphs
We consider how to index strings, trees and graphs for jum- bled pattern matching when we are asked to return a match if one exists. For example, we show how, given a tree containing two colours, we can build a quadratic-space index with which we can find a match in time proportional to the size of the match. We also show how we need only linear space if we are content with approximate matches
Optimal Omnitig Listing for Safe and Complete Contig Assembly
Genome assembly is the problem of reconstructing a genome sequence from a set of reads from a sequencing experiment. Typical formulations of the assembly problem admit in practice many genomic reconstructions, and actual genome assemblers usually output contigs, namely substrings that are promised to occur in the genome. To bridge the theory and practice, Tomescu and Medvedev [RECOMB 2016] reformulated contig assembly as finding all substrings common to all genomic reconstructions. They also gave a characterization of those walks (omnitigs) that are common to all closed edge-covering walks of a (directed) graph, a typical notion of genomic reconstruction. An algorithm for listing all maximal omnitigs was also proposed, by launching an exhaustive visit from every edge.
In this paper, we prove new insights about the structure of omnitigs and solve several open questions about them. We combine these to achieve an O(nm)-time algorithm for outputting all the maximal omnitigs of a graph (with n nodes and m edges). This is also optimal, as we show families of graphs whose total omnitig length is Omega(nm). We implement this algorithm and show that it is 9-12 times faster in practice than the one of Tomescu and Medvedev [RECOMB 2016]
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
