22 research outputs found
On the Relative Efficiency of Dynamic and Static Top-Down Compilation to Decision-DNNF
Top-down compilers of CNF formulas to circuits in decision-DNNF (Decomposable Negation Normal Form) have proved to be useful for model counting. These compilers rely on a common set of techniques including DPLL-style exploration of the set of models, caching of residual formulas, and connected components detection. Differences between compilers lie in the variable selection heuristics and in the additional processing techniques they may use. We investigate, from a theoretical perspective, the ability of top-down compilation algorithms to find small decision-DNNF circuits for two different variable selection strategies. Both strategies are guided by a graph of the CNF formula and are inspired by what is done in practice. The first uses a dynamic graph-partitioning approach while the second works with a static tree decomposition. We show that the dynamic approach performs significantly better than the static approach for some formulas, and that the opposite also holds for other formulas. Our lower bounds are proved despite loose settings where the compilation algorithm is only forced to follow its designed variable selection strategy and where everything else, including the many opportunities for tie-breaking, can be handled non-deterministically
Separating Incremental and Non-Incremental Bottom-Up Compilation
The aim of a compiler is, given a function represented in some language, to generate an equivalent representation in a target language L. In bottom-up (BU) compilation of functions given as CNF formulas, constructing the new representation requires compiling several subformulas in L. The compiler starts by compiling the clauses in L and iteratively constructs representations for new subformulas using an "Apply" operator that performs conjunction in L, until all clauses are combined into one representation. In principle, BU compilation can generate representations for any subformulas and conjoin them in any way. But an attractive strategy from a practical point of view is to augment one main representation - which we call the core - by conjoining to it the clauses one at a time. We refer to this strategy as incremental BU compilation. We prove that, for known relevant languages L for BU compilation, there is a class of CNF formulas that admit BU compilations to L that generate only polynomial-size intermediate representations, while their incremental BU compilations all generate an exponential-size core
An FPRAS for Model Counting for Non-Deterministic Read-Once Branching Programs
Non-deterministic read-once branching programs, also known as non-deterministic free binary decision diagrams (nFBDD), are a fundamental data structure in computer science for representing Boolean functions. In this paper, we focus on #nFBDD, the problem of model counting for non-deterministic read-once branching programs. The #nFBDD problem is #P-hard, and it is known that there exists a quasi-polynomial randomized approximation scheme for #nFBDD. In this paper, we provide the first FPRAS for #nFBDD. Our result relies on the introduction of new analysis techniques that focus on bounding the dependence of samples
Dataset of Random Reordered Encodings of Parity Problems
<p>Dataset created and used for the paper:</p>
<p><em>Hardness of Random Reordered Encodings of Parity for Resolution and CDCL</em>, Leroy Chew, Alexis de Colnet, Friedrich Slivovsky and Stefan Szeider, AAAI 2023</p>
<p>report.rmd is an R Markdown file that can be opened in RStudio and used to generate the figures in the submission</p>
<p>_stats file: lists the permutations used<br>.cnf file: DIMACS representation of unsat formulas<br>.gr : DIMACS representation of the Tseitin graph<br>.edges: alternative representation of the Tseitin graph<br>_half: DIMACS representation of the Tseitin graph for G*</p>
<p>n: number of input variables<br>p: probability of selection for each variable<br>s: random seed used<br>m: mode number used<br> 0 : uniformly random<br> 1: Mallows(n,q), q= (1/n)^(1/(k*k))<br> 3: performs k number of random adjacent swaps to make the permutation<br> 4: performs 10000000 potential random adjacent swaps but forbids the distance being >k<br> 5: uniformly but rerolls until the maximum permutation distance is k<br> 6: 3 parity constraints<br>k: integer parameter for mode</p>
A Lower Bound on DNNF Encodings of Pseudo-Boolean Constraints
International audienceTwo major considerations when encoding pseudo-Boolean (PB) constraints into SAT are the size of the encoding and its propagation strength, that is, the guarantee that it has a good behaviour under unit propagation. Several encodings with propagation strength guarantees rely upon prior compilation of the constraints into DNNF (decomposable negation normal form), BDD (binary decision diagram), or some other sub-variants. However it has been shown that there exist PB-constraints whose ordered BDD (OBDD) representations, and thus the inferred CNF encodings, all have exponential size. Since DNNFs are more succinct than OBDDs, preferring encodings via DNNF to avoid size explosion seems a legitimate choice. Yet in this paper, we prove the existence of PB-constraints whose DNNFs all require exponential size
Hard Functions in Knowledge Compilation : from Lower Bounds to Applications
Le thème de la thèse est la compilation de connaissances, une approche pour la résolution de problèmes difficiles à résoudre du point de vue du calcul et qui vise à réduire cette complexité en pré-traitant (en compilant) une partie du problème connue à l'avance et modélisée par une fonction. Il s'agit de trouver une représentation de la fonction dans une classe de représentations appelée langage de compilation. Pour un langage L donné, la taille d'une fonction dans L désigne la taille de sa plus petite représentation dans L. Dans cette thèse, nous étudions différents aspects de la compilation dans le langage DNNF, le langage où les fonctions Booléennes sont représentées par des circuits sous forme normale négative décomposable (DNNF).Dans la première partie de la thèse, nous montrons des bornes inférieures sur la taille de fonctions particulières dans le langage DNNF. Pour certaines fonctions, notamment des contraintes pseudo-Booléennes, nous obtenons des bornes inférieures par l'application de techniques usuelles pour l'analyse de circuits en DNNF. Pour d'autres fonctions, ces mêmes techniques s'avèrent insuffisantes. En particulier, pour les formules de Tseitin, qui sont des formules CNF représentant des systèmes d'équations linéaires structurés par des graphes, nous avons amélioré les techniques existantes pour montrer une borne exponentielle sur leur taille dans le langage DNNF. Quand la taille d'une fonction dans un langage est trop grande pour envisager sa compilation, on peut tenter de compiler une approximation de la fonction, on parle alors de compilation de connaissances approchée. Nous avons étudié deux notions d'approximation qui offrent des garanties sur l'erreur d'approximation. Pour chacune de ces notions, nous avons trouvé des exemples de fonctions dont toutes les approximations ont une taille trop grande dans de nombreux langages de compilation.Dans la seconde partie de la thèse, nous donnons des applications des nos bornes inférieures. Ces applications font le lien entre la compilation de connaissances et le domaine de la complexité des preuves. Notamment, nous utilisons notre borne inférieure sur la taille des formules de Tseitin dans le langage DNNF pour obtenir une caractérisation des formules de Tseitin non-satisfiables (dont les graphes sont de degré borné par une constante) pour lesquelles il existe des réfutations par résolution dites régulières de taille polynomiale en le nombre de variables. Une seconde application concerne à la fois certains systèmes de preuve et les compilateurs utilisant la compilation dite ascendante. La compilation ascendante a pour particularité qu'elle construit le circuit représentant la fonction initiale en combinant des circuits intermédiaires. Dans le cas de formules non-satisfiables, la séquence de circuits intermédiaires générés lors d'une compilation ascendante peut être vue comme une réfutation de la formule. Nous analysons des cas de compilations ascendantes de formules (pour certaines non-satisfiables) ayant de petites représentations dans le langage cible, mais pour lesquelles des circuits intermédiaires de taille exponentielle sont inévitables.Dans les derniers chapitres de la thèse, nous tentons d'élargir le champ de recherche en compilation de connaissances tout en restant dans l'esprit de la « carte de compilation de connaissances », un modèle proposé par Darwiche et Marquis pour comparer les langages de compilation en termes de compacité (quels sont les langages permettant d'obtenir les plus petites représentations ?) et en termes d'efficacité calculatoire (pour quels langages existe-t-il des algorithmes en temps polynomial pour répondre aux requêtes ?). Nous introduisons de nouvelles requêtes pour les langages Booléens, plus précisément des requêtes d'énumération, et nous initions des « cartes de compacité » pour des langages pour des fonctions non-Booléennes.The subject of the thesis is knowledge compilation, an approach for computationally hard problems that aims to work around general lower bounds results by preprocessing (or compiling) an input function. The aim is to generate an equivalent representation of the function in a class of representations called the compilation language. Given a compilation language 'L', we call "L size of a function" the size of the smallest representation in L for that function. In this thesis, we study different aspects of the compilation language DNNF of Boolean circuits in decomposable negation normal form (DNNF).In the first part of the thesis, we prove exponential lower bounds on the DNNF size of particular functions and even on the DNNF size of their approximations. For some functions, especially some pseudo-Boolean constraints, we prove lower bounds using classical techniques to analyse circuits in DNNF using tools from communication complexity. But for other functions, the same techniques do not yield good lower bounds. In particular for Tseitin formulas, which are CNF formulas representing systems of parity constraints structured by graphs, we had to improve existing techniques to prove a lower bound on their DNNF sizes that is exponential in the treewidth of the underlying graphs. Since the DNNF size of functions is sometimes an impediment for compilation in practice, we study the complexity of representing approximations of functions in DNNF. We use and compare two notions of approximations that guarantee a bounded error of approximation. For both notions, we give examples of functions that are not only hard to represent exactly in many compilation languages, but whose approximations are also all hard to represent in the same languages.In the second part of the thesis, we give applications of our lower bounds that establish connections between knowledge compilation and proof complexity theory. First, using our lower bound on the DNNF size of Tseitin formulas, we give a characterization of unsatisfiable Tseitin formulas whose underlying graphs have maximal degree bounded by a constant that have small refutations in the regular resolution proof system. A second application deals with the analysis of compilers that respect the "bottom-up" paradigm. A particularity of bottom-up compilers is to generate intermediate representations or circuits and, for some compilation languages, the sequence of intermediate circuits generated in a bottom-up compilation of an unsatisfiable formula can be seen as a refutation of the formula. We show that bottom-up compilers in a language L can be very inefficient even when the functions to compile have a small representation in L since large intermediate circuits are sometimes unavoidable.In the last part of the thesis, we explore new horizons for knowledge compilation. We follow the spirit of the "knowledge compilation map", a framework created by Darwiche and Marquis to compare compilation languages in term of succinctness (which languages yields the smaller representations?) and in terms of efficiency for reasoning on function (for which languages are there polynomial-time algorithms for given queries?). We introduce new queries for Boolean languages, more precisely enumeration queries, and we initiate a "succinctness map" for languages for non-Boolean functions
Lower Bounds on Intermediate Results in Bottom-Up Knowledge Compilation
International audienceBottom-up knowledge compilation is a paradigm for generating representations of functions by iteratively conjoining constraints using a so-called apply function. When the input is not efficiently compilable into a language - generally a class of circuits - because optimal compiled representations are provably large, the problem is not the compilation algorithm as much as the choice of a language too restrictive for the input. In contrast, in this paper, we look at CNF formulas for which very small circuits exists and look at the efficiency of their bottom-up compilation in one of the most general languages, namely that of structured decomposable negation normal forms (str-DNNF). We prove that, while the inputs have constant size representations as str-DNNF, any bottom-up compilation in the general setting where conjunction and structure modification are allowed takes exponential time and space, since large intermediate results have to be produced. This unconditionally proves that the inefficiency of bottom-up compilation resides in the bottom-up paradigm itself
On Translations between ML Models for XAI Purposes
International audienceIn this paper, the succinctness of various ML models is studied. To be more precise, the existence of polynomial-time and polynomial-space translations between representation languages for classifiers is investigated. The languages that are considered include decision trees, random forests, several types of boosted trees, binary neural networks, Boolean multilayer perceptrons, and various logical representations of binary classifiers. We provide a complete map indicating for every pair of languages C, C' whether or not a polynomial-time / polynomial-space translation exists from C to C'. We also explain how to take advantage of the resulting map for XAI purposes
Characterizing Tseitin-Formulas with Short Regular Resolution Refutations
International audienc
Lower Bounds for Approximate Knowledge Compilation
International audienceKnowledge compilation studies the trade-off between succinctness and efficiency of different representation languages. For many languages, there are known strong lower bounds on the representation size, but recent work shows that, for some languages, one can bypass these bounds using approximate compilation. The idea is to compile an approximation of the knowledge for which the number of errors can be controlled. We focus on circuits in deterministic decomposable negation normal form (d-DNNF), a compilation language suitable in contexts such as probabilistic reasoning, as it supports efficient model counting and probabilistic inference. Moreover, there are known size lower bounds for d-DNNF which by relaxing to approximation one might be able to avoid. In this paper we formalize two notions of approximation: weak approximation which has been studied before in the decision diagram literature and strong approximation which has been used in recent algorithmic results. We then show lower bounds for approximation by d-DNNF, complementing the positive results from the literature
