1,720,972 research outputs found
On the Relative Efficiency of Dynamic and Static Top-Down Compilation to Decision-DNNF
Top-down compilers of CNF formulas to circuits in decision-DNNF (Decomposable Negation Normal Form) have proved to be useful for model counting. These compilers rely on a common set of techniques including DPLL-style exploration of the set of models, caching of residual formulas, and connected components detection. Differences between compilers lie in the variable selection heuristics and in the additional processing techniques they may use. We investigate, from a theoretical perspective, the ability of top-down compilation algorithms to find small decision-DNNF circuits for two different variable selection strategies. Both strategies are guided by a graph of the CNF formula and are inspired by what is done in practice. The first uses a dynamic graph-partitioning approach while the second works with a static tree decomposition. We show that the dynamic approach performs significantly better than the static approach for some formulas, and that the opposite also holds for other formulas. Our lower bounds are proved despite loose settings where the compilation algorithm is only forced to follow its designed variable selection strategy and where everything else, including the many opportunities for tie-breaking, can be handled non-deterministically
On Translations between ML Models for XAI Purposes
International audienceIn this paper, the succinctness of various ML models is studied. To be more precise, the existence of polynomial-time and polynomial-space translations between representation languages for classifiers is investigated. The languages that are considered include decision trees, random forests, several types of boosted trees, binary neural networks, Boolean multilayer perceptrons, and various logical representations of binary classifiers. We provide a complete map indicating for every pair of languages C, C' whether or not a polynomial-time / polynomial-space translation exists from C to C'. We also explain how to take advantage of the resulting map for XAI purposes
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Hard Functions in Knowledge Compilation : from Lower Bounds to Applications
Le thème de la thèse est la compilation de connaissances, une approche pour la résolution de problèmes difficiles à résoudre du point de vue du calcul et qui vise à réduire cette complexité en pré-traitant (en compilant) une partie du problème connue à l'avance et modélisée par une fonction. Il s'agit de trouver une représentation de la fonction dans une classe de représentations appelée langage de compilation. Pour un langage L donné, la taille d'une fonction dans L désigne la taille de sa plus petite représentation dans L. Dans cette thèse, nous étudions différents aspects de la compilation dans le langage DNNF, le langage où les fonctions Booléennes sont représentées par des circuits sous forme normale négative décomposable (DNNF).Dans la première partie de la thèse, nous montrons des bornes inférieures sur la taille de fonctions particulières dans le langage DNNF. Pour certaines fonctions, notamment des contraintes pseudo-Booléennes, nous obtenons des bornes inférieures par l'application de techniques usuelles pour l'analyse de circuits en DNNF. Pour d'autres fonctions, ces mêmes techniques s'avèrent insuffisantes. En particulier, pour les formules de Tseitin, qui sont des formules CNF représentant des systèmes d'équations linéaires structurés par des graphes, nous avons amélioré les techniques existantes pour montrer une borne exponentielle sur leur taille dans le langage DNNF. Quand la taille d'une fonction dans un langage est trop grande pour envisager sa compilation, on peut tenter de compiler une approximation de la fonction, on parle alors de compilation de connaissances approchée. Nous avons étudié deux notions d'approximation qui offrent des garanties sur l'erreur d'approximation. Pour chacune de ces notions, nous avons trouvé des exemples de fonctions dont toutes les approximations ont une taille trop grande dans de nombreux langages de compilation.Dans la seconde partie de la thèse, nous donnons des applications des nos bornes inférieures. Ces applications font le lien entre la compilation de connaissances et le domaine de la complexité des preuves. Notamment, nous utilisons notre borne inférieure sur la taille des formules de Tseitin dans le langage DNNF pour obtenir une caractérisation des formules de Tseitin non-satisfiables (dont les graphes sont de degré borné par une constante) pour lesquelles il existe des réfutations par résolution dites régulières de taille polynomiale en le nombre de variables. Une seconde application concerne à la fois certains systèmes de preuve et les compilateurs utilisant la compilation dite ascendante. La compilation ascendante a pour particularité qu'elle construit le circuit représentant la fonction initiale en combinant des circuits intermédiaires. Dans le cas de formules non-satisfiables, la séquence de circuits intermédiaires générés lors d'une compilation ascendante peut être vue comme une réfutation de la formule. Nous analysons des cas de compilations ascendantes de formules (pour certaines non-satisfiables) ayant de petites représentations dans le langage cible, mais pour lesquelles des circuits intermédiaires de taille exponentielle sont inévitables.Dans les derniers chapitres de la thèse, nous tentons d'élargir le champ de recherche en compilation de connaissances tout en restant dans l'esprit de la « carte de compilation de connaissances », un modèle proposé par Darwiche et Marquis pour comparer les langages de compilation en termes de compacité (quels sont les langages permettant d'obtenir les plus petites représentations ?) et en termes d'efficacité calculatoire (pour quels langages existe-t-il des algorithmes en temps polynomial pour répondre aux requêtes ?). Nous introduisons de nouvelles requêtes pour les langages Booléens, plus précisément des requêtes d'énumération, et nous initions des « cartes de compacité » pour des langages pour des fonctions non-Booléennes.The subject of the thesis is knowledge compilation, an approach for computationally hard problems that aims to work around general lower bounds results by preprocessing (or compiling) an input function. The aim is to generate an equivalent representation of the function in a class of representations called the compilation language. Given a compilation language 'L', we call "L size of a function" the size of the smallest representation in L for that function. In this thesis, we study different aspects of the compilation language DNNF of Boolean circuits in decomposable negation normal form (DNNF).In the first part of the thesis, we prove exponential lower bounds on the DNNF size of particular functions and even on the DNNF size of their approximations. For some functions, especially some pseudo-Boolean constraints, we prove lower bounds using classical techniques to analyse circuits in DNNF using tools from communication complexity. But for other functions, the same techniques do not yield good lower bounds. In particular for Tseitin formulas, which are CNF formulas representing systems of parity constraints structured by graphs, we had to improve existing techniques to prove a lower bound on their DNNF sizes that is exponential in the treewidth of the underlying graphs. Since the DNNF size of functions is sometimes an impediment for compilation in practice, we study the complexity of representing approximations of functions in DNNF. We use and compare two notions of approximations that guarantee a bounded error of approximation. For both notions, we give examples of functions that are not only hard to represent exactly in many compilation languages, but whose approximations are also all hard to represent in the same languages.In the second part of the thesis, we give applications of our lower bounds that establish connections between knowledge compilation and proof complexity theory. First, using our lower bound on the DNNF size of Tseitin formulas, we give a characterization of unsatisfiable Tseitin formulas whose underlying graphs have maximal degree bounded by a constant that have small refutations in the regular resolution proof system. A second application deals with the analysis of compilers that respect the "bottom-up" paradigm. A particularity of bottom-up compilers is to generate intermediate representations or circuits and, for some compilation languages, the sequence of intermediate circuits generated in a bottom-up compilation of an unsatisfiable formula can be seen as a refutation of the formula. We show that bottom-up compilers in a language L can be very inefficient even when the functions to compile have a small representation in L since large intermediate circuits are sometimes unavoidable.In the last part of the thesis, we explore new horizons for knowledge compilation. We follow the spirit of the "knowledge compilation map", a framework created by Darwiche and Marquis to compare compilation languages in term of succinctness (which languages yields the smaller representations?) and in terms of efficiency for reasoning on function (for which languages are there polynomial-time algorithms for given queries?). We introduce new queries for Boolean languages, more precisely enumeration queries, and we initiate a "succinctness map" for languages for non-Boolean functions
Separating Incremental and Non-Incremental Bottom-Up Compilation
The aim of a compiler is, given a function represented in some language, to generate an equivalent representation in a target language L. In bottom-up (BU) compilation of functions given as CNF formulas, constructing the new representation requires compiling several subformulas in L. The compiler starts by compiling the clauses in L and iteratively constructs representations for new subformulas using an "Apply" operator that performs conjunction in L, until all clauses are combined into one representation. In principle, BU compilation can generate representations for any subformulas and conjoin them in any way. But an attractive strategy from a practical point of view is to augment one main representation - which we call the core - by conjoining to it the clauses one at a time. We refer to this strategy as incremental BU compilation. We prove that, for known relevant languages L for BU compilation, there is a class of CNF formulas that admit BU compilations to L that generate only polynomial-size intermediate representations, while their incremental BU compilations all generate an exponential-size core
Separating Incremental and Non-Incremental Bottom-Up Compilation
The aim of a compiler is, given a function represented in some language, to generate an equivalent representation in a target language L. In bottom-up (BU) compilation of functions given as CNF formulas, constructing the new representation requires compiling several subformulas in L. The compiler starts by compiling the clauses in L and iteratively constructs representations for new subformulas using an "Apply" operator that performs conjunction in L, until all clauses are combined into one representation. In principle, BU compilation can generate representations for any subformulas and conjoin them in any way. But an attractive strategy from a practical point of view is to augment one main representation - which we call the core - by conjoining to it the clauses one at a time. We refer to this strategy as incremental BU compilation. We prove that, for known relevant languages L for BU compilation, there is a class of CNF formulas that admit BU compilations to L that generate only polynomial-size intermediate representations, while their incremental BU compilations all generate an exponential-size core
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
- …
