Search CORE

1,721,523 research outputs found

Inductive queries for a drug designing robot scientist

Author: Amanda Schierz
Andrew Sparkes
Amanda Clare
Jan Ramon
King Ross D.
Nijssen Siegfried
Rowland Jem J.
Rowland Jem
Jem Rowland
Sparkes Andrew
Ross D. King
Siegfried Nijssen
Schierz Amanda C.
Clare Amanda
Schierz Amanda
Ramon Jan
Publication venue
Publication date: 01/01/2010
Field of study

It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

Lirias

Crossref

Bournemouth University Research Online

The University of Manchester - Institutional Repository

DIAL UCLouvain

Constraint Based Mining of First Order Sequences in SeqLog (Extended Abstract)

Author: Sau Dan Lee
Luc De Raedt
Publication venue
Publication date: 01/01/2004
Field of study

Sau Dan Lee and Luc De Raedt Institut fur Informatik Albert-Ludwigs-Universitat Freiburg Germany {danlee,deraedt}@informatik.uni-freiburg.de Abstract. A logical language, SeqLog, for mining and querying sequential data and databases is presented. In SeqLog, data takes the form of a sequence of logical atoms, background knowledge can be specified using DataLog style clauses and sequential queries or patterns correspond to subsequences of logical atoms

CiteSeerX

Relational random forests based on random relational rules

Author: Pfahringer Bernhard
Anderson Grant
Publication venue
Publication date: 01/01/2009
Field of study

Random Forests have been shown to perform very well in propositional learning. FORF is an upgrade of Random Forests for relational data. In this paper we investigate shortcomings of FORF and propose an alternative algorithm, R⁴F, for generating Random Forests over relational data. R⁴F employs randomly generated relational rules as fully self-contained Boolean tests inside each node in a tree and thus can be viewed as an instance of dynamic propositionalization. The implementation of R⁴F allows for the simultaneous or parallel growth of all the branches of all the trees in the ensemble in an efficient shared, but still single-threaded way. Experiments favorably compare R⁴F to both FORF and the combination of static propositionalization together with standard Random Forests. Various strategies for tree initialization and splitting of nodes, as well as resulting ensemble size, diversity, and computational complexity of R⁴F are also investigated

Research Commons@Waikato

Abstraction Refinement Guided by a Learnt Probabilistic Model

Author: Grigore R
Yang Hongseok
Grigore Radu y
Yang H
Hongseok Yang
Grigore Radu
Radu Grigore
Publication venue
Publication date: 01/01/2016
Field of study

The core challenge in designing an effective static program analysis is to find a good program abstraction -- one that retains only details relevant to a given query. In this paper, we present a new approach for automatically finding such an abstraction. Our approach uses a pessimistic strategy, which can optionally use guidance from a probabilistic model. Our approach applies to parametric static analyses implemented in Datalog, and is based on counterexample-guided abstraction refinement. For each untried abstraction, our probabilistic model provides a probability of success, while the size of the abstraction provides an estimate of its cost in terms of analysis time. Combining these two metrics, probability and cost, our refinement algorithm picks an optimal abstraction. Our probabilistic model is a variant of the Erdos-Renyi random graph model, and it is tunable by what we call hyperparameters. We present a method to learn good values for these hyperparameters, by observing past runs of the analysis on an existing codebase. We evaluate our approach on an object sensitive pointer analysis for Java programs, with two client analyses (PolySite and Downcast)

KAIST Institutional Repository

Crossref

Oxford University Research Archive

Kent Academic Repository

Kernels on Prolog Proof Trees: Statistical Learning in the ILP Setting

Author: Andrea Passerini
Luc De Raedt
Paolo Frasconi
Passerini Andrea
Frasconi Paolo
De Raedt Luc
Publication venue
Publication date: 01/01/2006
Field of study

We develop kernels for measuring the similarity between relational instances using background knowledge expressed in first-order logic. The method allows us to bridge the gap between traditional inductive logic programming (ILP) representations and statistical approaches to supervised learning. Logic programs are first used to generate proofs of given visitor programs that use predicates declared in the available background knowledge. A kernel is then defined over pairs of proof trees. The method can be used for supervised learning tasks and is suitable for classification as well as regression. We report positive empirical results on Bongard-like and M-of-N problems that are difficult or impossible to solve with traditional ILP techniques, as well as on real bioinformatics and chemoinformatics data sets

CiteSeerX

DROPS Dagstuhl Research Online Publication Server

Semiring programming: A semantic framework for generalized sum product problems

Author: Belle Vaishak; id_orcid
Belle Vaishak,
De Raedt Luc,
Belle Vaishak
De Raedt Luc
Raedt Luc De
Publication venue
Publication date: 01/01/2020
Field of study

sponsorship: Vaishak Belle was supported by a Royal Society University Research Fellowship. Luc De Raedt was supported by the European Research Council (ERC) Advanced Grant 694980 "SYNTH: Synthesising Inductive Data Models" and the Research Foundation Flanders. (Royal Society University Research Fellowship, European Research Council (ERC)|694980, Research Foundation Flanders, European Research Council (ERC)|694980)status: Publishe

Lirias

Crossref

Edinburgh Research Explorer

Swepub

The twokey plot for multiple association rules control

Author: Bernt Klaus
Publication venue
Publication date: 01/01/2001
Field of study

The twokey plot for multiple association rules control / A. R. Unwin, H. Hofmann, K. Bernt. - In: Principles of data mining and knowledge discovery / Luc de Raedt ... (ed.). - Berlin u.a. : Springer, 2001. - S. 472-483. - (Lecture notes in computer science ; 2168 : Lecture notes in artificial intelligence

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS

Variations on the Author

Author: Sayad Cecilia
Publication venue
Publication date: 01/01/2016
Field of study

“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

Crossref

Kent Academic Repository