Search CORE

1,721,010 research outputs found

MIDAS 2016: The 1st Workshop on MIning DAta for financial applicationS

Author: Caldarelli G.
Bordino I.
Gullo F.
Squartini T.
Fumarola F.
Publication venue
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

A KDD Platform based on the Application Service Provider Paradigm

Author: SALVEMINI E
MALERBA Donato
FUMAROLA F
Publication venue
Publication date: 01/01/2008
Field of study

Nowadays, Small and Medium Enterprises (SMEs) are forced to compete on a global market and to make strategic decisions in short periods of time. In order to allow SMEs access to information technologies which can support their competition on a global scale, public administrations are fostering the setting up of Digital Districts. In this paper, we describe a distributed collaborative data mining platform, named KD-ASP, developed for a Digital District. It is based on the application service provider (ASP) paradigm, which allows SMEs accessing to data mining services over a network and to cut down costs for their acquisition, upgrading and maintenance. KD-ASP allows the users to collaborate on the design of a knowledge discovery process whose execution is then demanded to a workflow engine. Tasks involved in a process are classified as data selection, pre-processing, data transformation, data mining and data visualization, and are made available as Web services

Archivio istituzionale della ricerca - Università di Bari

A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce

Author: MALERBA Donato
Donato Malerba
FUMAROLA F
Fabio Fumarola
Publication venue
Publication date: 01/01/2014
Field of study

Recently, several algorithms based on the MapReduce framework have been proposed for frequent pattern mining in Big Data. However, the proposed solutions come with their own technical challenges, such as inter-communication costs, in-process synchronizations, balanced data distribution and input parameters tuning, which negatively affect the computation time. In this paper we present MrAdam, a novel parallel, distributed algorithm which addresses these problems. The key principle underlying the design of MrAdam is that one can make reasonable decisions in the absence of perfect answers. Indeed, given the classical threshold for minimum support and a user-specified error bound, MrAdam exploits the Chernoff bound to mine "approximate" frequent itemsets with statistical error guarantees on their actual supports. These itemsets are generated in parallel and independently from subsets of the input dataset, by exploiting the MapReduce parallel computation framework. The result collections of frequent itemsets from each subset are aggregated and filtered by using a novel technique to provide a single collection in output. MrAdam can scale well on gigabytes of data and tens of machines, as experimentally proven on real datasets. In the experiments we also show that the proposed algorithm returns a good statistically bounded approximation of the exact results

Crossref

Archivio istituzionale della ricerca - Università di Bari

Proceedings of the Second Workshop on MIning DAta for financial applicationS (MIDAS '17), co-located with the 2017 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD '17)

Author: Caldarelli G
Gullo F
Squartini T
Fumarola F
Bordino I
Publication venue
Publication date: 01/01/2017
Field of study

IRIS Università degli Studi dell'Aquila

Screening tests for selecting anticancer metal compounds

Author: Fumarola F
Coluccia M.
BOCCARELLI Angelina
PANNUNZIO Alessandra
Vicenti C
Publication venue
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Università di Bari

KD-ASP - A distributed collaborative data mining platform

Author: SALVEMINI E
MALERBA Donato
FUMAROLA F
Publication venue
Publication date: 01/01/2008
Field of study

Archivio istituzionale della ricerca - Università di Bari

Proceedings of the First Workshop on MIning DAta for financial applicationS (MIDAS 2016) co-located with the 2016 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2016), Riva del Garda, Italy, September 19-23, 2016

Author: Caldarelli G
Gullo F
Squartini T
Fumarola F
Bordino I
Publication venue
Publication date: 01/01/2016
Field of study

IRIS Università degli Studi dell'Aquila

HyLiEn: a hybrid approach to general list extraction on the web

Author: MALERBA Donato
HAN J.
FUMAROLA F
BARBER R
WENINGER T
Publication venue
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Università di Bari

Unexpected results in automatic list extraction on the web

Author: HAN J
MALERBA Donato
FUMAROLA F
BARBER R
WENINGER T
Publication venue
Publication date: 01/01/2010
Field of study

The discovery and extraction of general lists on the Web continues to be an important problem facing the Web mining community. There have been numerous studies that claim to automatically extract structured data (i.e. lists, record sets, tables, etc.) from the Web for various purposes. Our own recent experiences have shown that the list-finding methods used as part of these larger frameworks do not generalize well and therefore ought to be reevaluated. This paper briefly describes some of the current approaches, and tests them on various list-pages. Based on our findings, we conclude that analyzing a Web page’s DOM-structure is not sufficient for the general list finding task

Archivio istituzionale della ricerca - Università di Bari

Discovering Novelty Patterns from the Ancient Christian Inscriptions of Rome

Author: PIO GIANVITO
Fumarola F
CECI MICHELANGELO
MALERBA Donato
FELLE Antonio
Publication venue
Publication date: 01/01/2014
Field of study

Studying Greek and Latin cultural heritage has always been considered essential to the understanding of important aspects of the roots of current European societies. However, only a small fraction of the total production of texts from ancient Greece and Rome has survived up to the present, leaving many gaps in the historiographic records. Epigraphy, which is the study of inscriptions (epigraphs), helps to fill these gaps. In particular, the goal of epigraphy is to clarify the meanings of epigraphs; to classify their uses according to their dating and cultural contexts; and to study aspects of the writing, the writers, and their “consumers.” Although several research projects have recently been promoted for digitally storing and retrieving data and metadata about epigraphs, there has actually been no attempt to apply data mining technologies to discover previously unknown cultural aspects. In this context, we propose to exploit the temporal dimension associated with epigraphs (dating) by applying a data mining method for novelty detection. The main goal is to discover relational novelty patterns—that is, patterns expressed as logical clauses describing significant variations (in frequency) over the different epochs, in terms of relevant features such as language, writing style, and material. As a case study, we considered the set of Inscriptiones Christianae Vrbis Romae stored in Epigraphic Database Bari, an epigraphic repository. Some patterns discovered by the data mining method were easily deciphered by experts since they captured relevant cultural changes, whereas others disclosed unexpected variations, which might be used to formulate new questions, thus expanding the research opportunities in the field of epigraph

Crossref

Archivio istituzionale della ricerca - Università di Bari