Search CORE

1,721,000 research outputs found

DataPACT: compliance by design of data/AI operations and pipelines

Author: Palmonari Matteo
Roman Dumitru
Prodan Radu
Musidlowska Marta
Konstantinidis George
Publication venue
Publication date: 2025
Field of study

DataPACT is a key initiative that develops novel tools and methodologies for efficient, compliant, ethical, and sustainable data/AI operations and pipelines. DataPACT contributes to their design, implementation, and management by embedding compliance, privacy, and environmental sustainability at their core design. It delivers compliance-by-design for data/AI operations and pipelines by developing innovative technical tools (Compliance Toolbox) and supportive methodologies (Compliance Framework) for compliance assessment and realization of data/AI pipelines designed, deployed, and executed through a set of management tools and techniques (Compliance-aware Data/AI Pipeline Toolbox). This paper presents an overview of DataPACT, focusing on motivation, methodology, and use cases

Southampton (e-Prints Soton)

LearningToAdapt with word embeddings: domain adaptation of named entity recognition systems

Author: Nozza Debora
Fersini Elisabetta
Palmonari Matteo
Messina Enza
Manchanda Pikakshi
Publication venue
Publication date: 01/01/2021
Field of study

The task of Named Entity Recognition (NER) is aimed at identifying named entities in a given text and classifying them into pre-defined domain entity types such as persons, organizations, locations. Most of the existing NER systems make use of generic entity type classification schemas, however, the comparison and integration of (more or less) different entity types among different NER systems is a complex problem even for human experts. In this paper, we propose a supervised approach called L2AWE (Learning To Adapt with Word Embeddings) which aims at adapting a NER system trained on a source classification schema to a given target one. In particular, we validate the hypothesis that the embedding representation of named entities can improve the semantic meaning of the feature space used to perform the adaptation from a source to a target domain. The results obtained on benchmark datasets of informal text show that L2AWE not only outperforms several state of the art models, but it is also able to tackle errors and uncertainties given by NER systems

Archivio istituzionale della Ricerca - Bocconi

Understanding the structure of knowledge graphs with ABSTAT profiles

Author: Palmonari Matteo
Alva Principe Renzo Arturo
Rula Anisa
Spahiu Blerina
Publication venue
Publication date: 01/01/2024
Field of study

Archivio istituzionale della ricerca - Università di Brescia

Cross-lingual link discovery with TR-ESA

Author: Matteo Palmonari
Palmonari Matteo
Giovanni Semeraro
NARDUCCI FEDELUCIO
SEMERARO Giovanni
Fedelucio Narducci
Publication venue
Publication date: 01/01/2017
Field of study

Cross-lingual data linking is the problem of establishing links between resources, such as places, services, or movies, which are described in different languages. In cross-lingual data linking it is often the case that very short descriptions have to be matched, which makes the problem even more challenging. This work presents a method named TRanslation-based Explicit Semantic Analysis (TR-ESA) to represent and match short textual descriptions available in different languages. TR-ESA translates short descriptions in any given language into a pivot language by exploiting a machine translation tool. Then, it generates a Wikipedia-based representation of the translated text by using the Explicit Semantic Analysis technique. The resulting representations are used to match short descriptions in different languages. The method is incorporated in CroSeR (Cross-lingual Service Retrieval), an interactive data linking tool that recommends potential matches to users. We compared results coming from an in-vitro evaluation on a gold standard consisting of five datasets in different languages, with an in-vivo experiment that involved human experts supported by CroSeR. The in-vivo evaluation confirmed the results of the in-vitro evaluation and the overall effectiveness of the proposed method

Crossref

Politecnio die Bari - Catalogo di prodotti della Ricerca

Archivio istituzionale della ricerca - Università di Bari

Capturing the Age of Linked Open Data: Towards a Dataset-Independent Framework

Author: Matteo Palmonari
Andrea Maurino
PALMONARI MATTEO LUIGI
RULA ANISA
MAURINO ANDREA
Anisa Rula
Publication venue
Publication date: 01/01/2012
Field of study

An increasing amount of data are published and consumed on the Web according to the Linked Data paradigm. In such scenario, understanding if the data consumed are up-to-date is crucial. Outdated data are usually considered inappropriate for many crucial tasks, such as make the consumer confident that answers returned to a query are still valid at the time the query is formulated. In this paper we present a first dataset-independent framework for assessing currency of Linked Open Data (LOD) graphs. Starting from the analysis of the 8,713,282 triples containing temporal metadata in the billion triple challenge 2011, we investigate which vocabularies are used to represent versioning metadata, we defined Onto Currency, an ontology that integrates the most frequent properties used in this domain, and supports the collection of metadata from datasets that use different vocabularies. The proposed framework uses this ontology to assess the currency of an RDF graph/statement, by extrapolating it from the currency of the documents that describe the resources occurring in the graphs (statement). The approach has been implemented and evaluated in two different scenarios. © 2012 IEEE

Crossref

Archivio istituzionale della ricerca - Università di Brescia

On the Diversity and Availability of Temporal Information in Linked Open Data

Author: Stadtmüller Steffen
Andrea Maurino
PALMONARI MATTEO LUIGI
Palmonari Matteo
Steffen Stadtmüller
Andreas Harth
Anisa Rula
Harth A
Matteo Palmonari
RULA ANISA
Harth Andreas
MAURINO ANDREA
Stadtmüller S
Publication venue
Publication date: 01/01/2012
Field of study

An increasing amount of data is published and consumed on the Web according to the Linked Data paradigm. In consideration of both publishers and consumers, the temporal dimension of data is important. In this paper we investigate the characterisation and availability of temporal information in Linked Data at large scale. Based on an abstract definition of temporal information we conduct experiments to evaluate the availability of such information using the data from the 2011 Billion Triple Challenge (BTC) dataset. Focusing in particular on the representation of temporal meta-information, i.e., temporal information associated with RDF statements and graphs, we investigate the approaches proposed in the literature, performing both a quantitative and a qualitative analysis and proposing guidelines for data consumers and publishers. Our experiments show that the amount of temporal information available in the LOD cloud is still very small; several different models have been used on different datasets, with a prevalence of approaches based on the annotation of RDF documents. © 2012 Springer-Verlag Berlin Heidelberg

Crossref

Repository KITopen

Archivio istituzionale della ricerca - Università di Brescia