Search CORE

790 research outputs found

SASWeb 2012: Semantic and Adaptive Social Web

Author: Vassileva J.
Cena F.
Lora Aroyo Federica Cena, Antonina Dattolo , Pasquale Lops , Julita Vassileva
Dattolo Antonina
Lops P.
Aroyo L.
Publication venue
Publication date: 01/01/2012
Field of study

SASWeb 2012: Semantic and Adaptive Social Web organized by Lora Aroyo, Federica Cena, Antonina Dattolo, Pasquale Lops, Julita Vassileva (1) Building multi-layer social knowledge maps with Google Maps API MinEr Liang, Julio Guerra, Peter Brusilovsky (2) Learning from a network of peers via peer-driven adjustment of a corpus John Champaign, Robin Cohen ****Invited Talks (4) Culture in User Modeling 3.0 Jacqueline Bourdeau (5) Leveraging social and semantic components in adaptive environments Cristina Gena (6) Meaning is its use: towards the use of distributional semantics for content-based recommender systems Cataldo Musto (7) Exploring folksonomies for adaptive query expansion Fabio Gasparett

Archivio istituzionale della ricerca - Università degli Studi di Udine

30 sentences annotated by 15 crowd workers (3/3)

Author: Lora Aroyo (401491)
Publication venue
Publication date: 2013
Field of study

These are 30 sentences annotated by 15 crowd workers each, within the context of the project Crowd Watson (http://crowd-watson.nl) for medical relation extraction. Project members: Chris Welty (IBM Research), Lora Aroyo (VU University Amsterdam),</p

The Francis Crick Institute

30 sentences annotated by 15 crowd workers (2/3)

Author: Lora Aroyo (401491)
Publication venue
Publication date: 2013
Field of study

The Francis Crick Institute

30 sentences annotated by 15 crowd workers (1/3)

Author: Lora Aroyo (401491)
Publication venue
Publication date: 2013
Field of study

These are 30 sentences annotated by 15 crowd workers each, within the context of the project Crowd Watson (http://crowd-watson.nl) for medical relation extraction. Project members: Chris Welty (IBM Research), Lora Aroyo (VU University Amsterdam), </p

The Francis Crick Institute

First International Workshop on User Interfaces for Crowdsourcing and Human Computation

Author: CREMONESI PAOLO
Alessandro Bozzon
Lora Aroyo
Bozzon Alessandro
Aroyo Lora
Paolo Cremonesi
Publication venue
Publication date: 01/01/2014
Field of study

Recent years witnessed an explosion in the number and variety of data crowdsourcing initiatives. From OpenStreetMap to Amazon Mechanical Turk, developers and practitioners have been striving to create user interfaces able to effectively and efficiently support the creation, exploration, and analysis of crowdsourced information. The extensive usage of crowdsourcing techniques brings a major change of paradigm with respect to traditional user interface for data collection and exploration, as effectiveness, speed, and interaction quality concerns play a central role in supporting very demanding incentives, including monetary ones. The First International Workshop on User Interfaces for Crowdsourcing and Human Computation (CrowdUI 2014), co-located with the AVI 2014 conference, brought together researchers and practitioners from a wide range of areas interested in discussing the user interaction challenges posed by crowdsourcing systems. © 2014 ACM

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

FrameNet Semantic Frame Disambiguation with CrowdTruth

Author: Lora Aroyo
Chris Welty (430979)
Anca Dumitrache (5283037)
Chris Welty
Lora Aroyo (6903701)
Anca Dumitrache
Publication venue
Publication date: 26/10/2018
Field of study

This repository contains a ground truth corpus for semantic frame disambiguation, acquired with crowdsourcing and processed with <a href="http://crowdtruth.org/">CrowdTruth</a> metrics that capture ambiguity in annotations by measuring inter-annotator disagreement. The dataset contains annotations for 433 sentence-word pairs from the <a href="https://framenet.icsi.berkeley.edu/">FrameNet corpus v.1.7</a>, with each sentence-word pair annotated for frame disambiguation by 15 workers. The crowdsourced data was collected from <a href="https://www.mturk.com/">Amazon Mechanical Turk</a>. The corpus has been referenced in the following paper: <ul> <li>Anca Dumitrache, Lora Aroyo and Chris Welty: <a href="https://arxiv.org/abs/1805.00270">Capturing and Interpreting Ambiguity in Crowdsourcing Frame Disambiguation</a>. <a href="https://www.humancomputation.com/2018/">HCOMP 2018</a>.</li> </ul> To replicate the data processing from the paper, use the Jupyter Notebook file <code>CrowdTruth metrics.ipynb</code>. It requires the installation of the <a href="https://github.com/CrowdTruth/CrowdTruth-core">CrowdTruth metrics</a> Python package (v >= 2.0). The data aggregated with CrowdTruth metrics is available in folder <code>data/output/</code> The raw crowdsourcing data is available in folder <code>data/input/</code> If you find this data useful in your research, please consider citing: <pre><code>@inproceedings{dumitrache2018frames, Author = {Anca Dumitrache and Lora Aroyo and Chris Welty}, Title = {Capturing Ambiguity in Crowdsourcing Frame Disambiguation}, Booktitle = {The sixth AAAI Conference on Human Computation and Crowdsourcing}, Year = {2018} } </code></pre&gt

ZENODO

The Francis Crick Institute

CrowdTruth 2.0:Quality metrics for crowdsourcing with disagreement

Author: Dumitrache Anca
Timmermans Benjamin
Welty Chris
Aroyo Lora; id_orcid
Inel Oana
Publication venue
Publication date: 01/01/2018
Field of study

Typically crowdsourcing-based approaches to gather annotated data use inter-annotator agreement as a measure of quality. However, in many domains, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. In this paper, we present ongoing work into the CrowdTruth metrics, that capture and interpret inter-annotator disagreement in crowdsourcing. The CrowdTruth metrics model the inter-dependency between the three main components of a crowdsourcing system – worker, input data, and annotation. The goal of the metrics is to capture the degree of ambiguity in each of these three components. The metrics are available online at https://github.com/CrowdTruth/CrowdTruth-core.</p

VU Research Portal

Validation Methodology for Expert-Annotated Datasets: Event Annotation Case Study

Author: Aroyo Lora (author)
Inel O. (author)
Aroyo Lora; id_orcid
Aroyo Lora
Inel Oana
Publication venue
Publication date: 01/01/2019
Field of study

Event detection is still a difficult task due to the complexity and the ambiguity of such entities. On the one hand, we observe a low inter-annotator agreement among experts when annotating events, disregarding the multitude of existing annotation guidelines and their numerous revisions. On the other hand, event extraction systems have a lower measured performance in terms of F1-score compared to other types of entities such as people or locations. In this paper we study the consistency and completeness of expert-annotated datasets for events and time expressions. We propose a data-agnostic validation methodology of such datasets in terms of consistency and completeness. Furthermore, we combine the power of crowds and machines to correct and extend expert-annotated datasets of events. We show the benefit of using crowd-annotated events to train and evaluate a state-of-the-art event extraction system. Our results show that the crowd-annotated events increase the performance of the system by at least 5.3%

VU Research Portal

TU Delft Repository

DROPS Dagstuhl Research Online Publication Server

Nichesourcing for Improving Access to Linked Cultural Heritage Datasets

Author: Dijkshoorn C.R.
Publication venue
Publication date: 01/01/2019
Field of study

Schreiber, A.T. [Promotor]Aroyo, L.M. [Promotor]Boer, V. de [Copromotor

VU Research Portal

CrowdTruth Corpus for Open Domain Relation Extraction from Sentences

Author: Lora Aroyo
Chris Welty (430979)
Anca Dumitrache (5283037)
Chris Welty
Lora Aroyo (6903701)
Anca Dumitrache
Publication venue
Publication date: 26/10/2018
Field of study

This repository contains a ground truth corpus for open domain relation extraction from sentences, acquired with crowdsourcing and processed with <a href="http://crowdtruth.org/">CrowdTruth</a> metrics that capture ambiguity in annotations by measuring inter-annotator disagreement. The dataset contains annotations for 4,100 sentences sampled from Angeli et al. (1) and Riedel et al. (2), over 16 relations, with each sentence annotated by 15 workers. The sentences have been pre-processed with Distant Supervision (3) using the Freebase knowledge base, in order to identify the term pairs in each sentence that are likely to express a relation. The crowdsourced data was collected from <a href="http://figure-eight.com/">Figure Eight</a> and <a href="https://www.mturk.com/">Amazon Mechanical Turk</a>. This corpus has been discussed in the following papers: <ul> <li>Anca Dumitrache, Lora Aroyo and Chris Welty: <a href="https://arxiv.org/abs/1809.00537">Crowdsourcing Semantic Label Propagation in Relation Classification</a>. <a href="http://fever.ai/">FEVER</a> Workshop at <a href="http://emnlp2018.org/">EMNLP 2018</a>.</li> <li>Anca Dumitrache, Lora Aroyo and Chris Welty: <a href="https://arxiv.org/abs/1711.05186">False Positive and Cross-relation Signals in Distant Supervision Data</a>. <a href="http://www.akbc.ws/">AKBC</a> Workshop at <a href="http://nips.cc/">NIPS 2017</a>.</li> <li>Anca Dumitrache, Lora Aroyo and Chris Welty: <a href="http://crowdtruth.org/wp-content/uploads/2017/03/collint17-open-domain.pdf">Disagreement in Crowdsourcing and Active Learning for Better Distant Supervision Quality</a>. <a href="http://collectiveintelligenceconference.org/">Collective Intelligence 2017</a>.</li> </ul> Sentence-level data is available in file: <code>|--data/output/aggregated_sentences.csv</code> Worker-level data is available in file: <code>|--data/output/aggregated_workers.csv</code> Raw crowdsourcig data is available in folder: <code>|--data/input/</code> Results of the relation classification model are available in folder: <code>|--data/model_results/</code>   References (1) Angeli, Gabor, et al. "Combining distant and partial supervision for relation extraction." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. (2) Riedel, Sebastian, et al. "Relation extraction with matrix factorization and universal schemas." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). 2013. (3) Mintz, Mike, et al. "Distant supervision for relation extraction without labeled data." Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009.</p&gt

ZENODO

The Francis Crick Institute