Search CORE

1,721,622 research outputs found

Victoria Stodden: Scholarly Communication in the Era of Big Data and Big Computation

Author: Stodden Victoria
Publication venue
Publication date: 02/11/2015
Field of study

Victoria Stodden gave the keynote address for Open Access Week 2015. "Scholarly communication in the era of big data and big computation" was sponsored by the University Libraries, Computational Modeling and Data Analytics, the Department of Computer Science, the Department of Statistics, the Laboratory for Interdisciplinary Statistical Analysis (LISA), and the Virginia Bioinformatics Institute. Victoria Stodden is an associate professor in the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign. She completed both her PhD in statistics and her law degree at Stanford University. Her research centers on the multifaceted problem of enabling reproducibility in computational science. This includes studying adequacy and robustness in replicated results, designing and implementing validation systems, developing standards of openness for data and code sharing, and resolving legal and policy barriers to disseminating reproducible research.Virginia Tech. University LibrariesVirginia Tech. Division of Computational Modeling and Data AnalyticsVirginia Tech. Department of Computer ScienceVirginia Tech. Department of StatisticsVirginia Tech. Laboratory for Interdisciplinary Statistical Analysis (LISA)Virginia Bioinformatics Institut

VTech Works (Virginia Tech)

DEplain-APA

Author: Stodden Regina ; https://orcid.org/
Momen Omar
Omar Momen
Regina Stodden
Kallmeyer Laura
Laura Kallmeyer
Publication venue
Publication date: 01/01/2023
Field of study

DEplain: A corpus for German Text Simplification This repository contains the corpus called DEplain-APA for German text simplification (document and sentence simplification). The corpus contains Austrian nexts text provided by the APA - Austria Presse Agentur eG. All of the sentence-wise aligned pairs (complex-simple) are manually aligned. The following table summarizes the most important meta data of the corpus. meta data value language DE-AT (Austrian German) domain news source language level B1 target language level A2 # document pairs (total, train/dev/test) 483 (387/48/48) # sentence pairs (total, train/dev/test) 13,122 (10,660/1,231/1,231) # complex sentences 25,607 # simple sentences 26,471 For more information, please have a look at our paper. If you use this corpus, please also cite our paper and name APA - Austria Presse Agentur eG as data provider: Regina Stodden, Omar Momen, and Laura Kallmeyer. 2023. DEplain: A German Parallel Corpus with Intralingual Translations into Plain Language for Sentence and Document Simplification. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16441–16463, Toronto, Canada. Association for Computational Linguistics

ZENODO

Publications at Bielefeld University

Recommended from our members

Trust Your Science? Open Your Data and Code

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

This is a view on the reproducibility of computational sciences by Victoria Stodden. It contains information on the Reproducibility, Replicability, and Repeatability of code created by the other sciences. Stodden also talks about the rising prominence of computational sciences as we are in the digital age and what that means for the future of science and collecting data

Columbia University Academic Commons

MASSIVE DATA, THE DIGITIZATION OF SCIENCE, AND REPRODUCIBILITY OF RESULTS

Author: Stodden Victoria
Publication venue
Publication date: 01/01/2010
Field of study

As the scientific enterprise becomes increasingly computational and data-driven, the nature of the information communicated must change. Without inclusion of the code and data with published computational results, we are engendering a credibility crisis in science. Controversies such as ClimateGate, the microarray-based drug sensitivity clinical trials under investigation at Duke University, and retractions from prominent journals due to unverified code suggest the need for greater transparency in our computational science. In this talk I argue that the scientific method be restored to (1) a focus on error control as central to scientific communication and (2) complete communication of the underlying methodology producing the results, ie. reproducibility. I outline barriers to these goals based on recent survey work (Stodden 2010), and suggest solutions such as the “Reproducible Research Standard” (Stodden 2009), giving open licensing options designed to create an intellectual property framework for scientists consonant with longstanding scientific norms

CERN Document Server

Recommended from our members

How Technology Is (Rapidly) Expanding the Scope of the Law in Statistics

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

Power point presentation on how technology is expanding the scope law has in statistics. Stodden goes into policy in terms of the ever growing enterprise of computational science, the update of the scientific method, different methods for code sharing and licensing (such as creative commons), and the way an updated scientific method would have an influence on reproducibility

Columbia University Academic Commons

Recommended from our members

Reproducibility in Computational Science: Framing the Concept

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

Power point presentation on the “Reproducibility in Computational Science” by Victoria Stodden going over the definitions of reproducibility, implementations of the scientific method in different fields, how this applies to policy makers, journal editors, and agencies such as the NSF that award grants for projects

Columbia University Academic Commons

Recommended from our members

Data Management and Sharing Policies in the NSF and the NIH

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

A power point presentation on data management and sharing policies in regards to the NSF and NIH foundations. Victoria Stodden explains the impact of computational methods as a central part of the scientific enterprise, how the scientific method should be updated, the role policy plays in terms of the NSF guidelines and how data should be shared and protected in terms of congressional policy

Columbia University Academic Commons

Recommended from our members

The Credibility Crisis and Computational Science: Accountability and Public Health

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

Power point presentation on “The Credibility Crisis and Computational Science” in terms of “Accountability and Public Health” in which Victoria Stodden goes into policy in terms of the ever growing enterprise of computational science, the update of the scientific method, different methods for code sharing and licensing (such as creative commons), and the way an updated scientific method would have an influence on reproducibility

Columbia University Academic Commons

Recommended from our members

Innovation and Growth through Open Access to Scientific Research: Three Ideas for High-Impact Rule Changes

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

A paper on Data Policies by Victoria Stodden where she explores the framing principles that should be applied to the reproduction of computational research and results and how those principles should be used to guide scientific policy during the digital age

Columbia University Academic Commons

Recommended from our members

Scientific Practice Today and the Scientific Method: Responding to the Credibility Crisis

Author: Stodden Victoria C.
Publication venue
Publication date: 01/01/2011
Field of study

Power point presentation on scientific practices today and the scientific method in terms of computational science. Stodden goes into policy in terms of the ever growing enterprise of computational science, the update of the scientific method, different methods for code sharing and licensing (such as creative commons), and the way an updated scientific method would have an influence on reproducibility

Columbia University Academic Commons