International Journal of Digital Curation

Not a member yet

605 research outputs found

Sort by

Leveraging Existing Technology: Developing a Trusted Digital Repository for the U.S. Geological Survey

Author: Hutchison Vivian B.
Norkin Tamar
Langseth Madison L.
Ignizio Drew A.
Zolly Lisa S.
McClees-Funinan Ricardo
Liford Amanda
Publication venue: University of Edinburgh
Publication date: 11/07/2021
Field of study

As Federal Government agencies in the United States pivot to increase access to scientific data (Sheehan, 2016), the U.S. Geological Survey (USGS) has made substantial progress (Kriesberg et al., 2017). USGS authors are required to make federally funded data publicly available in an approved data repository (USGS, 2016b). This type of public data product, known as a USGS data release, serves as a method for publishing reviewed and approved data. In this paper, we present major milestones in the approach the USGS took to transition an existing technology platform to a Trusted Digital Repository. We describe both the technical and the non-technical actions that contributed to a successful outcome.We highlight how initial workflows revealed patterns that were later automated, and the ways in which assessments and user feedback influenced design and implementation. The paper concludes with lessons learned, such as the importance of a community of practice, application programming interface (API)-driven technologies, iterative development, and user-centered design. This paper is intended to offer a potential roadmap for organizations pursuing similar goals. &nbsp

“You say potato, I say potato” Mapping Digital Preservation and Research Data Management Concepts towards Collective Curation and Preservation Strategies

Author: Lindlar Michelle
Rudnik Pia
Jones Sarah
Horton Laurence
Publication venue: University of Edinburgh
Publication date: 09/08/2020
Field of study

This paper explores models, concepts and terminology used in the Research Data Management and Digital Preservation communities. In doing so we identify several overlaps and mutual concerns where the advancements of one professional field can apply to and assist another. By focusing on what unites rather than divides us, and by adopting a more holistic approach we advance towards collective curation and preservation strategies. &nbsp

Assessing Metadata and Curation Quality: a Case Study from the Development of a Third-Party Curation Service at Springer Nature

Author: Grant Rebecca
Smith Graham
Hrynaszkiewicz Iain
Publication venue: University of Edinburgh
Publication date: 02/01/2020
Field of study

Since 2017, the publisher Springer Nature has provided an optional Research Data Support service to help researchers deposit and curate data that support their peer-reviewed publications. This service builds on a Research Data Helpdesk, which since 2016 has provided support to authors and editors who need advice on the options available for sharing their research data. In this paper, we describe a short project which aimed to facilitate an objective assessment of metadata quality, undertaken during the development of a third-party curation service for researchers (Research Data Support). We provide details on the single-blind user-testing that was undertaken, and the results gathered during this experiment. We also briefly describe the curation services which have been developed and introduced following an initial period of testing and piloting

Towards a Risk Catalogue for Data Management Plans

Author: Weng Franziska
Thoben Stella
Publication venue: University of Edinburgh
Publication date: 31/12/2020
Field of study

Although data management and its careful planning are no new topics, there is only little literature on risk mitigation in data management plans (DMPs). We consider it a problem that DMPs do not include a structured approach for the identification or mitigation of risks, because it would instil confidence and trust in the data and its stewards, and foster the successful conduction of data-generating projects, which often are funded research projects. In this paper, we present a lightweight approach for identifying general risks in DMPs. We introduce an initial version of a generic risk catalogue for funded research and similar projects. By analysing a selection of 13 DMPs for projects from multiple disciplines published by the Research Ideas and Outcomes (RIO) journal, we demonstrate that our approach is applicable to DMPs and transferable to multiple institutional constellations. As a result, the effort for integrating risk management in data management planning can be reduced

Archivists Managing Research Data? a Survey of Irish Organisations

Author: Grant Rebecca
Publication venue: University of Edinburgh
Publication date: 31/12/2020
Field of study

This paper describes a survey undertaken in 2017 to establish which research data management policies and practices were in place at Irish organisations; the extent to which archivists and records managers were employed to manage research data at those organisations; and the impact that archival skills have on research data management at an organisation. The paper describes the survey methods and data analysis, and presents findings including the presence of archivists and records managers at more than half of the surveyed organisations. Next steps for the research are also outlined

Updating the DCC Curation Lifecycle Model

Author: Choudhury Sayeed
Huang Caihong
Palmer Carole L.
Publication venue: University of Edinburgh
Publication date: 31/12/2020
Field of study

The DCC Curation Lifecycle Model has played a vital role in the field of data curation for over a decade. During that time, the scale and complexity of data have changed dramatically, along with the contexts of data production and use. This paper reports on a study examining factors impacting data curation practices and presents recommendations for updating the DCC Curation Lifecycle Model. The study was grounded in a review of other lifecycle models and informed by a site visit to the Digital Curation Centre and consultation with expert practitioners and researchers. Framed by contemporary conditions impacting the conduct of research and provision of data services, the analysis and proposed recommendations account for the prominence of machine-actionable data, the importance of machine learning for data processing and analytics, growth of integrated research workflows, and escalating concerns with fairness, accountability, and transparency of data and algorithms

Complementary Data as Metadata: Building Context for the Reuse of Video Records of Practice

Author: Tyler Allison Rae Bobyak
Suzuka Kara
Yakel Elizabeth
Publication venue: University of Edinburgh
Publication date: 01/11/2020
Field of study

Data reuse is often dependent on context external to the data. At times, this context is actually additional data that helps data reusers better assess and/or understand the target data upon which they are focused. We refer to these data as complementary data and define these as data external to the target data which could be used as evidence in their own right. In this paper, we specifically we focus on video records of practice in education. Records of practice are a type of data that more broadly document events surrounding teaching and learning. Video records of practice are an interesting case of data reuse as they can be extensive (e.g., days or weeks of video of a classroom), result in large files sizes, and require both metadata and other complementary data in order for reusers to understand the events depicted in the video. Through our mixed methods study, consisting of a survey of data reusers in 4 repositories and 44 in-depth interviews, we identified the types of complementary data that assist reusers of video records of practice for either teaching and/or research. While there were similarities in the types of complementary data identified as important to have when reusing VROP, the rationales and motivations for seeking out particular complementary data differed depending on whether the intended use was for teaching or research. While metadata is an important and valuable means of describing data for reuse, data’s meaning is often constructed through comparison, verification, or elucidation in reference to other data. &nbsp

Cross-tier Web Programming for Curated Databases: a Case Study

Author: Fowler Simon
Harding Simon
Sharman Joanna
Cheney James
Publication venue: University of Edinburgh
Publication date: 30/07/2020
Field of study

Curated databases have become important sources of information across several scientific disciplines, and as the result of manual work of experts, often become important reference works. Features such as provenance tracking, archiving, and data citation are widely regarded as important features for the curated databases, but implementing such features is challenging, and small database projects often lack the resources to do so. A scientific database application is not just the relational database itself, but also an ecosystem of web applications to display the data, and applications which allow data curation. Supporting advanced curation features requires changing all of these components, and there is currently no way to provide such capabilities in a reusable way. Cross-tier programming languages have been proposed to simplify the creation of web applications, where developers can write an application in a single, uniform language. Consequently, database queries and updates can be written in the same language as the rest of the program, and at least in principle, it should be possible to provide curation features reusably via program transformations. As a first step towards this goal, it is important to establish that realistic curated databases can be implemented in a cross-tier programming language. In this paper, we describe such a case study: reimplementing the web front end of a real world scientific database, the IUPHAR/BPS Guide to Pharmacology (GtoPdb), in the Links cross-tier programming language. We show how programming language features such as language-integrated query simplify the development process, and rule out common errors. Through a comparative performance evaluation, we show that the Links implementation performs fewer database queries, while the time needed to handle the queries is comparable to the Java version. Furthermore, while there is some overhead to using Links because of its comparative immaturity compared to Java, the Links version is usable as a proof-of-concept case study of cross-tier programming for curated databases. [ This paper is a conference pre-print presented at IDCC 2020 after lightweight peer review. The most up-to-date version of the paper can be found on arXiv https://arxiv.org/abs/2003.03845

Facilitating Access to Restricted Data: Operationalizing Trust in Data Users

Author: Tyler Allison Rae Bobyak
Publication venue: University of Edinburgh
Publication date: 22/07/2020
Field of study

The decision to allow users access to restricted and protected data is based on the development of trust in the user by data repositories. In this article, I propose a model of the process of trust development at restricted data repositories, a model which emphasizes the increasing levels of trust dependent on prior interactions between repositories and users. I find that repositories develop trust in their users through the interactions of four dimensions – promissory, experience, competence, and goodwill – that consider distinct types of researcher expertise and the role of a researcher’s reputation in the trust process. However, the processes used by repositories to determine a level of trust corresponding to data access are inconsistent and do not support the sharing of trusted users between repositories to maximize efficient yet secure access to restricted research data. I highlight the role of a researcher’s reputation as an important factor in trust development and trust transference, and discuss the implications of modelling the restricted data access process as a process of trust development

Access Some Areas: Reforming Access Categories for Data in a Social Science Data Archive

Author: Horton Laurence
Perry Anja
Publication venue: University of Edinburgh
Publication date: 31/12/2020
Field of study

In this paper we outline the process of revising data access categories for research data sets in GESIS – a large European social science data archive based in Germany. The challenge is to create a minimal set of workable access conditions that cope with a) facilitating as “open as possible, closed as necessary” expectations for data reuse; b) map on to existing legacy access categories and conditions in a data archive. The paper covers the work done in gathering data on data access categories used by data archives in their existing data catalogues, the choices offered to depositors of data in their user agreements, and work done by other data reuse platforms in categorising access to their data. Finally, we talk through the process of refining a minimal set of data access conditions for the GESIS data archive. &nbsp

522

full texts

605

metadata records

Updated in last 30 days.

International Journal of Digital Curation

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇