International Journal of Digital Curation
Not a member yet
    605 research outputs found

    Updating the Data Curation Continuum: not just Data, still focussed on Curation, more about Domains

    Full text link
    The Data Curation Continuum was developed as a way of thinking about data repository infrastructure. Since its original development over a decade ago, a number of things have changed in the data infrastructure domain. This paper revisits the thinking behind the original data curation continuum and updates it to respond to changes in research objects, storage models, and the repository landscape in general. &nbsp

    Getting to Beta: Building a Model Collection in a World of Digital One-Offs

    Full text link
    Libraries and archives are increasingly producing subject-based digital collections alongside, but separate from, their main digital collections. These smaller projects are often treated as digital one-offs; they are created, launched, promoted, and then largely forgotten. The authors of this study argue that small-scale digital collections instead be treated as test cases for their institutions’ main digitization programs. Because they are lightweight and have relatively low stakes, these collections get pushed through the system quickly and can illuminate its workings and shortcomings in a snapshot form. The authors treat their own experience in developing the Animal Welfare Act History Digital Collection at the National Agricultural Library as a case study in using a digital collection to test and revise an institution’s digitization program. In so doing, this study suggests how agile projects like the AWAHDC can be core components in digital curation policies and their implementation.&nbsp

    From Passive to Active, From Generic to Focussed: How Can an Institutional Data Archive Remain Relevant in a Rapidly Evolving Landscape?

    Full text link
    Founded in 2008 as an initiative of the libraries of three of the four technical universities in the Netherlands, the 4TU.Centre for Research Data (4TU.Research Data) has provided a fully operational, cross-institutional, long-term archive since 2010, storing data from all subjects in applied sciences and engineering. Presently, over 90% of the data in the archive is geoscientific data coded in netCDF (Network Common Data Form) – a data format and data model that, although generic, is mostly used in climate, ocean and atmospheric sciences. In this practice paper, we explore the question of how 4TU.Research Data can stay relevant and forward-looking in a rapidly evolving research data management landscape. In particular, we describe the motivation behind this question and how we propose to address it

    Experimental Data Curation at Large Instrument Facilities with Open Source Software

    Full text link
    The National Synchrotron Light Source II operating at Brookhaven National Laboratory since 2014 for the US Department of Energy is one of the newest and brightest storage-ring synchrotron facility in the world.  NSLS-II, like other facilities, provides pre-processing of the raw data and some analysis capabilities to its users. We describe the research collaborations and open source infrastructure  developed at large instrument facilities such as NSLS-II for the purpose of curating high value scientific data along the early stages of the data lifecycle.  Data acquisition and curation tasks include storing experiment configuration, detector metadata, raw data acquisition with infrastructure that converts proprietary instrument formats to industry standards.  In addition, we describe a specific effort for discovering sample information at NSLS-II and tracing the provenance of analysis performed on acquired images.  We show that curation tasks must be embedded into software along the data life cycle for effectiveness and ease of use, and that loosely defined collaborations evolve around shared open source tools.  Finally we discuss best practices for experimental metadata capture in such facilities, data access and the new challenges of scale and complexity posed by AI-based discovery for the synthesis of new materials

    Are Research Datasets FAIR in the Long Run?

    Full text link
    Currently, initiatives in Germany are developing infrastructure to accept and preserve dissertation data together with the dissertation texts (on state level – bwDATA Diss1, on federal level – eDissPlus2). In contrast to specialized data repositories, these services will accept data from all kind of research disciplines. To ensure FAIR data principles (Wilkinson et al., 2016), preservation plans are required, because ensuring accessibility, interoperability and re-usability even for a minimum ten year data redemption period can become a major challenge. Both for longevity and re-usability, file formats matter. In order to ensure access to data, the data’s encoding, i.e. their technical and structural representation in form of file formats, needs to be understood. Hence, due to a fast technical lifecycle, interoperability, re-use and in some cases even accessibility depends on the data’s format and our future ability to parse or render these. This leads to several practical questions regarding quality assurance, potential access options and necessary future preservation steps. In this paper, we analyze datasets from public repositories and apply a file format based long-term preservation risk model to support workflows and services for non-domain specific data repositories. 1 BwDATADiss-bw Data for Dissertations:https://www.alwr-bw.de/kooperationen/bwdatadiss/ 2EDissPlusDFG-Project – Electronic Dissertations Plus:https://www2.hu-berlin.de/edissplus

    Remediation Data Management Plans: A Tool for Recovering Research Data from Messy, Messy Projects

    Full text link
     Data   Management   Plans   (DMPs)   have   been   used   in   the   last   decade   to   encourage   good data   management   practices   among   researchers.   DMPs   are   widely   used,   preventive   tools that   encourage   good   data   management   practices.   DMPs   are   traditionally   used   to   manage data   during   the   planning   stage   of   the   project,   often   required   for   grant   proposals,   and   prior to   data   collection.   In   this   paper   we   will   use   a   case   study   to   argue   that   Data   Management  Plans   can   be   useful   in   improving   the   management   of   the   data   of   research   projects   that  have   moved   beyond   the   planning   stage   of   the   research   life   cycle.   In   particular,   we   focus  on   the   case   of   active   projects   where   data   has   already   been   collected   and   is   still   being  analyzed.   We   discuss   the   differences   and   commonalities   in   structure   between   preventive Data   Management   Plans   and   remedial   Data   Management   Plans,   and   describe   in   detail   the additional   considerations   that   are   needed   when   writing   remedial   Data   Management   Plans: the   goals   and   audience   of   the   document,   the   data   inventory,   and   an   implementation   plan.&nbsp

    Progress in Research Data Services: An international survey of university libraries

    Full text link
    University libraries have played an important role in constructing an infrastructure of support for Research Data Management at an institutional level. This paper presents a comparative analysis of two international surveys of libraries about their involvement in Research Data Services conducted in 2014 and 2018. The aim was to explore how services had developed over this time period, and to explore the drivers and barriers to change. In particular, there was an interest in how far the FAIR data principles had been adopted. Services in nearly every area were more developed in 2018 than before, but technical services remained less developed than advisory. Progress on institutional policy was also evident. However, priorities did not seem to have shifted significantly. Open ended answers suggested that funder policy, rather than researcher demand, remained the main driver of service development and that resources and skills gaps remained issues. While widely understood as an important reference point and standard, because of their relatively recent publication date, FAIR principles had not been widely adopted explicitly in policy

    Curating Scientific Workflows for Biomolecular Nuclear Magnetic Resonance Spectroscopy

    Full text link
    This paper describes our recent and ongoing efforts to enhance the curation of scientific workflows to improve reproducibility and reusability of biomolecular nuclear magnetic resonance (bioNMR) data. Our efforts have focused on both developing a workflow management system, called CONNJUR Workflow Builder (CWB), as well as refactoring our workflow data model to make use of the PREMIS model for digital preservation. This revised workflow management system will be available through the NMRbox cloud-computing platform for bioNMR. In addition, we are implementing a new file structure which bundles the original binary data files along with PREMIS XML records describing the provenance of the data. These are packaged together using a standardized file archive utility. In this manner, the provenance and data curation information is maintained together along with the scientific data. The benefits and limitations of these approaches, as well as future directions, are discussed in this paper

    Emerging Roles for Optimising Re-Use of Open Government Data

    Full text link

    Secure Data for the Future: A Risk Assessment

    Full text link
    The guarantee of secure and authentic future access to any digital data is a big worry to those who work with data now and those who are responsible to keep it accessible for the future. There are a wide range of threats to digital data that these people should need to take into consideration. The project PreservIA had the goal to assess the risks of using analogue 35mm film to store and preserve digital information and define its strengths and weaknesses for long-term secure preservation of all kinds of digital data. The research project was examining the application of the Piql technology to ensure the security, integrity and authenticity of the information stored on a unique storage medium. PiqlFilm has been designed for a life span of 500 years or more and the research tries to assess how well this solution could maintain the authenticity and availability of the information, independently of internal and external changes in the surrounding environment over time. The research project has been designed using a scenario-based approach and the morphological method of scenario development is used to define a set of scenarios covering the risks to the service. The scenario classes used were accident, technical error, natural disaster, crime, sabotage, espionage, terrorism, armed conflict and nuclear war. A scenario template has been included for the purpose of describing current and future scenarios. The final scenario analysis identified potential vulnerabilities. The paper shows briefly how Piql Preservation Services holistic preservation approach perform the work, defines a methodology to select the scenarios for the assessment and then studies the vulnerabilities and security challenges of the solution on those scenarios. The project also includes a comparison of other existing storage media to evaluate their robustness to the addressed scenarios in relation to Piql technology

    522

    full texts

    605

    metadata records
    Updated in last 30 days.
    International Journal of Digital Curation
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇