Centre for Environmental Data Analysis

Centre for Environmental Data Analysis Digital Repository
Not a member yet
    1250 research outputs found

    Data journals: building partnerships between publishers and data centres

    Full text link
    This presentation discusses the work done by CEDA and the other NERC data centres to interact with journal publsihers for data publication

    The Mind-Map of a Data Scientist

    Full text link
    Data science focus on issues such as the description of data (metadata), curating, archiving and management of data, their publication on the internet, their usability and legal issues associated to their preservation and use. Data science is also seen as a branch of statistics and artificial intelligence trying to meet the challenges of processing large data in order to derive valuable insights hiding in the data. This poster encapsulates what we effectively do as data scientist at BADC/CEDA and presents it from the point of view of someone who is new to our work. It was produced by two 15 years old students who researched what we do by interviewing some of us during their work experience week with u

    Data publication: policies and procedures from the PREPARDE project

    Full text link
    Data are widely acknowledged as a first class scientific output. Increases in researchers’ abilities to create data need to be matched by corresponding infrastructures for them to manage and share their data. At the same time, the quality and persistence of the datasets need to be ensured, providing the dataset creators with the recognition they deserve for their efforts. Formal publication of data takes advantage of the processes and procedures already in place to publish academic articles about scientific results, enabling data to be reviewed and more broadly disseminated. Data are vastly more varied in format than papers, and so the policies required to manage and publish data must take into account the complexities associated with different data types, scientific fields, licensing rules etc. The Peer REview for Publication & Accreditation of Research Data in the Earth sciences (PREPARDE) project is JISC- and NERC-funded, and aims to investigate the policies and procedures required for the formal publication of research data. The project is investigating the whole workflow of data publication, from ingestion into a data repository, through to formal publication in a data journal. To limit the scope of the project, the focus is primarily on the policies required for the Royal Meteorological Society and Wiley’s Geoscience Data Journal, though members of the project team include representatives from the life sciences (F1000Research), and will generalise the policies to other disciplines. PREPARDE addresses key issues arising in the data publication paradigm, such as: what criteria are needed for a repository to be considered objectively trustworthy; how does one peer-review a dataset; and how can datasets and journal publications be effectively cross-linked for the benefit of the wider research community and the completeness of the scientific record? To answer these questions, the project is hosting workshops addressing these issues, with interactions from key stakeholders, including data and repository managers, researchers, funders and publishers. The results of these workshops will be presented and further comment and interaction sought from interested parties

    A Simple Method for Eliminating Double Counting in Multi-Model Ensemble Forecasts

    Full text link
    We describe a method for eliminating double counting in multi-model ensemble forecasts. The method involves applying weights to the individual model predictions. The weights are derived from empirically estimated correlations between the outputs from the models in the ensemble, without reference to observations. The weighted ensemble mean can then be used as an improved best-estimate forecast, and the weighted ensemble spread as an improved estimate of the model uncertainty. Additional weights could subsequently be added to reflect agreement (or not) between the ensemble members and observations

    ESA Globsnow: Algorithm Theoretical Basis Document - SWE-algorithm

    Full text link
    Snow Water Equivalent (SWE) Algorithm Theoretical Basis Document; deliverable 06 for the European Space Agency Global Snow Monitoring for Climate Research (GlobSnow) project. The purpose of the document is to give a detailed description of the algorithms used for generating the GlobSnow Snow Water Equivalent (SWE) product. This document presents the algorithm used for producing the diagnostic data set (DDS) of SWE for the GlobSnow-2 project

    HPFELD : Hosted Processing Facility for the Exploitation of Large Datasets

    Full text link
    The era of 'big data' means that data centres are under increasing pressure to hold and support datasets which are much larger than before. The sheer volume of such datasets means that it is becoming impractical for users to be expected to download and store them on their local systems. Even if they could do this, they are then faced with the problem of finding enough local computing resource to process the data in a timely fashion. This is particularly true for Earth Observation (EO)data from satellites. Consequently, these data are effectively unusable by a significant proportion of the user community. A more efficient approach would be to allow the data centre archive themselves to be coupled to processing capability, and made available to users over the internet. In this way, remote users could select a pre-configured algorithm (or upload their own) to run on the dataset. The actual processing would be run on a host system which was 'close' to the data archive, and the results of the processing, would be made available to view on-line or download to their local systems. In this system, other complementary datasets for the data centre could be easily incorporated into the processing, such as comparison with different 42 model datasets. The host system would also be able to leverage the power of 'cloud' technologies, with the HPFELD system itself providing the environment in which the processing is performed. The HPFELD project was an attempt to see if existing technologies (such as G-POD, OPeNDAP and OpenID) could be combined to rapidly produce a demonstration system It was part funded by the TSB, and was a collaboration between STFC (CEDA), and the commercial companies Magellium and Terradue. The demonstrator system was set up to process METOP IASI L1C and ECMWF data to derive methane, with the aim of making the processing as flexible and easy to use as possible. Both of these datasets are held in the BADC archive (http://badc.nerc.ac.uk). This system has been used to show the benefits of using this approach when processing very large datasets

    Growing a community analysis platform with JASMIN and the Community Inter-comparison Suite

    Full text link
    CEDA is developing a the JASMIN Analysis Platform, a set of software which enables researchers to use a constent set of tools whether running their analyses at local reasearch institutions or on JASMIN hardware. A central component of the JASMIN Analysis Platform is the Community Inter-comparisson Suite (CIS): a high-level analysis tool enabling inter-comparison of diverse admospheric and EO datasets through the command-line and a Python interface

    Getting credit for your data: data citation and publication

    Full text link
    A poster describing data citation and publication and how to get credit for it. Why should I publish and cite data? It’s good for science: Allows research to be reproduced and verified. Allows data to be found and understood by other researchers. Encourages collaboration and cross-disciplinary work. Maintains the scientific record. It’s good for researchers: Provides attribution and credit for the hard work of creating a dataset. Makes it easier to build in good data management practices. Demonstrates the quality and usefulness of the data (especially for researchers outside the immediate field). Captures information about the data’s impact. It’s good for funders: Reduces money and effort needed to recreate important datasets. Reassures funders that the data resulting from their funding is useful and archived safely. Captures information about the funder’s impact. It’s good for data repositories: Encourages researchers to deposit data where it can be archived and curated. Papers linked to the data are an excellent source of metadata. Captures information about the repository’s impact

    Minutes of the 43rd Natural Environment Research Council (NERC) Mesosphere-Stratosphere-Troposphere (MST) Radar Facility Experimenters' Meeting

    Full text link
    Meeting date: Thursday 25th June 2009; Meeting location: The Cosener's House, Abingdon, UK; Meeting agenda: 1) Minutes of the previous meeting 2) Matters arising 3) Facility Report 4) NERC Instrument Report 5) Guest Instrument Report 6) Science and Technical Presentations 7) Any Other Busines

    Guidelines for Data Publication: Outputs from the PREPARDE project (public)SPM1.38, Fri 12 April, 12:15â13:15, R3

    Full text link
    At this meeting, the PREPARDE project (http://www2.le.ac.uk/projects/preparde) will present draft guidelines for data publication processes and will solicit feedback on them from members of the community. These processes include peer-review of data, data repository accreditation and cross-linking and workflows for data publication

    1,084

    full texts

    1,250

    metadata records
    Updated in last 30 days.
    Centre for Environmental Data Analysis Digital Repository
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇