CEDA Repository

Not a member yet

1250 research outputs found

Sort by

Environmental Data Archival: Practices and Benefits

Author: Parton Graham A
Callaghan Sarah
Publication venue
Publication date: 03/07/2013
Field of study

Presentation on Environmental data management given at the Problems in Information Sciences: Scholarly E-Publishing summer school. Presentation demonstrates the importance of active archiving of data to ensure that published science remains underpinned by its data, permitting transparency and reproducability of the work. The presentation also shows the value-added aspect of repositories and highlights common problems encountered by data scientists

A Simple Method for Eliminating Double Counting in Multi-Model Ensemble Forecasts

Author: Jewson Stephen
Publication venue
Publication date: 05/07/2013
Field of study

We describe a method for eliminating double counting in multi-model ensemble forecasts. The method involves applying weights to the individual model predictions. The weights are derived from empirically estimated correlations between the outputs from the models in the ensemble, without reference to observations. The weighted ensemble mean can then be used as an improved best-estimate forecast, and the weighted ensemble spread as an improved estimate of the model uncertainty. Additional weights could subsequently be added to reflect agreement (or not) between the ensemble members and observations

Getting credit for your data: data citation and publication

Author: Callaghan Sarah
Publication venue
Publication date: 2013
Field of study

A question and answer poster discussing data citation and publicatio

The CHARMe Project: Commentary Metadata for EO Datasets

Author: Marsh Kevin
Publication venue
Publication date: 12/09/2013
Field of study

A major impediment to enabling the wider use of EO and climate data is how they can judge if these data are 'fit for purpose', particularly as these data are now being used for increasingly diverse applications. Different users require different kinds of supporting information;. we term this 'Commentary' metadata, and includes both quantitative and non-quantitative metadata. This can include include: 1. Post-fact annotations, e.g. citations, ad-hoc comments and notes; 2. Results of assessments, e.g. validation campaigns, intercomparisons with models or other observations, reanalysis, quantitative error assessments; 3. Provenance, e.g. dependencies on other datasets, processing algorithms and chain, data source; 4. Properties of data distribution, e.g. data policy and licensing, timeliness (is the data delivered in real time?), reliability; 5. External events that may affect the data, e.g. volcanic eruptions, El-Nino index, satellite or instrument failure, operational changes to the orbit calculations. As yet, there is as yet no robust and consistent mechanism to link Commentary Metadata to the datasets themselves. CHARMe ("Characterization of metadata to allow high-quality climate applications and services") will provide these essential links, by creating a repository of Commentary metadata plus a set of interfaces through which users can interrogate the information over the Internet. This will provide robust and reusable frameworks for linking datasets with Commentary metadata, as well as reusable software tools that allow climate scientists and users to exploit this information in their own applications and improved search, intercomparison and time-series analysis tools for large and diverse datasets. The project consortium encompasses data providers, scientists, and developers of future climate services, who participate in major European investments such as GMES, ERA-Clim, ESA's Climate Change Initiative, the Climate Satellite Applications Facility and EURO4M. This will ensure that the CHARMe system is suited to the needs of diverse EO user groups

The Mind-Map of a Data Scientist

Author: Valdivieso Carlota
Perry Rebecca
da Costa Eduardo D.
Publication venue
Publication date: 2013
Field of study

Data science focus on issues such as the description of data (metadata), curating, archiving and management of data, their publication on the internet, their usability and legal issues associated to their preservation and use. Data science is also seen as a branch of statistics and artificial intelligence trying to meet the challenges of processing large data in order to derive valuable insights hiding in the data. This poster encapsulates what we effectively do as data scientist at BADC/CEDA and presents it from the point of view of someone who is new to our work. It was produced by two 15 years old students who researched what we do by interviewing some of us during their work experience week with u

The Receiver Independent Exchange Format (RINEX) Version 3.02

Author: Unknown A
Publication venue
Publication date: 03/04/2013
Field of study

Documentation about RINEX formatting

Getting credit for your data: data citation and publication

Author: Callaghan Sarah
Publication venue
Publication date: 12/07/2013
Field of study

A poster describing data citation and publication and how to get credit for it. Why should I publish and cite data? It’s good for science: Allows research to be reproduced and verified. Allows data to be found and understood by other researchers. Encourages collaboration and cross-disciplinary work. Maintains the scientific record. It’s good for researchers: Provides attribution and credit for the hard work of creating a dataset. Makes it easier to build in good data management practices. Demonstrates the quality and usefulness of the data (especially for researchers outside the immediate field). Captures information about the data’s impact. It’s good for funders: Reduces money and effort needed to recreate important datasets. Reassures funders that the data resulting from their funding is useful and archived safely. Captures information about the funder’s impact. It’s good for data repositories: Encourages researchers to deposit data where it can be archived and curated. Papers linked to the data are an excellent source of metadata. Captures information about the repository’s impact

The CEDA archive: Data, Services and Infrastructure

Author: Marsh Kevin
Publication venue
Publication date: 05/09/2013
Field of study

The purpose of the Centre for Environmental Archival (CEDA) is to deliver long term curation of scientifically important environmental data at the same time as facilitating the use of data by the environmental science community. Nearly 2Pb of data are available from the archive. These data are from a number of sources, including satellites, ground based, in-situ and numerical models, and CEDA has data from a number of major research projects including CMIP5, RAPID-WATCH and OCEANS2025. CEDA are also providing a range of services via the JASMIN and CEMS systems in order to make these data more easily accessible to users, and to help them face the 'big data' challenges which are now here. CEDA are also deeply involved in a number of EU data infrasturcture projects (such as IS-ENES), the development of metadata standards (such as the cf-conventions), and quality control for EU data (via the CHARMe project)

STFC Centre for Environmental Data Archival (CEDA) Annual Report 2012 (April 2011-March 2012)

Author: Lawrence Bryan N
Callaghan Sarah
Publication venue: Centre for Environmental Data Archival
Publication date: 11/01/2013
Field of study

The mission of the Centre for Environmental Archival (CEDA) is to deliver long term curation of scientifically important environmental data at the same time as facilitating the use of data by the environmental science community. CEDA was formed to host two of the Natural Environment Research Council (NERC) designated data centres: the British Atmospheric Data Centre and the NERC Earth Observation Data Centre, as well as the UK arm of the IPCC Data Distribution Centre. In 2011, the UK Solar System Data Centre joined CEDA. Here we present the fourth annual report, covering joint activities from April 2011 to March 2012 (previously the constituent centres reported independently). The report itself is in two sections, the first broadly providing a summary of activities and some statistics, and the second a selection of short reports on some specific activities beginning, under way, or completed. This section is intended to provide a taster for the range of activities that CEDA undertakes, rather than a complete report of activities, since CEDA staff are involved in a huge range of scientific and informatics projects, not all of which are appropriate for reporting here. CEDA continues to engage in informatics projects to help improve the provision of: (1) suitable tools to document and manage both high volume and highly heterogeneous data; (2)tooling and services to enable the community to exploit CEDA data holdings, and; (3) fundamental standards. The latter, both to improve the likelihood that others can build standards compliant software we can deploy, and to support interdisciplinary science. As in the previous year, the 2011/2012 year was dominated by the two major challenges of dealing with CMIP5 (e.g. see page 36) and the establishment of new services under the banner of the International Space Innovation Centre (discussed in the articles on CEMS on pages 27 and 28). However, while those were high profile external activities, issues of scale became dominant internally; the funding report on page 14 summarises some of the issues: of the order of 108 files – o(108) – using o(petabytes) of disk, on o(300) different computers, split into o(600) datasets on o(100) disk partitions – without a consistent metadata standard or file format across the archive. Despite a decade of effort on metadata systems, and what had been a very efficient computing environment, CEDA was beginning to creak at the seams – with disk failures, insufficient documentation, and complex network issues becoming more and more prevalent. Ongoing growth using the same technical environment would have been a problem. Fortunately, in late 2011, CEDA received significant capital investment, culminating in the delivery in March 2012 of a new computing system – JASMIN/CEMS – consisting of storage and compute funded both by NERC and UKSA and delivered by CEDA in what was then the e-Science department in STFC (now part of the Scientific Computing Department). JASMIN is discussed on page 26 and CEMS on pages 27 and 28. JASMIN/CEMS are not just about supporting the traditional archival services of CEDA though – they are intended to additionally provide support for high performance analysis of high volume data by the greater NERC scientific community. The physical delivery of these systems is of course just part of the story, in next year’s annual report we will be discussing the difficulty of migrating data to the new environment, and some of the new services which their advent has engendered. While we expect the physical system issues to be resolved with the new hardware, issues of documentation still exist – both in terms of the content, and how it is organised. CEDA continues to invest, with both core and project funding, in new metadata developments, aiming to address both issues. Work on data publication and citation is intended both to improve the integrity of the scholarly record, and to provide incentives for the production of good documentation, and work on metadata standards to ensure that we have the information organised fit for automating our environment and scientific use! Many of the one page reports discuss projects in this arena

Data journals: building partnerships between publishers and data centres

Author: Callaghan Sarah
Parton Graham A
Publication venue
Publication date: 07/05/2013
Field of study

This presentation discusses the work done by CEDA and the other NERC data centres to interact with journal publsihers for data publication

0

full texts

1,250

metadata records

Updated in last 30 days.

CEDA Repository

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇