How to assess your data quality

Evaluation And Quality Control Of The Copernicus Climate Change Service

Summary


The Copernicus Climate Change Service (C3S) aims to provide high-quality information about the past, present and future climate in Europe and the rest of the world. This would not be possible without the role of the Evaluation and Quality Control (EQC), which ensures these data is provided to a broad range of users while shaping the research agenda to attend to the most important challenges detected. Thanks to the EQC help, the services provided by C3S are up to date and in unceasing evolution. This project is framed within the EQC's activities to ensure the quality of the datasets within the Climate Data Store (CDS).

In particular, this project tries to develop a solution for the EQC function to respond to the needs identified in previous contracts, through a regular user-engagement process. The project is led by the Barcelona Supercomputing Center and involves several institutions (and we are happy to be one of them!).

Main Features

  • Climate Data Evaluation and Quality Control
  • Big data technologies

    Contributions

    • KPI definition: Defining the indicators that signal a good performance of the service is key. Such metrics inform C3S about the overall quality of the service, enabling them to take strategic decisions in the future. We have been strongly involved in the discussions to select the technical Key Performance Indicators (KPIs) that better fit the service needs.
    • User interaction: we have developed the rating widgets within the CDS, to directly interact with users at key points of the platform use. This enables C3S to obtain valuable insights into how their users perceive their platform, and take decisions accordingly.
    • Dashboard implementation: In order to showcase the KPIs, we have implemented a dashboard that shows the most useful metrics, both for C3S users and C3S staff. It enables users to compare current performance with past performance, and download historical results.
    • Quality control: We have taken part in the development of QARs (Quality Assessment Reports), an internal tool that C3S uses to ensure the CDS's quality.
    • CMS development: We've created a Content Management System that enables the whole project and the C3S to create QARs in an easier and standardised manner.
    • Automatisation of the KPIs: We developed reKPIlator, a software capable of generating and updating over 80 relevant KPIs automatically, sorting through the millions of events that the CDS compiles from their users. We also developed rendermeter: an additional solution that uses Selenium to assess the performance of the CDS webpages. This way we're able to distil the most relevant information, so C3S can make data-based decisions.

    For further details of the project, please read below the different tasks assigned to each Work Package (WPs):

    WP1: Suitability of the CDS data

    The Climate Data Store (CDS) is the cornerstone of the C3S infrastructure, contributing to the provision of existing datasets on Essential Climate Variables (ECVs), climate analyses, projections and indicators. The CDS offers a web-based and API-based search to access all this climate data.

    The CDS datasets must be compliant with certain data models, conventions and standards recommended by ECMWF, that is, the Common Data Model (CDM). We are involved in the planning and development of the Data Quality Checker: a product that checks if those datasets are in fact compliant with the CDM.

    The CDS quality is fed by some tools called Quality Assessment Reports (QARs). They are created to gather main information on all aspects of the object to be quality controlled (e.g. ECV) in a standardised and concise form, whereby data users can be supported in making informed decisions on the use of multiple similar quality controlled objects (e.g. ECV) for their particular application. We have created a Content Management System that helps developing QAR content.

    WP2: Suitability of the CDS toolbox

    The CDS Toolbox is a comprehensive set of software which enables users to develop custom-made applications, helping them discover and process the data and products provided through the distributed data repositories.

    Within this WP, we are engaged on the definition of shared vocabularies and common practices to ensure consistency of quality assurance information and user guidance for all elements of the CDS Toolbox.

    WP3: Suitability of overall service

    In order to check the fitness of the overall service, some KPIs were defined. We established a workflow and a system to automatically revise and update these KPIs. In this case, we have developed a web dashboard and interaction system with C3S users, to gather their feedback. We are also reporting about the overall KPIs quality.

    Browse our projects


    We use cookies to enhance the user experience. By clicking any link on this page, you are giving your consent.