Summary and Schedule
This tutorial helps you to use ESMValTool.
The Earth System Model Evaluation Tool (ESMValTool) is a community developed software toolkit that aims to facilitate the diagnosis and evaluation of the causes and effects of model biases and inter-model spread within the CMIP model ensemble.
This tutorial is structured into basic
and
advanced
topics such that episodes starting from the
[Introduction][lesson-introduction] up to the episode on Conclusion of the basic tutorial all
cover basic topics and can be done in one sitting.
The remaining episodes cover the advanced topics and each episode is a mini-tutorial covering an advanced aspect of working with ESMValTool. These mini-tutorials can be appended to the main tutorial or worked through independently.
What will you learn in this course
- What is ESMValTool
- How to install ESMValTool
- How to configure ESMValTool for your local system
- How to run ESMValTool
- How to work with ESMValTool’s suite of preprocessors
- How to debug your recipes
- How to access and deploy recipes from the ESMValTools gallery (Advanced)
- How to develop your own diagnostics and recipes (Advanced)
- How to contribute your recipes and diagnostics back into ESMValTool (Advanced)
- How to include new observational datasets (Advanced)
Prerequisites
The prerequisites for the tutorial are listed on the tutorial setup page.
Main things you need to know before starting this course
This tutorial can be taken online independently or taught by one of our instructors.
Don’t be alarmed if you can’t work through the entire tutorial in one sitting. It may take some time to get used to working with ESMValTool.
If you get stuck, help is always available from the tutors, from ESMValTool developers via the github issues page or via the ESMValTool email list. Please see information on how to subscribe to user mailing list.
This tutorial includes several advanced lessons after the conclusion. These advanced lessons should be treated like “mini-tutorials”, and include aspects like “developing your own diagnostic” or “how to include observations”.
How to cite the Tutorial
Please use citation information available at https://doi.org/10.5281/zenodo.3974591.
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introduction |
What is ESMValTool? Who are the people behind ESMValTool? |
Duration: 00h 15m | 2. Quickstart guide |
What is the purpose of the quickstart guide? How do I load and check the ESMValTool environment? How do I configure ESMValTool? How do I run a recipe? |
Duration: 00h 25m | 3. Installation |
What are the prerequisites for installing ESMValTool? How do I confirm that the installation was successful? |
Duration: 00h 45m | 4. Configuration | What is the user configuration file and how should I use it? |
Duration: 01h 05m | 5. Running your first recipe |
How to run a recipe? What happens when I run a recipe? |
Duration: 01h 35m | 6. Conclusion of the basic tutorial |
What do I do now? Where can I get help? What if I find a bug? Where can I find more information about ESMValtool? How can I cite ESMValtool? |
Duration: 01h 45m | 7. Writing your own recipe |
How do I create a new recipe? Can I use different preprocessors for different variables? Can I use different datasets for different variables? How can I combine different preprocessor functions? Can I run the same recipe for multiple ensemble members? |
Duration: 02h 30m | 8. Development and contribution |
What is a development installation? How can I test new or improved code? How can I incorporate my contributions into ESMValTool? |
Duration: 03h 00m | 9. Writing your own diagnostic script |
How do I write a new diagnostic in ESMValTool? How do I use the preprocessor output in a Python diagnostic? |
Duration: 03h 50m | 10. CMORization: adding new datasets to ESMValTool |
CMORization: what is it and why do we need it? How to use the existing CMORizer scripts shipped with ESMValTool? How to add support for new (observational) datasets? |
Duration: 04h 50m | 11. Debugging | How can I handle errors/warnings? |
Duration: 05h 35m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
This page includes some information on how to prepare for participating in this tutorial.
Prerequisites
Minimal requirements:
- Basic understanding of your preferred command line interface (ie a bash terminal)
- Access to CMIP data
Optional, but useful:
- Basic understanding of git
- Access to a suitable computing system (eg CEDA-Jasmin, DKRZ-Mistral)
- GitHub account
Command line & git tutorials
We typically use the command line to interact with ESMValTool. While most of us are likely to have experience with the command line, novices may want to work through this software carpentry unix shell course.
- Command line: https://swcarpentry.github.io/shell-novice/
Git is a distributed version-control system for tracking changes in source code during software development. It’s how we distribute, share, and manage the ESMValTool code.
Access to CMIP and Observational data and a suitable compute cluster
To complete this tutorial and use ESMValTool, you will need access to data in a reasonable format. Some data will be provided, but there is simply too much data available for your tutors to make it all available directly.
ESMValTool may be run on multiple platforms, from your local machine to large computing clusters. The best option is to use a computing cluster with an Earth System Grid Federation (ESGF) node. The benefit of using a compute cluster with an ESGF node is that the Coupled Model Intercomparison Project (CMIP) is locally stored on disk and accessible directly by the tool. Similarly, observational data would also be available at these sites. The ESGF also hosts observations for Model Intercomparison Projects (obs4MIPs) and reanalyses data (ana4MIPs).
Here are a few options for compute clusters with ESGF nodes:
For more information see:
CMIP5 and CMIP6 data obey the CF-conventions. Available variables can be found under the CMIP5 data request and the CMIP6 Data Request.
List of all CMIP named variables.
List of all ESGF nodes.
A good tutorial on how to search and download CMIP data from ESGF nodes.
Exploring climate model data on infrastructure for the European network for Earth system modelling.
CEDA-Jasmin
Please skip this section if you are not going to use JASMIN and go here.
If you do not already have an account on JASMIN, then request an account as soon as possible. Please follow these instructions on how to create a Jasmin account
During the account creation, you will need an SSH key, which can be generated following these instructions
Here are some more instructions on how to get started with jasmin.
Also note that if you are working from home, JASMIN may not be directly accessible from your home. You may need to use ssh to connect to a machine in your institute and then on to JASMIN. Please test your connection before the tutorial.
Jasmin-login
Note that you have only created an account for the web-interface. To log into the jasmin machine and do work, you’ll need to create a login account too using this page.
Access to data on JASMIN
Please request access to the working groups:
Once you have access to the data archive on CEDA, make sure to link your CEDA and JASMIN accounts. This can be done by checking the link to CEDA box on your JASMIN profile page.
The linking may take a few hours to take effect and is necessary for you to access the BADC archives via JASMIN. Some CMIP5 data sets such as MIROC are not accessible by default and special permission has to be requested to access them via the CEDA catalogue page.
Test your Setup
Log into jasmin-login:
Then log into the sci1 machine:
Can you see the following locations:
BASH
ls /gws/nopw/j04/esmeval/obsdata-v2/
ls /badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES
ls /badc/cmip6/data/CMIP6/CMIP/*/*/historical/r1i1p1f?/Omon/[ts]os/gn/latest/*.nc
Note that the JASMIN is only open to certain locations (mostly universities, and research centres). You may need a VPN if you wish to connect from your home network.
Please request access to the working groups:
Once you have access to the data archive on CEDA, make sure to link your CEDA and JASMIN accounts. This can be done by checking the link to CEDA box on your JASMIN profile page.
The linking may take a few hours to take effect and is necessary for you to access the BADC archives via JASMIN. Some CMIP5 data sets such as MIROC are not accessible by default and special permission has to be requested to access them via the CEDA catalogue page.
Congratulations! Please go here here next.
DKRZ
Please skip this section if you are not going to use DKRZ and go here.
If you do not already have an account at the DKRZ, then register as soon as possible. You could find a short introduction on how to get started at DKRZ here.
There is also a user manual for Levante which is DKRZ’s current supercomputer.
Join a project
To use the resources on DKRZ you have to join a project. One option is to join an existing project by logging into https://luv.dkrz.de/ with your account and select ‘Join existing project’. Once you are accepted by the manager of your chosen project, your web account will be turned into a full LDAP account which will allow you to log into and use the DKRZ’s resources. If you do not have access to an existing project, another option for you would be to apply for resources at DKRZ. Here are some instructions on how to apply for resources.
Access to data on DKRZ
CMIP5 and CMIP6 data are available in these directories:
- CMIP5: /work/kd0956/CMIP5/data/cmip5/output1/
- CMIP6: /work/ik1017/CMIP6/data/CMIP6/CMIP/
Additional information
Login nodes are for compiling and job submission only! For all other tasks, you can use the [interactive](https://docs.dkrz.de/doc/levante/running-jobs/ partitions-and-limits.html#interactive) partition or start an interactive session.
Data storage:
- Personal data: home directory (30GiB)
- Project data: /work/project_id/username
- Temporary data: scratch directory on /scratch/*/username is automatically deleted after 14 days (15TiB) (Please use this directory for all your testing! Do not use the work directory for tests.) (see also this)
Running batch jobs: Info and examples on SLURM job scheduling system at DKRZ can be found here.
Congratulations! Please go here here next.
Using your own machine
Please skip this section if you are not going to use ESMValTool on your local machine and go here.
If you are planning on running ESMValTool on your own machine, please make sure that you are able to download CMIP data and that you have a few GB of space available to install conda and ESMValTool, but also enough to make a copy of some data (~125MB) needed for this tutorial.
You can use ESMValTool to automatically download data needed for test recipes. Please see the [Configuration][lesson-configuration] episode or the [configuration file documentation][config-file] for more information. This the recommended option as it has the advantage that data is stored in subdirectories, and features such as wildcards and recording the version of the data will work automatically.
Alternatively, you can run the following command using wget:
wget --no-clobber --input-file \
https://github.com/ESMValGroup/ESMValTool_Tutorial/raw/main/data/dataset.urls \
--directory-prefix $HOME/esmvaltool_tutorial/data/
GitHub account (Advanced)
You don’t need a github account to participate in the tutorial. However, if you want to raise an issue, contribute to the discussions, or share your code, please create a github account.
To learn how to use github, please have a look at this introduction to github.
You may hear a few of the following phrases during the tutorial. Don’t be alarmed, they will make sense eventually.
GitHub issues
Issues are github’s ticketing system. They allow users and developers to discuss problems, identify bugs, or to make suggestions. Each issue is assigned a number and will have it’s own page on GitHub.
Here’s an explanation of the GitHub issues.
Raising an issue is the act of creating a new issue. If you are asked to raise an issue, please follow any instructions that you are given, and also make sure that you read the default issue text.