MDTF-diagnostics

Analysis framework and collection of process-oriented diagnostics for weather and climate simulations.
https://github.com/NOAA-GFDL/MDTF-diagnostics

Category: Climate Change
Sub Category: Earth and Climate Modeling

Keywords from Contributors

climate-model numerical-modeling climate gfdl ocean-circulation ocean-circulation-models feature-toggle feature-flag featured training

Last synced: about 3 hours ago
JSON representation

Repository metadata

Analysis framework and collection of process-oriented diagnostics for weather and climate simulations

README.md

MDTF-diagnostics: A Portable Framework for Weather and Climate Model Data Analysis

All Contributors

MDTF_test CodeQL Documentation Status

The MDTF-diagnostics package is a portable framework for running process-oriented diagnostics (PODs) on weather and
climate model data.

What is a POD?

MDTF_logo
Each process-oriented diagnostic [POD; Maloney et al.(2019)] targets a specific physical process or
emergent behavior to determine how well one or more models represent the process, ensure that models produce the right
answers for the right reasons, and identify gaps in the understanding of phenomena. Each POD is independent of other
PODs. PODs generate diagnostic figures that can be viewed as an html file using a web browser.

Available Diagnostics

The links in the table below show sample output, a brief description,
and a link to the full documentation for each currently-supported POD.

Diagnostic Contributor
Blocking Neale Rich Neale (NCAR), Dani Coleman (NCAR)
Convective Transition Diagnostics J. David Neelin (UCLA)
Diurnal Cycle of Precipitation Rich Neale (NCAR)
Eulerian Storm Track James Booth (CUNY), Jeyavinoth Jeyaratnam
Extratropical Variance (EOF 500hPa Height) CESM/AMWG (NCAR)
Forcing Feedback Diagnostic Brian Soden (U. Miami), Ryan Kramer
Mean Dynamic Sea Level Package C.Little (AER, Inc.), N. Etige, S. Vannah, M. Zhao
Mixed Layer Depth Cecilia Bitz (U. Washington), Lettie Roach
MJO Propagation and Amplitude Xianan Jiang (UCLA)
MJO Spectra and Phasing CESM/AMWG (NCAR)
MJO Teleconnections Eric Maloney (CSU)
Moist Static Energy Diagnostic Package H. Annamalai (U. Hawaii), Jan Hafner (U. Hawaii)
Ocean Surface Flux Diagnostic Charlotte A. DeMott (Colorado State University), Chia-Weh Hsu (GFDL)
Precipitation Buoyancy Diagnostic J. David Neelin (UCLA), Fiaz Ahmed
Rossby Wave Sources Diagnostic Package H. Annamalai (U. Hawaii), Jan Hafner (U. Hawaii)
Sea Ice Suite Cecilia Bitz (U. Washington), Lettie Roach
Soil Moisture-Evapotranspiration coupling Eric Wood (Princeton)
Stratosphere-Troposphere Coupling: Annular Modes Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: Eddy Heat Fluxes Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: QBO and ENSO stratospheric teleconnections Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL), Dillon Elsbury (NOAA)
Stratosphere-Troposphere Coupling: Stratospheric Ozone and Circulation Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: Stratospheric Polar Vortex Extremes Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: Vertical Wave Coupling Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Surface Albedo Feedback Cecilia Bitz (U. Washington), Aaron Donahoe (U. Washington), Ed Blanchard, Wei Cheng, Lettie Roach
Surface Temperature Extremes and Distribution Shape J. David Neelin (UCLA), Paul C Loikith (PSU), Arielle Catalano (PSU)
TC MSE Variance Budget Analysis Allison Wing (Florida State University), Jarrett Starr (Florida State University)
Top Heaviness Metric Zhuo Wang (U.Illinois Urbana-Champaign), Jiacheng Ye (U.Illinois Urbana-Champaign)
Tropical Cyclone Rain Rate Azimuthal Average Daehyun Kim (U. Washington), Nelly Emlaw (U.Washington)
Tropical Pacific Sea Level Jianjun Yin (U. Arizona), Chia-Weh Hsu (GFDL)
Wavenumber-Frequency Spectra CESM/AMWG (NCAR)

Example POD Analysis Results

Quickstart installation instructions

See the documentation site for all other information, including more in-depth installation instructions.

Visit the GFDL Youtube Channel for tutorials on package installation and other MDTF-diagnostics-related topics

Prerequisites

  • Anaconda3, Miniconda3,
    or micromamba.

  • Installation instructions are available here.

  • MDTF-diagnositics is developed for macOS and Linux systems. The package has been tested on, but is not fully supported
    for, the Windows Subsystem for Linux.

  • Attention macOS M-series chip users: the MDTF-diagnostics base and python3 conda environments will only build with
    micromamba on machines running Apple M-series chips. The NCL and R environments will NOT build on M-series machines
    because the conda packages do not support them at this time.

Notes

  • $ indicates strings to be substituted, e.g., the string $CODE_ROOT should be substituted by the actual path to the
    MDTF-diagnostics directory.
  • Consult the Getting started section to learn how to run the framework on your own data and configure general
    settings.
  • POD contributors can consult the Developer Cheatsheet for brief instructions and useful tips

1. Install MDTF-diagnostics

  • Open a terminal and create a directory named mdtf, then $ cd mdtf

  • Clone your fork of the MDTF repo on your machine: git clone https://github.com/[your fork name]/MDTF-diagnostics

  • Check out the latest official release: git checkout tags/[version name]

  • Run % conda info --base to determine the location of your Conda installation. This path will be referred to as
    $CONDA_ROOT.

  • cd $CODE_ROOT, then run

ANACONADA/MINICONDA

% ./src/conda/conda_env_setup.sh --all --conda_root $CONDA_ROOT --env_dir $CONDA_ENV_DIR

MICROMAMBA on machines that do NOT have Apple M-series chips

% ./src/conda/micromamba_env_setup.sh --all --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR

MICROMAMBA on machines with Apple M-series chips

% ./src/conda/micromamba_env_setup.sh -e base --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR

% ./src/conda/micromamba_env_setup.sh -e python3_base --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR

  • Substitute the actual paths for $CODE_ROOT, $CONDA_ROOT, $MICROMAMBA_ROOT, MICROMAMBA_EXE, and
    $CONDA_ENV_DIR.
  • $MICROMAMBA_ROOT is the path to micromamba installation on your system
    (e.g., /home/${USER}/micromamba). This is defined by the $MAMBA_ROOT_PREFIX environment variable on your system
    when micromamba is installed
  • $MICROMAMBA_EXE is full path to the micromamba executable on your system
    (e.g., /home/${USER}/.local/bin/micromamba). This is defined by the MAMBA_EXE environment variable on your system
  • All flags noted for your system above must be supplied for the script to work.

NOTE: The micromamba environments may differ from the conda environments because of package compatibility discrepancies between solvers % ./src/conda/micromamba_env_setup.sh --all --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR builds the base environment, NCL_base environment, and the python3_base environment.

NOTE: If you are trying to install environments with a Conda package managed by your institution, you will need to set your environment directory to a location with write access, and symbolically link $CONDA_ROOT/envs to this environment directory: % ln -s [ROOT_DIR]/miniconda3/envs [path to your environment directory]

2. Download the sample data

Supporting observational data and sample model data are available via
Globus.

For tar files tranfered over ftp, please note that the above paths are symlinks to the most recent versions of the data and will be reported as zero bytes in an FTP client.
Running tar -xvf [filename].tar will extract the contents in the following hierarchy under the mdtf directory:

mdtf
 ├── MDTF-diagnostics
 ├── inputdata
     ├── model
     │   ├── GFDL.CM4.c96L32.am4g10r8
     │   │   └── day
     │   │       ├── GFDL.CM4.c96L32.am4g10r8.precip.day.nc
     │   │       └── (... other .nc files )
     │   └── QBOi.EXP1.AMIP.001
     │       ├── 1hr
     │       │   ├── QBOi.EXP1.AMIP.001.PRECT.1hr.nc
     │       │   └── (... other .nc files )
     │       ├── 3hr
     │       │   └── QBOi.EXP1.AMIP.001.PRECT.3hr.nc
     │       ├── day
     │       │   ├── QBOi.EXP1.AMIP.001.FLUT.day.nc
     │       │   └── (... other .nc files )
     │       └── mon
     │           ├── QBOi.EXP1.AMIP.001.PS.mon.nc
     │           └── (... other .nc files )
     └── obs_data ( = $OBS_DATA_ROOT)
         ├── (... supporting data for individual PODs )

The default test case uses the QBOi.EXP1.AMIP.001 sample data. The GFDL.CM4.c96L32.am4g10r8 sample data is only
needed to test the MJO Propagation and Amplitude POD.

You can put the observational data and model output in different locations (e.g., for space reasons) by changing the
values of OBS_DATA_ROOTas described below in section 3.

3. Generate a data catalog for the sample input data

The MDTF-diagnostics package provides a basic catalog generator to assist users with building data catalogs in
the tools/catalog_builder directory

4. Configure framework paths

The MDTF framework supports setting configuration options in a file as well as on the command line. An example of the
configuration file format is provided at templates/runtime_config.[jsonc | yml].
We recommend configuring the following settings by editing a copy of this file.

  • CATALOG_DIR: path to the ESM-intake data catalog
  • If you've saved the supporting data in the directory structure described in section 2, and use observational input data
    the default value for OBS_DATA_ROOT (../inputdata/obs_data) will be correct. If you put the data in a different
    location, the path should be changed accordingly.
  • WORK_DIR is used as a scratch location for files generated by the PODs, and should have sufficient quota to
    handle the full set of model variables you plan to analyze. This includes the sample model and observational data
    (approx. 19 GB) PLUS data required for the POD(s) you are developing.** No files are saved here unless you set
    OUTPUT_DIR to the same location as WORK_DIR, so a temporary directory would be a good choice.
  • OUTPUT_DIR should be set to the desired location for output files. OUTPUT_DIR and WORK_DIR are set to the same
    locations by default. The output of each run of the framework will be saved in a different subdirectory in this
    location. As with the WORK_DIR, ensure that OUTPUT_DIR has sufficient space for all POD output.
  • conda_root should be set to the value of $CONDA_ROOT used in section 2.
  • Likewise, set conda_env_root to the same location as $CONDA_ENV_DIR in section 2

We recommend using absolute paths in runtime_config.[jsonc | yml], but relative paths are also allowed and should be
relative to $CODE_ROOT.$CODE_ROOT contains the following subdirectories:

  • diagnostics/: directory containing source code and documentation of individual PODs
  • doc/: directory containing documentation (a local mirror of the documentation site)
  • src/: source code of the framework itself
  • submodules/: location to place 3rd-party submodules to run as part of the MDTF-diagnostics workflow
  • tests/: unit tests for the framework
  • templates/: runtime configuration template files
  • tools/: helper scripts for building ESM-intake catalogs, and other utilities
  • user_scripts/: directory where users can place custom preprocessing scripts

5. Run the framework

The framework runs PODs that analyze one or more model datasets (cases), along with optional observational datasets,
using. To run the framework on the example_multicase POD, modify the example configuration file and run

cd $CODE_ROOT
./mdtf -f templates/[runtime_config.[jsonc | yml]

The above command will execute PODs included in pod_list block of runtime_config.[jsonc | yml].

If you re-run the above command, the result will be written to another subdirectory under $OUTPUT_DIR,
i.e., output files saved previously will not be overwritten unless you change overwrite in the configuration file
to true.

The output files for the test case will be written to $OUTPUT_DIR/MDTF_Output/ (_[v(number)] will be appended
output directories if an existing MDTF_Output directory is present in the $OUTPUT_DIR).
When the framework is finished, open $OUTPUT_DIR/MDTF_Output/[POD NAME]/index.html in a web browser to view
the output report.

You can specify your own datasets in the caselist block of the runtime config file and provide a catalog with
the model data,
or run the example_multicase POD on the synthetic data and associated test catalog specified in the configuration file.
To generate the synthetic CMIP data, run:

mamba env create --force -q -f ./src/conda/_env_synthetic_data.yml
conda activate _MDTF_synthetic_data
pip install mdtf-test-data
mkdir mdtf_test_data && cd mdtf_test_data
mdtf_synthetic.py -c CMIP --startyear 1980 --nyears 5 --freq day
mdtf_synthetic.py -c CMIP --startyear 1985 --nyears 5 --freq day

Then, modify the path entries in diagnostic/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.csv, and
the "catalog_file": path in diagnostic/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.json to include the
root directory locations on your file system. Full paths must be specified.

Depending on the POD(s) you run, the size of your input datasets, and your system hardware, run time may be 10--20
minutes.

6. Next steps

For more detailed information, consult the documentation site. Users interested in
contributing a POD should consult the "Developer Information" section.

Acknowledgements

MDTF_funding_sources

Development of this code framework for process-oriented diagnostics was supported by the
National Oceanic and Atmospheric Administration
(NOAA) Climate Program Office Modeling, Analysis, Predictions and Projections
(MAPP) Program (grant # NA18OAR4310280). Additional support was provided by University of California Los Angeles,
the Geophysical Fluid Dynamics Laboratory, the National Center for Atmospheric Research,
Colorado State University, Lawrence Livermore National Laboratory and the US Department of Energy.

Many of the process-oriented diagnostics modules (PODs) were contributed by members of the NOAA
Model Diagnostics Task Force under MAPP support. Statements, findings or recommendations in these documents do
not necessarily reflect the views of NOAA or the US Department of Commerce.

Citations

Guo, Huan; John, Jasmin G; Blanton, Chris; McHugh, Colleen; Nikonov, Serguei; Radhakrishnan, Aparna; Rand, Kristopher;
Zadeh, Niki T.; Balaji, V; Durachta, Jeff; Dupuis, Christopher; Menzel, Raymond; Robinson, Thomas; Underwood, Seth;
Vahlenkamp, Hans; Bushuk, Mitchell; Dunne, Krista A.; Dussin, Raphael; Gauthier, Paul PG; Ginoux, Paul; Griffies,
Stephen M.; Hallberg, Robert; Harrison, Matthew; Hurlin, William; Lin, Pu; Malyshev, Sergey; Naik, Vaishali;
Paulot, Fabien; Paynter, David J; Ploshay, Jeffrey; Reichl, Brandon G; Schwarzkopf, Daniel M; Seman, Charles J;
Shao, Andrew; Silvers, Levi; Wyman, Bruce; Yan, Xiaoqin; Zeng, Yujin; Adcroft, Alistair; Dunne, John P.;
Held, Isaac M; Krasting, John P.; Horowitz, Larry W.; Milly, P.C.D; Shevliakova, Elena; Winton, Michael; Zhao, Ming;
Zhang, Rong (2018). NOAA-GFDL GFDL-CM4 model output historical. Version YYYYMMDD[1].Earth System Grid Federation.
https://doi.org/10.22033/ESGF/CMIP6.8594

Krasting, John P.; John, Jasmin G; Blanton, Chris; McHugh, Colleen; Nikonov, Serguei; Radhakrishnan, Aparna;
Rand, Kristopher; Zadeh, Niki T.; Balaji, V; Durachta, Jeff; Dupuis, Christopher; Menzel, Raymond; Robinson, Thomas;
Underwood, Seth; Vahlenkamp, Hans; Dunne, Krista A.; Gauthier, Paul PG; Ginoux, Paul; Griffies, Stephen M.;
Hallberg, Robert; Harrison, Matthew; Hurlin, William; Malyshev, Sergey; Naik, Vaishali;
Paulot, Fabien; Paynter, David J; Ploshay, Jeffrey; Schwarzkopf, Daniel M; Seman, Charles J; Silvers, Levi;
Wyman, Bruce; Zeng, Yujin; Adcroft, Alistair; Dunne, John P.; Dussin, Raphael; Guo, Huan; He, Jian; Held, Isaac M;
Horowitz, Larry W.; Lin, Pu; Milly, P.C.D; Shevliakova, Elena; Stock, Charles; Winton, Michael; Xie, Yuanyu;
Zhao, Ming (2018). NOAA-GFDL GFDL-ESM4 model output prepared for CMIP6 CMIP historical.
Version YYYYMMDD[1].Earth System Grid Federation. https://doi.org/10.22033/ESGF/CMIP6.8597

E. D. Maloney et al. (2019): Process-Oriented Evaluation of Climate and Weather Forecasting Models. BAMS, 100 (9),
1665–1686, doi:10.1175/BAMS-D-18-0042.1.

Disclaimer

This repository is a scientific product and is not an official communication of the National Oceanic and Atmospheric
Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an ‘as is’
basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or
Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable
Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark,
manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the
Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be
used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.

Contributors ✨

Thanks goes to our code contributors.
Thanks goes to these wonderful people (emoji key):

This project follows the all-contributors specification. Contributions of any kind welcome!


Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: 8 days ago

Total Commits: 2,524
Total Committers: 24
Avg Commits per committer: 105.167
Development Distribution Score (DDS): 0.542

Commits in past year: 198
Committers in past year: 5
Avg Commits per committer in past year: 39.6
Development Distribution Score (DDS) in past year: 0.389

Name Email Commits
tsjackson-noaa t****n@n****v 1156
wrongkindofdoctor 2****r 658
Thomas Jackson t****4@g****m 401
yihungkuo y****o@a****u 111
Jacob Mims 1****s 76
Jessica.Liptak j****k@n****v 25
John Krasting J****g@n****v 25
Dani Coleman b****y@u****u 24
Aparna Radhakrishnan a****n@n****v 14
wrongkindofdoctor 11
Wenhao Dong w****g@n****v 5
Zac Lawrence 4****e 3
ahbutlerwx 6****x 2
Wenaho Dong w****g@n****v 2
wrongkindofdoctor J****k@l****v 2
Cecilia Bitz b****z@u****u 1
Chris Blanton c****n@n****v 1
Geraldine Nelly Emlaw 1****w 1
climate_kid 6****s 1
delsbury 9****y 1
jcstarr 1****r 1
lgtm-com[bot] 4****] 1
Jacob Mims j****s@l****v 1
nishsilva 1****a 1

Committer domains:


Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 216
Total pull requests: 559
Average time to close issues: 5 months
Average time to close pull requests: 18 days
Total issue authors: 28
Total pull request authors: 35
Average comments per issue: 2.98
Average comments per pull request: 1.2
Merged pull request: 471
Bot issues: 0
Bot pull requests: 4

Past year issues: 59
Past year pull requests: 212
Past year average time to close issues: 16 days
Past year average time to close pull requests: 1 day
Past year issue authors: 10
Past year pull request authors: 8
Past year average comments per issue: 2.93
Past year average comments per pull request: 0.1
Past year merged pull request: 201
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/NOAA-GFDL/MDTF-diagnostics

Top Issue Authors

  • wrongkindofdoctor (41)
  • tsjackson-noaa (40)
  • aradhakrishnanGFDL (36)
  • jkrasting (23)
  • jtmims (10)
  • bitterbark (9)
  • Wen-hao-Dong (9)
  • nishsilva (7)
  • emaroon (6)
  • csyhuang (3)
  • jeyavinoth (3)
  • jcstarr (3)
  • chiaweh2 (3)
  • jfbooth (3)
  • ahmedfiaz (3)

Top Pull Request Authors

  • wrongkindofdoctor (295)
  • jtmims (92)
  • tsjackson-noaa (47)
  • jkrasting (22)
  • jhafner2 (16)
  • aradhakrishnanGFDL (16)
  • bitterbark (12)
  • yihungkuo (12)
  • chiaweh2 (4)
  • briansoden (4)
  • allcontributors[bot] (3)
  • zdlawrence (3)
  • jiacheng-atmos (3)
  • jeyavinoth (3)
  • delsbury (3)

Top Issue Labels

  • framework (91)
  • bug (71)
  • feature-request (37)
  • diagnostic (32)
  • data (19)
  • data catalogs (15)
  • documentation (13)
  • docs-policy (13)
  • site-specific (10)
  • testing-CI (10)
  • question (10)
  • Conda/micromamba (6)
  • help wanted (6)
  • suggestion (5)
  • python notebooks (5)
  • tool (3)
  • fieldlist (2)
  • invalid (1)

Top Pull Request Labels

  • framework (231)
  • diagnostic (76)
  • documentation (60)
  • bug (58)
  • testing-CI (29)
  • tool (22)
  • feature-request (21)
  • site-specific (16)
  • Conda/micromamba (16)
  • data catalogs (16)
  • fieldlist (11)
  • docs-policy (7)
  • data (7)
  • containers (5)

Dependencies

.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/mdtf_tests.yml actions
  • actions/checkout v3 composite
  • conda-incubator/setup-miniconda v2 composite
doc/requirements.txt pypi
  • jinja2 >=3.0
  • mock *
  • recommonmark >=0.7
  • sphinx >=5.2
setup.py pypi

Score: 7.905441649060286