A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

Recent Releases of ESMF

ESMF - ESMF 8.8.0

Overview

The 8.8.0 release of ESMF is backward compatible with ESMF 8.7.0 for all the interfaces that are marked as backward compatible in the Reference Manual. ESMF 8.8.0 introduces performance improvement, feature enhancement, and two bug fixes in different areas of the framework. Notably, the ESMF_StateReconcile() method has been fundamentally re-implemented in 8.8.0 to improve performance during the initialization phase of a coupled model. A new method ESMF_FieldEmptyReset() has been introduced to reset an ESMF Field back to a less complete field status. Extensions were made to the ESMF Geom class and NUOPC State metadata to add more capabilities and options for the user.

Although there are some API changes to add more options for users, in general this release requires no user code changes. However, please note that the following experimental methods that were not marked as backward compatible in previous releases are now removed in this version: ESMF_StateRead(), ESMF_StateWrite(), and ESMF_RouteHandlePrint(); this may require minor modifications to user code that contains the above methods.

In addition, ESMF 8.8.0 may introduce some bit-for-bit changes in second-order conservative weight calculation due to enhancements in the algorithm.

Lastly, ESMPy has been updated to support Python version 3.8 and above.

The ESMF team is grateful for the many years of support, ideation, feedback, testing, and code contributions that the ESMF user community has provided since its inception. We look forward to continuing this partnership with the community to further improve and add features to ESMF. As always, please utilize ESMF Discussions (https://github.com/orgs/esmf-org/discussions) or write [email protected] with any questions or issues concerning ESMF.

Release Notes

  • The public Fortran API in this release is backward compatible with the previous release, ESMF 8.7.0, for all the interfaces that are marked as backward compatible in the Reference Manual. There were API changes to a few unmarked methods that may require minor modifications to user code that uses these methods. The complete list of API changes is summarized in a table showing interface changes since ESMF 8.7.0. The table includes the rationale and impact for each change.
  • Some bit-for-bit changes are expected for this release compared to ESMF 8.7.0. This is based on test runs with the Intel compiler suite using options “-O2 -fp-model precise”, and due to the following change:
    • The second-order conservative weight calculation algorithm used by the ESMF regrid weight generation methods and applications was modified to operate on the incoming destination cells in the order of their ids. This fixes an issue where sometimes weights were calculated slightly differently on different PETs due to changes in destination order. The fix was made to both the 2D Cartesian and 2D spherical weight calculation. Expected bit-for-bit changes:
      • Roundoff level changes to second-order conservative weights due to a change in calculation order.
    • The algorithm for calculating second-order conservative weights on the XGrid has been improved to use more of the information inherent in the XGrid. This removes some issues with unmapped cells as well as making the results more precise.
      • Changes to second-order conservative weights when calculated through the XGrid.
  • ESMPy has been updated to support newer Python versions and drop support for older versions; specifically:
    • Python 3.7 is no longer supported.
    • The LocStream __getitem__ function has been fixed for Python 3.12 and later.
    • Segmentation faults that sometimes occurred with MeshCreateFromFile and RouteHandleCreateFromFile have been fixed.
    • Some unit tests have been fixed to work with recent versions of numpy.
    • Some unused Python code was upgraded to Python 3; despite being unused, this code caused problems with some installations (e.g., via spack).
  • The NUOPC State metadata was extended to include a new option for the FieldTransferPolicy attribute: “transferAllWithNamespace”. This new option mirror transfers all of the Fields from the provider to the acceptor component by automatically placing them into the namespace of the provider component.
  • The ESMF_StateReconcile() method has been fundamentally re-implemented. This method plays a central role during the setup of multi-component coupled systems. Many users, particularly under NUOPC, do not directly interact with this ESMF method. However, the generic NUOPC Connector uses it heavily during the NUOPC initialization process. Previous to ESMF 8.8.0, the implementation of ESMF_StateReconcile() leveraged all-to-all communications, leading to memory usage and execution times that increased quadratic with the number of PETs across which the method was called. The new implementation completely avoids all-to-all communications, showing memory usage and execution times that increase close to the theoretically expected logarithmic scaling by the number of participating PETs.
  • The ESMF_StateRead(), ESMF_StateWrite(), ESMF_StateWriteRestart(), and ESMF_StateReadRestart() methods have been removed from the ESMF API. The implementation of these methods was very problematic and it was decided to remove them to prevent users from employing them in their code. If you do need them, please contact ESMF support ([email protected]) and we can help find a solution.
  • Added the capability to reset an ESMF Field back to a less complete field status. This new functionality allows a Field to be repeatedly stripped down and then built up again with different features (e.g. with a different typekind). This capability is provided via the new method ESMF_FieldEmptyReset().
  • The ESMF Geom class was further extended to include equality and inequality operators, as well as the ESMF_GeomMatch() method. The later method outputs different levels of comparison between Geom objects.
  • Added support for creating a RouteHandle representing the transpose of a regrid weight matrix. This new RouteHandle is available through the ESMF_FieldRegridStore() interface.

Bug Fixes:

  • Fixed an issue where, with some compilers, Time Manager methods like ESMF_TimeGet() and ESMF_TimePrint() returned garbled time strings.
  • Fixed an issue where some sequences of regrid weight generation calls (e.g. ESMF_FieldRegridStore()) on a Field containing a Mesh can result in mask values from a previous call being used in a following call. For example, doing a bilinear regridding on a Field created on a Mesh on the ESMF_MESHLOC_ELEMENT location with one set of mask values followed by a conservative regridding on the same Mesh with a different set of mask values can lead to some wrong mask values being used during the second regridding.

Known Issues

Platform-specific issues:

  • It has been found that running ESMF applications across > 1000 MPI tasks on Infiniband based clusters, using IntelMPI requires setting environment variable FI_MLX_INJECT_LIMIT=0. This recommendation was made by Intel support staff when looking at larger ESMF applications that would hang inside MPI calls on such hardware for no apparent reason.
  • Compiling ESMF with GCC version 8.1.0 triggers an internal compiler error in ESMF_HConfig.F90 due to the use of allocatable character variables. Earlier and later versions of GCC do not have this issue.
  • On Darwin, with version 15 of the clang C compiler, when building under Rosetta, it is sometimes necessary to add “-Wl,-ld_classic” to environment variables ESMF_CXXLINKOPTS, ESMF_CLINKOPTS, and ESMF_F90LINKOPTS to work around link errors. (For more details, see the related GitHub issue.)
  • The Cray (cce) compiler currently has problems running PIO with mpiuni, at least for some versions of this compiler (tested with version 15.0.1).
  • There is an issue with intercepting the MPI calls for profiling on one of the supported platforms (hearhear: Darwin+gfortran+mpich set up via spack). This results in a single FAIL reported for ESMF_TraceMPIUTest.F90.

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich 4 months ago

ESMF - ESMF 8.7.0

Overview

The 8.7.0 release of ESMF is backward compatible. Although there are some API changes to add more options for users, this release requires no user code changes. ESMF 8.7.0 introduces 16 feature enhancements and 6 bug fixes in various areas. Notably, ESMPy now has options to associate a node mask with an ESMPy Mesh and to utilize endFlag when finalizing ESMPy. Several improvements are made in time management such as ISO 8601 time interval format and repeat capabilities for ESMF Clocks. This release also includes performance enhancements such as to the routeHandle calculation. In addition, ESMF 8.7.0 includes more user options such as a new cubed sphere coordinate calculation method and new methods and documentation to facilitate the use of the new HConfig class in application code. To demonstrate the capability to couple components in languages other than Fortran, new NUOPC application prototypes were added to showcase basic C API support through ESMX and NUOPC. Finally, starting with this release, all pull requests into develop will be automatically tested using a combination of settings.

The ESMF team is grateful for the many years of support, ideation, feedback, testing, and code contributions that the ESMF user community has provided since its inception. We look forward to continuing this partnership with the community to further improve and add features to ESMF.

Release Notes

  • The public Fortran API in this release is backward compatible with the previous release of ESMF, 8.6.0, and patch release 8.6.1. There were some API changes, none of which require user code changes. The complete list of API changes is summarized in a table showing interface changes since ESMF 8.6.0. The table includes the rationale and impact for each change.
  • No bit-for-bit changes are expected for this release compared to ESMF 8.6.0 or 8.6.1.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged compared to ESMF 8.6.1.
  • All of the fixes and improvements released with patch 8.6.1 are included in release 8.7.0.
  • Starting with this release all pull requests into develop will be automatically tested using a combination of settings including: BOPT=O/g, COMPILER=gfortran/clang, COMM=mpiuni/mpich/openmpi, and TESTEXHAUSTIVE=ON/OFF. These tests can be manually executed on any fork/branch by going to Development Tests in the Actions tab on GitHub.
  • Added ESMF_StateLog(), ESMF_FieldLog(), ESMF_ArrayBundleLog(), ESMF_ArrayLog(), and ESMF_DistGridLog() methods. These methods allow inspection of ESMF objects from the user level, which can be useful during code development and debugging. ESMF object logging is also leveraged by the standard NUOPC Verbosity attributes of the generic NUOPC components.
  • Added ESMF_HConfigLog() and ESMF_HConfigMatch() methods to facilitate ESMF_HConfig object usage in application code. New sections with code examples were added to the reference manual discussing comparison of ESMF_HConfig objects, and usage of quoted strings and the Norway problem.
  • Added the capability to set an ESMF_TimeInterval object from a string in ISO 8601 time interval format. This allows a user familiar with that format to more intuitively set a time interval in ESMF.
  • Added a new repeat capability to the ESMF Clock. This capability can be enabled by setting the repeatDuration argument when creating a Clock. When enabled the clock stays within the range from the clock’s startTime to its startTime+repeatDuration. For example, when advancing if the current time goes past startTime+repeatDuration, then it loops back to startTime and continues from there. ESMF Alarms also work with this capability. One example of when this capability can be useful is when spinning up a model.
  • An enhancement was made to ESMPy to allow associating a node mask with an ESMPy Mesh. The Mesh.add_nodes method now accepts a keyword, node_mask, through which a user can pass a mask array of length n_nodes. Previously it was only possible to associate a mask with a mesh's elements.
  • Allow setting the endFlag when finalizing ESMPy. In particular, this allows the user to control whether the MPI library is finalized along with ESMPy or is still available afterwards.
  • Builds using mpiuni (the stub, single-processor MPI library bundled with ESMF) now support ESMF_PIO=internal, allowing full I/O with mpiuni. This is particularly useful for ESMPy.
  • The NUOPC Connection Options now support remapMethod “conserve_2nd” and extrapMethod “nearest_d”.
  • New NUOPC application prototypes were added to showcase basic C API support through ESMX and NUOPC. This approach leverages components in dynamically loaded objects.
  • A new NUOPC application prototype was added to demonstrate coupling a component written in Julia, with a cap in C.
  • A more scalable method of calculating cubed sphere coordinates was added. This method can be selected using the new coordCalcFlag argument added to the ESMF_GridCreateCubedSphere interfaces. Setting the argument to ESMF_CUBEDSPHERECALC_LOCAL selects the new method.
  • The routeHandle calculation was improved to be faster when mapping data locations to their eventual PET destinations. This improvement is expected to increase scalability, especially when redistributing data from a small number of PETs to a large number.
  • An internal data member in the ESMF_Array class, with a size that scales with the total number of PETs (i.e. MPI tasks) across which the Array was created, has been eliminated. This improvement dramatically reduces the memory requirement for applications that have a large number of Arrays or Fields and run on large numbers of PETs.
  • During ‘make install’ the FindESMF.cmake file is copied into the ESMF_INSTALL_PREFIX/cmake directory. The default location can be overwritten by setting an environment variable, ESMF_INSTALL_CMAKEDIR. This variable can be specified as an absolute path (starting with "/") or relative to ESMF_INSTALL_PREFIX.
  • The ESMF capability to access components through dynamically loaded objects was made more portable by introducing wildcard support for the suffix of specified shared objects.

Bug Fixes:

  • A fix was made in ESMPy to permit the creation of Grids where the product of the dimensions exceeds approximately 268,000,000.
  • The build of libesmftrace_preload.dylib on some Darwin (Mac) systems with OpenMPI has been fixed. (Incorrect link flags were being used in this build.)
  • A fix was made to ESMPy to allow the creation of 3D Meshes. Previously creating a 3D Mesh in ESMPy would result in an error.
  • The regridding system was fixed so that the patch regrid method (ESMF_REGRIDMETHOD_PATCH) can be used on a source Field built on a Mesh where the PET list of the component that contains the destination Field has some PETs that aren’t in the source Field’s component’s PET list. Previously this condition would have resulted in an error.
  • A fix was made to ESMX to add link_module_paths to the CMAKE_MODULE_PATH list.
  • A fix was made to ESMX to support using variables that are defined by setting link_packages in esmxBuild.yaml files.

Known Issues

Platform-specific issues:

  • Compiling ESMF with GCC version 8.1.0 triggers an internal compiler error in ESMF_HConfig.F90 due to the use of allocatable character variables. Earlier and later versions of GCC do not have this issue.
  • On Darwin, with version 15 of the clang C compiler, when building under Rosetta, it is sometimes necessary to add "-Wl,-ld_classic" to environment variables ESMF_CXXLINKOPTS, ESMF_CLINKOPTS, and ESMF_F90LINKOPTS to work around link errors. (For more details, see the related GitHub issue.)
  • The cray (cce) compiler currently has problems running PIO with mpiuni, at least for some versions of this compiler (tested with version 15.0.1).

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by anntsay 7 months ago

ESMF - ESMF 8.6.1

Overview

The 8.6.1 release of ESMF is a patch release to the previously released 8.6.0, introducing 2 minor feature enhancements and 14 fixes in various areas such as ESMF Config, HConfig, and ESMF Field.

Due to user requests, ESMF Config was enhanced to remove the single line limitation of 1024 characters. In addition, the ESMF library target within the FindESMF module has been changed to conform to the imported library standard. Both feature enhancements allow more flexibility and usability.

Several corrections were introduced, such as the returning of index space information by ESMF_FieldGet(), start_index attribute in ESMFMesh unstructured mesh file in edge cases, interpretation of ESMF_RUNTIME_ environment variable keys by ESMF_Initialize(), ESMF_NVML=OFF build option, building of ESMF with ESMF_NO_GETHOSTID, equality operators in the ESMF_HConfig and ESMF_HConfigIter, and ISO_C_BINDING symbols export via the ESMF Fortran module.

Incremental progress was also made in several other areas of the framework. In ESMF_Config, several enhancements and fixes were made to cover more edge cases such as dealing with string length, special characters, and blank values.

Note: ESMF_Config will be deprecated in a future major release for the preferred ESMF_HConfig class, which supports YAML coded configuration files. Deprecation has not been scheduled at this time due to the high impact it will have on existing configuration files.

Release Notes

  • This patch release is backward compatible with ESMF 8.6.0. There were no API changes in this release.
  • No bit-for-bit changes are expected for this release compared to ESMF 8.6.0.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • The index space information (minIndex, maxIndex, elementCount) returned by method ESMF_FieldGet() is now correct for a Field on any ESMF_STAGGERLOC (Grid) or ESMF_MESHLOC (Mesh) location.
  • The ESMF_Config class has been enhanced by removing the 1024 character limit. It is now possible to set strings to values longer than 1024 characters. The maximum size is determined by the ESMF_Config buffer size, which is 256k.
  • Multiple bugs in the implementation of ESMF_Config have been fixed.
    • A bug has been fixed that prevented changing values to a different string length.
    • It is now possible to enclose number sign (#) characters within single or double quotes in order to include this character in string values.
    • It is now possible to enclose apostrophes / single quote (‘) characters within double quotes in order to include this character in string values.
    • It is now possible to enclose double quote (“) characters within single quotes in order to include this character in string values.
    • It is now possible to set values to blank and change blank values to non-blank values.
    • Note: ESMF_Config will be deprecated in a future major release for the preferred ESMF_HConfig class, which supports YAML coded configuration files. Deprecation has not been scheduled at this time due to the high impact it will have on existing configuration files.
  • The ESMF library target within the FindESMF module for CMake has been changed to ‘ESMF::ESMF’. This conforms to the imported library standard as seen in modules provided by Kitware. For backwards compatibility ‘ESMF’ has been added as an alias target.
  • The issue of inadvertently exporting ISO_C_BINDING symbols via the ESMF Fortran module has been addressed.
  • ESMF_RUNTIME_ environment variable keys that are set on the top level of the configuration file ingested by the ESMF_Initialize() method are now correctly interpreted.
  • The ESMF_NVML=OFF build option now works correctly as documented.
  • The build of libesmftrace_preload.dylib on some Darwin (Mac) systems with OpenMPI has been fixed. (Incorrect link flags were being used in this build.)
  • The problem of building ESMF with macro ESMF_NO_GETHOSTID defined has been fixed.
  • The ESMFMESH unstructured mesh file format now correctly begins the connection index calculation at the start_index value when the start_index attribute is attached to the elementConn variable. Previously, beginning in ESMF 8.3.0, this attribute had been incorrectly ignored.
  • The ESMF_HConfig and ESMF_HConfigIter equality operators have been fixed. Previously, on some platforms, identical objects could be detected as not equal which could lead to problems with iterations not terminating correctly, etc.

Known Issues

  • Same as 8.6.0.

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by oehmke 12 months ago

ESMF - ESMF 8.6.0

Overview

The 8.6.0 release of ESMF introduces two major new features: spherical vector regridding and integrated accelerator device management. Both features are provided at an early stage of the implementation, expected to be fleshed out over the course of the next several releases. In addition to these new developments, the release includes a number of improvements of existing functionality. Highlights of the 8.6.0 release are outlined in the following paragraphs, followed by a comprehensive list of release notes.

Support for spherical vector regridding has been requested by a number of groups over the years, and a basic implementation of the feature is now available in ESMF 8.6.0. One of the advantages of mapping vectors through 3D Cartesian space, as opposed to regridding the individual components separately, is improved accuracy of the results, especially in the polar regions. The initial implementation of this feature included with 8.6.0 is limited to 2D tangential vectors (expressed in terms of east and north components) on a spherical geometry (e.g. an ESMF_Grid with ESMF_COORDSYS_SPH_DEG) and requires that the vector components are stored in a single ungridded Field dimension of size 2. We expect these and other restrictions to be relaxed in future releases and welcome feedback from the community to help prioritize.

Accelerator device solutions that are capable of managing multi-component applications are likely to become critical as coupled systems are run on hardware that provides most of the compute power in the form of GPUs. ESMF 8.6.0 provides novel development in this area, building on the experience gained from CPU resource management for multi-component ESMF applications based on the petList concept, and its recent extension to ESMF-managed threading. The user has now the option to associate an analogous list for accelerator devices (devList) with each component. ESMF implements the required bookkeeping, providing component context specific device information. Paired with standard accelerator programming paradigms, such as OpenACC or OpenMP, this allows component code to target the desired devices for offloading. The current implementation is considered a proof of concept, and refinements of the feature are expected in future releases.

Among the areas of the framework that received incremental improvements is ESMX, where it is now possible to modify ESMF run-time behavior via ESMF_RUNTIME_* options in the ESMX run-time configuration file for convenience. Small functional improvements also went into the Array and Field Create() methods that now allow creation from the slice of an existing object. Handling this situation on the ESMF level can significantly simplify user code.

Further progress was made for multi-tile I/O support for Fields and Arrays. It is now possible to perform I/O for multi-tile Arrays and Fields with layouts other than 1 DE per PET. Related to I/O, the "normalization" attribute in mapping files written by the ESMF regrid weight generation methods and applications was corrected to set the value “N/A” for anything other than the conservative regridding methods.

Incremental progress, bug fixes, portability and performance improvements were made in several other areas of the framework. This includes full support for the NVHPC compiler suite, tracing support for MPI calls on all platforms, resolving crashes observed with the Darshan and Cray Performance Analysis Tools, and finally addition of a public C API for the ESMF trace feature. Please see the release notes below for a detailed list of changes.

Release Notes

  • The public Fortran API in this release is backward compatible with the previous release, ESMF 8.5.0. There were a few API changes, none of which require user code changes. The complete list of API changes is summarized in a table showing interface changes since ESMF 8.5.0. The table includes the rationale and impact for each change.
  • No bit-for-bit changes were observed for this release compared to ESMF 8.5.0. This is based on test runs with the Intel compilers using options "-O2 -fp-model precise". However, changes were made to the implementation of the conservative weight calculation to remove a problem caused by the same calculation being optimized differently in two places in the code. Bit-for-bit changes compared to ESMF 8.5.0 from this change are possible when using regridding methods ESMF_REGRIDMETHOD_CONSERVE or ESMF_REGRIDMETHOD_CONSERVE_2ND and not using strict floating point compiler options.
  • No changes that affect the status of existing regridding methods were made to the ESMF regrid weight generation methods and applications. However, the tables summarizing the ESMF regridding status were extended to cover the spherical vector regridding option added in this release.
  • Keys to match all of the ESMF_RUNTIME_* environment variables were added to the App Options of the ESMX run configuration. This feature provides a convenient way to control the run-time behavior of the ESMX executable from its standard configuration file.
  • Basic accelerator device management for multi-component applications was implemented. This feature allows the user to assign accelerator devices via the “devList” option to individual components. "devList” is available as a new argument to ESMF_GridCompCreate() and ESMF_CplCompCreate(), and is also accessible on the ESMX level as Component Label Option. Once specified, ESMF handles the device bookkeeping, and provides the user with context specific device information through the ESMF_VMGet() API. This feature can be leveraged in connection with popular accelerator programming paradigms like OpenACC, OpenMP, or standard language approaches to offload component code to the desired devices.
  • Basic support for vector regridding was added to the ESMF_FieldRegridStore() method. The initial implementation of this feature is limited to 2D tangential vectors (expressed in terms of east and north components) on a spherical geometry (e.g. an ESMF_Grid with ESMF_COORDSYS_SPH_DEG) and requires that the vector components are stored in an ungridded Field dimension. The advantage of using this capability, which maps vectors through 3D Cartesian space, instead of regridding both components separately, is that it provides more accurate results, particularly in the polar regions.
  • The "normalization" attribute in mapping files written by the ESMF regrid weight generation methods and applications was changed to now correctly set “N/A” for anything other than the conservative regridding methods. User code that accesses the “normalization” attribute in ESMF mapping files might need to be adjusted accordingly.
  • A new entry point was added to the ESMF_FieldCreate() interface allowing creation from an existing Field object. The interface supports slicing with respect to trailing ungridded dimensions.
  • The ESMF_ArrayCreate() entry point that allows creation from an existing Array object was extended to support slicing with respect to trailing undistributed dimensions.
  • The Read and Write operations for multi-tile Arrays and Fields (e.g., for representing a cubed sphere grid as a six-tile grid) have been extended to permit I/O for Arrays / Fields with layouts other than 1 DE per PET. This change applies to the ESMF_ArrayRead(), ESMF_ArrayWrite(), ESMF_ArrayBundleRead(), ESMF_ArrayBundleWrite(), ESMF_FieldRead(), ESMF_FieldWrite(), ESMF_FieldBundleRead(), and ESMF_FieldBundleWrite() methods. The remaining limitations of the implementation are discussed in the Restrictions and Future Work section of the I/O Capability in the reference manual.
  • A basic ESMF C API was added to provide access to the ESMF tracing and profiling capability from code written in C.
  • Crashes that were observed with ESMF applications during MPI_Finalize() with the Darshan or Cray Performance Analysis Tool active have been resolved. (The issue was traced back to an interaction with the MPI Tools interface that used to be initialized during ESMF_Initialize(). ESMF no longer initializes the MPI Tools interface by default to avoid this problem.)
  • The issues noted in previous releases with regard to using the ESMF tracing and profiling feature for MPI calls have been resolved. This capability is now fully functional on all the supported platforms.
  • NVIDIA's NVHPC compiler suite is now fully supported by ESMF for both ESMF_OS=Linux and ESMF_OS=Unicos, and setting ESMF_COMPILER=nvhpc. This is particularly important for users looking at exploring ESMF accelerator device support on systems with NVIDIA GPUs.

Known Issues

  • Using vector regridding (vectorRegrid=.true.) through the ESMF_FieldRegridStore() interface in combination with either conservative regridding method (regridmethod=ESMF_REGRIDMETHOD_CONSERVE or regridmethod=ESMF_REGRIDMETHOD_CONSERVE_2ND) currently results in an error.
  • Attempting to write weight files from the ESMPy Regrid object when using filemode=FileMode.WITHAUX currently crashes.

Platform-specific issues:

  • Compiling ESMF with GCC version 8.1.0 triggers an internal compiler error in ESMF_HConfig.F90 due to the use of allocatable character variables. Earlier and later versions of GCC do not have this issue.
  • On Darwin, with version 15 of the clang C compiler, when building under Rosetta, it is sometimes necessary to add “-Wl,-ld_classic” to environment variables ESMF_CXXLINKOPTS, ESMF_CLINKOPTS, and ESMF_F90LINKOPTS to work around link errors. (For more details, see the related GitHub issue.)

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich over 1 year ago

ESMF - ESMF 8.5.0

Overview

The 8.5.0 release of ESMF comes with a few big new developments as well as improvements of existing functionality. The highlights of the 8.5.0 release are outlined in the following paragraphs, followed by a detailed list of release notes.

One major new development available with 8.5.0 is the addition of a hierarchical configuration class: ESMF_HConfig. This class provides YAML 1.2 support through the ESMF Fortran API. The ESMF_HConfig class integrates with the existing configuration class ESMF_Config for backward compatibility, and allows ESMF user code to seamlessly access, modify, and create information in YAML format. This new capability is leveraged within ESMF and NUOPC to provide a YAML alternative when specifying ESMF_Initialize() parameters, NUOPC attribute, petList, or run sequence information.

Another area that has seen a lot of new development is the Earth System Model eXecutable (ESMX) layer. ESMX has greatly matured in many areas since it was first introduced in the previous ESMF release and we believe it is now ready to be used by early adopters for “real world” applications. To this end, several new build options were added to the ESMX build configuration, and component building is now an integrated feature. The new ESMX_Builder command line tool eliminates the need for direct user interaction with the CMake system and improves the overall usability.

ESMF 8.5.0 adds a generic geometry class (ESMF_Geom) to support user code that deals generically with the existing geometry classes: ESMF_Grid, ESMF_Mesh, ESMF_LocStream, and ESMF_XGrid. Using the ESMF_Geom type, code will function regardless of the underlying geometry class. In this release only a few ESMF operations, such as Field creation, are overloaded to support the generic Geom type. Additional support will be added in future releases as the need arises.

Progress was also made in the area of multi-tile I/O support for Fields and Arrays, eliminating one of the restrictions of the previous release. Multi-tile Fields with ungridded dimensions, and multi-tile Arrays with undistributed dimensions are now fully supported.

With respect to installation and portability of the library, support for Spack and Docker continues to be an important focus. ESMF core team members are now maintainers of the official ESMF Spack package, and automated daily testing ensures continued support of this build option. Docker images are available and maintained under the ESMF organization on Docker Hub.

Incremental progress, bug fixes, and performance improvements were made in several other areas of the framework, including the ESMF_RegridWeightGen application, Mesh creation, Regridding, RouteHandle re-use, and others. Please see the release notes below for a comprehensive list of changes.

Release Notes

  • The public Fortran API in this release is backward compatible with the last release of ESMF, 8.4.0, and patch releases 8.4.1 and 8.4.2. There were a few minor API changes, most of which do not require user code changes. The exception is a change to the Read() and Write() methods for multi-tile Arrays or Fields, where the tile number placeholder character in the fileName argument was changed from “#” to “*”. A number of new interfaces were added. The complete list of API changes is summarized in a table showing interface changes since ESMF 8.4.0, including the rationale and impact for each change.
  • Some bit-for-bit changes are expected for this release compared to ESMF 8.4.0, and patch releases 8.4.1 and 8.4.2. This is based on test runs with the Intel compiler suite using options “-O2 -fp-model precise”, and due to the following changes:
    • Fixed a problem in the first-order conservative weight calculation that might occur in some situations when a destination cell is a concave quadrilateral. The fix was made to both the 2D Cartesian and 2D spherical weight conservative calculation. Expected bit-for-bit changes:
      • Small changes to first-order conservative weights for destination cells that are concave. These changes can occur for either 2D Cartesian or 2D spherical Grids or Meshes.
    • Updated the ESMF cubed sphere coordinate generation algorithm to use R8 reals. This makes the algorithm more accurate and also aligns it better with the same algorithm used by NASA. Expected bit-for-bit changes:
      • Small changes to the coordinate values produced by ESMF_GridCreateCubedSphere() or ESMF_MeshCreateCubedSphere(), and also changes to anything which depends on those values (e.g. regridding weights to/from an ESMF generated cube sphere Grid).
  • No changes that affect regridding status were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • All of the fixes and improvements released with patches 8.4.1 and 8.4.2 are included in release 8.5.0.
  • Significant improvements have been made to the Earth System Model eXecutable (ESMX) layer:
    • Component building has been integrated into the ESMX build system.
    • The build configuration has been enhanced through new build options.
    • The ESMX_Builder script has been added to simplify building ESMX applications.
    • The ESMX run configuration file has been updated to YAML formatting.
    • The names of the ESMX build and run configuration files can be customized.
    • The ESMX Data Component, which allows users to define import/export fields and prescribe data, has been added to the ESMX Layer.
  • A hierarchical configuration class (ESMF_HConfig) with Fortran API was implemented. ESMF_HConfig covers the YAML 1.2 standard and integrates with the existing configuration class ESMF_Config for backward compatibility.
  • The ESMF_Initialize() method now supports reading startup configurations from a YAML file.
  • The ESMF_InfoSet() interface was overloaded to allow setting ESMF_Info key value pairs from an ESMF_HConfig object.
  • The NUOPC methods that implement ingestion of PetList, RunSequence and Attribute information have been overloaded to support the YAML format through the ESMF_HConfig class.
  • A new named constant, ESMF_STATEINTENT_INTERNAL, for type(ESMF_StateIntent_Flag) was added. This new option allows a state to be marked exclusively for component internal access.
  • A generic geometry class (ESMF_Geom) was added to ESMF. A Geom object can be created from any object of the existing geometry classes (ESMF_Grid, ESMF_Mesh, ESMF_LocStream, and ESMF_XGrid). Having this class allows a user to pass a generic geometry object through a coupled system without knowing its exact type. To support greater flexibility in coupled systems, some ESMF operations are now supported on Geom (e.g. Field creation). More will be added as the need arises.
  • Creating a Mesh, or adding nodes to an existing Mesh, without specifying the nodeOwners argument now works, even if there are PETs with zero nodes. Previously this condition led to a hang inside the respective ESMF method.
  • Creating a Mesh without specifying node information now works, even if there are PETs with zero elements. Previously this condition led to a hang inside the respective ESMF method.
  • Grids that contain DEs of zero width are now supported in regridding. Previously using a Grid of this type with some regridding methods (e.g. ESMF_REGRIDMETHOD_CONSERVE) would result in an error.
  • The srcTermProcessing argument was added to the version of ESMF_FieldRegridStore() that operates on an XGrid. This fixes the issue where in some cases bit-for-bit reproducibility wasn’t available in that version of ESMF_FieldRegridStore().
  • The RouteHandle reuse optimization in ESMF_FieldBundleRegridStore() was extended to include the Grid-to/from-Mesh and Mesh-to/from-Mesh combinations. This optimization is leveraged by the NUOPC_Connector and results in a significant reduction of cost (both time and memory) during model initialization when Fields on Meshes are present in the import- and/or exportStates.
  • All of the RedistStore() methods that take the optional srcToDstTransposeMap argument now support negative entries. The order of elements along each negative dimension entry is reversed.
  • Two new dataFillScheme options, “nan” and “snan”, have been added to the ESMF_FieldFill() method. These options fill a Field with IEEE quiet and signaling NaN ("Not a Number") values, respectively.
  • An optional logical flag (isESMFAllocated) was added to the ESMF_FieldGet(), ESMF_ArrayGet(), and ESMF_LocalArrayGet() methods to allow the user to query whether the respective data allocation is held by ESMF.
  • A bug in the auto-detection of file type in the ESMF_RegridWeightGen application was fixed. Previously SCRIP grid files could sometimes be incorrectly identified as GRIDSPEC files.
  • The -t, --src_type, and --dst_type arguments of the ESMF_RegridWeightGen application that allow grid file types to be specified explicitly, have been re-enabled.
  • The Read and Write operations for multi-tile Arrays and Fields (e.g., for representing a cubed sphere grid as a six-tile grid) have been extended to permit I/O for Arrays / Fields with undistributed / ungridded dimensions. This change applies to the ESMF_ArrayRead(), ESMF_ArrayWrite(), ESMF_ArrayBundleRead(), ESMF_ArrayBundleWrite(), ESMF_FieldRead(), ESMF_FieldWrite(), ESMF_FieldBundleRead(), and ESMF_FieldBundleWrite() methods.
  • For Read() and Write() operations for multi-tile Arrays and Fields (e.g., for representing a cubed sphere grid as a six-tile grid), the file name placeholder character in the fileName string that stands for the tile number has been changed from “#” to “*”. This change was needed to support obtaining the fileName string from ESMF Config file, where the “#” character is interpreted as the start of a comment.
  • Creating a cubed sphere grid under ESMPy with corner stagger by calling ESMF.Grid() with add_corner_stagger=True now works correctly. Previously this condition would lead to a run-time error.
  • ESMF is available through Spack, a popular package manager for High Performance Computing. A user can build ESMF, and its dependencies, in a convenient and standard way using the Spack tools. ESMF core team members are now maintainers of the official ESMF Spack package. To ensure a robust integration, the ESMF build procedure through Spack is automatically tested using GitHub Actions.
  • Intel oneAPI compiler support was improved and additional test combinations were added. A new ESMF User’s Guide section discussing Intel compiler support was added to provide guidance with the classic to oneAPI transition.
  • Fujitsu compiler support was added under ESMF_OS=Linux. Use ESMF_COMPILER=fujitsu to access the option.

Known Issues

Platform-specific issues:

  • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default, ESMF uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.
  • Compiling ESMF with GCC version 8.1.0 triggers an internal compiler error in ESMF_HConfig.F90 due to the use of allocatable character variables. Earlier and later versions of GCC do not have this issue.
  • On Darwin, with the GNU gfortran+gcc combination, when building MPICH3 from source, it is important to specify the "--enable-two-level-namespace" configure option. By default, i.e. without this option, on Darwin, the produced MPICH compiler wrappers include a linker flag (-flat_namespace) that causes issues with C++ exception handling. Building and linking ESMF applications with MPICH compiler wrappers that specify this linker option leads to “mysterious” application aborts during execution.
  • On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting the environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.
  • There is an issue with intercepting the MPI calls for profiling on some of the supported platforms. This results in a single FAIL reported for ESMF_TraceMPIUTest.F90. The affected platforms are:
    • Catania: Darwin+GNU+mpich
    • Green: Darwin+gfortranclang+mpich/openmpi
    • Gaea: Unicos+GNU+cray-mpich
  • There is an issue with loading the libesmftrace_preload.so library on some of the supported platforms. This results in a reported CRASH for ESMF_TraceIOUTest.F90 and ESMF_TraceMPIUTest.F90. The affected platforms are:
    • Discover: Linux+GNU+intelmpi
    • Gaea: Unicos+Intel+cray-mpich
    • Gaea: Unicos+Intel+mpiuni

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich almost 2 years ago

ESMF - ESMF 8.4.2

Overview

ESMF 8.4.2 is a patch release that fixes a few minor issues that were discovered in 8.4.1 after it was released. Unless a user experiences any of the issues listed below, there is no need to upgrade to this release from patch 8.4.1.

As a patch release, 8.4.2 does not introduce new features beyond 8.4.0 or 8.4.1. Applications that have already moved to ESMF 8.5.0 beta tags should not go to the 8.4.2 patch. For these cases an upgrade to 8.5.0, when officially released, is the recommended next step. All of the fixes that come with patch release 8.4.2 are also available in beta tag v8.5.0b20 and newer.

Release Notes

  • This patch release is backward compatible with ESMF 8.4.0 and 8.4.1.
  • No bit-for-bit changes are expected for this release compared to ESMF 8.4.0 and 8.4.1.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • The ESMF profiler mode (activated by setting environment variable ESMF_RUNTIME_PROFILE=ON) has been made compatible with user code that calls the standard ESMF component methods from within OpenMP threaded regions. This is a rare scenario. However, it can be encountered e.g. under NOAA’s Unified Forecast System (UFS) when enabling the threading option for NASA’s GOCART chemistry component.
  • A linking issue reported for the Darwin.gfortranclang.default build configuration, when building ESMF under the Spack package manager, has been fixed.

Known Issues

  • Same as 8.4.1.

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich about 2 years ago

ESMF - ESMF 8.4.1

Overview

ESMF 8.4.1 is a patch release that fixes a major issue that was discovered in 8.4.0 after it was released. The identified bug can lead to memory corruption problems, which are often hard to detect and diagnose by their nature. We strongly recommend upgrading applications that are currently using 8.4.0 to patch release 8.4.1.

As a patch release, 8.4.1 does not introduce new features beyond 8.4.0. Besides the major bug fix mentioned above, a number of smaller bug fixes and maintenance work is included with the patch release. See the detailed release notes below. Applications that have already moved to ESMF 8.5.0 beta tags do not need to upgrade to the 8.4.1 patch. For these cases an upgrade to 8.5.0, when officially released, is the recommended next step. All of the fixes that come with patch release 8.4.1 are also available in beta tag v8.5.0b18 and newer.

Release Notes

  • This patch release is backward compatible with ESMF 8.4.0.
  • No bit-for-bit changes are expected for this release compared to ESMF 8.4.0.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • A bug in the implementation of method ESMF_FieldGet() was fixed. The problematic code was accessing the optional, intent(out) “name” argument without the proper present() check. As a consequence, code not specifying an actual “name” argument when making this call was at risk of suffering from memory corruption issues. Due to the fact that the ESMF library internally is making calls to ESMF_FieldGet() without passing the “name” argument, it must be assumed that all user code is at risk of memory corruption issues when using ESMF 8.4.0.
  • A problem of incorrect ESMPy project metadata was fixed. This issue was causing failures during installation when using python’s setuptools module v67.1.0 or later: (“configuration error: project.maintainers[{data__maintainers_x}] must not contain {'author'} properties”). The fix allows ESMPy installation to work with the latest setuptools. We recommend this release for anyone doing a new installation of ESMPy.
  • The NUOPC_Driver code responsible for stepping components through the initialization protocol has been optimized to reduce the amount of synchronization needed between components. This allows for a greater level of concurrency between components during initialization, resulting in a reduction of execution time.
  • A number of code adjustments were made to support Intel’s LLVM based OneAPI compiler suite under the common ESMF_COMPILER=intel setting. Classic and OneAPI Intel compiler flavors are now supported under the same setting.

Known Issues

  • Same as 8.4.0.

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich about 2 years ago

ESMF - ESMF 8.4.0

Overview

The 8.4.0 release of ESMF implements a number of exciting new features. Some are incremental in nature while others explore new territory. Highlights of the 8.4.0 release are outlined in the following paragraphs. A detailed list of release notes is provided further down.

ESMX, a new layer built on top of the ESMF and NUOPC APIs, is being introduced in this release. ESMX stands for the Earth System Model eXecutable (ESMX) layer. The goals of ESMX are (1) to simplify standing up new NUOPC-based systems, (2) promote hierarchical model component testing, (3) reduce the cost of maintaining NUOPC-based modeling systems, (4) improve alignment and interoperability between different NUOPC-based systems, and (5) provide a fast and coordinated roll-out strategy for new ESMF/NUOPC features.

Exploration and experimentation with ESMX is strongly encouraged. However, please recognize that it is a new development effort that comes with the typical rough edges. For example, no formal reference manual is available yet. A good starting point to learn about the layer is the ESMX/README.md. We expect development of ESMX to continue throughout the next several release cycles.

Progress was made in the area of I/O. Multi-tile Arrays and Fields can now be used in Read and Write operations. A common use case is the six-tile representation of the cubed sphere grid. There are a number of restrictions that apply to the multi-tile I/O support made available in this release. Please see the associated release note below for more information. The plan is to relax restrictions in future releases.

On the ESMF object level a new concept of “Named Aliases” was introduced. This concept builds on the existing ESMF alias concept for deep objects (objects explicitly created and destroyed by Create() and Destroy() methods). Named aliases manage their own private name without affecting the name of the aliased object. This allows the same object to be known under different names inside of different components.

Incremental progress was also made in the areas of XGrid, LocStream, and Grid. Here user feedback identified a number of issues that have been addressed in this release.

With the increasing importance of ARM based architectures, such as Apple’s M1, etc., work was done in this release to support the native development environment on Darwin (Mac OS X). The compiler combination GFortran + Clang is now considered fully supported on Darwin. The release has been fully tested on an Apple M1 system with MPIUNI, MPICH, and OpenMPI.

ESMPy users please note that the Python module has been renamed from "ESMF" to "esmpy" for better alignment with Python Enhancement Proposal (PEP) guidelines. Please check out the respective release note for details and potential impact on user code.

Release Notes

  • The public Fortran API in this release is backward compatible with the last release of ESMF 8.3.0 and patch release ESMF 8.3.1. There were a few minor API changes, none of which require user code changes. The list of API changes is summarized in a table showing interface changes since ESMF 8.3.0, including the rationale and impact for each change.
  • The Python module name for ESMPy was changed in this release from “ESMF” to “esmpy” for better alignment with Python Enhancement Proposal (PEP) guidelines. This change will require user code changes! ESMPy users will have to do a global search and replace on their scripts to adjust to the module name change. Also notice that the internal ESMF source tree directory structure has changed, moving ./src/addon/ESMPy to ./src/addon/esmpy. This might affect anybody keeping an ESMF repository clone, and maintaining scripts that depend on the internal directory naming.
  • No bit-for-bit changes were observed for this release compared to release ESMF 8.3.0. This is based on test runs with the Intel compilers using options “-O2 -fp-model precise”.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • All of the fixes and improvements released with patch 8.3.1 are also included in release 8.4.0. Of particular note is support for messages above the 2GiB limit by the ESMF VM communication layer.
  • The Earth System Model eXecutable (ESMX) layer was added to ESMF. ESMX removes technical hurdles that impede hierarchical NUOPC model testing. It includes a unified executable capable of driving multiple configurations of model components: single component, one-way forced, and fully dynamic two-way coupled. A simple YAML file is used to list the component build dependencies (CMake based). A standard NUOPC run configuration file is used to specify the processor layout, run sequence, the field dictionary, and model component options. ESMX supports hierarchical testing by allowing decentralized testing at the component level before integration into larger coupling systems, as well as testing on the integration level.
  • The concept of “Named Aliases” has been introduced. Regular aliases of deep ESMF objects continue being created using a simple assignment operator. Changing the name of a regular alias affects the object and thus all other aliases. Named aliases are created using the new ESMF_NamedAlias() function. Named aliases manage their own private name. Changing the name of a named alias does not affect the object or other aliases.
  • Using a Mesh that contains one or more elements with greater than 4 sides when creating an XGrid is now supported. Previously doing so would result in an error when data was transferred into or out of the XGrid.
  • LocStreams can now be created from a Cartesian Mesh file. Previously the attempt to do so would result in an error.
  • Attempting to create a Grid from a GRIDSPEC mosaic file that doesn’t contain variables with standard_name set to "grid_tile_spec", "geographic_latitude", or "geographic_longitude" now triggers clear ERROR log messages before returning with an error. Previously no ERROR messages were logged, making this situation hard to debug.
  • Read and Write operations are now permitted for multi-tile Arrays and Fields. A common use case for this is representing a cubed sphere grid as a six-tile grid. This change applies to ESMF_ArrayRead(), ESMF_ArrayWrite(), ESMF_ArrayBundleRead(), ESMF_ArrayBundleWrite(), ESMF_FieldRead(), ESMF_FieldWrite(), ESMF_FieldBundleRead(), and ESMF_FieldBundleWrite(). Each of these methods reads or writes to multiple files in the multi-tile case. See the respective API documentation for details of how the tile-specific file names are constructed.
    Current limitations are:
    • For I/O of ArrayBundles and FieldBundles, all Arrays / Fields in the bundle must contain the same number of tiles;
    • I/O is not yet permitted for multi-tile Arrays / Fields with ungridded / undistributed dimensions;
    • I/O is currently only permitted for multi-tile Arrays / Fields with 1 DE per PET.
  • ESMF profiles now include a summary of the full application run time for convenience. Previously user instrumentation was required to introduce an end-to-end profiling level.
  • The configuration for ESMF_COMPILER=gfortranclang now works on Darwin systems (Mac OS X). Note that on Darwin, the system-level g++ invokes clang++, and unless you ensure that you have a true g++ early in your path (or build the MPI compiler wrappers to ensure that they wrap the true g++), you will end up using clang++ even if you think you are using the GNU C++ compiler. Setting ESMF_COMPILER=gfortranclang is correct for this typical situation. Attempting to use ESMF_COMPILER=gfortran when the C++ compiler is actually clang++ now issues an error message.

Known Issues

Platform-specific issues:

  • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default, ESMF uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.
  • On Darwin, with the GNU gfortran+gcc combination, when building MPICH3 from source, it is important to specify the "--enable-two-level-namespace" configure option. By default, i.e. without this option, on Darwin, the produced MPICH compiler wrappers include a linker flag (-flat_namespace) that causes issues with C++ exception handling. Building and linking ESMF applications with MPICH compiler wrappers that specify this linker option leads to “mysterious” application aborts during execution.
  • On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting the environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.
  • There is an issue with intercepting the MPI calls for profiling on some of the supported platforms. This results in a single FAIL reported for ESMF_TraceMPIUTest.F90. The affected platforms are:
    • Catania: Darwin+GNU+mpich
    • Green: Darwin+gfortranclang+mpich/openmpi
    • Gaea: Unicos+GNU+cray-mpich
  • There is an issue with loading the libesmftrace_preload.so library on some of the supported platforms. This results in a reported CRASH for ESMF_TraceIOUTest.F90 and ESMF_TraceMPIUTest.F90. The affected platforms are:
    • Discover: Linux+GNU+intelmpi
    • Gaea: Unicos+Intel+cray-mpich
    • Gaea: Unicos+Intel+mpiuni

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich over 2 years ago

ESMF - ESMF 8.3.1

Overview

ESMF 8.3.1 is a patch release that fixes a number of issues that were noticed after the 8.3.0 release. While most of the issues are minor, they have been reported as problems under specific user applications. The 8.3.1 patch release provides a path for affected applications to upgrade and use an official ESMF release instead of a beta tag in the 8.4.0 series currently in development.

As a patch release, 8.3.1 does not introduce new features. Applications that work fine with 8.3.0, or are already on a 8.4.0 beta tag, need not upgrade to the 8.3.1 patch. For these cases an upgrade to 8.4.0, when officially released, is the recommended next step. All of the fixes in the 8.3.1 release are also available in beta tag v8.4.0b11 and newer.

Release Notes

  • This patch release is backward compatible with ESMF 8.3.0.
  • No bit-for-bit changes are expected for this release compared to ESMF 8.3.0.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • The ESMF communication layer now supports single messages that are above the previous 2GiB limit. This applies to the direct usage of ESMF_VM communication calls, but also extends to ESMF_RouteHandle based communication methods: Regrid, Redist, Halo, and SMM. User applications have been observed to push over the previous 32-bit message limit when each PET addresses a substantial amount of memory and calls into the ESMF communication methods. Before this fix, a user could experience application crashes in the MPI layer due to ESMF attempting to send messages that exceed the 32-bit size limit.
  • An issue inside ESMF_Info was fixed in which the data type and precision of attributes was not properly set when querying for an attribute through ESMF_InfoGet(). This led to a downstream issue of writing NetCDF attributes with a precision inconsistent with the attribute's precision inside the ESMF_Info object.
  • Fixed an issue in which the IO layer was incorrectly querying for the number of compute cores on a node. In some cases, the bug led to application hangs during IO operations (e.g. ArrayWrite) that span multiple nodes.
  • An issue was fixed in the IO layer that occurred when reading through PNetCDF into a destination with repeating elements (e.g. halo points). When used to read a Mesh from file, this problem caused bad coordinate values to be set in the created mesh. This in turn led to very poor performance when subsequently using the Mesh in a regrid weight generation operation.
  • The IO performance and memory requirement for the Read() operation was significantly improved by eliminating a costly check only relevant for Write() operations.
  • An issue observed under Darwin M1 systems with ESMP_Initailize() called by ESMPy was fixed.
  • The ESMPy syntax was corrected to allow for calls with property accessor Manager.local_pet.
  • The internal MOAB library included with ESMF now builds under old GCC 5.4.x.

Known Issues

  • Same as ESMF 8.3.0.

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich over 2 years ago

ESMF - ESMF 8.3.0

Overview

The 8.3.0 release of ESMF implements a number of incremental improvements and bug fixes across the library. Highlights of the 8.3.0 release are outlined in the following paragraphs. A detailed list of release notes is provided further down.

On the code management side, ESMF has aligned its tagging scheme with the standard convention used by many other packages on GitHub. Standard tags now start with the lowercase letter “v”, followed by the version triplet. For example, the tag for release 8.3.0 on the ESMF GitHub repository is “v8.3.0”. Beta snapshots leading to a future release have the same root, followed by the lowercase letter “b” and a two digit snapshot number. E.g. v8.3.0b17 was the last 8.3.0 beta snapshot tag before the official release tag.

ESMF uses a library called ParallelIO (PIO) for its internal I/O operations, such as reading in mesh files and writing out fields. During this release, the version of PIO used internally was upgraded from a very outdated 1.x version to version 2.5. A new option was also added to the ESMF build system to allow linking to an external build of the PIO library.

Built on top of the PIO upgrade, the ESMF_MeshCreate() method that reads a mesh from file was re-implemented. It now reads the mesh coordinate information in a fully distributed way. This reduces the memory footprint dramatically, allowing the creation of much larger meshes from file than before.

Further progress was made toward the full adoption of MOAB as the internal mesh representation in ESMF. The internal MOAB library, included with ESMF, was updated to version 5.3, and combinatorial testing was added to the ESMF testing framework to ensure consistency and backward compatibility between the native mesh implementation and the MOAB-based implementation. Several consistency issues were resolved as a result of the new testing. By default, ESMF 8.3.0 still uses the native mesh implementation internally. As in previous releases, users can enable the MOAB-based implementation at run-time by calling ESMF_MeshSetMOAB().

Support for dynamically changing grid coordinates (e.g. storm following grids) was added to ESMF. The ESMF_GridCreate() method that creates a new Grid from an existing Grid with new DistGrid was extended to optionally return a RouteHandle object. The RouteHandle allows subsequent calls into the new ESMF_GridRedist() method to efficiently redistribute the coordinate values from the original source grid to the new destination grid. NUOPC_Connector support for handling changing grid coordinates is not available in this release but will be added in a future release.

Two issues were encountered and addressed in the ESMF_XGrid implementation. First, element areas can now be set in a Field built on an XGrid by using the ESMF_FieldRegridGetArea() method. Previously, this capability was only supported for a Field built on a Grid or Mesh. Second, the algorithm used to generate interpolation weights was improved to guarantee that exchange grid cells are exactly overlapping exactly one cell on each side. Prior to this change, small numerical errors prevented this property from holding and resulted in small remapping errors.

Release Notes

  • This release is backward compatible with the last release ESMF 8.2.0, for all the interfaces that are marked as backward compatible in the Reference Manual. There were API changes to a few unmarked methods that may require minor modifications to user code that uses these methods. The entire list of API changes is summarized in a table showing interface changes since ESMF 8.2.0, including the rationale and impact for each change.
  • No bit-for-bit changes were observed for this release compared to release ESMF v8.2.0 with Intel compilers using “-O2 -fp-model precise”.
  • Tables summarizing the ESMF regridding status have been updated. These include supported grids and capabilities of the offline and integrated regridding.
  • A new section was added to the NUOPC Reference Manual describing the use of NUOPC_AddNestedState() for the coupling of multiple nests or multiple data sets between components.
  • The option to profile the execution time of each individual iteration through a NUOPC run sequence has been implemented in the Driver Component Metadata Profiling attribute. Setting the appropriate profiling bit results in a profile where the timing for each individual run sequence iteration is reported in the timing profile under a unique label. This information can be helpful for cases where the cost per iteration changes throughout the execution.
  • An issue in the ESMF_StateReconcile() method used by ESMF and NUOPC to generate a consistent object view across multiple components was fixed. The optimization implemented in v8.1.0 introduced the unintended behavior of switching out geom objects (Grid, Mesh, etc.) for Fields contained in States that are used in multiple ESMF_StateReconcile() operations. The incorrect association of geom objects with Fields resulted in unexpected results during subsequent operations using those Fields, such as creating a RouteHandle for regridding.
  • Progress was made in full adoption of MOAB as the internal mesh representation in ESMF. This includes updating the internal MOAB library included with ESMF to version 5.3.1 and the addition of combinatorial testing designed to ensure consistency and backward compatibility between the native mesh implementation and the MOAB-based implementation. Several consistency issues were resolved as a result of the new testing.
  • The previous version of ESMF_MeshCreate() from a file used to read all the node coordinate information on every processor. For large mesh files this global read can lead to high memory consumption and prevent reading in certain large meshes entirely. To reduce the memory footprint, a fully distributed read of mesh coordinate information was implemented. This change allows the creation of much larger meshes from file.
  • The nodeOwners argument for method ESMF_MeshCreate() and ESMF_MeshAddNodes() was made optional. This allows the user to defer specification of node ownership to ESMF in cases where a specific ownership assignment is not needed to match the application data distribution. When this argument is absent, ESMF generates a consistent assignment of node owners.
  • The ESMF_GridCreate() method that creates a new Grid from an existing Grid with new DistGrid was extended to optionally return a RouteHandle object. The RouteHandle allows subsequent calls into the new ESMF_GridRedist() method to redistribute the coordinate values from the original source grid to the new destination grid. This feature supports efficient handling of dynamically changing grids between components.
  • The implementation of the exchange grid (ESMF_XGrid) class that supports efficient conservative regridding between multiple grids on source and destination sides has been improved:
    • Element areas can now be set in a Field built on an XGrid by using the ESMF_FieldRegridGetArea() method. Previously, this capability was only supported for a Field built on a Grid or Mesh.
    • The algorithm used to generate interpolation weights was improved to guarantee that exchange grid cells are exactly overlapping exactly one cell on each side. Prior to this change, small numerical errors prevented this property from holding.
  • Added the optional --checkFlag argument to ESMF_RegridWeightGen application. This flag allows the user to turn on more expensive error checking that may not be appropriate for an operational run. Initially this flag turns on a check for grid self-intersection during conservative regridding.
  • The VM Epoch implementation now provides an option to reduce the memory pressure on the sending side PETs. By default, internal send buffers, once allocated, are kept until the VM is destroyed. This can lead to high memory pressure for cases where the same sending PETs participate in communication with multiple sets of receiving PETs. Setting keepAlloc=.false. when calling ESMF_VMEpochEnter(), instructs ESMF to immediately deallocate internal send buffers once the data has been transferred. This is in analogy to the handling of internal receive buffers with keepAlloc=.false. when calling ESMF_VMEpochExit(). The default remains .false. for both sides for efficiency.
  • Two internal fixed size buffers that caused issues when precomputing RouteHandles (e.g. via RegridStore()) for high-resolution, high PET count cases (~10,000 and above) were modified. The size of one of the buffers was doubled, while the other fixed size limitation was removed. The symptom of the first buffer size issue (now increased in size) was an error trace in the ESMF Log starting with "ESMCI_DELayout.C:9616 ESMCI::XXE::storeBufferInfo() Internal error: Bad condition - bufferInfoList overflow!!!". The second buffer size issue (now eliminated) was an error trace starting with "ESMCI_DELayout.C:8416 ESMCI::XXE::execReady() Internal error: Bad condition - sendnbCount out of range".
  • ESMF uses a library called ParallelIO (PIO) for its internal I/O operations, such as reading in mesh files and writing out fields. During this release, the version of PIO used internally was upgraded from a very outdated 1.x version to version 2.5. As a result, the binary output option ESMF_IOFMT_BIN is no longer supported and has been removed. A new option was also added to the ESMF build system to allow linking to an external build of the PIO library, as long as the external build is at least version 2.5.8. The upgrade eliminates the need to pass the compiler flags "-fallow-argument-mismatch -fallow-invalid-boz" when building with GNU 10.x or newer compilers. If the internal build of PIO is used, CMake version 2.8.12 or newer must be available in the system path. See the User's Guide for information about the environment variables used to configure PIO build and linking options.
  • An extra column was added to the ESMF profiler summary output, reporting the number of PEs (CPU cores) associated with the executing PETs. This information is helpful for example when profiling components that run with ESMF-managed threading. In the single-threaded case, each PET is associated with a single PE, and the number of PEs equals that of PETs. However, for the multi-threaded case, where N threads (e.g. OpenMP) are spawned under each PET, the number of PEs will be N times the number of PETs.
  • The ESMF_COMM build setting for MPICH has been reworked to better align with the current state of the MPICH project, and other ESMF_COMM settings. ESMF_COMM=mpich now covers the current MPICH versions 3 and 4. ESMF_COMM=mpich3 is still supported for backward compatibility. The old MPICH2 continues to be supported via ESMF_COMM=mpich2.
  • A problem with ESMF library installation linking for dylibs under Darwin was fixed. Previously the installed ESMF library remained dependent on files under the src directory of the ESMF build tree.
  • The FindESMF.cmake file included with ESMF, which is provided as a convenience to users that use CMake in their projects, has been updated. The module now searches ESMF_ROOT if ESMFMKFILE is not provided by the environment. Option USE_ESMF_STATIC_LIBS has been added to use the static ESMF library when building executables. This module requires CMake v3.12 and above.

Known Issues

Platform-specific issues:

  • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default, ESMF uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.
  • On Darwin, with the GNU gfortran+gcc combination, when building MPICH3 from source, it is important to specify the "--enable-two-level-namespace" configure option. By default, i.e. without this option, on Darwin, the produced MPICH compiler wrappers include a linker flag (-flat_namespace) that causes issues with C++ exception handling. Building and linking ESMF applications with MPICH compiler wrappers that specify this linker option leads to “mysterious” application aborts during execution.
  • On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting the environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.
  • There is an issue with intercepting the MPI calls for profiling on some of the supported platforms. This results in a single FAIL reported for ESMF_TraceMPIUTest.F90. The affected platforms are:
    • Catania: Darwin+GNU+MPICH3
    • Gaea: Unicos+GNU+cray-mpich
  • There is an issue with loading the libesmftrace_preload.so library on some of the supported platforms. This results in a reported CRASH for ESMF_TraceIOUTest.F90 and ESMF_TraceMPIUTest.F90. The affected platforms are:
    • Cori: Unicos+Intel+cray-mpich
    • Cori: Unicos+Intel+mpiuni
    • Discover: Linux+GNU+intelmpi
    • Gaea: Unicos+Intel+cray-mpich
    • Gaea: Unicos+Intel+mpiuni
    • Hera: Linux+GNU+intelmpi
    • Orion: Linux+GNU+mpiuni

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich almost 3 years ago

ESMF - ESMF 8.2.0

Overview

Starting with version 8.2.0, the ESMF team has moved to a more frequent release cadence with new releases anticipated approximately every six months. This approach helps to ensure that new features, bug fixes, and optimizations are available more frequently in official releases of ESMF.

Highlights of the 8.2.0 release are outlined below. A detailed list of release notes is also provided below.

The NUOPC run sequence feature has proven a viable formalism to capture and express the control- and data-flow among the components of a wide range of coupled applications. Recent application work has demonstrated the need for more succinctly specifying conditional execution of run sequence elements. This release extends the NUOPC RunSequence syntax to include Alarm Blocks. Alarm blocks allow the user to specify if certain run sequence elements should be called less frequently than the parent timestep.

Several groups have started implementing exchange grids in their modeling systems, including within NUOPC Mediators. To facilitate these efforts, the ESMF_XGrid support was extended in this release. It now includes the use of all ESMF regridding methods (bilinear, patch, etc.) and options (extrapolation, regridding status, etc.) when regridding to or from Fields built on exchange grids.

A number of issues were uncovered during the deployment of the ESMF-managed threading and resource control features. This release addresses these issues, and the NUOPC level support for resource control and handling of threaded components is now more robust and has been demonstrated in several large-scale applications. This feature allows model components to independently set OpenMP threading levels so that all components in a coupled system are best utilizing available HPC resources, based on their individual scaling profiles.

The VMEpoch feature is an important communication optimization used by the NUOPC_Connector, and by some applications directly. This release fixes a problem that was encountered when using VMEpoch with any of the Redist() methods. The release also addresses an out-of-memory issue that can be triggered when the sending side runs many iterations ahead of the receiving side, by introducing automatic message throttling. Finally, a new reference manual section is available that describes the use of VMEpoch for asynchronous RouteHandle communications.

The process of replacing the native ESMF mesh implementation with the MOAB library, developed by the U.S. Department of Energy, is continuing. This release makes the MOAB mesh backend available to ESMPy users by calling Manager.set_moab(). This option allows users to test the impacts of using the MOAB mesh backend instead of the default native mesh through ESMPy.

Release Notes

  • This release is backward compatible with the last major release update, ESMF 8.1.0 and patch release ESMF 8.1.1, for all the interfaces that are marked as backward compatible in the Reference Manual. There were API changes to a few unmarked methods that may require minor modifications to user code that uses these methods. The entire list of API changes is summarized in a table showing interface changes since ESMF_8_1_0, including the rationale and impact for each change.
  • No bit-for-bit changes were observed for this release compared to release ESMF 8.1.0 and patch release ESMF 8.1.1, with Intel compilers using “-O2 -fp-model precise”. However, the release contains code changes to the regridding implementation that have the potential to lead to bit-for-bit changes in regridding weights. Any release item with the potential to introduce a bit-for-bit change is indicated in the respective release note.
  • Tables summarizing the ESMF regridding status have been updated. These include supported grids and capabilities of the offline and integrated regridding.
  • The NUOPC RunSequence syntax was extended to support Alarm Blocks. An alarm block specifies the time interval at which the elements within the block are executed. This adds additional flexibility to the RunSequence approach, e.g. to write restart files at certain intervals that are multiples of the parent timestep.
  • Fields created on XGrids can now be used as either source, destination, or both when calling the general ESMF regrid methods (ESMF_FieldRegridStore(), ESMF_FieldRegrid(), ESMF_FieldBundleRegridStore(), ESMF_FieldBundleRegrid()). This enables the use of all ESMF regridding methods (bilinear, patch, etc.) and options (extrapolation, regridding status, etc.) when regridding to or from Fields on an XGrid. Prior to this release, regridding to or from Fields on an XGrid was only supported when going from one of the grids used to originally create the XGrid. Also, only conservative methods were supported.
  • A change in the 3D spherical bilinear weight calculation to handle more complex cells lead to a decrease in performance in releases 8.0.0, 8.1.0, and 8.1.1. The current release restores the performance to the level of ESMF 7.1.0r, and better, while retaining support for the complex cells. (Note that this change has the potential to introduce round off level changes in weights calculated for the 3D spherical bilinear method compared to previous ESMF releases. However, bit-for-bit testing with the Intel compiler using “-O2 -fp-model precise” did not detect any changes.)
  • A number of issues that were found with ESMF-managed threading under real application usage, as released with ESMF 8.1.0, have been addressed: (1) PETs that execute a threaded component are no longer instantiated as Pthreads by default but instead execute under the original MPI process. This resolves the issue of not being able to set an unlimited stack size. (2) Issues within the automatic garbage collection of ESMF objects have been resolved, which lead to memory corruption issues during ESMF_Finalize() when Grids or Meshes were transferred between threaded components. (3) Thread affinities and number of OpenMP threads are reset when exiting from a threaded component method, and global resource control can be turned on/off via the optional argument globalResourceControl during ESMF_Initialize().
  • It is now possible to override the defaults of a number of global ESMF settings by specifying an ESMF_Config file during ESMF_Initialize(). This is particularly useful for adjusting log specific settings, or to turn on/off resource control on the global VM.
  • A new section was added to the ESMF Reference Manual that discusses use of VMEpoch for asynchronous RouteHandle communications.
  • The VMEpoch feature allows sending PETs to fill the message queue up to the limit set by the MPI implementation. For message sizes where an MPI implementation chooses to use the EAGER protocol, this can lead to memory exhaustion on the receiving PETs. To prevent this issue, VMEpoch now limits the number of outstanding send cycles to ten by default. This default can be overridden by the user through the optional argument throttle to ESMF_VMEpochEnter().
  • The process of replacing the native ESMF mesh implementation with the MOAB library is continuing. The MOAB mesh backend is now available to ESMPy by calling Manager.set_moab(). This allows the user to test ESMPy regridding features with the new MOAB backend in preparation for MOAB becoming the default. Manager.moab returns a boolean value to indicate if the MOAB backend is currently in use. The default is to use the native ESMF mesh backend.

Known Bugs

  • The ESMF_XGrid construction can lead to degenerate cells in cases where the source and destination grids have edges that are almost the same. Often these cells don't produce weights and are benign, but when weights are produced, they can lead to low accuracy results when transferring data to/from the XGrid.

  • Attempting to write weight files from the ESMPy Regrid object when using filemode=FileMode.WITHAUX currently crashes.


    Platform-specific bugs:

  • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default, ESMF uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.

  • For GNU compilers GCC>=10.x, the default Fortran argument mismatch checking has become stricter. This results in build failures in some of the code that comes with ESMF. Setting environment variable ESMF_F90COMPILEOPTS="-fallow-argument-mismatch -fallow-invalid-boz", during the ESMF build, can be used as a work-around for this issue.

  • On Darwin, with the GNU gfortran+gcc combination, when building MPICH3 from source, it is important to specify the "--enable-two-level-namespace" configure option. By default, i.e. without this option, on Darwin, the produced MPICH compiler wrappers include a linker flag (-flat_namespace) that causes issues with C++ exception handling. Building and linking ESMF applications with MPICH compiler wrappers that specify this linker option leads to “mysterious” application aborts during execution.

  • On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting the environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.

  • The ESMF_ArrayIOUTest unit test fails the binary read test on the S4 test system (Linux+Intel+IntelMPI).

  • There is an issue with intercepting the MPI calls for profiling on some of the supported platforms. This results in a single FAIL reported for ESMF_TraceMPIUTest.F90. The affected platforms are:

    • Catania: Darwin+GNU+MPICH3
    • Gaea: Unicos+GNU+cray-mpich
  • There is an issue with loading the libesmftrace_preload.so library on some of the supported platforms. This results in a reported CRASH for ESMF_TraceIOUTest.F90 and ESMF_TraceMPIUTest.F90. The affected platforms are:

    • Cori: Unicos+Intel+cray-mpich
    • Cori: Unicos+Intel+mpiuni
    • Discover: Linux+GNU+intelmpi
    • Gaea: Unicos+Intel+cray-mpich
    • Gaea: Unicos+Intel+mpiuni
    • Hera: Linux+GNU+intelmpi
    • Orion: Linux+GNU+mpiuni

Documentation

Tables

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich over 3 years ago

ESMF - ESMF 8.1.1

Overview

The 8.1.1 is a patch release that fixes a major regression in 8.1.0. The regression is in the area of user level OpenMP thread handling, where we received several reports of reduced performance. Performance in these cases has been restored with ESMF 8.1.1.

Release Notes

  • This patch release is backward compatible with ESMF 8.1.0.
  • No bit-for-bit changes are expected for this release compared to ESMF 8.1.0.
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • This patch release corrects a regression in the handling of user level OpenMP threads. See the Known Bugs section for details.

Known Bugs

  • Same as ESMF_8_1_0 with the following exceptions:
    • The issue with OpenMP thread count being reset to 1 within all ESMF components has been fixed.
    • The PETs of all ESMF components, and any potentially created OpenMP threads under such PETs, are no longer pinned automatically to any specific PEs.
  • Platform-specific bugs:
    • The same as ESMF_8_1_0.

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich about 4 years ago

ESMF - ESMF 8.1.0


IMPORTANT: This release of ESMF contains a known issue with OpenMP thread counts reset to 1 and incorrect pinning of OpenMP threads, leading to reduced application performance in threaded regions. See Known Bugs below for more details. ESMF release 8.1.1 addresses this issue.


Overview

During the 8.1.0 development cycle, a number of exciting new features were added to the ESMF library, ease of use was improved in several areas, and performance was optimized in critical parts of the library. Highlights of the 8.1.0 release are listed below. A detailed list of release notes is also provided further down.

The integration of the MOAB mesh library, developed by the U.S. Department of Energy, into ESMF, has reached a milestone that supports application-level testing. All users of ESMF are encouraged to begin testing their applications with the MOAB option turned ON. Preliminary results indicate that MOAB provides improved performance and scaling, and reduced memory footprint. Applications seeking to go to very high resolution grids should benefit from this new capability. The MOAB mesh feature is still OFF by default in this release, but can easily be turned ON from the application level, any time during the run. The available regridding features supported by the MOAB based implementation are clearly listed in the detailed release notes.

Another area of the library that received significant attention is the key-value storage. The new ESMF_Info class, based on a modern C++ JSON implementation, replaces ESMF_Attribute. Most of the legacy ESMF_Attribute API is preserved for backward compatibility, but users are encouraged to migrate their code to using the ESMF_Info API. As a consequence of the new key-value implementation, the NUOPC initialization time has been reduced. This is most pronounced in applications with large Field and PET counts.

Major ease of use improvements went into the NUOPC Layer. One is the introduction of semantic specialization labels into the NUOPC API. The new approach no longer uses Initialize Phase Definition (IPD) versions or the IPDvXXpY nomenclature when registering methods in the SetServices() method. This leads to clearer, more concise NUOPC “cap” implementations. Another improvement is the seamless integration of the NUOPC profiling options into the ESMF profiling infrastructure. Simply by setting the Profiling attribute on NUOPC Driver, Model, Mediator, or Connector components, it is now possible for the user to generate a detailed NUOPC-level performance profile.

A number of new features were added in the area of resource control, handling of threaded components, and shared memory access. The NUOPC layer now supports resource control and handling of threaded components. This mechanism supports hybrid MPI+OpenMP components with different threading levels, allowing each component to fully utilize HPC resources independently. Coupling between threaded components is supported automatically via the standard NUOPC Connectors. Further, data can now be shared by reference between components that run on the same compute nodes even if running with different threading levels. Both Field-level sharing and Array-level sharing are supported through the ESMF API.

New features were also added to the ESMF regridding implementation. It now supports the “nearest destination”, and “creep nearest destination” extrapolation methods. The “creep fill” extrapolation method, introduced in 8.0.0, is now available through ESMPy.

Finally 8.1.0 includes additions to improve the overall user experience with ESMF. In the area of regridding the user has now the option to specify the ‘checkFlag’ argument to ESMF_FieldRegridStore(). This turns on more comprehensive grid/mesh error checking at the cost of performance. The option should therefore be used during application development and debugging, and not during production runs. Finally, for users faced with debugging failing ESMF applications, a new section was added to the User’s Guide entitled "Debugging of ESMF User Applications" that provides some hints on how to interpret stack traces and errors that appear in the ESMF log files.

Release Notes

  • This release is backward compatible with the last major release update, ESMF 8.0.0, for all the interfaces that are marked as backward compatible in the Reference Manual. There were API changes to a few unmarked methods that may require minor modifications to user code that uses these methods. The entire list of API changes is summarized in a table showing interface changes since ESMF_8_0_0, including the rationale and impact for each change.
  • Some bit-for-bit changes are expected for this release compared to release ESMF 8.0.0 and patch release ESMF 8.0.1. We observe the following impact with Intel compilers using “-O2 -fp-model precise”:
    • Fixed a problem that could result in erroneously unmapped destinations when going from a very fine source grid to a coarse destination grid (e.g. 1/20 degree to 10x15 degree). Expected bit-for-bit changes:
      • ESMF_REGRIDMETHOD_CONSERVE_2ND: roundoff level changes in weight values because of a change in the order of calculation.
      • All regrid methods: Missing weights being added for very fine source grid to coarse destination grid regridding cases as this fix comes into play.
    • Fixed a problem where using the bilinear method between two identical grids doesn't result in an identity matrix for the regridding weights. It also improves the efficiency of the code when using bilinear or patch between identical grids. Expected bit-for-bit changes:
      • ESMF_REGRIDMETHOD_BILINEAR: small changes in regridding weights when a destination point exactly matches a source point.
      • ESMF_REGRIDMETHOD_PATCH: small changes in regridding weights when a destination point exactly matches a source point.
    • Fixed a problem where a set of points with latitudes set at exactly -90.0 are not all mapped to the same point. Expected bit-for-bit changes:
      • All regrid methods: Small weight changes when a point in the grid lies at exactly -90.0 latitude.
    • Optimized the creep fill so the memory doesn't increase as quickly for large numbers of creep levels. Expected bit-for-bit changes:
      • ESMF_EXTRAPMETHOD_CREEP: Small weight changes for the extrapolated destination locations.
    • Fixed an issue where in some cases creep fill weights can trigger an assertion in the code that redistributes weights to their final decomposition. Expected bit-for-bit changes:
      • ESMF_EXTRAPMETHOD_CREEP: Small weight changes for the extrapolated destination locations.
  • Tables summarizing the ESMF regridding status have been updated. These include supported grids and capabilities of the offline and integrated regridding, and numerical results of some specific test cases.
  • The ESMF_Info class was introduced as a replacement for ESMF_Attribute. ESMF_Info is based on a modern C++ JSON implementation to provide efficient key-value pair storage. Most of the legacy ESMF_Attribute API is preserved for backward compatibility.
  • ESMF is in the process of upgrading the internal mesh representation to use the MOAB mesh library developed by the U.S. Department of Energy. In this release, ESMF capabilities using MOAB have been significantly optimized and expanded, allowing for application-level testing of ESMF with MOAB as the underlying mesh representation. MOAB is built into the ESMF library by default, but its use must be enabled at run-time by calling ESMF_MeshSetMOAB(). When MOAB is activated, the following new capabilities are supported in this release:
    • The Mesh creation, conservative regridding, and bilinear regridding algorithms when MOAB is active have been optimized to reduce their memory use and expand the size of Meshes they can be used on.
    • Grids can now be explicitly converted to a Mesh when MOAB is active, using the ESMF_MeshCreate() method.
    • Grids can be used to do first-order conservative regridding using MOAB.
    • Grids can be used for bilinear regridding on cell centers or corners using MOAB.
  • The NUOPC Layer contained in this release has been improved in the following specific technical areas:
    • The NUOPC API has been simplified through the introduction of semantic specialization labels. The new approach leads to clearer and more concise NUOPC “cap” implementations that do not require specifying an Initialize Phase Definition (IPD) version or using the IPDvXXpY nomenclature when registering methods in the SetServices() method. Existing caps do not have to be re-written or updated, although updating to the new semantic specialization labels is recommended for existing and new NUOPC caps. The older IPD version based approach is supported for backward compatibility.
    • The NUOPC layer now provides features for resource control and handling of threaded components. This mechanism supports mixing of hybrid MPI+OpenMP components with different threading levels and mixing with standard MPI components on the same processing elements (PEs), i.e. cores. It allows each component to fully utilize HPC resources independently. Coupling between threaded components is supported automatically via the standard NUOPC Connectors.
    • The external NUOPC interface that supports interaction between an entire NUOPC application and a layer outside of NUOPC (e.g. a Data Assimilation system) was further refined. The associated prototype code (ExternalDriverAPIProto) has been updated to reflect the current status.
    • The NUOPC Profiling attribute, available in the Driver Metadata, Model Metadata, Mediator Metadata, and Connector Metadata, has been re-implemented. The NUOPC layer profiling features now integrate with the ESMF profiling infrastructure.
    • The NUOPC transfer protocol for geometry objects (Grid, Mesh, LocStream) has been made more efficient. Geometries used for multiple Fields are only transferred once, reducing the initialization overhead associated with the transfer.
    • Several optimizations were implemented in the NUOPC layer to reduce overhead. All applications using NUOPC benefit from these optimizations without requiring code changes.
    • Added the creep_nrst_d value to the extrapMethod NUOPC connection options. This is equivalent to the ESMF_EXTRAPMETHOD_CREEP_NRST_D option in ESMF_FieldRegridStore() discussed below.
  • Added the extrapolation option: CREEP_FILL to ESMPy. This option fills unmapped destination points by repeatedly moving data from mapped locations to neighboring unmapped locations.
  • The implementation of the ESMF_StateReconcile() method was redesigned to improve performance, and scalability. Most users do not interact with this method directly, however, the NUOPC initialization time has been reduced as a consequence, which is most pronounced in applications with large Field and PET counts.
  • Added a new extrapolation method called "nearest destination" to the regrid weight generation system. This capability fills destination points that were not filled by an initial regridding by using the nearest regridded destination point. The nearest destination method is accessible by specifying the extrapMethod=ESMF_EXTRAPMETHOD_NEAREST_D option in any of the ESMF_*RegridStore() methods or the --extrap_method nearestd option when using the ESMF_RegridWeightGen application).
  • Added a new extrapolation method called "creep nearest destination" to the regrid weight generation system. This capability fills destination points that were not filled by an initial regridding by first applying a creep fill extrapolation and then filling the remaining unmapped destination points using nearest destination extrapolation. The creep nearest destination method is accessible by specifying the extrapMethod=ESMF_EXTRAPMETHOD_CREEP_NRST_D option in any of the ESMF_*RegridStore() methods or the --extrap_method creepnrstd option when using the ESMF_RegridWeightGen application).
  • The creep fill extrapolation has been optimized so that it uses less memory per level extrapolated. This allows the creep fill method to extrapolate several times further into unmapped parts of the destination Field.
  • Added the optional checkFlag argument to ESMF_FieldRegridStore(). The intention behind this flag is to allow the user to turn on more expensive error checking that may not be appropriate for an operational run. Initially this flag turns on a check for grid self-intersection during conservative regridding.
  • The ESMF_MeshGet() call has been expanded to allow the user to query a full set of information for most Meshes. The exception is 2D Meshes with cells of more than four sides for which the element information (e.g. element connectivity) is not yet available.
  • A new shared memory feature was introduced that allows sharing of decomposition elements (DEs) between PETs that execute on the same single system image (SSI), i.e. node. This feature provides an efficient way to access data by reference between components that are running on the same PEs (cores), but with different threading levels. Both Field-level DE sharing and Array-level DE sharing are supported.
  • Key-value pair storage was added to the ESMF_Mesh and ESMF_LocStream classes through the overloaded ESMF_InfoGetFromHost() method.
  • A section discussing “Debugging of ESMF User Applications” has been added to the User’s Guide. This section is designed to help users interpret error traces and efficiently locate issues in failing ESMF applications.

Known Bugs

  • FieldBundles don't currently enforce that every contained Field is built on the same Grid, Mesh, LocStream, or XGrid object, although the documentation says that this should be so.

  • The packed FieldBundle implementation uses a concatenated string to create a base object. When this string has more than 255 characters, e.g. a large number of Fields with long individual names is packed, the base object is not created correctly resulting in incorrect behavior at the FieldBundle level.

  • When the ESMF regrid weight generation methods and applications are used with nearest destination to source interpolation method, the unmapped destination point detection does not work. Even if the option is set to return an error for unmapped destination points (the default) no error will be returned.

  • The ESMF regrid weight generation methods and applications do not currently work for source Fields created on Grids which contain a DE of width less than 2 elements. For conservative regridding the destination Field also has this restriction.

  • The ESMF regrid weight generation methods and applications do not currently work on Fields created on Grids with arbitrary distribution.

  • When using the ESMF_RegridWeightGen application to generate conservative weights for a Mesh with > 16 million cells, the weight file produced has some of its factors scrambled in a subtle way. This can lead to higher than usual conservation error (>1.0E-7).

  • When pole extrapolation is used during regridding operations and a quadrilateral cell that degenerates into a triangle is in the top or bottom row of the grid, the error "Condition {etopo->num_nodes == 4} failed..." is incorrectly returned.

  • The ESMF_GridCreate() interface that allows the user to create a copy of an existing Grid with a new distribution will give incorrect results when used on a Grid with 3 or more dimensions and whose coordinate arrays are less than the full dimension of the Grid (i.e. it contains factorized coordinates).

  • The ESMF_XGrid construction can lead to degenerate cells for cases where the source and destination grids have edges that are almost the same. Often these cells don't produce weights and are benign, but when weights are produced can lead to low accuracy results when transferring data to/from the XGrid.

  • When any of the Redist() methods are used inside a VMEpoch, the execution crashes with an MPI error.

  • The OpenMP thread count is being reset to 1 within all ESMF components. This affects user code that leverages OpenMP threading inside of components, and uses the OMP_NUM_THREADS environment variable to set the desired number of OpenMP threads. As a consequence the expected speed up from OpenMP threading in user code will not be present.

  • The PETs of all ESMF components, and any potentially created OpenMP threads under such PETs, are pinned to the PE on the respective shared memory node, corresponding to the PET number. As a consequence, even if a user overcomes the OpenMP thread count reset to 1 bug, e.g. by using omp_set_num_threads() API directly, the performance of OpenMP threaded user code is far below that of the expected speed up.

    Platform-specific bugs:

  • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default ESMF now uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.

  • For GNU compilers GCC>=10.x, the default Fortran argument mismatch checking has become stricter. This results in build failures in some of the code that comes with ESMF. Setting environment variable ESMF_F90COMPILEOPTS="-fallow-argument-mismatch -fallow-invalid-boz", during the ESMF build, can be used as a work around for this issue.

  • On Darwin, with the GNU gfortran+gcc combination, when building MPICH3 from source, it is important to specify the "--enable-two-level-namespace" configure option. By default, i.e. without this option, on Darwin, the produced MPICH compiler wrappers include a linker flag (-flat_namespace) that causes issues with C++ exception handling. Building and linking ESMF applications with MPICH compiler wrappers that specify this linker option leads to “mysterious” application aborts during execution.

  • On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.

  • Currently the ESMPy interface to retrieve regridding weights from Python is only supported under the GNU compiler. On all other compilers the method will flag an error.

  • There is an issue with intercepting the MPI calls for profiling on some of the supported platforms. This results in a single FAIL reported for ESMF_TraceMPIUTest.F90. The affected platforms are:

    • Catania: Darwin+GNU+MPICH3
    • Gaea: Unicos+GNU+cray-mpich
  • There is an issue with loading the libesmftrace_preload.so library on some of the supported platforms. This results in a reported CRASH for ESMF_TraceIOUTest.F90 and ESMF_TraceMPIUTest.F90. The affected platforms are:

    • Discover: Linux+GNU+intelmpi
    • Gaea: Unicos+Intel+cray-mpich
    • Gaea: Unicos+Intel+mpiuni
    • Orion: Linux+GNU+mpiuni

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich about 4 years ago

ESMF - ESMF 8.0.0

Overview

The ESMF 8.0.0 release concludes another phase of evolving and improving the library. The number of applications in which the library is used continues to grow. Requirements from these applications shaped and guided the developments included in this release.

The four typical ways of using ESMF have not changed: 1) to create high-performance, interoperable component-based modeling systems; 2) as a source of data communication, time management, metadata handling, and other libraries; 3) as a fast, parallel generator of interpolation weights from file for many different grids (see the ESMF_RegridWeightGen application website); and 4) as a Python grid remapping library (see the ESMPy website).

Highlights of 8.0.0 include improved component timing profiles and the introduction of a community based NUOPC Field Dictionary. Further extensions to the NUOPC layer include improved performance, and support for driving a NUOPC system from a higher level such as a data assimilation (DA) system. The ESMPy interface now offers in-memory weight access, and the ESMF regridding implementation was extended to include "creep-fill" extrapolation. Core ESMF data classes were extended: A C interface was added to ESMF_XGrid, the creation and use of ESMF_Mesh has been simplified, ESMF_FieldBundles can be created for packed data allocations, and a shared memory access capability was added to ESMF_Array.

More details of the highlighted items are provided in the following paragraphs for convenience.

The tracing capability that was introduced in the previous release now supports a simple mechanism to generate component timing profiles in text files. A single summary timing file can be generated at the end of a run that provides timing statistics across all the PETs. This provides a simple way to understand the relative cost (in terms of wall clock time) of each component in a coupled application.

Management of the NUOPC Field Dictionary has been made more flexible. The dictionary can now be loaded from file during run-time. A community version of the NUOPC Field Dictionary resides in a dedicated public repository and can evolve independently of the NUOPC Layer.

The NUOPC layer now supports situations where a NUOPC system is driven by a higher level driver (outside of NUOPC). One application of this feature is the ability to integrate NUOPC-based forecast systems (such as UFS) with DA systems that require their own driver layer (such as JEDI). Further the NUOPC run sequence was extended to support switching between different run sequence sections during execution. For example, this capability allows changing which components are active at different stages during a run. The overall performance of the NUOPC layer was improved by eliminating unnecessary synchronization, allowing greater opportunity for component concurrency.

The ESMF regridding system was extended in several areas. The "creep-fill" extrapolation method was added to allow the user to spread data from mapped destination points to neighboring unmapped destination points. Regridding weights can now be returned through the ESMPy interface in-memory to eliminating the need to go through netCDF files when accessing weights from Python. The ESMF_Regrid application has been updated to support GRIDSPEC Mosaic files, and regridding on different stagger locations. The ESMF_RegridCheck external demo was added to test the ESMF_Regrid application and the ESMF regridding system with a collection of test grids and data sets.

The ESMF_XGrid interface was wrapped with ESMC bindings, providing simplified access to the ESMF exchange grid implementation through native interfaces from model code written in the C programming language.

The use of the unstructured mesh class, ESMF_Mesh, has been simplified. ESMF_Mesh objects can now be created directly from a structured ESMF_Grid object, using a simple ESMF_MeshCreate() call. Further, a Mesh object can now be queried for its mask and area information making this information accessible to model code that depends on it.

The ESMF_FieldBundle class was extended to cover the case where multiple Fields are packed into a single data allocation. Fields can be interleaved along any dimension of the packed data allocation. The current implementation of the packed feature is limited to cases where the user provides the data allocation to ESMF. Communication calls are supported going from a packed source FieldBundle to a packed destination FieldBundle, however, both sides must provide the same number of Fields, and the order of Fields must be the same on both sides. Further, the NUOPC layer does not currently support exchanging data via FieldBundles, packed or unpacked. Future ESMF releases are planned to address these limitations.

The ESMF_Array class now allows sharing of data between PETs that execute on the same single system image (SSI). This feature can be used for shared memory data access between components that run on the same set of compute nodes (i.e. same SSIs), but run with different number of PETs on each node. This situation is typically encountered when components use different number of threads under each PET. Exchanging data by shared memory access is usually more efficient than having to transfer the data between PETs.

There are many other features and options added throughout ESMF, detailed in the release notes (see link below). Backward compatibility of the Fortran user interface with the ESMF 5.2.0r release series was preserved for methods that are labeled backward compatible in the Reference Manual; the majority of methods fall into this category.

Release Notes

  • This release is backward compatible with the last release, ESMF 7.1.0r, for all the interfaces that are marked as backward compatible in the Reference Manual. There were API changes to a few unmarked methods that may require minor modifications to user code that uses these methods. A number of new interfaces were added. The entire list of API changes is summarized in a table showing interface changes since ESMF_7_1_0r, including the rationale and impact for each change.
  • Some bit-for-bit changes are expected for this release compared to the last release, ESMF 7.1.0r. We observe the following impact with Intel compilers using "-O2 -fp-model precise":
    • Roundoff level differences in conservative regridding due to an improvement in an area calculation algorithm
    • Roundoff level differences in regridding when used on a Mesh created from a SCRIP format file that contains longitudes <=0 degrees. This change was due to removing a conversion for non-positive longitudes to improve consistency
    • Minor differences in 2nd order conservative regridding for cells that protrude outside their neighbors (e.g. a peninsula made up of a single cell) due to a bug fix in the weight calculation algorithm for that regridding method
  • Tables summarizing the ESMF regridding status have been updated. These include supported grids and capabilities of the offline and integrated regridding, and numerical results of some specific test cases.
  • Added an option to output component timing profiles in text format by setting the ESMF_RUNTIME_PROFILE environment variable. ESMF and NUOPC component phases are automatically instrumented and user-defined timed regions are also supported. Timing profiles can be written to the end of the ESMF PET log files, to separate per-PET text files, and/or to a single timing summary file. The summary file provides timing statistics across all the PETs. This provides a simple way to understand the relative cost (in terms of wall clock time) of each component in a coupled application.
  • The NUOPC Layer contained in this release has been improved in the following specific technical areas:
    • The NUOPC Field Dictionary can now be ingested from a community-based YAML file which resides in a dedicated public repository and can evolve independently of the NUOPC Layer. YAML Ain't Markup Language (YAML) is a human friendly, Unicode-based data serialization language for all programming languages.
    • A NUOPC_Driver can now be called from a higher level driver outside of NUOPC, going through an ESMF API. This is useful for systems that come with their own driver layer, but need to drive a NUOPC system (e.g. Data Assimilation).
    • The NUOPC run sequences now supports the "*" wildcard character in the "@" timestep syntax. This allows the timestep length to be set in code via driver specialization for run sequences ingested from a text file.
    • NUOPC now allows switching between different run sequence sections during execution. For example, this capability allows changing which components are active at different stages during a run. To enable this, the NUOPC_DriverIngestRunSequence() method now supports specification of a run duration, run sequence concatenation, and component run sequence elements outside loops.
    • ConnectionOptions is now an official NUOPC level Connector attribute. The attribute is read during RunSequence ingestion (for each Connector line), and appended by default to all of the CplList entries of the Connector.
  • The NUOPC_DriverAddComp() method now supports adding components with a SetVM() method, allowing the component to configure its own VM, e.g. for PET idling for PE reuse under threaded PETs.
  • Fields that are mirrored now arrive on the acceptor side with attributes set to reflect information about the provider side Field (TypeKind, GeomLoc, MinIndex, MaxIndex, ArbDimCount, GridToFieldMap, UngriddedLBound, UngriddedUBound). These attributes can then be used when creating Fields on the acceptor side.
    • The NUOPC Mesh transfer protocol was extended to correctly transfer either the node DistGrid, the element DistGrid, or both if present on the provider Mesh.
    • Completed the implementation of the sharing protocol. Whether to share the GeomObject and/or Field are now independent decisions, and all four possible combinations are supported. Sharing is available through component hierarchies and supports nested states (e.g. to use the Namespace and/or CplSet features of NUOPC).
    • The sharing protocol now checks whether the combination of provider and acceptor VMs allows Fields and/or GeomObjects to be shared between them and properly sets the share status attributes.
    • The generic Connector was optimized to also reuse Redist RouteHandles for Fields built on Meshes and LocStreams. Previously only the Redist for Fields built on Grids was optimized in this manner. Interleaving Redist and Regrid Field pairs is supported as before.
    • Implemented a communication optimized approach to propagating Field timestamps during Connector Run() to remove unnecessary synchronization between PETs. Only real data dependencies remain.
    • The NUOPC timestamp definition has been extended to include the ESMF calendar kind for more accurate timestamp validation between components.
    • Improved, standardized, and documented Verbosity attribute handling across all NUOPC generic component kinds. Preset Verbosity levels are available: "off", "low", "high", and "max".
    • The Diagnostic attribute option has been implemented across all NUOPC generic component kinds. This attribute allows the user to specify when Fields contained in the importState and exportState of a component are dumped to file when entering and exiting component methods (Initialize/Run/Finalize).
    • Several new NUOPC prototype (example) codes have been added:
      • AtmOcnCplSetProto demonstrates the use of the CplSet feature to support coupling multiple independent sets of Fields between components (e.g., multi-domain coupling).
      • AtmOcnLogNoneProto demonstrates the option to turn off ESMF PET log files completely.
      • AtmOcnMirrorFieldsProto demonstrates the NUOPC Field mirroring protocol.
      • CustomFieldDictionaryProto demonstrates the use of an external YAML file to populate the NUOPC Field Dictionary.
      • ExternalDriverAPIProto demonstrates how an external layer can drive a NUOPC driver component going through ESMF.
      • SingleModelOpenMPProto demonstrates the use of OpenMP threading inside a NUOPC model component and resource allocation through the generic SetVM method.
  • The following capabilities were added to the ESMF Python interface (ESMPy):
    • Added an in-memory weight generation option to the Regrid class, allowing re-use of weight vectors without writing them to netCDF. The weight arrays can be returned as NumPy objects or Python dictionary of weight vectors. This allows retrieval of the weights by source and destination key/value pairs.
  • The ESMF_Regrid application now supports additional options, and one option was removed:
    • Added --srcdatafile, --dstdatafile, and --tilefile_path options to support the GRIDSPEC Mosaic file format.
    • Added the --dst_loc option to support regridding on different stagger locations when the destination grid is in UGRID format and the regridding method is non-conservative.
    • Added the --check option to check the regridding results using a synthetic field, generated by an analytic function.
    • Removed the --user_areas option because none of the currently supported file formats provide user areas.
  • Added new features to the ESMF_FileRegrid() method and the ESMF_Regrid application:
    • Regridding of multiple variables
    • Multi-tile GRIDSPEC Mosaic file format with data stored in separate files, one per tile
    • 2nd order conservative regridding
    • Regridding to the corner stagger location if the regridding method is non-conservative and the destination file is in UGRID format
  • Added new features to the ESMF_RegridWeightGen application:
    • 1D network topology support in UGRID format
    • 2D Cartesian grid in CF Single Tile file format
    • Creep fill added as an extrapolation method
  • Added a new extrapolation method called "creep fill" to the entire regrid weight generation system. This capability allows the user to spread data from mapped destination points to neighboring unmapped destination points. This action can be repeated for a number of levels where at each level the data is spread from filled to neighboring unfilled points. The creep fill method is accessible by specifying the extrapMethod=ESMF_EXTRAPMETHOD_CREEP option in any of the ESMF_*RegridStore() methods or the --extrap_method creep option when using the ESMF_RegridWeightGen application).
  • Additional regridding methods are now supported when ESMF_Mesh is switched to be based on DOE's MOAB mesh library. The new supported regrid methods are: bilinear (when the source Field is not built on Grid), and nearest neighbor. All extrapolation methods, except for "creep fill" are supported.
  • RouteHandles can now be written to file via ESMF_RouteHandleWrite() and read back into memory (on the same number of PETs) via ESMF_RouteHandleCreate(). This provides an opportunity to lower the initialization cost e.g. for short production runs that repeatedly require regridding between the same grids.
  • The ESMF_GridCreateCubedSphere() and ESMF_GridCreateMosaic() methods now support irregular decompositions.
  • The ESMF_GridCreateCubedSphere() method can now apply the Schmidt transformation on the coordinates.
  • The ESMF_GridCreateMosaic() and the ESMF_GridCreate() method that reads a grid from file now support different coordinate typekinds.
  • The ESMF_GridCreate() method now automatically determines the correct file format if it was not explicitly specified as an argument.
  • The ESMF_GridCreate() method now supports grid creation from a GRIDSPEC Mosaic supergrid tile file.
  • The ESMF_LocStreamCreate() method now supports 1D network topology in UGRID format.
  • Added the capability to create an unstructured Mesh object from a structured Grid object. This allows components that internally work with a Mesh (e.g. a generic Mesh based mediator) to construct a Mesh from a transferred Grid.
  • The ESMF_MeshGet() method has been extended to allow the user to get mask and area information from an unstructured Mesh object.
  • Wrapped the ESMF_XGrid methods with ESMC bindings, to make them easily available to applications implemented in C.
  • Added a new ESMF_FieldBundleCreate() method that allows the creation of a packed FieldBundle. This is an initial capability with some limitations that will be addressed in a future release. The method takes a pre-allocated Fortran array pointer containing the memory of a set of interleaved fields. This is often how sets of fields are structured in model code and this capability allows ESMF to reference this memory directly. Interleaving along any dimension is supported. Packed FieldBundles support communication methods including redistribution, sparse matrix multiplication, and regridding. Currently the number of fields on source and destination must be the same, and permutations of fields are not supported. This means that the order of fields on source and destination must agree.
  • Implemented the ESMF_DECOMP_SYMMEDGEMAX option for cases where the number of elements is not evenly divisible by the number of decomposition elements. This option assigns the largest number of elements to the two edge DEs. It then progresses by assigning a descending number of elements to DEs as the center of the decomposition is approached from both sides. This produces a decomposition that is identical to the decomposition the FV3 model uses for this situation.
  • The ESMF_Array class now allows sharing of DEs between PETs that run on the same single system image (SSI). This feature can be used for shared memory data access between components running on different number of PETs (e.g. for threading) but are located on the same SSI.
  • The ESMF_Config class API has been extended. The ESMF_ConfigCreate() method now supports creating a Config object from a subsection of an existing Config object. This feature allows the consolidation of the content of multiple Config objects into a single configuration file.
  • The YAML-CPP parser has been included in the ESMF distribution. It is active by default, but can be turned off by setting ESMF_YAMLCPP=OFF.
  • Added support for C++11 standard. ESMF builds with the compiler default, but switches to C++11 if ESMF_YAMLCPP is enabled (default). The new ESMF_CXXSTD environment variable can be used to explicitly switch to specific C++ standards.
  • Simplified the use of CMake for projects using ESMF by providing a FindESMF.cmake file. This file is located under the new cmake subdirectory. It parses the esmf.mk file of an ESMF installation, exposing ESMF build variables as global variables accessible by other CMake modules.
  • The ESMF regression test suite can now be used to validate a pre-installed ESMF installation. This can be done by setting the ESMF_TESTESMFMKFILE environment variable to ON, and pointing ESMFMKFILE to the esmf.mk file of the ESMF installation to be validated.
  • Added an external demo ESMF_RegridCheck to test the ESMF_Regrid application with a set of test grid and data sets. In all the test cases, the input variables were constructed using an analytic function and the regridded destination variables were compared with that function to calculate the mean relative errors.

Known Bugs

  • FieldBundles don't currently enforce that every contained Field is built on the same Grid, Mesh, LocStream, or XGrid object, although the documentation says that this should be so.
  • When the ESMF regrid weight generation methods and applications are used with nearest destination to source interpolation method, the unmapped destination point detection does not work. Even if the option is set to return an error for unmapped destination points (the default) no error will be returned.
  • The ESMF regrid weight generation methods and applications do not currently work for source Fields created on Grids which contain a DE of width less than 2 elements. For conservative regridding the destination Field also has this restriction.
  • The ESMF regrid weight generation methods and applications do not currently work on Fields created on Grids with arbitrary distribution.
  • There is a race condition in the ESMF_FileRegrid() method and the ESMF_Regrid application when the destination grid is of GRIDSPEC Mosaic format and >=12 PETs are used. This issue leads to intermittent failures in the external_demos tests for the GRIDSPEC_1x1_time_to_C48_mosaic_bilinear case when run on the 16 PET configuration.
  • Applying the sparse matrix multiplication to cases where the local data allocation is above the 32-bit limit will fail with a memory allocation error. This affects all Regrid(), Redist(), Halo(), and SMM() calls.
  • The ESMF_GridCreate() interface that allows the user to create a copy of an existing Grid with a new distribution will give incorrect results when used on a Grid with 3 or more dimensions and whose coordinate arrays are less than the full dimension of the Grid (i.e. it contains factorized coordinates).
  • Using the ESMF_GridCreate1PeriDim() method to create a grid with a bipole connection on the lower side (typically referring to the southern hemisphere) resulted in no connection there.
  • The ESMF_XGrid construction can lead to degenerate cells for cases where the source and destination grids have edges that are almost the same. Often these cells don't produce weights and are benign, but when weights are produced can lead to low accuracy results when transferring data to/from the XGrid.
  • The ESMF_ArrayCreate() crashes when used with pinflag=ESMF_PIN_DE_TO_SSI or pinflag=ESMF_PIN_DE_TO_SSI_CONTIG from within a component. The crash is from inside MPI with "invalid communicator". The "pinflag" option works correctly from the application level, i.e. in the context of the global VM.
  • Querying the ESMF_DistGridGet() method for "de" or "tile" information for a "localDe" will return incorrect results, and/or crash.
    ESMF_AttributeWrite() has only been verified to work for ESMF standard Attribute packages. Non-standard Attribute packages may trigger a crash inside the ESMF_AttributeWrite() implementation.
  • For NetCDF installations that have the C and Fortran bindings installed in different locations, a NetCDF enabled build of ESMF does not correctly include the Fortran NetCDF library during linking.
  • When installing ESMF into a location that is shared with other libraries, it can happen that executing the ESMF install target fails with a "permission denied" error.
  • The Darwin.intelclang.default build configuration is broken.

Platform-specific bugs:

  • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default ESMF now uses the C++11 standard. If you run into build issues due to the C++11 dependency, you can either (1) make sure a GCC>=4.8 is loaded, or (2) set ESMF_YAMLCPP=OFF. In the latter case the YAML-dependent features in ESMF will not be available.
  • For GNU compilers GCC>=10.x, the default Fortran argument mismatch checking has become stricter. This will result in build failures. Setting environment variable ESMF_F90COMPILEOPTS="-fallow-argument-mismatch -fallow-invalid-boz", during the ESMF build, can be used as a work around for this issue.
  • On some systems with the PGI compiler, there is an issue with shared memory pointers between PETs on the same SSI. We see failures or crashes for Array tests that exercise this feature (ESMF_ArraySharedDeSSISTest.F90, ESMF_ArrayCreateGetUTest.F90) on the following platforms:
    • Hera/PGI-18.10.1
    • Gaea/PGI-16.5.0
    • Electra/PGI-17.1.0
    • Pleiades/PGI-17.1.0
    • Summitdev/PGI-19.7.0
      However, we do not observe these failures or crashes on:
    • Discover/PGI-14.1.0
    • Discover/PGI-17.7.0
  • On Summitdev/PGI-19.7.0 we see ESMF_XGridUTest.F90 unit test failures due to erroneously produce weights for source and destination grids that have edges that are almost the same.
  • On Discover/PGI-14.1.0 the ESMF_FieldRegridUTest.F90 and ESMF_FieldBundleCrGetUTest.F90 unit tests are failing.
  • Currently the ESMPy interface to retrieve regridding weights from Python is only supported under the GNU compiler. On all other compilers the method will flag an error.
  • On Darwin, with the Intel Fortran compiler, command line arguments cannot be accessed from ESMF applications when linked against the shared library version of libesmf. There is no issue when linked against the static libesmf.a version. Setting environment variable ESMF_SHARED_LIB_BUILD=OFF, during the ESMF build, can be used as a work around for this issue.
  • On some systems with the Cray compiler (CCE version 8.x), the ESMF library fails to build. The error can be prevented by setting the ESMF build environment variable ESMF_MOAB=OFF.

Climate Change - Earth and Climate Modeling - Fortran
Published by rsdunlapiv over 4 years ago

ESMF - ESMF 8.0.1

Overview

The 8.0.1 release fulfills two purposes: to patch a number of bugs discovered in 8.0.0, and to introduce a small selection of critical new performance features. These features were needed by operational centers in an official release that is fully backward compatible with 8.0.0. No bit-for-bit changes from ESMF regridding functions are expected in 8.0.1 relative to release 8.0.0.

One of the performance improvements introduced by ESMF 8.0.1 is message aggregation on the ESMF_VM level. This mechanism significantly improves the efficiency of inter-component data exchanges, especially in situations where there is an imbalance between the sending and receiving side. The imbalance can either be in the total number of sending vs receiving PETs, or in the timing, where the receiving PETs arrive at the exchange late, and out of sync with the sending PETs. The feature is automatically leveraged by the NUOPC level when executing the NUOPC_Connector between components on disjoint sets of PETs.

The other performance improvement added is on the ESMPy level. ESMPy now supports writing/reading of ESMF RouteHandles to/from file. This allows a user to perform the costly RouteHandle generation once, and re-use it in subsequent runs. This provides a more efficient approach to Regridding in the situation where the number of PETs does not change between runs.

Release Notes

  • This release is backward compatible with ESMF 8.0.0. Two new interfaces were added to the Fortran API: ESMF_VMEpochEnter() and ESMF_VMEpochExit().
  • No bit-for-bit changes are expected for this release compared to ESMF 8.0.0. This has been verified for a large number of regridding tests with Intel compilers using flags "-O2 -fp-model precise".
  • No changes were made to the ESMF regrid weight generation methods and applications. The ESMF tables summarizing the ESMF regridding status are unchanged.
  • The ESMF Virtual Machine (ESMF_VM) now supports message aggregation to improve performance for some very common communication patterns. Two new methods, ESMF_VMEpochEnter() and ESMF_VMEpochExit(), allow explicit use of this feature.
  • The generic NUOPC_Connector automatically takes advantage of VMEpoch message aggregation when used between components on disjoint petLists.
  • The ability to write/read ESMF RouteHandles to/from file was added to the ESMPy layer. This allows a user to perform the costly RouteHandle generation once, and re-use it in subsequent runs. This provides a more efficient approach to Regridding in the situation where the number of PETs does not change between runs.
  • The pole_kind parameter was added to allow specification of pole behavior when creating an ESMPy Grid.

Known Bugs

  • Same as ESMF_8_0_0 with the following exceptions:
    • The race condition in the ESMF_FileRegrid() method and ESMF_Regrid application has been fixed. Now if the destination grid is a multi-tile grid in GRIDSPEC MOSAIC format and the tile is distributed into multiple PETs, the regridded field is written correctly into the output file.
    • Applying the sparse matrix multiplication to cases where the local data allocation is above the 32-bit limit now works reliably.
    • The ESMF_GridCreate1PeriDim() method can now be used to create a bipole connection on the lower side.
    • The ESMF_ArrayCreate() method now succeeds when called from inside a component, requesting DE sharing.
    • The ESMF_DistGridGet() method now correctly returns "de" and "tile" information for a "localDe".
    • ESMF now correctly links against the Fortran bindings of NetCDF, even when the C and Fortran bindings of NetCDF are provided in different locations.
    • The ESMF install target now reliably works when installing ESMF into a location that is shared with other library installations.
    • The Darwin.intelclang.default build configuration now works correctly.
  • Platform-specific bugs:
    • The same as ESMF_8_0_0 with the following exceptions:
      • The GNU and Intel compilers require GCC>=4.8 for C++11 support (Intel uses the GCC headers). By default ESMF now uses the C++11 standard and cannot be downgraded. If you run into build issues due to the C++11 dependency, you must make sure a GCC>=4.8 is loaded.

Climate Change - Earth and Climate Modeling - Fortran
Published by theurich almost 5 years ago