A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

ParFlow

An open-source, modular, parallel watershed flow model.
https://github.com/parflow/parflow

Category: Hydrosphere
Sub Category: Freshwater and Hydrology

Keywords from Contributors

measur archiving transforms hydrology mesh animals observation conversion geocoder optimize

Last synced: about 21 hours ago
JSON representation

Repository metadata

Parflow is an open-source parallel watershed flow model.

README-GPU.md

Building Parflow with GPU acceleration

WARNING! Parflow GPU support is still in beta and there may be issues with it.

The GPU acceleration is currently compatible only with NVIDIA GPUs either using CUDA directly or using CUDA through the Kokkos library. The minimum supported CUDA compute capability for the hardware is 6.0 (NVIDIA Pascal architecture).

Building with CUDA or Kokkos may improve the performance significantly for large problems but is often slower for test cases and small problems due to initialization overhead associated with the GPU use. Currently, the only preconditioner resulting in a good performance is MGSemi, when applied to spinup problems. Installation reference can be found in Ubuntu recipe, Dockerfile, and linux.yml.

CMake

Building with GPU acceleration requires a CUDA installation, and in case Kokkos is used, Kokkos installation. However, the performance can be further improved by using pool allocation for Unified Memory. Supported memory managers for pool allocation are: RMM v0.10 and Umpire. Note that we are in the process of updating RMM to the newest API, but that should not affect users. Note that you should use only one memory manager, you can't pick both RMM and Umpire. Performance can be improved even more with direct communication between GPUs (requires using a CUDA-Aware MPI library).

The GPU acceleration is activated by specifying either PARFLOW_ACCELERATOR_BACKEND=cuda option to the CMake, i.e.,

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=cuda

or PARFLOW_ACCELERATOR_BACKEND=kokkos and DKokkos_ROOT=/path/to/Kokkos i.e.,

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=kokkos -DKokkos_ROOT=/path/to/Kokkos

where DPARFLOW_AMPS_LAYER=mpi1 leverages GPU-based data packing and unpacking. By default, the packed data is copied to a host staging buffer which is then passed for MPI to avoid special requirements for the MPI library. Direct communication between GPUs (with GPUDirect P2P/RDMA) can be activated by specifying an environment variable PARFLOW_USE_GPUDIRECT=1 during runtime in which case the memory copy between CPU and GPU is avoided and a GPU pointer is passed for MPI, but this requires a CUDA-Aware MPI library (support for Unified Memory is not required with the native CUDA backend because the pointers passed to the MPI library point to pinned GPU memory allocations, but is required with the Kokkos backend).

Furthermore, RMM library can be activated by specifying the RMM root directory with -DRMM_ROOT=/path/to/rmm_root* as follows:

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=cuda -DRMM_ROOT=/path/to/RMM

or

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=kokkos -DKokkos_ROOT=/path/to/Kokkos -DRMM_ROOT=/path/to/RMM

Similarly, the Umpire library can be activated by specifying the Umpire root directory with -Dumpire_ROOT=/path/to/umpire/root

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=cuda -Dumpire_ROOT=/path/to/umpire

or

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=kokkos -DKokkos_ROOT=/path/to/Kokkos -Dumpire_ROOT=/path/to/umpire

Note that on some systems, nvcc cannot locate the MPI include files by default, if this is the case, defining the environment variable CUDAHOSTCXX=mpicxx might help.

Finally, you must make sure you are building the code for the correct GPU architecture. First, find the compute capability of your device (e.g. A100 is CC 80, H100 is CC 90, etc). Then, you can specify it as a CMake option with -DCMAKE_CUDA_ARCHITECTURES={CC}:

cmake ../parflow -DPARFLOW_AMPS_LAYER=mpi1 -DCMAKE_BUILD_TYPE=Release -DPARFLOW_ENABLE_TIMING=TRUE -DPARFLOW_HAVE_CLM=ON -DCMAKE_INSTALL_PREFIX=${PARFLOW_DIR} -DPARFLOW_ACCELERATOR_BACKEND=cuda -Dumpire_ROOT=/path/to/umpire -DCMAKE_CUDA_ARCHITECTURES=90

Running Parflow with GPU acceleration

Running Parflow built with GPU support requires that each MPI process has access to a GPU device. Best performance is typically achieved by launching one MPI process per available GPU device. The MPI processes map to the available GPUs by

cudaSetDevice(node_local_rank % local_num_devices);

where node_local_rank and local_num_devices are the node-local rank of the process and the number of GPUs associated with the corresponding node, respectively. Therefore, launching 4 MPI processes per node that has 4 GPUs automatically means that each process uses a different GPU. Launching more processes (than the number of available GPUs) is only supported when using CUDA Multi-Process Service (MPS), but this typically results in reduced performance.

Any existing input script can be run with GPU acceleration; no changes are necessary. However, it is noted that two subsequent runs of the same input script using the same compiled executable are not guaranteed to produce identical results. This is expected behavior due to atomic operations performed by the GPU device (i.e., the order of floating-point operations may change between two subsequent runs).


Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: 6 days ago

Total Commits: 887
Total Committers: 50
Avg Commits per committer: 17.74
Development Distribution Score (DDS): 0.346

Commits in past year: 63
Committers in past year: 11
Avg Commits per committer in past year: 5.727
Development Distribution Score (DDS) in past year: 0.397

Name Email Commits
Steven Smith s****4@l****v 580
Reed Maxwell r****l@m****u 130
Ian Ferguson i****n@u****v 31
gartavanis 3****s 16
Paul Rigor k****r 9
Nick Engdahl n****l@w****u 9
Fabian Gasper f****r@f****e 8
Jaro Hokkanen j****n@f****e 7
Calla Chennault c****t@g****m 7
Patrick Avery p****y@k****m 6
Juan S. Acero Triana 5****t 6
Andrew Bennett b****r@g****m 5
Sebastien Jourdain s****n@k****m 5
grapp1 4****1 5
reedmaxwell r****l@p****u 5
Laura Condon l****n@e****u 5
Ketan Kulkarni k****i@g****m 3
Vineet Bansal v****l@p****m 3
dependabot[bot] 4****] 3
xy124 x****4 3
alanquits g****e@g****m 2
Basile Hector b****r@g****m 2
Joe Beisman j****n@g****m 2
Muhammad Fahad Azeemi m****d@f****e 2
elena-leo 7****o 2
Rob de Rooij r****j@u****u 2
Jackson Swilley 5****m 2
DrewLazzeriKitware 7****e 2
David Thompson d****n@k****m 2
John Williams j****i@g****m 2
and 20 more...

Committer domains:


Issue and Pull Request metadata

Last synced: 2 days ago

Total issues: 220
Total pull requests: 393
Average time to close issues: 3 months
Average time to close pull requests: 18 days
Total issue authors: 63
Total pull request authors: 45
Average comments per issue: 1.52
Average comments per pull request: 0.74
Merged pull request: 353
Bot issues: 0
Bot pull requests: 3

Past year issues: 20
Past year pull requests: 65
Past year average time to close issues: 12 days
Past year average time to close pull requests: 13 days
Past year issue authors: 9
Past year pull request authors: 13
Past year average comments per issue: 0.7
Past year average comments per pull request: 0.31
Past year merged pull request: 53
Past year bot issues: 0
Past year bot pull requests: 1

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/parflow/parflow

Top Issue Authors

  • smithsg84 (115)
  • jsacerot (11)
  • xy124 (9)
  • elappala (4)
  • vineetbansal (4)
  • alanquits (3)
  • surak (3)
  • cozisco (3)
  • hmandela (2)
  • kulkarni1 (2)
  • chrisreidy (2)
  • nbengdahl (2)
  • wrs827 (2)
  • reedmaxwell (2)
  • m5a0r7 (2)

Top Pull Request Authors

  • smithsg84 (228)
  • gartavanis (21)
  • reedmaxwell (17)
  • kvrigor (12)
  • callachennault (8)
  • jourdain (8)
  • hokkanen (7)
  • jsacerot (7)
  • psavery (6)
  • arbennett (6)
  • grapp1 (5)
  • nbengdahl (5)
  • lecondon (5)
  • westb2 (5)
  • kulkarni1 (5)

Top Issue Labels

  • enhancement (12)
  • bug (10)
  • question (2)

Top Pull Request Labels

  • dependencies (3)
  • enhancement (2)
  • bug (1)

Package metadata

pypi.org: pftools

A Python package creating an interface with the ParFlow hydrologic model.

  • Homepage: https://github.com/parflow/parflow/tree/master/pftools/python
  • Documentation: https://pftools.readthedocs.io/
  • Licenses: BSD
  • Latest release: 1.3.11 (published about 1 year ago)
  • Last Synced: 2025-04-25T13:34:45.421Z (2 days ago)
  • Versions: 20
  • Dependent Packages: 2
  • Dependent Repositories: 8
  • Downloads: 988 Last month
  • Docker Downloads: 64
  • Rankings:
    • Docker downloads count: 3.836%
    • Forks count: 4.848%
    • Dependent repos count: 5.252%
    • Stargazers count: 5.943%
    • Average: 6.48%
    • Dependent packages count: 7.306%
    • Downloads: 11.695%
  • Maintainers (7)

Dependencies

docs/user_manual/requirements.txt pypi
  • PyYAML ==5.4
  • Sphinx >=4.0.0
  • sphinx-rtd-theme *
  • sphinxcontrib-bibtex *
.github/workflows/linux.yml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
Dockerfile docker
  • ubuntu 22.04 build
docker/dev/Dockerfile docker
  • ${BASE_IMAGE} latest build
docker/runtime/Dockerfile docker
  • ${BASE_IMAGE} latest build
  • ${DEV_IMAGE} latest build
pf-keys/generators/simput/requirements.txt pypi
  • PyYAML ==5.4.1
  • click ==8.0.1
  • importlib-metadata ==4.4.0
  • typing-extensions ==3.10.0.0
  • zipp ==3.4.1
pftools/python/requirements.txt pypi
  • PyYAML ==5.4
  • dask *
  • numba *
  • numpy *
  • xarray *
pftools/python/requirements_dev.txt pypi
  • twine * development
pftools/python/requirements_pfsol.txt pypi
  • imageio >=2.9.0
pftools/python/setup.py pypi
  • pyyaml >=5.4
pftools/python/requirements_all.txt pypi

Score: 16.515058046574733