A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

river-dl

Deep learning model for predicting environmental variables on river systems.
https://github.com/USGS-R/river-dl

Last synced: over 1 year ago
JSON representation

Acceptance Criteria

Repository metadata

Deep learning model for predicting environmental variables on river systems

README.md

Active development moved to https://code.usgs.gov/wma/wp/river-dl

Deep Graph Convolutional Neural Network for Predicting Environmental Variables on River Networks

This repository contains code for predicting environmental variables on river networks. The models included are all either
temporally or spatiotemporally aware and incorporate information from the river network. The original intent of
this repository was to predict stream temperature and streamflow.

This work is being developed by researchers in the Data Science branch of the US. Geological Survey and researchers at the
University of Minnesota in Vipin Kumar's lab. Sources for specific models are included as comments within the code.

Running the code

There are functions for facilitating pre-processing and post-processing of the data in addition to running the models themselves.
Included within the workflow_examples folder of the repository are a number of example Snakemake workflow that show how to
run the entire process with a variety of models and end-goals.

To run the Snakemake workflow locally:

  1. Install the dependencies in the environment.yaml file. With conda you can do this with conda env create -f environment.yaml
  2. Activate your conda environment source activate rdl_torch_tf
  3. Install the local river-dl package by pip install path/to/river-dl/ (optional)
  4. Edit the river-dl run configuration (including paths for I/O data) in the appropriate config.yml
    from the workflow_examples folder.
  5. Run Snakemake with snakemake --configfile config.yml -s Snakemake --cores <n>

To run the Snakemake Workflow on TallGrass

  1. Request a GPU allocation and start an interactive shell

     salloc -N 1 -t 2:00:00 -p gpu -A <account> --gres=gpu:1 
     srun -A <account> --pty bash
    
  2. Load the necessary cuda toolkit module and add paths to the cudnn drivers

     module load cuda11.3/toolkit/11.3.0 
     export LD_LIBRARY_PATH=/cm/shared/apps/nvidia/TensorRT-6.0.1.5/lib:/cm/shared/apps/nvidia/cudnn_8.0.5/lib64:$LD_LIBRARY_PATH
    
  3. Follow steps 1-5 above as you would to run the workflow locally (note, you may need to change tensorflow
    to tensoflow-gpu in the environment.yml).

After building your environment, you may want to make sure the recommended versions of PyTorch and CUDA were installed
according to the PyTorch documentation. You can see the installed versions
by calling conda list within your activated environment.

The data

The data used to run this model currently are specific to the Delaware River Basin but will soon be made more generic.


Disclaimer

This software is in the public domain because it contains materials that originally came from the U.S. Geological Survey, an agency of the United States Department of Interior. For more information, see the official USGS copyright policy

Although this software program has been used by the U.S. Geological Survey (USGS), no warranty, expressed or implied, is made by the USGS or the U.S. Government as to the accuracy and functioning of the program and related program material nor shall the fact of distribution constitute any such warranty, and no responsibility is assumed by the USGS in connection therewith.

This software is provided “AS IS.”


Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: over 1 year ago

Total Commits: 770
Total Committers: 13
Avg Commits per committer: 59.231
Development Distribution Score (DDS): 0.457

Commits in past year: 55
Committers in past year: 3
Avg Commits per committer in past year: 18.333
Development Distribution Score (DDS) in past year: 0.364

Name Email Commits
jsadler2 j****r@u****v 418
Janet j****y@u****v 198
Simon Topp 3****p 56
Jared D. Smith j****h@u****v 35
Janet Barclay j****y@i****v 29
Janet Barclay j****y@i****v 17
Jake Zwart j****t@u****v 4
Jacob Zwart j****t@i****v 4
Jeremy Diaz j****z@c****u 3
Jacob Zwart j****t@i****v 3
Julie Padilla j****s@g****m 1
Janet Barclay j****y@i****v 1
Simon Topp s****p@i****v 1

Committer domains:


Issue and Pull Request metadata

Last synced: over 1 year ago

Total issues: 128
Total pull requests: 92
Average time to close issues: 3 months
Average time to close pull requests: 7 days
Total issue authors: 7
Total pull request authors: 8
Average comments per issue: 2.71
Average comments per pull request: 1.85
Merged pull request: 86
Bot issues: 0
Bot pull requests: 0

Past year issues: 8
Past year pull requests: 6
Past year average time to close issues: about 1 month
Past year average time to close pull requests: 13 days
Past year issue authors: 2
Past year pull request authors: 3
Past year average comments per issue: 0.25
Past year average comments per pull request: 2.5
Past year merged pull request: 5
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/USGS-R/river-dl

Top Issue Authors

  • jsadler2 (96)
  • janetrbarclay (12)
  • SimonTopp (11)
  • jds485 (6)
  • jdiaz4302 (1)
  • jesse-ross (1)
  • matiiin (1)

Top Pull Request Authors

  • jsadler2 (41)
  • janetrbarclay (34)
  • SimonTopp (10)
  • jds485 (2)
  • jdiaz4302 (2)
  • Reemyos (1)
  • jesse-ross (1)
  • jzwart (1)

Top Issue Labels

  • enhancement (12)
  • experiment (6)
  • logistics (6)
  • backburner (1)

Top Pull Request Labels


Dependencies

environment.yaml conda
  • dask
  • git
  • jupyterlab
  • matplotlib
  • pandas
  • pip
  • pyarrow
  • pytorch
  • scikit-learn
  • snakemake
  • torchaudio
  • torchvision
  • tqdm
  • xarray
  • zarr
Dockerfile docker
  • python 3.6 build
setup.py pypi

Score: 6.415096959171596