ml_drought
A Machine Learning Pipeline to Predict Vegetation Health.
https://github.com/ecmwfcode4earth/ml_drought
Category: Natural Resources
Sub Category: Soil and Land
Keywords
2019 copernicus machine-learning
Last synced: about 9 hours ago
JSON representation
Repository metadata
Machine learning to better predict and understand drought. Moving github.com/ml-clim
- Host: GitHub
- URL: https://github.com/ecmwfcode4earth/ml_drought
- Owner: ECMWFCode4Earth
- Created: 2019-05-01T16:05:56.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2022-05-18T18:41:07.000Z (almost 3 years ago)
- Last Synced: 2025-03-16T16:02:12.561Z (about 2 months ago)
- Topics: 2019, copernicus, machine-learning
- Language: Jupyter Notebook
- Homepage: https://ml-clim.github.io/drought-prediction/
- Size: 309 MB
- Stars: 93
- Watchers: 7
- Forks: 19
- Open Issues: 42
- Releases: 0
-
Metadata Files:
- Readme: README.md
README.md
A Machine Learning Pipeline for Climate Science
This repository is an end-to-end pipeline for the creation, intercomparison and evaluation of machine learning methods in climate science.
The pipeline carries out a number of tasks to create a unified-data format for training and testing machine learning methods.
These tasks are split into the different classes defined in the src
folder and explained further below:
NOTE: some basic working knowledge of Python is required to use this pipeline, although it is not too onerous
Using the Pipeline
There are three entrypoints to the pipeline:
A blog post describing the goals and design of the pipeline can be found
here.
View the initial presentation of our pipeline here.
Setup
Anaconda running python 3.7 is used as the package manager. To get set up
with an environment, install Anaconda from the link above, and (from this directory) run
conda env create -f environment.yml
This will create an environment named esowc-drought
with all the necessary packages to run the code. To
activate this environment, run
conda activate esowc-drought
Docker can also be used to run this code. To do this, first
run the docker app (either docker desktop)
or configure the docker-machine
:
# on macOS
brew install docker-machine docker
docker-machine create --driver virtualbox default
docker-machine env default
See here for help on all machines or here
for MacOS.
Then build the docker image:
docker build -t ml_drought .
Then, use it to run a container, mounting the data folder to the container:
docker run -it \
--mount type=bind,source=<PATH_TO_DATA>,target=/ml_drought/data \
ml_drought /bin/bash
You will also need to create a .cdsapirc file with the following information:
url: https://cds.climate.copernicus.eu/api/v2
key: <INSERT KEY HERE>
verify: 1
Testing
This pipeline can be tested by running pytest
. flake8 is used for linting.
We use mypy for type checking. This can be run by running mypy src
(this runs mypy on the src
directory).
We use black for code formatting.
Team: @tommylees112, @gabrieltseng
For updates follow @tommylees112 on twitter or look out for our blog posts!
Acknowledgements
This was a project completed as part of the ECMWF Summer of Weather Code Challenge #12. The challenge was setup to use ECMWF/Copernicus open datasets to evaluate machine learning techniques for the prediction of droughts.
Huge thanks to @ECMWF for making this project possible!
Owner metadata
- Name: ECMWF Code for Earth
- Login: ECMWFCode4Earth
- Email:
- Kind: organization
- Description: ECMWF Code for Earth is a collaborative programme where each summer several developer teams work on innovative earth sciences-related software.
- Website: https://codeforearth.ecmwf.int
- Location: Online
- Twitter: ECMWFCode4Earth
- Company:
- Icon url: https://avatars.githubusercontent.com/u/44897980?v=4
- Repositories: 37
- Last ynced at: 2023-05-10T13:37:08.293Z
- Profile URL: https://github.com/ECMWFCode4Earth
GitHub Events
Total
- Watch event: 1
- Fork event: 1
Last Year
- Watch event: 1
- Fork event: 1
Committers metadata
Last synced: 2 days ago
Total Commits: 248
Total Committers: 3
Avg Commits per committer: 82.667
Development Distribution Score (DDS): 0.46
Commits in past year: 0
Committers in past year: 0
Avg Commits per committer in past year: 0.0
Development Distribution Score (DDS) in past year: 0.0
Name | Commits | |
---|---|---|
tommylees112 | t****2@g****m | 134 |
Gabriel Tseng | g****g@m****a | 113 |
Julia Wagemann | w****a@g****e | 1 |
Committer domains:
- gmx.de: 1
- mail.mcgill.ca: 1
Issue and Pull Request metadata
Last synced: 1 day ago
Total issues: 41
Total pull requests: 131
Average time to close issues: 2 months
Average time to close pull requests: 17 days
Total issue authors: 8
Total pull request authors: 2
Average comments per issue: 1.1
Average comments per pull request: 0.79
Merged pull request: 108
Bot issues: 0
Bot pull requests: 0
Past year issues: 0
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 0
Past year pull request authors: 0
Past year average comments per issue: 0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- tommylees112 (25)
- cvitolo (5)
- jwagemann (4)
- gabrieltseng (2)
- shaunharrigan (2)
- rpitonak (1)
- AlineBornschein (1)
- v2thegreat (1)
Top Pull Request Authors
- gabrieltseng (74)
- tommylees112 (57)
Top Issue Labels
- review (11)
- export (2)
- model validation (2)
- pipeline entrypoint (1)
- analysis (1)
Top Pull Request Labels
- modelling (33)
- wip (16)
- preprocess (14)
- feature engineering (13)
- export (12)
- model validation (12)
- analysis (7)
- documentation (4)
- pipeline entrypoint (1)
Dependencies
- continuumio/miniconda3 latest build
Score: 6.00388706710654