A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

weather-tools

A series of command-line tools to make common data engineering tasks easier for researchers in climate and weather.
https://github.com/google/weather-tools

Category: Climate Change
Sub Category: Climate Data Processing and Analysis

Keywords

apache-beam python weather

Keywords from Contributors

archiving measur transforms animals optimize compose generic conversion observation projection

Last synced: about 16 hours ago
JSON representation

Repository metadata

Tools to make weather data accessible and useful.

README.md

weather-tools

Apache Beam pipelines to make weather data accessible and useful.

CI
Documentation Status

Introduction

This project contributes a series of command-line tools to make common data engineering tasks easier for researchers in
climate and weather. These solutions were born out of the need to improve repeated work performed by research teams
across Alphabet.

The first tool created was the weather downloader (weather-dl). This makes it easier to ingest data from the European
Center for Medium Range Forecasts (ECMWF). weather-dl enables users to describe very specifically what data they'd
like to ingest from ECMWF's catalogs. It also offers them control over how to parallelize requests, empowering users to
retrieve data efficiently. Downloads are driven from a
configuration file, which can be reviewed (and version-controlled) independently of pipeline or
analysis code.

We also provide two additional tools to aid climate and weather researchers: the weather mover (weather-mv) and the
weather splitter (weather-sp). These CLIs are still in their alpha stages of development. Yet, they have been used for
production workflows for several partner teams.

We created the weather mover (weather-mv) to load geospatial data from cloud buckets
into Google BigQuery. This enables rapid exploratory analysis and visualization of
weather data: From BigQuery, scientists can load arbitrary climate data fields into a Pandas or XArray dataframe via a
simple SQL query.

The weather splitter (weather-sp) helps normalize how archival weather data is stored in cloud buckets:
Whether you're trying to merge two datasets with overlapping variables — or, you simply need
to open Grib data from XArray, it's really useful to split datasets into
their component variables.

Installing

It is currently recommended that you create a local python environment (with
Anaconda) and install the
sources as follows:

conda env create --name weather-tools --file=environment.yml
conda activate weather-tools

Note: Due to its use of 3rd-party binary dependencies such as GDAL and MetView, weather-tools
is transitioning from PyPi to Conda for its main release channel. The instructions above
are a temporary workaround before our Conda-forge release.

From here, you can use the weather-* tools from your python environment. Currently, the following tools are available:

  • weather-dl (beta) – Download weather data (namely, from ECMWF's API).
  • ⛅️ weather-mv (alpha) – Load weather data into analytics engines, like BigQuery.
  • 🌪 weather-sp (alpha) – Split weather data by arbitrary dimensions.

Quickstart

In this tutorial, we will
download the Era 5 pressure level dataset
and ingest it into Google BigQuery using weather-dl and weather-mv, respectively.

Prerequisites

  1. Register for a license from
    ECMWF's Copernicus (CDS) API.
  2. Install your license by copying your API url & key from this page to a new file $HOME/.cdsapirc.[^1] The file should look like this:
    url: https://cds.climate.copernicus.eu/api/v2
    key: <YOUR_USER_ID>:<YOUR_API_KEY>
    
  3. If you do not already have a Google Cloud project, create one by following
    these steps. If you are working on
    an existing project, make sure your user has the BigQuery Admin role.
    To learn more about granting IAM roles to users in Google Cloud, visit the
    official docs.
  4. Create an empty BigQuery Dataset. This can be done using
    the Google Cloud Console
    or via the bq CLI tool.
    For example:
    bq mk --project_id=$PROJECT_ID $DATASET_ID
    
  5. Follow these steps
    to create a bucket for staging temporary files in Google Cloud Storage.

Steps

For the purpose of this tutorial, we will use your local machine to run the
data pipelines. Note that all weather-tools can also be run in Cloud Dataflow
which is easier to scale and fully managed.

  1. Use weather-dl to download the Era 5 pressure level dataset.

    weather-dl configs/era5_example_config_local_run.cfg \
       --local-run # Use the local machine
    

    Recommendation: Pass the -d, --dry-run flag to any of these commands to preview the effects.

    NOTE: By default, local downloads are saved to the ./local_run directory unless another file system is specified.
    The recommended output location for weather-dl is Cloud Storage.
    The source and destination of the download are configured using the .cfg configuration file which is passed to the command.
    To learn more about this configuration file's format and features,
    see this reference. To learn more about the weather-dl command, visit here.

  2. (optional) Split your downloaded dataset up with weather-sp:

     weather-sp --input-pattern "./local_run/era5-*.nc" \
        --output-dir "split_data" 
    

    Visit the weather-sp docs for more information.

  3. Use weather-mv to ingest the downloaded data into BigQuery, in a structured format.

    weather-mv bigquery --uris "./local_run/**.nc" \ # or "./split_data/**.nc" if weather-sp is used
       --output_table "$PROJECT.$DATASET_ID.$TABLE_ID" \ # The path to the destination BigQuery table
       --temp_location "gs://$BUCKET/tmp" \  # Needed for stage temporary files before writing to BigQuery
       --direct_num_workers 2
    

    See these docs for more about the weather-mv command.

That's it! After the pipeline is completed, you should be able to query the ingested
dataset in BigQuery SQL workspace
and analyze it using BigQuery ML.

Contributing

The weather tools are under active development, and contributions are welcome! Please check out
our guide to get started.

License

This is not an official Google product.

Copyright 2021 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

[^1]: Note that you need to be logged in for the CDS API page to actually show your user ID and API key. Otherwise, it will display a placeholder, which is confusing to some users.


Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: 6 days ago

Total Commits: 353
Total Committers: 33
Avg Commits per committer: 10.697
Development Distribution Score (DDS): 0.626

Commits in past year: 37
Committers in past year: 10
Avg Commits per committer in past year: 3.7
Development Distribution Score (DDS) in past year: 0.703

Name Email Commits
Alex Merose a****r@g****m 132
Rahul Mahrsee 8****7 40
DeepGabani 6****8 27
Darshan Prajapati 9****9 20
aniketinfocusp 1****p 20
Ulrike Hager u****r@g****m 16
dabhi_cusp 1****p 13
Aniket Singh Rawat 1****t 12
Cillian Fennell c****s@g****m 8
Jash Rana 1****4 8
Piyush-Ingale 1****e 6
ksic8 1****8 6
Shail Parekh 1****h 5
Aaron Bell a****l@g****m 4
pbattaglia p****a 4
pramodg 6****g 4
Steven Greenberg s****g@g****m 4
Iman Akbari i****i@g****m 3
David Lowell d****l@g****m 3
Sean Campbell c****n@g****m 2
Valliappa (Lak) Lakshmanan l****k@v****m 2
dependabot[bot] 4****] 2
ksic8 k****l@i****n 2
kbp45-cusp k****h@i****m 1
Sascha Kahrs s****s@g****m 1
Shirish Jamthe s****e@g****m 1
Stephan Rasp r****n@g****m 1
Saverio Guzzo s****o@g****m 1
SangamSwadik s****8@g****m 1
RoshiniFernando 4****o 1
and 3 more...

Committer domains:


Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 153
Total pull requests: 356
Average time to close issues: 2 months
Average time to close pull requests: 10 days
Total issue authors: 21
Total pull request authors: 29
Average comments per issue: 1.16
Average comments per pull request: 0.56
Merged pull request: 299
Bot issues: 0
Bot pull requests: 3

Past year issues: 5
Past year pull requests: 47
Past year average time to close issues: 1 day
Past year average time to close pull requests: 5 days
Past year issue authors: 3
Past year pull request authors: 10
Past year average comments per issue: 2.0
Past year average comments per pull request: 0.15
Past year merged pull request: 40
Past year bot issues: 0
Past year bot pull requests: 2

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/google/weather-tools

Top Issue Authors

  • alxmrs (84)
  • mahrsee1997 (13)
  • blackvvine (12)
  • deepgabani8 (9)
  • dabhicusp (6)
  • ksic8 (4)
  • CillianFn (4)
  • mt467 (4)
  • aniketinfocusp (3)
  • DarshanSP19 (2)
  • floraxue (2)
  • jongwooo (1)
  • pramodg (1)
  • heyanand (1)
  • lakshmanok (1)

Top Pull Request Authors

  • alxmrs (86)
  • aniketinfocusp (58)
  • mahrsee1997 (44)
  • deepgabani8 (35)
  • DarshanSP19 (20)
  • aniketsinghrawat (16)
  • dabhicusp (14)
  • j9sh264 (12)
  • CillianFn (9)
  • pbattaglia (8)
  • Piyush-Ingale (7)
  • ksic8 (6)
  • blackvvine (6)
  • shail-parekh (5)
  • pramodg (5)

Top Issue Labels

  • weather-mv (31)
  • weather-dl (31)
  • bug (27)
  • P1 (23)
  • P2 (20)
  • good first issue (17)
  • enhancement (11)
  • documentation (7)
  • help wanted (5)
  • P0 (5)
  • P3 (4)
  • weather-sp (4)
  • can't reproduce (1)
  • 20% (1)
  • wontfix (1)

Top Pull Request Labels

  • weather-dl (3)
  • dependencies (3)
  • weather-sp (2)
  • enhancement (2)
  • weather-mv (2)
  • github_actions (1)

Package metadata

pypi.org: google-weather-tools

Apache Beam pipelines to make weather data accessible and useful.

  • Homepage: https://weather-tools.readthedocs.io/
  • Documentation: https://google-weather-tools.readthedocs.io/
  • Licenses: License :: OSI Approved :: Apache Software License
  • Latest release: 0.3.2 (published almost 3 years ago)
  • Last Synced: 2025-04-25T14:35:05.678Z (1 day ago)
  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 300 Last month
  • Rankings:
    • Dependent packages count: 10.002%
    • Average: 21.348%
    • Dependent repos count: 21.718%
    • Downloads: 32.324%
  • Maintainers (1)

Dependencies

docs/requirements.txt pypi
  • myst-parser ==0.13.7
  • sphinx >=2.1
setup.py pypi
  • apache-beam *
.github/workflows/ci.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • conda-incubator/setup-miniconda v2 composite
  • s-weigand/setup-conda v1 composite
  • styfle/cancel-workflow-action 0.7.0 composite
.github/workflows/publish.yml actions
  • actions/checkout v2 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v2.3.1 composite
  • actions/upload-artifact v2 composite
  • pypa/gh-action-pypi-publish v1.4.2 composite
Dockerfile docker
  • apache/beam_python${py_version}_sdk 2.40.0 build
  • continuumio/miniconda3 latest build
environment.yml pypi
  • cython ==0.29.34
  • earthengine-api ==0.1.329
  • firebase-admin ==6.0.1
pyproject.toml pypi
weather_dl/setup.py pypi
weather_mv/setup.py pypi
weather_sp/setup.py pypi

Score: 14.962676792428262