weather-tools
A series of command-line tools to make common data engineering tasks easier for researchers in climate and weather.
https://github.com/google/weather-tools
Category: Climate Change
Sub Category: Climate Data Processing and Analysis
Keywords
apache-beam python weather
Keywords from Contributors
archiving measur transforms animals optimize compose generic conversion observation projection
Last synced: about 16 hours ago
JSON representation
Repository metadata
Tools to make weather data accessible and useful.
- Host: GitHub
- URL: https://github.com/google/weather-tools
- Owner: google
- License: apache-2.0
- Created: 2021-11-22T22:30:09.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-04-16T07:19:26.000Z (11 days ago)
- Last Synced: 2025-04-25T14:44:20.445Z (1 day ago)
- Topics: apache-beam, python, weather
- Language: Python
- Homepage: https://weather-tools.readthedocs.io/
- Size: 196 MB
- Stars: 227
- Watchers: 14
- Forks: 43
- Open Issues: 89
- Releases: 8
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
README.md
weather-tools
Apache Beam pipelines to make weather data accessible and useful.
Introduction
This project contributes a series of command-line tools to make common data engineering tasks easier for researchers in
climate and weather. These solutions were born out of the need to improve repeated work performed by research teams
across Alphabet.
The first tool created was the weather downloader (weather-dl
). This makes it easier to ingest data from the European
Center for Medium Range Forecasts (ECMWF). weather-dl
enables users to describe very specifically what data they'd
like to ingest from ECMWF's catalogs. It also offers them control over how to parallelize requests, empowering users to
retrieve data efficiently. Downloads are driven from a
configuration file, which can be reviewed (and version-controlled) independently of pipeline or
analysis code.
We also provide two additional tools to aid climate and weather researchers: the weather mover (weather-mv
) and the
weather splitter (weather-sp
). These CLIs are still in their alpha stages of development. Yet, they have been used for
production workflows for several partner teams.
We created the weather mover (weather-mv
) to load geospatial data from cloud buckets
into Google BigQuery. This enables rapid exploratory analysis and visualization of
weather data: From BigQuery, scientists can load arbitrary climate data fields into a Pandas or XArray dataframe via a
simple SQL query.
The weather splitter (weather-sp
) helps normalize how archival weather data is stored in cloud buckets:
Whether you're trying to merge two datasets with overlapping variables — or, you simply need
to open Grib data from XArray, it's really useful to split datasets into
their component variables.
Installing
It is currently recommended that you create a local python environment (with
Anaconda) and install the
sources as follows:
conda env create --name weather-tools --file=environment.yml
conda activate weather-tools
Note: Due to its use of 3rd-party binary dependencies such as GDAL and MetView,
weather-tools
is transitioning from PyPi to Conda for its main release channel. The instructions above
are a temporary workaround before our Conda-forge release.
From here, you can use the weather-*
tools from your python environment. Currently, the following tools are available:
- ⛈
weather-dl
(beta) – Download weather data (namely, from ECMWF's API). - ⛅️
weather-mv
(alpha) – Load weather data into analytics engines, like BigQuery. - 🌪
weather-sp
(alpha) – Split weather data by arbitrary dimensions.
Quickstart
In this tutorial, we will
download the Era 5 pressure level dataset
and ingest it into Google BigQuery using weather-dl
and weather-mv
, respectively.
Prerequisites
- Register for a license from
ECMWF's Copernicus (CDS) API. - Install your license by copying your API url & key from this page to a new file
$HOME/.cdsapirc
.[^1] The file should look like this:url: https://cds.climate.copernicus.eu/api/v2 key: <YOUR_USER_ID>:<YOUR_API_KEY>
- If you do not already have a Google Cloud project, create one by following
these steps. If you are working on
an existing project, make sure your user has the BigQuery Admin role.
To learn more about granting IAM roles to users in Google Cloud, visit the
official docs. - Create an empty BigQuery Dataset. This can be done using
the Google Cloud Console
or via thebq
CLI tool.
For example:bq mk --project_id=$PROJECT_ID $DATASET_ID
- Follow these steps
to create a bucket for staging temporary files in Google Cloud Storage.
Steps
For the purpose of this tutorial, we will use your local machine to run the
data pipelines. Note that all weather-tools
can also be run in Cloud Dataflow
which is easier to scale and fully managed.
-
Use
weather-dl
to download the Era 5 pressure level dataset.weather-dl configs/era5_example_config_local_run.cfg \ --local-run # Use the local machine
Recommendation: Pass the
-d, --dry-run
flag to any of these commands to preview the effects.NOTE: By default, local downloads are saved to the
./local_run
directory unless another file system is specified.
The recommended output location forweather-dl
is Cloud Storage.
The source and destination of the download are configured using the.cfg
configuration file which is passed to the command.
To learn more about this configuration file's format and features,
see this reference. To learn more about theweather-dl
command, visit here. -
(optional) Split your downloaded dataset up with
weather-sp
:weather-sp --input-pattern "./local_run/era5-*.nc" \ --output-dir "split_data"
Visit the
weather-sp
docs for more information. -
Use
weather-mv
to ingest the downloaded data into BigQuery, in a structured format.weather-mv bigquery --uris "./local_run/**.nc" \ # or "./split_data/**.nc" if weather-sp is used --output_table "$PROJECT.$DATASET_ID.$TABLE_ID" \ # The path to the destination BigQuery table --temp_location "gs://$BUCKET/tmp" \ # Needed for stage temporary files before writing to BigQuery --direct_num_workers 2
See these docs for more about the
weather-mv
command.
That's it! After the pipeline is completed, you should be able to query the ingested
dataset in BigQuery SQL workspace
and analyze it using BigQuery ML.
Contributing
The weather tools are under active development, and contributions are welcome! Please check out
our guide to get started.
License
This is not an official Google product.
Copyright 2021 Google LLC
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
[^1]: Note that you need to be logged in for the CDS API page to actually show your user ID and API key. Otherwise, it will display a placeholder, which is confusing to some users.
Owner metadata
- Name: Google
- Login: google
- Email: [email protected]
- Kind: organization
- Description: Google ❤️ Open Source
- Website: https://opensource.google/
- Location: United States of America
- Twitter: GoogleOSS
- Company:
- Icon url: https://avatars.githubusercontent.com/u/1342004?v=4
- Repositories: 2754
- Last ynced at: 2025-04-19T22:26:37.811Z
- Profile URL: https://github.com/google
GitHub Events
Total
- Issues event: 5
- Watch event: 15
- Delete event: 5
- Issue comment event: 11
- Push event: 14
- Pull request review comment event: 42
- Pull request event: 21
- Pull request review event: 57
- Fork event: 3
- Create event: 3
Last Year
- Issues event: 5
- Watch event: 15
- Delete event: 5
- Issue comment event: 11
- Push event: 14
- Pull request review comment event: 42
- Pull request event: 21
- Pull request review event: 57
- Fork event: 3
- Create event: 3
Committers metadata
Last synced: 6 days ago
Total Commits: 353
Total Committers: 33
Avg Commits per committer: 10.697
Development Distribution Score (DDS): 0.626
Commits in past year: 37
Committers in past year: 10
Avg Commits per committer in past year: 3.7
Development Distribution Score (DDS) in past year: 0.703
Name | Commits | |
---|---|---|
Alex Merose | a****r@g****m | 132 |
Rahul Mahrsee | 8****7 | 40 |
DeepGabani | 6****8 | 27 |
Darshan Prajapati | 9****9 | 20 |
aniketinfocusp | 1****p | 20 |
Ulrike Hager | u****r@g****m | 16 |
dabhi_cusp | 1****p | 13 |
Aniket Singh Rawat | 1****t | 12 |
Cillian Fennell | c****s@g****m | 8 |
Jash Rana | 1****4 | 8 |
Piyush-Ingale | 1****e | 6 |
ksic8 | 1****8 | 6 |
Shail Parekh | 1****h | 5 |
Aaron Bell | a****l@g****m | 4 |
pbattaglia | p****a | 4 |
pramodg | 6****g | 4 |
Steven Greenberg | s****g@g****m | 4 |
Iman Akbari | i****i@g****m | 3 |
David Lowell | d****l@g****m | 3 |
Sean Campbell | c****n@g****m | 2 |
Valliappa (Lak) Lakshmanan | l****k@v****m | 2 |
dependabot[bot] | 4****] | 2 |
ksic8 | k****l@i****n | 2 |
kbp45-cusp | k****h@i****m | 1 |
Sascha Kahrs | s****s@g****m | 1 |
Shirish Jamthe | s****e@g****m | 1 |
Stephan Rasp | r****n@g****m | 1 |
Saverio Guzzo | s****o@g****m | 1 |
SangamSwadik | s****8@g****m | 1 |
RoshiniFernando | 4****o | 1 |
and 3 more... |
Committer domains:
- google.com: 7
- arun.blog: 1
- infocusp.com: 1
- infocusp.in: 1
- vlakshman.com: 1
Issue and Pull Request metadata
Last synced: 1 day ago
Total issues: 153
Total pull requests: 356
Average time to close issues: 2 months
Average time to close pull requests: 10 days
Total issue authors: 21
Total pull request authors: 29
Average comments per issue: 1.16
Average comments per pull request: 0.56
Merged pull request: 299
Bot issues: 0
Bot pull requests: 3
Past year issues: 5
Past year pull requests: 47
Past year average time to close issues: 1 day
Past year average time to close pull requests: 5 days
Past year issue authors: 3
Past year pull request authors: 10
Past year average comments per issue: 2.0
Past year average comments per pull request: 0.15
Past year merged pull request: 40
Past year bot issues: 0
Past year bot pull requests: 2
Top Issue Authors
- alxmrs (84)
- mahrsee1997 (13)
- blackvvine (12)
- deepgabani8 (9)
- dabhicusp (6)
- ksic8 (4)
- CillianFn (4)
- mt467 (4)
- aniketinfocusp (3)
- DarshanSP19 (2)
- floraxue (2)
- jongwooo (1)
- pramodg (1)
- heyanand (1)
- lakshmanok (1)
Top Pull Request Authors
- alxmrs (86)
- aniketinfocusp (58)
- mahrsee1997 (44)
- deepgabani8 (35)
- DarshanSP19 (20)
- aniketsinghrawat (16)
- dabhicusp (14)
- j9sh264 (12)
- CillianFn (9)
- pbattaglia (8)
- Piyush-Ingale (7)
- ksic8 (6)
- blackvvine (6)
- shail-parekh (5)
- pramodg (5)
Top Issue Labels
- weather-mv (31)
- weather-dl (31)
- bug (27)
- P1 (23)
- P2 (20)
- good first issue (17)
- enhancement (11)
- documentation (7)
- help wanted (5)
- P0 (5)
- P3 (4)
- weather-sp (4)
- can't reproduce (1)
- 20% (1)
- wontfix (1)
Top Pull Request Labels
- weather-dl (3)
- dependencies (3)
- weather-sp (2)
- enhancement (2)
- weather-mv (2)
- github_actions (1)
Package metadata
- Total packages: 1
-
Total downloads:
- pypi: 300 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 7
- Total maintainers: 1
pypi.org: google-weather-tools
Apache Beam pipelines to make weather data accessible and useful.
- Homepage: https://weather-tools.readthedocs.io/
- Documentation: https://google-weather-tools.readthedocs.io/
- Licenses: License :: OSI Approved :: Apache Software License
- Latest release: 0.3.2 (published almost 3 years ago)
- Last Synced: 2025-04-25T14:35:05.678Z (1 day ago)
- Versions: 7
- Dependent Packages: 0
- Dependent Repositories: 1
- Downloads: 300 Last month
-
Rankings:
- Dependent packages count: 10.002%
- Average: 21.348%
- Dependent repos count: 21.718%
- Downloads: 32.324%
- Maintainers (1)
Dependencies
- myst-parser ==0.13.7
- sphinx >=2.1
- apache-beam *
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- conda-incubator/setup-miniconda v2 composite
- s-weigand/setup-conda v1 composite
- styfle/cancel-workflow-action 0.7.0 composite
- actions/checkout v2 composite
- actions/download-artifact v2 composite
- actions/setup-python v2.3.1 composite
- actions/upload-artifact v2 composite
- pypa/gh-action-pypi-publish v1.4.2 composite
- apache/beam_python${py_version}_sdk 2.40.0 build
- continuumio/miniconda3 latest build
- cython ==0.29.34
- earthengine-api ==0.1.329
- firebase-admin ==6.0.1
Score: 14.962676792428262