Open Sustainable Technology

A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

Browse accepted projects | Review proposed projects | Propose new project | Open Issues

DeepSensor

A Python package for tackling diverse environmental prediction tasks with neural processes.
https://github.com/alan-turing-institute/deepsensor

Last synced: about 14 hours ago
JSON representation

Repository metadata

A Python package for tackling diverse environmental prediction tasks with NPs.

README

        

[//]: # (![](figs/DeepSensorLogo.png))



    A Python package and open-source project for modelling environmental
    data with neural processes

    -----------

    [![release](https://img.shields.io/badge/release-v0.3.6-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
    [![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](https://alan-turing-institute.github.io/deepsensor/)
    ![Tests](https://github.com/alan-turing-institute/deepsensor/actions/workflows/tests.yml/badge.svg)
    [![Coverage Status](https://coveralls.io/repos/github/alan-turing-institute/deepsensor/badge.svg?branch=main)](https://coveralls.io/github/alan-turing-institute/deepsensor?branch=main)
    [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
    [![slack](https://img.shields.io/badge/slack-deepsensor-purple.svg?logo=slack)](https://ai4environment.slack.com/archives/C05NQ76L87R)
    [![All Contributors](https://img.shields.io/github/all-contributors/alan-turing-institute/deepsensor?color=ee8449&style=flat-square)](#contributors)
    [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/alan-turing-institute/deepsensor/blob/main/LICENSE)

    DeepSensor streamlines the application of neural processes (NPs) to environmental sciences by
    providing a simple interface for building, training, and evaluating NPs using `xarray` and `pandas`
    data. Our developers and users form an open-source community whose vision is to accelerate the next
    generation of environmental ML research. The DeepSensor Python package facilitates this by
    drastically reducing the time and effort required to apply NPs to environmental prediction tasks.
    This allows DeepSensor users to focus on the science and rapidly iterate on ideas.

    DeepSensor is an experimental package, and we
    welcome [contributions from the community](https://github.com/alan-turing-institute/deepsensor/blob/main/CONTRIBUTING.md).
    We have an active Slack channel for code and research discussions; you can request to
    join [via this Google Form](https://docs.google.com/forms/d/e/1FAIpQLScsI8EiXDdSfn1huMp1vj5JAxi9NIeYLljbEUlMceZvwVpugw/viewform).

    ![DeepSensor example application figures](https://raw.githubusercontent.com/alan-turing-institute/deepsensor/main/figs/deepsensor_application_examples.png)

    Why neural processes?
    -----------
    NPs are a highly flexible class of probabilistic models that offer unique opportunities to model
    satellite observations, climate model output, and in-situ measurements.
    Their key features are the ability to:

    - ingest multiple data streams of pointwise or gridded modalities
    - handle missing data and varying resolutions
    - predict at arbitrary target locations
    - quantify prediction uncertainty

    These capabilities make NPs well suited to a range of
    spatio-temporal data fusion tasks such as downscaling, sensor placement, gap-filling, and forecasting.

    Why DeepSensor?
    -----------
    This package aims to faithfully match the flexibility of NPs with a simple and intuitive interface.
    Under the hood, DeepSensor wraps around the
    powerful [neuralprocessess](https://github.com/wesselb/neuralprocesses) package for core modelling
    functionality, while allowing users to stay in the familiar [xarray](https://xarray.pydata.org)
    and [pandas](https://pandas.pydata.org) world from end-to-end.
    DeepSensor also provides convenient plotting tools and active learning functionality for finding
    optimal [sensor placements](https://doi.org/10.1017/eds.2023.22).

    Documentation
    -----------
    We have an extensive documentation page [here](https://alan-turing-institute.github.io/deepsensor/),
    containing steps for getting started, a user guide built from reproducible Jupyter notebooks,
    learning resources, research ideas, community information, an API reference, and more!

    DeepSensor Gallery
    -----------
    For real-world DeepSensor research demonstrators, check out the
    [DeepSensor Gallery](https://github.com/tom-andersson/deepsensor_gallery).
    Consider submitting a notebook showcasing your research!

    Deep learning library agnosticism
    -----------
    DeepSensor leverages the [backends](https://github.com/wesselb/lab) package to be compatible with
    either [PyTorch](https://pytorch.org/) or [TensorFlow](https://www.tensorflow.org/).
    Simply `import deepsensor.torch` or `import deepsensor.tensorflow` to choose between them!

    Quick start
    ----------

    Here we will demonstrate a simple example of training a convolutional conditional neural process
    (ConvCNP) to spatially interpolate random grid cells of NCEP reanalysis air temperature data
    over the US. First, pip install the package. In this case we will use the PyTorch backend
    (note: follow the [PyTorch installation instructions](https://pytorch.org/) if you
    want GPU support).

    ```bash
    pip install deepsensor
    pip install torch
    ```

    We can go from imports to predictions with a trained model in less than 30 lines of code!

    ```python
    import deepsensor.torch
    from deepsensor.data import DataProcessor, TaskLoader
    from deepsensor.model import ConvNP
    from deepsensor.train import Trainer

    import xarray as xr
    import pandas as pd
    import numpy as np
    from tqdm import tqdm

    # Load raw data
    ds_raw = xr.tutorial.open_dataset("air_temperature")

    # Normalise data
    data_processor = DataProcessor(x1_name="lat", x2_name="lon")
    ds = data_processor(ds_raw)

    # Set up task loader
    task_loader = TaskLoader(context=ds, target=ds)

    # Set up model
    model = ConvNP(data_processor, task_loader)

    # Generate training tasks with up 100 grid cells as context and all grid cells
    # as targets
    train_tasks = []
    for date in pd.date_range("2013-01-01", "2014-11-30")[::7]:
    N_context = np.random.randint(0, 100)
    task = task_loader(date, context_sampling=N_context, target_sampling="all")
    train_tasks.append(task)

    # Train model
    trainer = Trainer(model, lr=5e-5)
    for epoch in tqdm(range(10)):
    batch_losses = trainer(train_tasks)

    # Predict on new task with 50 context points and a dense grid of target points
    test_task = task_loader("2014-12-31", context_sampling=50)
    pred = model.predict(test_task, X_t=ds_raw)
    ```

    After training, the model can predict directly to `xarray` in your data's original units and
    coordinate system:

    ```python
    >>> pred["air"]

    Dimensions: (time: 1, lat: 25, lon: 53)
    Coordinates:
    * time (time) datetime64[ns] 2014-12-31
    * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
    * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
    Data variables:
    mean (time, lat, lon) float32 267.7 267.2 266.4 ... 297.5 297.8 297.9
    std (time, lat, lon) float32 9.855 9.845 9.848 ... 1.356 1.36 1.487
    ```

    We can also predict directly to `pandas` containing a timeseries of predictions at off-grid
    locations
    by passing a `numpy` array of target locations to the `X_t` argument of `.predict`:

    ```python
    # Predict at two off-grid locations over December 2014 with 50 random, fixed context points
    test_tasks = task_loader(pd.date_range("2014-12-01", "2014-12-31"), 50, seed_override=42)
    pred = model.predict(test_tasks, X_t=np.array([[50, 280], [40, 250]]).T)
    ```

    ```python
    >>> pred["air"]
    mean std
    time lat lon
    2014-12-01 50 280 260.282562 5.743976
    40 250 270.770111 4.271546
    2014-12-02 50 280 255.572098 6.165956
    40 250 277.588745 3.727404
    2014-12-03 50 280 260.894196 6.02924
    ... ... ...
    2014-12-29 40 250 266.594421 4.268469
    2014-12-30 50 280 250.936386 7.048379
    40 250 262.225464 4.662592
    2014-12-31 50 280 249.397919 7.167142
    40 250 257.955505 4.697775

    [62 rows x 2 columns]
    ```

    DeepSensor offers far more functionality than this simple example demonstrates.
    For more information on the package's capabilities, check out the
    [User Guide](https://tom-andersson.github.io/deepsensor/user-guide/index.html)
    in the documentation.

    ## Citing DeepSensor

    If you use DeepSensor in your research, please consider citing this repository.
    You can generate a BiBTeX entry by clicking the 'Cite this repository' button
    on the top right of this page.

    ## Funding

    DeepSensor is funded by [The Alan Turing Institute](https://www.turing.ac.uk/) under the [Environmental monitoring: blending satellite and surface data](https://www.turing.ac.uk/research/research-projects/environmental-monitoring-blending-satellite-and-surface-data) and [Scivision](https://www.turing.ac.uk/research/research-projects/scivision) projects, led by PI [Dr Scott Hosking](https://www.turing.ac.uk/people/researchers/scott-hosking).

    ## Contributors

    We appreciate all contributions to DeepSensor, big or small, code-related or not, and we thank all
    contributors below for supporting open-source software and research.
    For code-specific contributions, check out our graph of [code contributions](https://github.com/tom-andersson/deepsensor/graphs/contributors).
    See our [contribution guidelines](https://github.com/tom-andersson/deepsensor/blob/main/CONTRIBUTING.md)
    if you would like to join this list!



    Alejandro ©
    Alejandro ©

    📓 🐛 🧑‍🏫 🤔 🔬
    Anna Vaughan
    Anna Vaughan

    🔬
    Jim Circadian
    Jim Circadian

    🤔 📆 🚧
    Jonas Scholz
    Jonas Scholz

    📓 🔬 💻 🐛 🤔
    Kalle Westerling
    Kalle Westerling

    📖 🚇 🤔 📆 📣 💬
    Kenza Tazi
    Kenza Tazi

    🤔
    Magnus Ross
    Magnus Ross

    🔣


    Nils Lehmann
    Nils Lehmann

    🤔 📓 🐛
    Paolo Pelucchi
    Paolo Pelucchi

    📓 🐛
    Rohit Singh Rathaur
    Rohit Singh Rathaur

    💻
    Scott Hosking
    Scott Hosking

    🔍 🤔 📆
    Tom Andersson
    Tom Andersson

    💻 🔬 🚧 🐛 ⚠️ 📖 👀 📢 💬
    Wessel
    Wessel

    🔬 💻 🤔
    Zeel B Patel
    Zeel B Patel

    🐛 💻 📓 🤔


    ots22
    ots22

    🤔

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: 'DeepSensor: A Python package for modelling environmental data with convolutional neural processes'
message: >-
  If you use DeepSensor in your research, please cite it
  using the information below.
type: software
authors:
  - given-names: Tom Robin
    family-names: Andersson
    email: [email protected]
    affiliation: Google DeepMind
    orcid: 'https://orcid.org/0000-0002-1556-9932'
repository-code: 'https://github.com/alan-turing-institute/deepsensor'
abstract: >-
  DeepSensor is a Python package for modelling environmental
  data with convolutional neural processes (ConvNPs).
  ConvNPs are versatile deep learning models capable of
  ingesting multiple environmental data streams of varying
  modalities and resolutions, handling missing data, and
  predicting at arbitrary target locations with uncertainty.
  DeepSensor allows users to tackle a diverse array of
  environmental prediction tasks, including downscaling
  (super-resolution), sensor placement, gap-filling, and
  forecasting. The library includes a user-friendly
  pandas/xarray interface, automatic unnormalisation of
  model predictions, active learning functionality,
  integration with both PyTorch and TensorFlow, and model
  customisation. DeepSensor streamlines and simplifies the
  environmental data modelling pipeline, enabling
  researchers and practitioners to harness the potential of
  ConvNPs for complex environmental prediction challenges.
keywords:
  - machine learning
  - environmental science
  - neural processes
  - active learning
license: MIT
version: 0.3.6
date-released: '2024-02-02'

Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: 14 days ago

Total Commits: 640
Total Committers: 13
Avg Commits per committer: 49.231
Development Distribution Score (DDS): 0.167

Commits in past year: 640
Committers in past year: 13
Avg Commits per committer in past year: 49.231
Development Distribution Score (DDS) in past year: 0.167

Name Email Commits
Tom Andersson t****d@b****k 533
allcontributors[bot] 4****] 40
Tom Andersson t****3@g****m 25
Kalle Westerling k****g@b****k 21
Kalle Westerling 7****g 6
Jonas Scholz j****3@g****m 3
RohitRathore1 r****5@g****m 3
polpel 5****l 3
patel-zeel p****l@i****n 2
Alejandro © a****c@g****m 1
Paolo Pelucchi p****i@y****m 1
Scott Hosking j****g@g****m 1
David Wilby 2****y 1

Committer domains:


Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 91
Total pull requests: 75
Average time to close issues: 23 days
Average time to close pull requests: 1 day
Total issue authors: 9
Total pull request authors: 12
Average comments per issue: 3.37
Average comments per pull request: 0.84
Merged pull request: 65
Bot issues: 0
Bot pull requests: 32

Past year issues: 91
Past year pull requests: 75
Past year average time to close issues: 23 days
Past year average time to close pull requests: 1 day
Past year issue authors: 9
Past year pull request authors: 12
Past year average comments per issue: 3.37
Past year average comments per pull request: 0.84
Past year merged pull request: 65
Past year bot issues: 0
Past year bot pull requests: 32

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/alan-turing-institute/deepsensor

Top Issue Authors

  • tom-andersson (56)
  • patel-zeel (10)
  • nilsleh (7)
  • kallewesterling (5)
  • jonas-scholz123 (4)
  • magnusross (4)
  • kenzaxtazi (2)
  • scotthosking (2)
  • polpel (1)

Top Pull Request Authors

  • allcontributors[bot] (32)
  • kallewesterling (13)
  • polpel (6)
  • tom-andersson (6)
  • RohitRathore1 (4)
  • nilsleh (4)
  • scotthosking (2)
  • davidwilby (2)
  • magnusross (2)
  • acocac (2)
  • patel-zeel (1)
  • jonas-scholz123 (1)

Top Issue Labels

  • enhancement (34)
  • bug (13)
  • good first issue (13)
  • thoughts welcome (7)

Top Pull Request Labels

  • bug (4)
  • good first issue (2)

Dependencies

.github/workflows/publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/style.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/tests.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • coverallsapp/github-action v1 composite
.github/workflows/docs.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • peaceiris/actions-gh-pages v3 composite
docs/requirements.txt pypi
  • jupyter-book *
  • matplotlib *
  • numpy *
pyproject.toml pypi
requirements/requirements.dev.txt pypi
  • coveralls * development
  • parameterized * development
  • pytest * development
  • pytest-cov * development
  • tox * development
  • tox-gh-actions * development
requirements/requirements.docs.txt pypi
  • jupyter-book ==0.15.1
  • sphinx *
requirements/requirements.txt pypi
  • backends *
  • backends-matrix *
  • dask *
  • distributed *
  • gcsfs *
  • jupyter *
  • matplotlib *
  • neuralprocesses >=0.2.2
  • numpy *
  • pandas *
  • pooch *
  • pyshp *
  • rioxarray *
  • seaborn *
  • shapely *
  • tensorflow *
  • tensorflow_probability *
  • torch >=2
  • tqdm *
  • xarray *
  • zarr *
setup.py pypi

Score: 6.895682697747867