Open Sustainable Technology

A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

Browse accepted projects | Review proposed projects | Propose new project | Open Issues

earthaccess

Search, download or stream NASA Earth science data with just a few lines of code.
https://github.com/nsidc/earthaccess

access climate-data cloud-computing cmr data gis nasa nasa-api nasa-data open-science openscience pangeo remote-sensing

Last synced: about 21 hours ago
JSON representation

Repository metadata

Python Library for NASA Earthdata APIs

README

        


earthaccess, a python library to search, download or stream NASA Earth science data with just a few lines of code


DOI


Art Designer: Allison Horst


Package version


Conda Versions


Python Versions


Documentation Status

## **Overview**

*earthaccess* is a **python library to search, download or stream NASA Earth science data** with just a few lines of code.

In the age of cloud computing, the power of open science only reaches its full potential if we have easy-to-use workflows that facilitate research in an inclusive, efficient and reproducible way. Unfortunately β€”as it stands todayβ€” scientists and students alike face a steep learning curve adapting to systems that have grown too complex and end up spending more time on the technicalities of the tools, cloud and NASA APIs than focusing on their important science.

During several workshops organized by [NASA Openscapes](https://nasa-openscapes.github.io/events.html), the need to provide easy-to-use tools to our users became evident. Open science is a collaborative effort; it involves people from different technical backgrounds, and the data analysis to solve the pressing problems we face cannot be limited by the complexity of the underlying systems. Therefore, providing easy access to NASA Earthdata regardless of the data storage location (hosted within or outside of the cloud) is the main motivation behind this Python library.

## **Installing earthaccess**

You will need Python 3.8 or higher installed.

Install the latest release using conda

```bash
conda install -c conda-forge earthaccess
```

Using Pip

```bash
pip install earthaccess
```

Try it in your browser without installing anything! [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nsidc/earthaccess/main)

## **Usage**

With *earthaccess* we can login, search and download data with a few lines of code and even more relevant, our code will work the same way if we are running it in the cloud or from our laptop. ***earthaccess*** handles authentication with [NASA's Earthdata Login (EDL)](https://urs.earthdata.nasa.gov), search using NASA's [CMR](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html) and access through [`fsspec`](https://github.com/fsspec/filesystem_spec).

The only requirement to use this library is to open a free account with NASA [EDL](https://urs.earthdata.nasa.gov).

### **Authentication**

By default, `earthaccess` with automatically look for your EDL account credentials in two locations:

1. A `~/.netrc` file
2. `EARTHDATA_USERNAME` and `EARTHDATA_PASSWORD` environment variables

If neither of these options are configured, you can authenticate by calling the `earthaccess.login()` method
and manually entering your EDL account credentials.

```python
import earthaccess

earthaccess.login()
```

Note you can pass `persist=True` to `earthaccess.login()` to have the EDL account credentials you enter
automatically saved to a `~/.netrc` file for future use.

Once you are authenticated with NASA EDL you can:

* Get a file from a DAAC using a `fsspec` session.
* Request temporary S3 credentials from a particular DAAC (needed to download or stream data from an S3 bucket in the cloud).
* Use the library to download or stream data directly from S3.
* Regenerate CMR tokens (used for restricted datasets)

### **Searching for data**

Once we have selected our dataset we can search for the data granules using *doi*, *short_name* or *concept_id*.
If we are not sure or we don't know how to search for a particular dataset, we can start with the ["Introducing NASA earthaccess"](https://nsidc.github.io/earthaccess/tutorials/demo/#querying-for-datasets) tutorial or through the [NASA Earthdata Search portal](https://search.earthdata.nasa.gov/). For a complete list of search parameters we can use visit the extended [API documentation](https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/).

```python

results = earthaccess.search_data(
short_name='SEA_SURFACE_HEIGHT_ALT_GRIDS_L4_2SATS_5DAY_6THDEG_V_JPL2205',
cloud_hosted=True,
bounding_box=(-10, 20, 10, 50),
temporal=("1999-02", "2019-03"),
count=10
)

```

Now that we have our results we can do multiple things: We can iterate over them to get HTTP (or S3) links, we can download the files to a local folder, or we can open these files and stream their content directly to other libraries e.g. xarray.

### **Accessing the data**

**Option 1: Using the data links**

If we already have a workflow in place for downloading our data, we can use *earthaccess* as a search-only library and get HTTP links from our query results. This could be the case if our current workflow uses a different language and we only need the links as input.

```python

# if the data set is cloud hosted there will be S3 links available. The access parameter accepts "direct" or "external", direct access is only possible if you are in the us-west-2 region in the cloud.
data_links = [granule.data_links(access="direct") for granule in results]

# or if the data is an on-prem dataset
data_links = [granule.data_links(access="external") for granule in results]

```

> Note: *earthaccess* can get S3 credentials for us, or auhenticated HTTP sessions in case we want to use them with a different library.

**Option 2: Download data to a local folder**

This option is practical if you have the necessary space available on disk. The *earthaccess* library will print out the approximate size of the download and its progress.
```python
files = earthaccess.download(results, "./local_folder")

```

**Option 3: Direct S3 Access - Stream data directly to xarray**

This method works best if you are in the same Amazon Web Services (AWS) region as the data (us-west-2) and you are working with gridded datasets (processing level 3 and above).

```python
import xarray as xr

files = earthaccess.open(results)

ds = xr.open_mfdataset(files)

```

And that's it! Just one line of code, and this same piece of code will also work for data that are not hosted in the cloud, i.e. located at NASA storage centers.

> More examples coming soon!

### Compatibility

Only **Python 3.8+** is supported.

## Contributors

[![Contributors](https://contrib.rocks/image?repo=nsidc/earthaccess)](https://github.com/nsidc/earthaccess/graphs/contributors)

## Contributing Guide

Welcome! πŸ˜ŠπŸ‘‹

> Please see the [Contributing Guide](CONTRIBUTING.md).

### [Project Board](https://github.com/nsidc/earthdata/discussions).

### Glossary

NASA Earth Science Glossary

## License

earthaccess is licensed under the MIT license. See [LICENSE](LICENSE.txt).

## Level of Support





This repository is supported by a joint effort of NSIDC, NASA DAACs, and the Earth science community, and we welcome any contribution in the form of issue submissions, pull requests, or discussions. Issues labeled as https://github.com/nsidc/earthaccess/labels/good%20first%20issue are a great place to get started.

Citation (CITATION.cff)

cff-version: 1.2.0
message: |
  Please cite this software using these metadata.
  Authors are listed in alphabetical order.

title: "earthaccess"
doi: "10.5281/zenodo.8365009"
abstract: "Python Library for NASA Earthdata APIs"
contact:
  - name: "The earthaccess community"
    website: "https://github.com/nsidc/earthaccess/discussions"
  - name: "NSIDC"
    email: "[email protected]"
type: "software"
# NOTE: The CFF spec says `license` can be a list, but Zenodo is currently not
#       accepting lists for this key:
#       https://github.com/zenodo/zenodo/issues/2515
license: "MIT"
keywords:
  - "data"
  - "Remote sensing"
  - "Cloud computing"
  - "authentication"
  - "Earthdata Login"

url: "https://earthaccess.readthedocs.io"
repository-code: "https://github.com/nsidc/earthaccess"

version: "0.9.0"
date-released: "2024-02-28"

authors:
  - family-names: "Barrett"
    given-names: "Andrew"
    orcid: "https://orcid.org/0000-0003-4394-5445"
    website: "https://github.com/andypbarrett"
  - family-names: "Battisto"
    given-names: "Chris"
    orcid: "https://orcid.org/0000-0002-9608-3634"
    website: "https://github.com/battistowx"
  - family-names: "Bourbeau"
    given-names: "James"
    orcid: "https://orcid.org/0000-0003-2164-7789"
    website: "https://github.com/jrbourbeau"
  - family-names: "Fisher"
    given-names: "Matt"
    orcid: "https://orcid.org/0000-0003-3260-5445"
    website: "https://mfisher87.github.io/"
  - family-names: "Kaufman"
    given-names: "Daniel"
    orcid: "https://orcid.org/0000-0002-1487-7298"
    website: "https://github.com/danielfromearth"
  - family-names: "Kennedy"
    given-names: "Joseph"
    orcid: "https://orcid.org/0000-0002-9348-693X"
    website: "https://github.com/jhkennedy"
  - family-names: "Lopez"
    given-names: "Luis"
    orcid: "https://orcid.org/0000-0003-4896-3263"
    website: "https://github.com/betolink"
  - family-names: "Lowndes"
    given-names: "Julia"
    orcid: "https://orcid.org/0000-0003-1682-3872"
    website: "https://github.com/jules32"
  - family-names: "Scheick"
    given-names: "Jessica"
    orcid: "https://orcid.org/0000-0002-3421-4459"
    website: "https://github.com/JessicaS11"
  - family-names: "Steiker"
    given-names: "Amy"
    orcid: "https://orcid.org/0000-0002-3039-0260"
    website: "https://github.com/asteiker"



references:
  - type: "grant"
    title: "Openscapes: Supporting better science for future us"
    institution:
      name: "National Aeronautics and Space Administration"
    number: "20-TWSC20-2-0003"
    authors:
      - family-names: "Lowndes"
        given-names: "Julia"
        orcid: "https://orcid.org/0000-0003-1682-3872"
        website: "https://github.com/jules32"
      - family-names: "Robinson"
        given-names: "Erin"
        orcid: "https://orcid.org/0000-0001-9998-0114"
        website: "https://erinrobinson.info/"

Owner metadata


GitHub Events

Total
Last Year

Committers metadata

Last synced: 1 day ago

Total Commits: 542
Total Committers: 30
Avg Commits per committer: 18.067
Development Distribution Score (DDS): 0.657

Commits in past year: 327
Committers in past year: 23
Avg Commits per committer in past year: 14.217
Development Distribution Score (DDS) in past year: 0.768

Name Email Commits
betolink b****n@g****m 186
Matt Fisher m****7@g****m 76
James Bourbeau j****u@g****m 51
betolink:w l****z@n****g 30
doug-newman-nasa d****n@n****v 25
jennifer j****k@c****u 24
danielfromearth d****n@n****v 23
Joseph H Kennedy m****e@j****g 20
Andy Barrett a****t@n****g 19
Matt Fisher m****r@n****g 14
Chris Battisto 3****x 10
karthik venkataramani k****t@v****u 10
Andy Barrett a****t@g****m 7
Doug Newman d****n@n****v 6
dependabot[bot] 4****] 5
Amy Steiker 4****r 5
Chuck Daniels c****k@d****g 4
Ian Carroll c****n@g****m 3
pre-commit-ci[bot] 6****] 3
Trey Stafford t****d@c****u 3
Julia Stewart Lowndes j****a@o****g 3
Jessica Scheick j****k@g****m 3
Ian Carroll i****l@n****v 3
vincentsarago v****o@g****m 2
luz paz l****z 2
Aimee Barciauskas a****e@d****g 1
Chelle Gentemann c****n@e****g 1
Paulius Sarka p****a@g****m 1
rsignell 1****l 1
Rupesh Shrestha r****a@g****m 1

Committer domains:


Issue and Pull Request metadata

Last synced: 2 days ago

Total issues: 221
Total pull requests: 349
Average time to close issues: 2 months
Average time to close pull requests: 11 days
Total issue authors: 56
Total pull request authors: 27
Average comments per issue: 3.09
Average comments per pull request: 2.18
Merged pull request: 127
Bot issues: 0
Bot pull requests: 203

Past year issues: 173
Past year pull requests: 167
Past year average time to close issues: 21 days
Past year average time to close pull requests: 8 days
Past year issue authors: 47
Past year pull request authors: 23
Past year average comments per issue: 2.94
Past year average comments per pull request: 3.1
Past year merged pull request: 94
Past year bot issues: 0
Past year bot pull requests: 54

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/nsidc/earthaccess

Top Issue Authors

  • mfisher87 (53)
  • betolink (34)
  • jrbourbeau (18)
  • andypbarrett (11)
  • MattF-NSIDC (11)
  • itcarroll (8)
  • danielfromearth (7)
  • chuckwondo (6)
  • asteiker (5)
  • JessicaS11 (4)
  • rupesh2 (4)
  • jhkennedy (3)
  • nikki-t (3)
  • Rapsodia86 (3)
  • aldotapia (3)

Top Pull Request Authors

  • dependabot[bot] (199)
  • jrbourbeau (35)
  • betolink (29)
  • mfisher87 (25)
  • danielfromearth (6)
  • itcarroll (6)
  • chuckwondo (5)
  • jhkennedy (5)
  • MattF-NSIDC (5)
  • pre-commit-ci[bot] (4)
  • asteiker (4)
  • andypbarrett (4)
  • jules32 (3)
  • doug-newman-nasa (3)
  • JessicaS11 (3)

Top Issue Labels

  • enhancement (43)
  • documentation (33)
  • bug (27)
  • good first issue (19)
  • automation (12)
  • question (4)
  • feedback requested (1)
  • help wanted (1)

Top Pull Request Labels

  • dependencies (199)
  • github_actions (10)
  • python (9)
  • help wanted (2)
  • enhancement (1)

Package metadata

pypi.org: earthaccess

Client library for NASA Earthdata APIs

  • Homepage: https://github.com/nsidc/earthaccess
  • Documentation: https://earthaccess.readthedocs.io/
  • Licenses: MIT
  • Latest release: 0.9.0 (published 2 months ago)
  • Last Synced: 2024-05-09T08:37:00.139Z (2 days ago)
  • Versions: 14
  • Dependent Packages: 3
  • Dependent Repositories: 7
  • Downloads: 4,339 Last month
  • Docker Downloads: 155
  • Rankings:
    • Docker downloads count: 2.317%
    • Dependent packages count: 2.377%
    • Stargazers count: 4.196%
    • Average: 4.412%
    • Dependent repos count: 5.533%
    • Downloads: 5.775%
    • Forks count: 6.275%
  • Maintainers (1)

Dependencies

pyproject.toml pypi
  • autoflake >=1.3 develop
  • black >=21.11b0 develop
  • flake8 >=3.7 develop
  • ipywidgets ^7.7.0 develop
  • isort >=5 develop
  • jupyterlab >=3 develop
  • markdown-include >=0.6 develop
  • mkdocs >=1.2 develop
  • mkdocs-jupyter =0.19.0 develop
  • mkdocs-material >=7.1,<9.0 develop
  • mkdocstrings >=0.18 develop
  • mypy >=0.812 develop
  • pre-commit >=2.4 develop
  • pygments =2.11.1 develop
  • pymdown-extensions =9.2 develop
  • pytest >=6.0 develop
  • pytest-cov >=2.8 develop
  • pytest-watch >=4.2 develop
  • responses >=0.14 develop
  • types-requests >=0.1 develop
  • types-setuptools >=0.1 develop
  • widgetsnbextension ^3.6.0 develop
  • pqdm >=0.1
  • python >=3.8,<4.0
  • python-benedict >=0.25
  • python-cmr >=0.7
  • requests >=2.26
  • s3fs >=2021.11, <2024
  • tinynetrc ^1.3.1
.github/workflows/issue-manager.yml actions
  • tiangolo/issue-manager master composite
.github/workflows/publish.yml actions
  • JRubics/poetry-publish v1.8 composite
  • actions/checkout v2 composite
.github/workflows/test.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
  • codecov/codecov-action v1 composite
.github/workflows/integration-test.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
  • codecov/codecov-action v1 composite
poetry.lock pypi
  • 194 dependencies
.github/workflows/pr-rtd-link.yml actions
  • readthedocs/actions/preview v1 composite
.github/workflows/static-analysis.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v4 composite
.github/workflows/test-mindeps.yml actions
  • actions/checkout v4.1.1 composite
  • codecov/codecov-action v1 composite
  • conda-incubator/setup-miniconda v2.2.0 composite
binder/environment.yml conda
  • cartopy >=0.18.0
  • dask >=2022.1
  • geopandas >=0.9
  • h5netcdf >=0.11
  • h5py >=3.2
  • holoviews
  • hvplot
  • ipyleaflet >=0.15
  • jupyterlab >=3
  • matplotlib-base >=3.3
  • netcdf4 >=1.5
  • panel
  • python 3.9.*
  • rioxarray >=0.3
  • xarray >=0.19
  • zarr >=2.9.5

Score: 17.941453588466988