Recent Releases of earthaccess

earthaccess - v0.14.0

[v0.14.0] - 2025-02-11

📣 📣 📣 BREAKING CHANGE 📣 📣 📣

From now on, any time Earthdata Login rejects credentials, a Python exception will be raised. You can get the old behavior with a standard try/except block:

# Caution: If credentials are rejected, you should know about it and update any env vars
# or .netrc files. If credentials are rejected too many times, you could get locked out
# of your account.
try:
    earthaccess.login()
except Exception:
    pass

Added

  • search_datasets now accepts a has_granules keyword argument. Use has_granules=False to search for metadata about collections with no associated granules. The default value set in DataCollections remains True. (#939) (@juliacollins)

Changed

  • Breaking: earthaccess will now raise an exception when login credentials are rejected. If you need the old behavior, please use a try block. (#888) (@mfisher87, @chuckwondo, @jhkennedy)

New Contributors

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.13.0...v0.14.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by juliacollins 3 months ago

earthaccess - v0.13.0

[v0.13.0] - 2025-01-28

Changed

  • Integration tests: Test are no longer randomized! this means each fail should be reproducible, we are testing the most
    popular datasets from all DAACs, see files under tests/integration/popular_collections.
    (#215)
    (@mfisher87)

Added

  • VirtualiZarr: earthaccess can open archival formats (NetCDF, HDF5) as if they were Zarr by leveraging VirtualiZarr
    In order to use this capability the collection needs to be supported by OPeNDAP and have dmrpp files.
    See example notebooks!
    (@ayushnag and @TomNicholas)

Fixed

  • earthaccess.download will let requests automatically decode compressed content
    (#887)
    (@itcarroll)

  • earthaccess.download now shares the authenticated session cookie among threads to avoid overloading EDL.
    (#913)
    (@hailiangzhang)

Complete autogenerated list:

What's Changed

New Contributors

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.12.0...v0.13.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink 3 months ago

earthaccess - v0.12.0

v0.12.0

Changed

  • Use built-in assert statements instead of unittest assertions in integration tests (#743) (@chuckwondo)

Added

  • Add support for NETRC environment variable to override default .netrc file location (#480) (@chuckwondo)
  • Add nox session for running integration tests locally (#815; @chuckwondo and #872; @jhkennedy)
  • Auto-add comment to PR that requires maintainer to review and re-run integration tests (#824) (@chuckwondo)

Removed

  • The scripts/integration-test.sh script has been removed in favor of the integration-tests nox session. (#872) (@jhkennedy)
  • Python 3.9 is no longer supported. (#876) (@mfisher87)

Fixed

  • earthaccess.download will not ignore errors by default (#581) (@Sherwin-14, @chuckwondo, @mfisher87)
  • Integration tests no longer clobber existing .netrc file (#806) (@chuckwondo)
  • Return an empty list instead of raising an IndexError when searches find no results. (#526) (@jhkennedy)

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.11.0...v0.12.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink 6 months ago

earthaccess - v0.11.0

v0.11.0

Changed

  • Automatically refresh EDL token and deprecate the Auth.refresh_tokens method with no replacement, as there is no longer a need to explicitly refresh (#484) (@fwfichtner)
  • Deprecate earthaccess.get_s3fs_session and Store.get_s3fs_session. Use earthaccess.get_s3_filesystem and Store.get_s3_filesystem, respectively, instead (#766) (@Sherwin-14, @chuckwondo)

Added

  • Add Issue Templates (#281) (@Sherwin-14)
  • Support Service queries (#447) (@nikki-t, @chuckwondo, @mfisher87, @betolink)
  • Add example PR links to pull request template (#756) (@Sherwin-14, @mfisher87)
  • Add Contributing Naming Convention document (#532) (@Sherwin-14, @mfisher87)

Fixed

  • Remove broken link "Introduction to NASA earthaccess" (#779) (@Sherwin-14)
  • Restore automation for tidying notebooks used in documentation (#788) (@itcarroll)

Removed

  • Remove binder/ directory, as we no longer need a special binder environment with the top-level environment.yml introduced in #733 (@jhkennedy)

New Contributors

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.10.0...v0.11.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by nikki-t 7 months ago

earthaccess - v0.10.0

v0.10.0

Changed

Added

Removed

Fixed

New Contributors

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.9.0...v0.10.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink 10 months ago

earthaccess - v0.9.0

What's Changed

New Contributors

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.8.2...v0.9.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink about 1 year ago

earthaccess - v0.8.2

What's Changed

  • Bug fixes:
    • Enable AWS check with IMDSv2
    • Add region to running in AWS check
    • Handle opening multi-file granules
  • Maintenance:
    • Add CI tests with minimum supported versions
    • Update poetry lockfile
    • Add python-dateutil as a direct dependency
    • Remove binder PR comments
    • Add YAML formatting (prettier)

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.8.1...v0.8.2

Sustainable Development - Data Catalogs and Interfaces - Python
Published by jrbourbeau over 1 year ago

earthaccess - v0.8.1

What's Changed

  • New Features:
    • Add kerchunk metadata consolidation utility.
  • Enhancements:
    • Handle S3 credential expiration more gracefully.
  • Maintenanece:
    • Use dependabot to update Github Actions.
    • Consolidate dependabot updates.
    • Switch to ruff for formatting.

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.8.0...v0.8.1

Sustainable Development - Data Catalogs and Interfaces - Python
Published by jrbourbeau over 1 year ago

earthaccess - v0.8.0

What's Changed

  • Bug fixes:
    • Fix zero granules being reported for restricted datasets. (#358)
  • Enhancements:
    • earthaccess will raise errors instead of printing them in more cases. (#351)
    • daac and provider parameters are now normalized to uppercase, since lowercase
      characters are never valid. (#355)

New Contributors

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.7.1...v0.8.0

Sustainable Development - Data Catalogs and Interfaces - Python
Published by mfisher87 over 1 year ago

earthaccess - v0.7.1

What's Changed

Full Changelog: https://github.com/nsidc/earthaccess/compare/v0.7.0...v0.7.1

Sustainable Development - Data Catalogs and Interfaces - Python
Published by mfisher87 over 1 year ago

earthaccess - v0.7.0

  • Bug Fixes:
    • Fix spelling mistake in access variable assignment (direc -> direct)
      in earthaccess.store._get_granules.
    • Pass threads arg to _open_urls_https in
      earthaccess.store._open_urls, replacing the hard-coded value of 8.
    • Return S3 data links by default when in region.
  • Enhancements:
    • earthaccess.download now accepts a single granule as input in addition to a list of granules.
    • earthaccess.download now returns fully qualified local file paths.
  • New Features:
    • Earthaccess will now automatically search for Earthdata authentication. earthaccess.login()
      still works as before, but is no longer required if you have a ~/.netrc file for have set
      EARTHDATA_USERNAME and EARTHDATA_PASSWORD environment variables.
    • Add earthaccess.auth_environ() utility for getting Earthdata authentication environment variables.

Sustainable Development - Data Catalogs and Interfaces - Python
Published by jrbourbeau over 1 year ago

earthaccess - v0.6.1

Hotfix: A version number was out of sync prior to the last release. This release brings all the version numbers in sync and enables a successful publish to PyPI.

Sustainable Development - Data Catalogs and Interfaces - Python
Published by MattF-NSIDC over 1 year ago

earthaccess - v0.6.0

bug fixes

  • earthaccess.search_datasets() and earthaccess.search_data() can find restricted datasets #296
  • distributed serialization fixed for EarthAccessFile #301 and #276

new features

  • earthaccess.get_s3fs_session() can use the results to find the right set of S3 credentials

Sustainable Development - Data Catalogs and Interfaces - Python
Published by mfisher87 over 1 year ago

earthaccess - v0.5.3

Enhancements

  • We can search by doi at the granule level, if a collection is found earthaccess will grab the concept_id from the CMR record and search using it.
  • We will be able to use pattern matching on the granule file names! closes #198 combining the two we could have searches like
results = earthaccess.search_data(
    doi = "10.5067/SLREF-CDRV3",
    granule_name = "2005-*.nc",
    count=100
)
  • If using remote Dask cluster, earthaccess will open the files using HTTPS links and will switch on the fly to S3 links if the cluster is in us-west-2 Thanks to @jrbourbeau! this change implemented a thin wrapper around fsspec.AbstractFileSystem

  • The granule representation removed the spatial output in favor of a simpler is_cloud_hosted until we have a nicer spatial formatter (it was a blob of json)

Bugs fixed

  • size() method for granules had a typo and returned 0 all the time, this was fixed
  • https sessions returned to trust_env=False with a True value the session will read the .netrc and send both simple auth and tokens at the same time causing an authentication error with most services.

Documentation improvements

  • Reorganized docs to include resources and end to end examples
  • README is now using the SSHA dataset from PODAAC as is simpler to explain and work with compared to ATL data, addresses #241
  • SSL and EMIT examples included in the documentation, they are executed end to end on CI
  • Added a minimal example of search_data() filtering thanks @andypbarrett!

CI Maintenance:

  • Integration tests are on a different file
  • Integration tests are going to run only on pushes to main
  • Documentation is only going to be updated when we update main
  • PODAAC migrated all their data to the cloud already so there is no point in having it on the on_prem tests

Contributors to this release

@MattF-NSIDC @jrbourbeau @mrocklin @andypbarrett @betolink

🚀

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 1 year ago

earthaccess - v0.5.2

  • deprecating Benedict as dict data structure in favor of just using the built-in Python dict. Thanks @psarka!
  • fixed NSIDC S3 credentials endpoint

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink about 2 years ago

earthaccess - v0.5.1

This release will fix #212 and implements more testing for Auth and S3Credentials endpoints. Eventually they are going to support bearer tokens but only ASF does at the moment.

  • Fix call to S3Credentials
  • Fix readthedocs
  • Removed python_magic from core dependencies (will fix Windows for conda)
  • Updated example notebooks to use the new top level API
  • Support EARTHDATA_USERNAME and EARTHADATA_PASSWORD same as in IcePyx (work in progress with @JessicaS11)
  • Once logged in we can access our profile (and email) with
auth = earthaccess.login()

profile = auth.user_profile
email = profile["email_address"]

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink about 2 years ago

earthaccess - v0.5.0

This release will fix some bugs and bring new capabilities to the top level API

import earthaccess

auth = earthaccess.login()

will automatically try all strategies, there is no need to specify one, if our credentials are not found it will ask the user to provide them interactively.

s3_credentials = earthaccess.get_s3_credentials(daac="PODAAC")
# use them with your fav library, e.g. boto3
# another thing we can do with our auth instance is to refresh our EDL tokens
auth.refresh_tokens()

We can also get authenticated fsspec sessions:

url = "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc"

fs = earthaccess.get_fsspec_https_session()
with fs.open(lpcloud_url) as f:
    data = f.read(10)
data

or we can use them in tandem with xarray/rioxarray

import xarray as xr

ds = xr.open_mfdataset(earthaccess.open([url]))
ds

This PR will fix #195 #187 and completes #167

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink about 2 years ago

earthaccess - v0.4.7 AGU Edition

Bug fixes:

  • direct access streaming: .open() now works with granules from results when we run the code in us-west-2
  • python-magic is a dev dependency, moved to the dev section in pyproject.toml

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 2 years ago

earthaccess - v0.4.6

This is the first formal release under the new name. 0.4.6 will be available in both pypi and conda-forge.

The first thing to mention is the new API notation that should evolve to support all the use cases,

import earthaccess

earthaccess.login(strategy="netrc")

granules = earthaccess.search_data(params)

earthaccess.download(granules, local_path= "./test")

is equivalent to

from earthdata import Store, Auth, DataGranules

auth = Auth()
auth.login(strategy="netrc")
store = Store(auth)

granules = DataGranules().params(params).get()

store.get(granules, local_path="./test")

We can still use the classes the same way but eventually we should support only module-level API.

Features:

  • search datasets by DOI, e.g.
datasets = earthaccess.search_datasets(
    doi="10.5067/AQR50-3Q7CS"
    cloud_hosted=True
)

searching by DOI should usually return only one dataset but I'm not sure what would happen if the same data is also in the cloud so to be sure we can use the cloud_hosted parameter if we want to operate on the AWS hosted version.

The documentation started to get updated and soon we should have a "gallery" with more examples of how to use the library.

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 2 years ago

earthaccess - earthaccess

First release under the new name, pypi was updated and the current earthaccess package installs v0.4.5, conda-forge is still pending.

The old notation is still supported, we can import the classes and instantiate them the same way but having a simpler notation is probably a better idea. From now on we can do the following:

import earthaccess

earthaccess.login(strategy="netrc")

granules = earthaccess.search_data(params)

earthaccess.download(granules, local_path= "./test")

and voila!

This is still beta and the though is that we can have a stable package starting on v0.5.0, we need to add more tests and deal with EULAs as they represent a big obstacle for programmatic access specially for new accounts with NASA.

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 2 years ago

earthaccess - v0.4.1

This is a minor release with some bug fixes but the last one with the old name. The next release will come with the earthaccess name.

  • store.get() had a bug when we used it with empty lists
  • GESDISC didn't have S3 credential endpoints
  • LP DAAC changed its S3 credential endpoint
  • documentation from super classes was not showing due a new change in mkdocstrings, had to re-implement the inherited members and call super()

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 2 years ago

earthaccess - v0.4.0

earthdata can now persist user's credentials into a .netrc file

from earthdata import Auth, DataCollections, DataGranules, Store

auth = Auth().login(strategy="netrc")
# are we authenticated?
if not auth.authenticated:
    # ask for credentials and persist them in a .netrc file
    auth.login(strategy="interactive", persist=True)

We can also renew our CMR token to make sure our authenticated queries work:

auth.refresh_token()
collections = DataCollections(auth).concept_id("c-some-restricted-dataset").get()

We can get authenticated fsspec file sessions. closes #41

store = Store(auth)

fs = store.get_https_session()
# we can use fsspec to get any granule from any DAAC!
fs.get("https://DAAC/granule", "./data")

We can use Store to get our files from a URL list. closes #43

store = Store(auth)
files = store.get(["https://GRANULE_URL"], "./data/")

Lastly, we can stream certain datasets directly into xarray (even if we are not in AWS)

%%time 
import xarray as xr

query_results =  DataGranules().concept_id("C2036880672-POCLOUD").temporal("2018-01-01", "2018-12-31").get()
ds = xr.open_mfdataset(store.open(query_results))
ds

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 2 years ago

earthaccess - v0.3.1 First usable version 🚀

This is probably the first usable version for earthdata

New features:

  • python-cmr:

    • it now uses the latest python-cmr version(NASA fork) which opens new possibilities for querying CMR. Soon, on top of datasets and data files(granules) also platforms and variables will be supported.
  • Documentation:

  • Authentication:

    • Auth can persist user credentials into a netrc file
    • Auth can refresh CMR tokens

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink about 3 years ago

earthaccess - Beta Release

Core features are now working making the library usable.

New features and improvements

  • The Auth class can now authenticate using a .netrc file or environment variables
  • Queries can be debugged with .debug(True)
  • cloud collections will return S3 links by default or HTTPS with .data_links(direct_s3=False)

Bug fixes

  • Date parses incomplete dates in a more predictable way

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 3 years ago

earthaccess - Initial Release

earthdata v0.1.1-alpha.0

Initial beta release of earthdata a client library for NASA CMR and EDL.

New features and improvements

  • Added simple classes to search and download collections and granules
  • Authentication is managed using an Auth class that gets the user's EDL credentials one time.
  • No need to use .netrc as all the calls from the client use the Auth session if provided.

Acknowledgments

  • NASA OpenScapes: A NASA funded project to support open science and scientific researchers using data from NASA Distributed Active Archive Centers (DAACs) as they migrate workflows to the cloud.

Sustainable Development - Data Catalogs and Interfaces - Python
Published by betolink over 3 years ago