GeoTessera
A foundation model that can process time-series satellite imagery for applications such as land classification and canopy height prediction.
https://github.com/ucam-eo/geotessera
Category: Natural Resources
Sub Category: Soil and Land
Keywords from Contributors
standards
Last synced: about 23 hours ago
JSON representation
Repository metadata
Python library for the Tessera embeddings
- Host: GitHub
- URL: https://github.com/ucam-eo/geotessera
- Owner: ucam-eo
- License: isc
- Created: 2025-07-02T10:24:37.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-11-30T17:08:38.000Z (26 days ago)
- Last Synced: 2025-12-01T11:14:34.262Z (25 days ago)
- Language: Python
- Size: 67.2 MB
- Stars: 185
- Watchers: 4
- Forks: 21
- Open Issues: 40
- Releases: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.md
- License: LICENSE.md
README.md
GeoTessera
Python library for accessing and working with Tessera geospatial foundation model embeddings.
Overview
GeoTessera provides access to geospatial embeddings from the Tessera
foundation model, which processes
Sentinel-1 and Sentinel-2 satellite imagery to generate 128-channel
representation maps at 10m resolution. These embeddings compress a full year of
temporal-spectral features into dense representations optimized for downstream
geospatial analysis tasks. Read more details about the model.

Request missing embeddings
This repo provides precomputed embeddings for multiple years and regions.
Embeddings are generated by randomly sampling tiles within each region to ensure broad spatial coverage.
If some years (2017β2024) / areas are still missing for your use case, please submit an Embedding Request:
- π Open an Embedding Request
- Please include: your organization, intended use, ROI as a bounding box with four points (lon,lat, 4 decimals), and the year(s).
After you submit the request, we will prioritize your ROI and notify you via a comment in the issue once the embeddings are ready.
A request for support
Due to limited compute resources, we're unable to fulfill embedding requests covering large geographic areas or requiring substantial processing time. To help us serve the community better, we kindly ask requestersβespecially those from commercial organizations or those requiring large-scale processingβto sponsor their requests by providing us with Azure credits. Importantly, the resulting outputs will be contributed to our global embeddings database, making them freely available for the entire research and user community. This approach allows us to scale our service while building a shared resource that benefits everyone. If you are in a position to support us in this way, please contact Prof. S.Keshav at sk818@cam.ac.uk. We greatly appreciate your understanding and support in making Tessera more accessible to all.
Table of Contents
- Installation
- Architecture
- Quick Start
- Python API
- CLI Reference
- Complete Workflows
- Registry System
- Data Organization
- Contributing
Installation
pip install geotessera
For development:
git clone https://github.com/ucam-eo/geotessera
cd geotessera
pip install -e .
Architecture
Core Concepts
GeoTessera is built around a simple two-step workflow:
- Retrieve embeddings: Fetch raw numpy arrays for a geographic bounding box
- Export to desired format: Save as raw numpy arrays or convert to georeferenced GeoTIFF files
Coordinate System and Tile Grid
The Tessera embeddings use a 0.1-degree grid system:
- Tile size: Each tile covers 0.1Β° Γ 0.1Β° (approximately 11km Γ 11km at the equator)
- Tile naming: Tiles are named by their center coordinates (e.g.,
grid_0.15_52.05) - Tile bounds: A tile at center (lon, lat) covers:
- Longitude: [lon - 0.05Β°, lon + 0.05Β°]
- Latitude: [lat - 0.05Β°, lat + 0.05Β°]
- Resolution: 10m per pixel (variable number of pixels per tile depending on latitude)
File Structure and Downloads
When you request embeddings, GeoTessera downloads several files via Pooch:
Embedding Files (via fetch_embedding)
-
Quantized embeddings (
grid_X.XX_Y.YY.npy):- Shape:
(height, width, 128) - Data type: int8 (quantized for storage efficiency)
- Contains the compressed embedding values
- Shape:
-
Scale files (
grid_X.XX_Y.YY_scales.npy):- Shape:
(height, width)or(height, width, 128) - Data type: float32
- Contains scale factors for dequantization
- Shape:
-
Dequantization:
final_embedding = quantized_embedding * scales
Landmask Files (for GeoTIFF export)
When exporting to GeoTIFF, additional landmask files are fetched:
- Landmask tiles (
grid_X.XX_Y.YY.tiff):- Provide UTM projection information
- Define precise geospatial transforms
- Contain land/water masks
Data Flow
User Request (lat/lon bbox)
β
Registry Lookup (find available tiles)
β
Download Files (via Pooch with caching)
βββ embedding.npy (quantized)
βββ embedding_scales.npy
β
Dequantization (multiply arrays)
β
Output Format
βββ NumPy arrays β Direct analysis
βββ GeoTIFF β GIS integration
Quick Start
Check Available Data
Before downloading, check what data is available:
# Generate a coverage map showing all available tiles
geotessera coverage --output coverage_map.png
# Generate a coverage map for the UK
geotessera coverage --country uk
# View coverage for a specific year
geotessera coverage --year 2024 --output coverage_2024.png
# Customize the visualization
geotessera coverage --year 2024 --tile-color blue --tile-alpha 0.3 --dpi 150
Download Embeddings
Download embeddings as either numpy arrays or GeoTIFF files:
# Download as GeoTIFF (default, with georeferencing)
geotessera download \
--bbox "-0.2,51.4,0.1,51.6" \
--year 2024 \
--output ./london_tiffs
# Download as raw numpy arrays (with metadata JSON)
geotessera download \
--bbox "-0.2,51.4,0.1,51.6" \
--format npy \
--year 2024 \
--output ./london_arrays
# Download using a GeoJSON/Shapefile region
geotessera download \
--region-file cambridge.geojson \
--format tiff \
--year 2024 \
--output ./cambridge_tiles
# Download specific bands only
geotessera download \
--bbox "-0.2,51.4,0.1,51.6" \
--bands "0,1,2" \
--year 2024 \
--output ./london_rgb
Create Visualizations
Generate web maps from downloaded GeoTIFFs:
# Create an interactive web map
geotessera visualize \
./london_tiffs \
--type web \
--output ./london_web
# Create an RGB mosaic
geotessera visualize \
./london_tiffs \
--type rgb \
--bands "30,60,90" \
--output ./london_rgb
# Serve the web map locally
geotessera serve ./london_web --open
Python API
Core Methods
The library provides two main methods for retrieving embeddings:
from geotessera import GeoTessera
# Initialize the client
gt = GeoTessera()
# Method 1: Fetch a single tile
embedding, crs, transform = gt.fetch_embedding(lon=0.15, lat=52.05, year=2024)
print(f"Shape: {embedding.shape}") # e.g., (1200, 1200, 128)
print(f"CRS: {crs}") # Coordinate reference system from landmask
# Method 2: Fetch all tiles in a bounding box
bbox = (-0.2, 51.4, 0.1, 51.6) # (min_lon, min_lat, max_lon, max_lat)
embeddings = gt.fetch_embeddings(bbox, year=2024)
for tile_lon, tile_lat, embedding_array, crs, transform in embeddings:
print(f"Tile ({tile_lat}, {tile_lon}): {embedding_array.shape}")
Export Formats
Export as GeoTIFF
# Export embeddings for a region as individual GeoTIFF files
files = gt.export_embedding_geotiffs(
bbox=(-0.2, 51.4, 0.1, 51.6),
output_dir="./output",
year=2024,
bands=None, # Export all 128 bands (default)
compress="lzw" # Compression method
)
print(f"Created {len(files)} GeoTIFF files")
# Export specific bands only (e.g., first 3 for RGB visualization)
files = gt.export_embedding_geotiffs(
bbox=(-0.2, 51.4, 0.1, 51.6),
output_dir="./rgb_output",
year=2024,
bands=[0, 1, 2] # Only export first 3 bands
)
Work with NumPy Arrays
# Fetch and process embeddings directly
embeddings = gt.fetch_embeddings(bbox, year=2024)
for lon, lat, embedding, crs, transform in embeddings:
# Compute statistics
mean_values = np.mean(embedding, axis=(0, 1)) # Mean per channel
std_values = np.std(embedding, axis=(0, 1)) # Std per channel
# Extract specific pixels
center_pixel = embedding[embedding.shape[0]//2, embedding.shape[1]//2, :]
# Apply custom processing
processed = your_analysis_function(embedding)
Visualization Functions
from geotessera.visualization import (
create_rgb_mosaic,
visualize_global_coverage
)
from geotessera.web import (
create_coverage_summary_map,
geotiff_to_web_tiles
)
# Create an RGB mosaic from multiple GeoTIFF files
create_rgb_mosaic(
geotiff_paths=["tile1.tif", "tile2.tif"],
output_path="mosaic.tif",
bands=(0, 1, 2) # RGB bands
)
# Generate web tiles for interactive maps
geotiff_to_web_tiles(
geotiff_path="mosaic.tif",
output_dir="./web_tiles",
zoom_levels=(8, 15)
)
# Create a global coverage visualization
visualize_global_coverage(
tessera_client=gt,
output_path="global_coverage.png",
year=2024, # Or None for all years
width_pixels=2000,
tile_color="red",
tile_alpha=0.6
)
CLI Reference
download
Download embeddings for a region in your preferred format:
geotessera download [OPTIONS]
Options:
-o, --output PATH Output directory [required]
--bbox TEXT Bounding box: 'min_lon,min_lat,max_lon,max_lat'
--region-file PATH GeoJSON/Shapefile to define region
-f, --format TEXT Output format: 'tiff' or 'npy' (default: tiff)
--year INT Year of embeddings (default: 2024)
--bands TEXT Comma-separated band indices (default: all 128)
--compress TEXT Compression for TIFF format (default: lzw)
--list-files List all created files with details
-v, --verbose Verbose output
Output formats:
- tiff: Georeferenced GeoTIFF files with UTM projection
- npy: Raw numpy arrays with metadata.json file
visualize
Create visualizations from GeoTIFF files:
geotessera visualize INPUT_PATH [OPTIONS]
Options:
-o, --output PATH Output directory [required]
--type TEXT Visualization type: rgb, web, coverage
--bands TEXT Comma-separated band indices for RGB
--normalize Normalize bands
--min-zoom INT Min zoom for web tiles (default: 8)
--max-zoom INT Max zoom for web tiles (default: 15)
--force Force regeneration of tiles
coverage
Generate a world map showing data availability:
geotessera coverage [OPTIONS]
Options:
-o, --output PATH Output PNG file (default: tessera_coverage.png)
--year INT Specific year to visualize
--tile-color TEXT Color for tiles (default: red)
--tile-alpha FLOAT Transparency 0-1 (default: 0.6)
--tile-size FLOAT Size multiplier (default: 1.0)
--dpi INT Output resolution (default: 100)
--width INT Figure width in inches (default: 20)
--height INT Figure height in inches (default: 10)
--no-countries Don't show country boundaries
serve
Serve web visualizations locally:
geotessera serve DIRECTORY [OPTIONS]
Options:
-p, --port INT Port number (default: 8000)
--open/--no-open Auto-open browser (default: open)
--html TEXT Specific HTML file to serve
info
Display information about GeoTIFF files or the library:
geotessera info [OPTIONS]
Options:
--geotiffs PATH Analyze GeoTIFF files/directory
--dataset-version TEXT Tessera dataset version
-v, --verbose Verbose output
Registry System
Overview
GeoTessera uses a registry system to efficiently manage and access the large Tessera dataset:
- Block-based organization: Registry divided into 5Γ5 degree geographic blocks
- Lazy loading: Only loads registry blocks for the region you're accessing
- Automatic caching: Downloads are cached locally using Pooch
- Integrity checking: SHA256 checksums ensure data integrity
Registry Sources
The registry can be loaded from multiple sources (in priority order):
- Local directory (via
--registry-dirorregistry_dirparameter) - Environment variable (
TESSERA_REGISTRY_DIR) - Auto-cloned repository (default, from GitHub)
# Use local registry
gt = GeoTessera(registry_dir="/path/to/tessera-manifests")
# Use auto-updating registry
gt = GeoTessera(auto_update=True)
# Use custom manifest repository
gt = GeoTessera(
manifests_repo_url="https://github.com/your-org/custom-manifests.git"
)
Registry Structure
tessera-manifests/
βββ registry/
βββ embeddings/
β βββ embeddings_2024_lon-5_lat50.txt # 5Γ5Β° block
β βββ embeddings_2024_lon0_lat50.txt
β βββ ...
βββ landmasks/
βββ landmasks_lon-5_lat50.txt
βββ landmasks_lon0_lat50.txt
βββ ...
Each registry file contains:
# Pooch registry format
filepath SHA256checksum
2024/grid_0.15_52.05/grid_0.15_52.05.npy sha256:abc123...
2024/grid_0.15_52.05/grid_0.15_52.05_scales.npy sha256:def456...
How Registry Loading Works
- Request tiles for bbox β Determine which 5Γ5Β° blocks overlap
- Load block registries β Parse only the needed registry files
- Find available tiles β List tiles within the requested region
- Fetch via Pooch β Download with caching and integrity checks
Data Organization
Tessera Data Structure
Remote Server (dl-2.tessera.wiki)
βββ v1/ # Dataset version
β βββ 2024/ # Year
β β βββ grid_0.15_52.05/ # Tile (named by center coords)
β β β βββ grid_0.15_52.05.npy # Quantized embeddings
β β β βββ grid_0.15_52.05_scales.npy # Scale factors
β β βββ ...
β βββ landmasks/
β βββ grid_0.15_52.05.tiff # Landmask with projection info
β βββ ...
Local Cache Structure
~/.cache/geotessera/ # Default cache location
βββ tessera-manifests/ # Auto-cloned registry
β βββ registry/
βββ pooch/ # Downloaded data files
β βββ grid_0.15_52.05.npy
β βββ grid_0.15_52.05_scales.npy
β βββ ...
Coordinate Reference Systems
- Embeddings: Stored in simple arrays, referenced by center coordinates
- GeoTIFF exports: Use UTM projection from corresponding landmask tiles
- Web visualizations: Reprojected to Web Mercator (EPSG:3857)
Environment Variables
# Set custom cache directory for downloaded files
export TESSERA_DATA_DIR=/path/to/cache
# Use local registry directory
export TESSERA_REGISTRY_DIR=/path/to/tessera-manifests
# Configure per-command
TESSERA_DATA_DIR=/tmp/cache geotessera download ...
Contributing
Contributions are welcome! Please see our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use Tessera in your research, please cite the arXiv paper:
@misc{feng2025tesseratemporalembeddingssurface,
title={TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis},
author={Zhengpeng Feng and Clement Atzberger and Sadiq Jaffer and Jovana Knezevic and Silja Sormunen and Robin Young and Madeline C Lisaius and Markus Immitzer and David A. Coomes and Anil Madhavapeddy and Andrew Blake and Srinivasan Keshav},
year={2025},
eprint={2506.20380},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2506.20380},
}
Links
Star History
Owner metadata
- Name: Cambridge Centre for Earth Observation
- Login: ucam-eo
- Email:
- Kind: organization
- Description: Cambridge University and partner organisations are conducting research on the remote sensing of environmental change
- Website: https://eo.conservation.cam.ac.uk
- Location: United Kingdom
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/217043827?v=4
- Repositories: 1
- Last ynced at: 2025-06-27T11:47:06.801Z
- Profile URL: https://github.com/ucam-eo
GitHub Events
Total
- Issues event: 43
- Watch event: 87
- Member event: 1
- Issue comment event: 68
- Push event: 93
- Pull request event: 13
- Fork event: 10
- Create event: 10
Last Year
- Issues event: 43
- Watch event: 87
- Member event: 1
- Issue comment event: 68
- Push event: 93
- Pull request event: 13
- Fork event: 10
- Create event: 10
Committers metadata
Last synced: about 2 months ago
Total Commits: 169
Total Committers: 9
Avg Commits per committer: 18.778
Development Distribution Score (DDS): 0.189
Commits in past year: 169
Committers in past year: 9
Avg Commits per committer in past year: 18.778
Development Distribution Score (DDS) in past year: 0.189
| Name | Commits | |
|---|---|---|
| Anil Madhavapeddy | a****l@r****g | 137 |
| frankfeng | f****g@d****e | 14 |
| Robin Young | 5****g | 6 |
| Sadiq Jaffer | s****q@t****m | 3 |
| Frank Feng | 6****3 | 3 |
| E-Ping Rau | e****s@g****m | 2 |
| Nicolas Karasiak | n****k@e****m | 2 |
| Srinivasan Keshav | 6****8 | 1 |
| GitHub Actions Bot | a****s@g****m | 1 |
Committer domains:
- github.com: 1
- earthdaily.com: 1
- toao.com: 1
- recoil.org: 1
Issue and Pull Request metadata
Last synced: about 2 months ago
Total issues: 44
Total pull requests: 8
Average time to close issues: 5 days
Average time to close pull requests: 4 days
Total issue authors: 39
Total pull request authors: 6
Average comments per issue: 1.11
Average comments per pull request: 0.75
Merged pull request: 5
Bot issues: 0
Bot pull requests: 0
Past year issues: 44
Past year pull requests: 8
Past year average time to close issues: 5 days
Past year average time to close pull requests: 4 days
Past year issue authors: 39
Past year pull request authors: 6
Past year average comments per issue: 1.11
Past year average comments per pull request: 0.75
Past year merged pull request: 5
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- epingchris (3)
- barbarametzler (2)
- kt-sa7716 (2)
- ratsakatika (2)
- rbnyng (1)
- DalelanW (1)
- sampathyetiraj-cpu (1)
- CBonannella (1)
- JBehanRio (1)
- jfprieur (1)
- RossDF (1)
- Rudigithub12345 (1)
- jdoblas (1)
- miquel-espinosa (1)
- yoshitos (1)
Top Pull Request Authors
- avsm (3)
- sadiqj (1)
- epingchris (1)
- nkarasiak (1)
- olli4 (1)
- rbnyng (1)
Top Issue Labels
- embedding-request (32)
- enhancement (2)
- bug (1)
Top Pull Request Labels
Package metadata
- Total packages: 1
-
Total downloads:
- pypi: 698 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 8
- Total maintainers: 2
pypi.org: geotessera
Python library interface to the Tessera geofoundation model embeddings
- Homepage: https://github.com/ucam-eo/geotessera
- Documentation: https://geotessera.readthedocs.io
- Licenses: ISC License Copyright 2025 Anil Madhavapeddy <anil@recoil.org> Copyright 2025 Frank Feng Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
- Latest release: 0.6.0 (published 3 months ago)
- Last Synced: 2025-10-30T08:52:17.060Z (about 2 months ago)
- Versions: 8
- Dependent Packages: 0
- Dependent Repositories: 0
- Downloads: 698 Last month
-
Rankings:
- Dependent packages count: 8.836%
- Average: 29.312%
- Dependent repos count: 49.788%
- Maintainers (2)
Dependencies
- matplotlib *
- numpy *
- pooch *
- tqdm >=4.67.1
- actions/checkout v4 composite
- astral-sh/setup-uv v5 composite
- stefanzweifel/git-auto-commit-action v5 composite
- affine 2.4.0
- alabaster 1.0.0
- attrs 25.3.0
- babel 2.17.0
- certifi 2025.6.15
- charset-normalizer 3.4.2
- click 8.2.1
- click-plugins 1.1.1.2
- cligj 0.7.2
- colorama 0.4.6
- contourpy 1.3.2
- cycler 0.12.1
- docutils 0.21.2
- fonttools 4.58.5
- geopandas 1.1.1
- geotessera 0.2.0
- idna 3.10
- imagesize 1.4.1
- jinja2 3.1.6
- kiwisolver 1.4.8
- markdown-it-py 3.0.0
- markupsafe 3.0.2
- matplotlib 3.10.3
- mdurl 0.1.2
- numpy 2.3.1
- packaging 25.0
- pandas 2.3.0
- pillow 11.3.0
- platformdirs 4.3.8
- pooch 1.8.2
- pygments 2.19.2
- pyogrio 0.11.0
- pyparsing 3.2.3
- pyproj 3.7.1
- python-dateutil 2.9.0.post0
- pytz 2025.2
- rasterio 1.4.3
- requests 2.32.4
- rich 14.0.0
- roman-numerals-py 3.1.0
- shapely 2.1.1
- six 1.17.0
- snowballstemmer 3.0.1
- sphinx 8.2.3
- sphinxcontrib-applehelp 2.0.0
- sphinxcontrib-devhelp 2.0.0
- sphinxcontrib-htmlhelp 2.1.0
- sphinxcontrib-jsmath 1.0.1
- sphinxcontrib-qthelp 2.0.0
- sphinxcontrib-serializinghtml 2.0.0
- tqdm 4.67.1
- tzdata 2025.2
- urllib3 2.5.0
Score: 14.164405314584045