satlas
Aims to provide open AI-generated geospatial data that is highly accurate, available globally, and updated on a frequent (monthly) basis.
https://github.com/allenai/satlas
Category: Sustainable Development
Sub Category: Data Catalogs and Interfaces
Last synced: about 10 hours ago
JSON representation
Repository metadata
- Host: GitHub
- URL: https://github.com/allenai/satlas
- Owner: allenai
- License: apache-2.0
- Created: 2023-01-06T19:31:26.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-09-24T16:45:12.000Z (7 months ago)
- Last Synced: 2025-04-25T12:45:54.038Z (2 days ago)
- Language: Python
- Size: 397 KB
- Stars: 235
- Watchers: 10
- Forks: 30
- Open Issues: 3
- Releases: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
README.md
Satlas: Open AI-Generated Geospatial Data
Satlas aims to provide open AI-generated geospatial data that is highly accurate, available globally, and updated on a frequent (monthly) basis.
For an introduction to Satlas, see https://satlas.allen.ai/.
Quick links:
- Download SatlasPretrain, our large-scale remote sensing dataset.
- Download and fine-tune our foundation models for remote sensing. These models are pre-trained on SatlasPretrain.
- Download the AI-generated geospatial data in Satlas for offline analysis.
- Access Satlas super-resolution data and code
- View ongoing Satlas projects
Overview
The AI-generated geospatial data in Satlas is computed by applying deep learning models on Sentinel-2 satellite imagery, which is open imagery released by the European Space Agency.
The images are relatively low-resolution, at 10 m/pixel, but captured frequently---the bulk of Earth's land mass is imaged weekly by Sentinel-2. We retrieve these images and update the geospatial data products on a monthly basis.
Training Data and Models
The models in Satlas are developed in four phases:
- Pre-train models on SatlasPretrain.
- Annotate high-quality task-specific training labels.
- Fine-tune models on the task-specific labels.
- Test the models on the whole world, and iterate on the training data until the models provide high accuracy.
SatlasPretrain
SatlasPretrain is a large-scale remote sensing image understanding dataset appearing in ICCV 2023.
It contains 302M labels under 137 categories, collected through a combination of crowdsourced annotation and processing existing data sources like OpenStreetMap.
Pre-training on SatlasPretrain helps to improve the downstream performance of our models when fine-tuning on the smaller sets of task-specific labels.
See https://satlas-pretrain.allen.ai/ for more information on SatlasPretrain. You can also download the dataset or download and fine-tune the pre-trained models.
Task-Specific Labels and Model Weights
The fine-tuning training data and model weights can be downloaded at https://pub-956f3eb0f5974f37b9228e0a62f449bf.r2.dev/satlas_explorer_datasets/satlas_explorer_datasets_2023-07-24.tar.
This download link contains an archive with four folders:
base_models/
contains models trained on SatlasPretrain that are used as initialization for fine-tuning.labels/
contains the fine-tuning task-specific training data.models/
contains the trained model weights.splits/
contains metadata about the training and validation splits.
There is also a smaller (1.5 GB) download with just the model weights.
See Using the Code below for details on training and applying models.
The format of the task-specific datasets is described in DatasetSpec.md.
The models are trained to make predictions from multiple Sentinel-2 images.
They first extract features from each image independently through a Swin Transformer.
They then apply temporal max pooling on corresponding feature maps at each of four resolutions.
The pooled feature maps are then passed to task-specific heads to make predictions.
See ModelArchitecture.md for more details.
AI-Generated Geospatial Data
The AI-generated geospatial data in Satlas can be downloaded here for offline analysis.
We have evaluated the accuracy of each model in terms of their precision and recall on each continent. View the Data Validation Report here.
Using the Code
Here we describe using the code for the task-specific training data. For using the code for pre-training models on SatlasPretrain, click here.
Training and Validation
First clone this repository and extract the training data to a subfolder called satlas_explorer_datasets
:
git clone https://github.com/allenai/satlas
cd satlas
wget https://pub-956f3eb0f5974f37b9228e0a62f449bf.r2.dev/satlas_explorer_datasets/satlas_explorer_datasets_2023-07-24.tar
tar xvf satlas_explorer_datasets_2023-07-24.tar
Run training if desired (this will overwrite the models extracted from the tar download):
python -m satlas.cmd.model.train --config_path configs/satlas_explorer_wind_turbine.txt
python -m satlas.cmd.model.train --config_path configs/satlas_explorer_solar_farm.txt
python -m satlas.cmd.model.train --config_path configs/satlas_explorer_marine_infrastructure.txt
python -m satlas.cmd.model.train --config_path configs/satlas_explorer_tree_cover.txt
Compute precision and recall stats on the validation data:
python -m satlas.cmd.model.infer --config_path configs/satlas_explorer_wind_turbine.txt --details
python -m satlas.cmd.model.infer --config_path configs/satlas_explorer_solar_farm.txt --details
python -m satlas.cmd.model.infer --config_path configs/satlas_explorer_marine_infrastructure.txt --details
python -m satlas.cmd.model.infer --config_path configs/satlas_explorer_tree_cover.txt --details
Inference on Custom Images
See guide on applying Satlas/SatlasPretrain models on custom images.
Contact
If you have feedback about the code, data, or models, or if you would like to see new types of geospatial data that are feasible to produce from Sentinel-2 imagery,
you can contact us by opening an issue or via e-mail at [email protected].
Owner metadata
- Name: AI2
- Login: allenai
- Email: [email protected]
- Kind: organization
- Description:
- Website: http://www.allenai.org
- Location: Seattle, WA
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/5667695?v=4
- Repositories: 454
- Last ynced at: 2024-04-14T22:06:46.803Z
- Profile URL: https://github.com/allenai
GitHub Events
Total
- Issues event: 13
- Watch event: 39
- Issue comment event: 19
- Fork event: 7
Last Year
- Issues event: 13
- Watch event: 39
- Issue comment event: 19
- Fork event: 7
Committers metadata
Last synced: 7 days ago
Total Commits: 54
Total Committers: 6
Avg Commits per committer: 9.0
Development Distribution Score (DDS): 0.352
Commits in past year: 21
Committers in past year: 5
Avg Commits per committer in past year: 4.2
Development Distribution Score (DDS) in past year: 0.476
Name | Commits | |
---|---|---|
Favyen Bastani | f****b@a****g | 35 |
Favyen Bastani | f****i@p****m | 12 |
Piper Wolters | p****w@p****n | 4 |
Piper Wolters | p****w@p****n | 1 |
Piper Wolters | p****w@p****n | 1 |
Favyen Bastani | 9****2 | 1 |
Committer domains:
- prior-cirrascale-89.reviz.ai2.in: 1
- prior-cirrascale-64.reviz.ai2.in: 1
- prior-cirrascale-65.reviz.ai2.in: 1
- perennate.com: 1
- allenai.org: 1
Issue and Pull Request metadata
Last synced: 1 day ago
Total issues: 57
Total pull requests: 2
Average time to close issues: 18 days
Average time to close pull requests: about 3 hours
Total issue authors: 37
Total pull request authors: 2
Average comments per issue: 2.09
Average comments per pull request: 0.0
Merged pull request: 2
Bot issues: 0
Bot pull requests: 0
Past year issues: 15
Past year pull requests: 0
Past year average time to close issues: 5 days
Past year average time to close pull requests: N/A
Past year issue authors: 11
Past year pull request authors: 0
Past year average comments per issue: 2.2
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- robmarkcole (8)
- rbavery (6)
- Randomdude11 (4)
- alimkarimi (2)
- ShreelekhaR (2)
- moonboy12138 (2)
- srinify (2)
- ando-shah (2)
- ShileiCao (1)
- pyaada (1)
- samar-khanna (1)
- oguzhannysr (1)
- AlexeySudakovB01-109 (1)
- HuangShiqi128 (1)
- schmmd (1)
Top Pull Request Authors
- favyen2 (1)
- piperwolters (1)
Top Issue Labels
Top Pull Request Labels
Dependencies
- eyediagram *
- numpy *
- rasterio *
- scikit-image *
- scipy *
- shapely *
- torch *
- torchaudio *
- torchvision *
- tqdm *
- vit_pytorch *
Score: 7.26403014289953