WildlifeDatasets
Pipeline for wildlife re-identification including dataset zoo, training tools and trained models.
https://github.com/WildlifeDatasets/wildlife-datasets
Category: Biosphere
Sub Category: Terrestrial Wildlife
Keywords
dataset datasets deep-learning ecology ecology-modelling machine-learning
Last synced: about 17 hours ago
JSON representation
Repository metadata
WildlifeDatasets: An open-source toolkit for animal re-identification
- Host: GitHub
- URL: https://github.com/WildlifeDatasets/wildlife-datasets
- Owner: WildlifeDatasets
- License: mit
- Created: 2022-10-05T12:57:28.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2025-12-17T11:10:22.000Z (8 days ago)
- Last Synced: 2025-12-22T08:21:50.457Z (3 days ago)
- Topics: dataset, datasets, deep-learning, ecology, ecology-modelling, machine-learning
- Language: Jupyter Notebook
- Homepage: https://wildlifedatasets.github.io/wildlife-datasets/
- Size: 253 MB
- Stars: 142
- Watchers: 2
- Forks: 21
- Open Issues: 0
- Releases: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
README.md
| Dataset for identification of individual animals | Trained model for individual re‑identification | Tools for training re‑identification models |
Wildlife Re-Identification (Re-ID) Datasets
The aim of the project is to provide a comprehensive overview of datasets for wildlife individual re-identification and an easy-to-use package for developers of machine learning methods. The core functionality includes:
- overview of 50 publicly available wildlife re-identification datasets and 2 metadatasets.
- utilities to mass download and convert them into a unified format and fix some wrong labels.
- default splits for several machine learning tasks including the ability to create additional splits.
- synergy with WildlifeTools used for training ML models.
An introductory example is provided in a Jupyter notebook. The package provides a natural synergy with WildlifeTools, which provides our MegaDescriptor model and tools for training neural networks.
Do you know about any animal re-identification dataset which is not included? Post it to the discussion forum please.
Changelog
[18/08/2025] Reached 50 datasets by adding BristolGorillas2020 (primates), CattleMuzzle, CoBRAReIdentificationYoungstock, HolsteinCattleRecognition (cows), CzechLynx (lynxes) and WildRaptorID (eagles).
[14/04/2025] Added AnimalCLEF2025, WildlifeReID-10k (unifications of multiple datasets), MultiCamCows2024 (cows) and PrimFace (primates).
[31/10/2024] Added AmvrakikosTurtles, ReunionTurtles, SouthernProvinceTurtles, ZakynthosTurtles (sea turtles), ELPephants (elephants) and Chicks4FreeID (chickens).
[09/05/2024] Added CatIndividualImages (cats), CowDataset (cows) and DogFaceNet (dogs).
[28/02/2024] Added MPDD (dogs), PolarBearVidID (polar bears) and SeaStarReID2023 (sea stars).
[04/01/2024] Received Best paper award at WACV 2024.
Summary of datasets
An overview of the provided datasets is available in the documentation. We include basic characteristics such as publication years, number of images, number of individuals, dataset time spans (difference between the last and first image taken) and additional information such as source, number of poses, inclusion of timestamps, whether the animals were captured in the wild and whether the dataset contains multiple species.
MetaDatasets
Datasets
Installation
The installation of the package is simple by
pip install wildlife-datasets
Adding new datasets
WildlifeDatasets are meant as a community effort to provide an easy access to wildlife re-identification datasets. New datasets may be easily added as described in the documentation.
Basic functionality
We show an example of downloading, extracting and processing the MacaqueFaces dataset.
from wildlife_datasets import analysis, datasets
datasets.MacaqueFaces.get_data('data/MacaqueFaces')
dataset = datasets.MacaqueFaces('data/MacaqueFaces')
The class dataset contains the summary of the dataset. The content depends on the dataset. Each dataset contains the identity and paths to images. This particular dataset also contains information about the date taken and contrast. Other datasets store information about bounding boxes, segmentation masks, position from which the image was taken, keypoints or various other information such as age or gender.
dataset.df
The dataset also contains basic metadata including information about the number of individuals, time span, licences or published year.
dataset.summary
This particular dataset already contains cropped images of faces. Other datasets may contain uncropped images with bounding boxes or even segmentation masks.
dataset.plot_grid()

Additional functionality
For additional functionality including mass loading, datasets splitting or evaluation metrics we refer to the documentation or the notebooks.
Additional datasets
For a list of additional datasets not included in WidlifeDatasets, see this webpage.
Citation
If you like our package, please cite our paper. You may be also interested in our SeaTurtleID2022 dataset published in another paper.
@InProceedings{Cermak_2024_WACV,
author = {\v{C}erm\'ak, Vojt\v{e}ch and Picek, Luk\'a\v{s} and Adam, Luk\'a\v{s} and Papafitsoros, Kostas},
title = {{WildlifeDatasets: An Open-Source Toolkit for Animal Re-Identification}},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2024},
pages = {5953-5963}
}
Owner metadata
- Name: WildlifeDatasets
- Login: WildlifeDatasets
- Email:
- Kind: organization
- Description:
- Website:
- Location:
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/115085642?v=4
- Repositories: 1
- Last ynced at: 2023-03-25T20:55:01.880Z
- Profile URL: https://github.com/WildlifeDatasets
GitHub Events
Total
- Create event: 5
- Release event: 3
- Issues event: 7
- Watch event: 57
- Delete event: 3
- Member event: 1
- Issue comment event: 8
- Push event: 100
- Pull request event: 6
- Fork event: 10
Last Year
- Create event: 4
- Issues event: 4
- Release event: 2
- Watch event: 42
- Delete event: 3
- Issue comment event: 2
- Member event: 1
- Push event: 45
- Pull request event: 6
- Fork event: 9
Committers metadata
Last synced: 11 days ago
Total Commits: 841
Total Committers: 7
Avg Commits per committer: 120.143
Development Distribution Score (DDS): 0.493
Commits in past year: 140
Committers in past year: 4
Avg Commits per committer in past year: 35.0
Development Distribution Score (DDS) in past year: 0.064
| Name | Commits | |
|---|---|---|
| sadda | l****r@g****m | 426 |
| adamluk3 | a****3@l****z | 337 |
| cermavo3 | c****3@l****z | 46 |
| Vojtech Cermak | c****h@s****z | 23 |
| Karl Ahrendsen | k****n@g****m | 4 |
| white-richard | r****5@y****m | 3 |
| Lukas Picek | l****k@g****m | 2 |
Committer domains:
- login1.rci.cvut.cz: 2
- seznam.cz: 1
Issue and Pull Request metadata
Last synced: 8 days ago
Total issues: 6
Total pull requests: 5
Average time to close issues: 4 months
Average time to close pull requests: 12 days
Total issue authors: 5
Total pull request authors: 3
Average comments per issue: 1.83
Average comments per pull request: 0.0
Merged pull request: 3
Bot issues: 0
Bot pull requests: 0
Past year issues: 1
Past year pull requests: 5
Past year average time to close issues: 5 days
Past year average time to close pull requests: 12 days
Past year issue authors: 1
Past year pull request authors: 3
Past year average comments per issue: 2.0
Past year average comments per pull request: 0.0
Past year merged pull request: 3
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- mfruhner (2)
- zhoumu53 (1)
- ahrendsen (1)
- VojtechCermak (1)
- MatthiasZuerl (1)
Top Pull Request Authors
- ahrendsen (2)
- picekl (2)
- RoberAlcaraz (1)
Top Issue Labels
Top Pull Request Labels
Package metadata
- Total packages: 3
-
Total downloads:
- pypi: 581 last-month
- Total dependent packages: 1 (may contain duplicates)
- Total dependent repositories: 0 (may contain duplicates)
- Total versions: 69
- Total maintainers: 2
proxy.golang.org: github.com/WildlifeDatasets/wildlife-datasets
- Homepage:
- Documentation: https://pkg.go.dev/github.com/WildlifeDatasets/wildlife-datasets#section-documentation
- Licenses: mit
- Latest release: v1.0.7 (published 4 months ago)
- Last Synced: 2025-12-21T22:07:15.248Z (4 days ago)
- Versions: 8
- Dependent Packages: 0
- Dependent Repositories: 0
-
Rankings:
- Dependent packages count: 5.395%
- Average: 5.576%
- Dependent repos count: 5.758%
proxy.golang.org: github.com/wildlifedatasets/wildlife-datasets
- Homepage:
- Documentation: https://pkg.go.dev/github.com/wildlifedatasets/wildlife-datasets#section-documentation
- Licenses: mit
- Latest release: v1.0.7 (published 4 months ago)
- Last Synced: 2025-12-21T22:07:15.738Z (4 days ago)
- Versions: 8
- Dependent Packages: 0
- Dependent Repositories: 0
-
Rankings:
- Dependent packages count: 5.395%
- Average: 5.576%
- Dependent repos count: 5.758%
pypi.org: wildlife-datasets
Library for easier access and research of wildlife re-identification datasets
- Homepage: https://github.com/WildlifeDatasets/wildlife-datasets
- Documentation: https://wildlifedatasets.github.io/wildlife-datasets/
- Licenses: mit
- Latest release: 1.0.7 (published 4 months ago)
- Last Synced: 2025-12-21T22:07:14.285Z (4 days ago)
- Versions: 53
- Dependent Packages: 1
- Dependent Repositories: 0
- Downloads: 581 Last month
-
Rankings:
- Downloads: 5.006%
- Dependent packages count: 6.633%
- Average: 20.189%
- Stargazers count: 28.203%
- Forks count: 30.492%
- Dependent repos count: 30.611%
- Maintainers (2)
Score: 13.2716381894848