BatteryLife
A Comprehensive Dataset and Benchmark for Battery Life Prediction.
https://github.com/ruifeng-tan/batterylife
Category: Energy Storage
Sub Category: Battery
Last synced: about 11 hours ago
JSON representation
Repository metadata
The official BatteryLife repository
- Host: GitHub
- URL: https://github.com/ruifeng-tan/batterylife
- Owner: Ruifeng-Tan
- License: mit
- Created: 2025-02-21T02:13:04.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-03-16T09:20:32.000Z (8 days ago)
- Last Synced: 2026-03-16T13:37:20.114Z (8 days ago)
- Language: Jupyter Notebook
- Size: 94.8 MB
- Stars: 234
- Watchers: 3
- Forks: 35
- Open Issues: 4
- Releases: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
README.md
(KDD 2025) BatteryLife
This is the official repository for BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction. If you find this repository useful, we would appreciate citations to our paper and stars to this repository.
🚩 News (2026.02) BatteryLife has exceeded 30,000 downloads. BatteryLife v10 is now released, with fixes for issues reported over the past year (update details are available here). We sincerely appreciate the support from the community.
🚩News (2025.10) Added the standardized SDU dataset to BatteryLife. Corrected the time_in_s column for all batteries.
🔥News (2025.08) BatteryLife downloads exceed 10,000.
🔥News (2025.07) BatteryLife downloads exceed 7,000.
🔥News (2025.06) BatteryLife downloads exceed 5,000.
🚩News (2025.06) Added the complete Stanford dataset as "Stanford_2" (now including both releases of the Stanford dataset).
🚩News (2025.05) BatteryLife was accpeted by KDD 2025.
🔥News (2025.05) BatteryLife downloads exceed 3,000. ​
🚩News (2025.02) BatteryLife was released!
Highlights
(Data statistics are based on the initial release of BatteryLife.)
- The largest battery life dataset: BatteryLife is created by integrating 16 datasets, providing 99,000 samples from 990 batteries with life labels. This is 2.5 times the size of BatteryML, which is the previous largest battery life resource.
- The most diverse battery life dataset: BatteryLife contains 8 battery formats, 59 chemical systems, 9 operation temperatures, and 421 charge/discharge protocols. Compared with the previous largest battery life resource (BatteryML), BatteryLife furnishes 4 times format, 11.8 times chemical system, 1.8 times operating temperature, and 2.2 times charge/discharge protocol.
- A comprehensive benchmark for battery life prediction: BatteryLife provides 18 benchmark methods with open-source codes in this repository. The 18 benchmark methods include popular methods for battery life prediction, popular baselines in time series analysis, and a series of baselines proposed by this work.
Data availability
The processed datasets can be accessed via multiple ways:
- You can download the datasets from Huggingface [tutorial].
- You can download the datasets from Zenodo.
Note that brief introductions to each dataset are available under the directory of each dataset.
All the raw datasets are publicly available, interested users can download them from the following links:
- Zn-ion, Na-ion, and CALB datasets: Zenodo link Huggingface link [tutorial]
- CALCE: link
- MATR: Three batches and Batch 9
- HUST: link
- RWTH: link
- ISU_ILCC: link
- XJTU: link
- Tongji: link
- Stanford: link
- HNEI, SNL, MICH, MICH_EXP and UL_PUR datasets: BatteryArchive.
- SDU dataset: link.
Benchmark results of Battery Life Prediction (BLP) task
The benchmark result for battery life prediction. The comparison methods are split into five types, including
- Dummy, a baseline that uses the mean of training labels as the prediction.
- MLPs, a series of multilayer perceptron models including DLinear, MLP, and CPMLP.
- Transformers, a series of transformer models including PatchTST, Autoformer, iTransformer, Transformer, and CPTransformer.
- CNNs, a series of convolutional neural network models including CNN and MICN.
- RNNs, a series of recurrent neural network models including CPGRU, CPBiGRU, CPLSTM, CPBiLSTM, GRU, BiGRU, LSTM, and BiLSTM.
| Datasets | Li-ion | Li-ion | Zn-ion | Zn-ion | Na-ion | Na-ion | CALB | CALB |
|---|---|---|---|---|---|---|---|---|
| Metrics | MAPE | 15%-Acc | MAPE | 15%-Acc | MAPE | 15%-Acc | MAPE | 15%-Acc |
| Dummy | 0.831±0.000 | 0.296±0.000 | 1.297±0.214 | 0.083±0.047 | 0.404±0.029 | 0.067±0.094 | 1.811±0.550 | 0.267±0.094 |
| DLinear | 0.586±0.028 | 0.275±0.017 | 0.814±0.026 | 0.124±0.020 | 0.319±0.031 | 0.329±0.042 | 0.164±0.049 | 0.601±0.114 |
| MLP | 0.233±0.010 | 0.503±0.013 | 0.805±0.103 | 0.079±0.055 | 0.281±0.067 | 0.364±0.098 | 0.149±0.014 | 0.641±0.115 |
| CPMLP | 0.179±0.003 | 0.620±0.004 | 0.558±0.034 | 0.297±0.084 | 0.274±0.026 | 0.337±0.038 | 0.140±0.009 | 0.704±0.053 |
| PatchTST | 0.288±0.042 | 0.430±0.053 | 0.716±0.024 | 0.133±0.001 | 0.396±0.094 | 0.258±0.070 | 0.347±0.045 | 0.511±0.139 |
| Autoformer | 0.437±0.093 | 0.287±0.067 | 0.987±0.243 | 0.106±0.039 | 0.372±0.047 | 0.177±0.128 | 0.761±0.061 | 0.329±0.121 |
| iTransformer | 0.209±0.015 | 0.516±0.028 | 0.690±0.110 | 0.188±0.037 | 0.321±0.087 | 0.249±0.178 | 0.164±0.020 | 0.649±0.044 |
| Transformer | - | - | - | - | - | - | - | - |
| CPTransformer | 0.184±0.003 | 0.573±0.016 | 0.515±0.067 | 0.202±0.084 | 0.255±0.036 | 0.406±0.084 | 0.149±0.005 | 0.672±0.107 |
| CNN | 0.337±0.068 | 0.371±0.050 | 0.928±0.093 | 0.115±0.029 | 0.307±0.047 | 0.273±0.027 | 0.278±0.011 | 0.582±0.032 |
| MICN | 0.249±0.004 | 0.494±0.019 | 0.579±0.101 | 0.227±0.127 | 0.305±0.040 | 0.335±0.065 | 0.233±0.050 | 0.471±0.257 |
| CPGRU | 0.189±0.008 | 0.585±0.013 | 0.616±0.049 | 0.289±0.076 | 0.298±0.063 | 0.203±0.160 | 0.141±0.012 | 0.681±0.178 |
| CPBiGRU | 0.190±0.001 | 0.566±0.034 | 0.774±0.202 | 0.193±0.156 | 0.282±0.055 | 0.395±0.008 | 0.160±0.015 | 0.686±0.063 |
| CPLSTM | 0.196±0.006 | 0.585±0.020 | 0.932±0.227 | 0.085±0.028 | 0.272±0.051 | 0.386±0.009 | 0.156±0.073 | 0.613±0.153 |
| CPBiLSTM | 0.191±0.007 | 0.421±0.255 | 0.645±0.049 | 0.150±0.104 | 0.299±0.043 | 0.399±0.001 | 0.173±0.075 | 0.663±0.247 |
| GRU&BiGRU | NA | NA | NA | NA | NA | NA | NA | NA |
| LSTM&BiLSTM | NA | NA | NA | NA | NA | NA | NA | NA |
Quick start
Install
pip install -r requirements.txt
# You should also install BatteryML (https://github.com/microsoft/BatteryML)
Preprocessing [tutorial]
After downloading all raw datasets provided in "Data availability" section, you can run the following script to obtain the processed datasets:
python preprocess_scripts.py
If you download the processed datasets, you can skip this step.
-
During the development of BatteryLife, we frequently encountered problems where the processed data still contained potential issues after processing. Consequently, according to our experience, we have provided some Jupyter scripts for the double-check of processed data in the
./check_data_scripts/folder to help the quick verification and processing of the data for the community. By conducting quick checks to ensure that all characteristic curves align with expectations, potential downstream complications can be effectively mitigated.check_capacity_curves.ipynb: for checking charge and discharge capacities curve of the batteries..check_soh_curves.ipynb: for checking the degradation trajectory of the batteries.check_voltage_current_curves.ipynb: for checking the voltage and current curves of the batteries.
How to calculate the statistical information of aging conditions for processed data:
- Firstly, run the
aging_conditions.pyscript to generate thename2agingConditionID.json, which the aging condition number for each battery. - Secondly, run the
dataset_overview_calculation.pyscript to calculate the aging conditions statistical information for preprocessed data.
Train the model [tutorial]
Before you start training, please move all processed datasets (such as, HUST, MATR, et al.) folders and Life labels folder (downloaded from Hugginface or Zenodo websites) into ./dataset folder under the root folder.
After that, just feel free to run any benchmark method. For example:
sh ./train_eval_scripts/CPTransformer.sh
Evaluate the model
If you want to evaluate a model in detail. We have provided the evaluation script. You can use it as follows:
sh ./train_eval_scripts/evaluate.sh
Fine-tuning [tutorial]
If you want to fine-tune the pretrained model to another dataset. We have provided the fine-tuning script and the tutorial. You can use it as follows:
sh ./train_eval_scripts/finetune_script.sh
Domain adaptation [tutorial]
If you want to do the domain adaptation to another dataset. We have provided the domain adaptation script and the tutorial. You can use it as follows:
sh ./train_eval_scripts/domain_adaptation_script.sh
Documention
- The main information is described in our BatteryLife paper.
- The data structure of the standardized data is described in Data_structure_description.md.
- Further details of data statistics are available at Further_details_of_data_statistics.md.
- Further details of processed charge and discharge capacity data are available at Further_details_of_processed_charge_and_discharge_capacity_data.md.
- BatteryLife v10 update details are available at Version10_Update_Details.md.
Welcome contributions
Advancing AI4Battery requires standardized datasets. However, the available battery life datasets are typically stored in different places and in different formats. We have put great efforts into integrating 13 previously available datasets and 3 of our datasets. BatteryLife aims to become a unified platform for sharing standardized battery aging and lifetime datasets. We warmly welcome contributions from the community—whether by sharing new datasets or standardizing existing ones according to the BatteryLife guidelines.
To further broaden the range of available resources, we list below several open-source but currently unprocessed datasets in the battery life domain:
If you are interested in contributing, please either submit a pull request or contact us via email at rtan474@connect.hkust-gz.edu.cn and whong719@connect.hkust-gz.edu.cn. To integrate your data into the BatteryLife repositories, please provide:
- Raw datasets
- Processed datasets
- Preprocessing scripts (for reproducibility)
- A list of contributors (for acknowledgment in the repo)
- Papers related to the data generation (we will prompt users to cite these in the repository's Citation section).
Citation
If you use the benchmark, processed datasets, or the raw datasets produced by this work, you should cite the BatteryLife paper:
@inproceedings{10.1145/3711896.3737372,
author = {Tan, Ruifeng and Hong, Weixiang and Tang, Jiayue and Lu, Xibin and Ma, Ruijun and Zheng, Xiang and Li, Jia and Huang, Jiaqiang and Zhang, Tong-Yi},
title = {BatteryLife: A Comprehensive Dataset and Benchmark for Battery Life Prediction},
year = {2025},
isbn = {9798400714542},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3711896.3737372},
doi = {10.1145/3711896.3737372},
booktitle = {Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2},
pages = {5789–5800},
numpages = {12},
location = {Toronto ON, Canada},
series = {KDD '25}
}
- Additionally, please cite the original papers that conducted experiments. Please cite BatteryArchive as the data source for the HNEI, SNL, MICH, MICH_EXP, and UL_PUR datasets.
- Please cite BatteryML if you use the processed CALCE, MATR, HUST, HNEI, RWTH, SNL, and UL_PUR datasets. Our preprocessing for these 7 datasets relies heavily on BatteryML's preprocessing scripts.
- Please cite SDU paper if you use the SDU dataset.
Acknowledgement
This repo is constructed based on the following repos:
All thanks to our contributors
Owner metadata
- Name: Ruifeng T
- Login: Ruifeng-Tan
- Email:
- Kind: user
- Description: a first-year PhD student at HKUST(GZ)
- Website: https://Ruifeng-Tan.github.io
- Location: Guangzhou, Guangdong, China
- Twitter:
- Company: HKUST(GZ)
- Icon url: https://avatars.githubusercontent.com/u/50574059?u=8f24bdaae72a19bba14fbef6d04f3292fc866e32&v=4
- Repositories: 2
- Last ynced at: 2023-03-07T20:46:49.520Z
- Profile URL: https://github.com/Ruifeng-Tan
GitHub Events
Total
- Member event: 1
- Pull request event: 5
- Fork event: 21
- Issues event: 19
- Watch event: 154
- Issue comment event: 24
- Push event: 125
- Create event: 3
Last Year
- Pull request event: 5
- Fork event: 17
- Issues event: 13
- Watch event: 82
- Issue comment event: 19
- Push event: 68
- Create event: 1
Committers metadata
Last synced: 4 days ago
Total Commits: 169
Total Committers: 5
Avg Commits per committer: 33.8
Development Distribution Score (DDS): 0.515
Commits in past year: 105
Committers in past year: 5
Avg Commits per committer in past year: 21.0
Development Distribution Score (DDS) in past year: 0.533
| Name | Commits | |
|---|---|---|
| Ruifeng Tan(è°ç‘žé”‹ï¼‰ | 5****n | 82 |
| Hong Weixiang | 1****4@q****m | 79 |
| HWX | h****x@p****s | 5 |
| Jintao Dong | d****o@o****m | 2 |
| Kevin Wang | 1****6 | 1 |
Committer domains:
Issue and Pull Request metadata
Last synced: 3 days ago
Total issues: 10
Total pull requests: 3
Average time to close issues: 13 days
Average time to close pull requests: N/A
Total issue authors: 5
Total pull request authors: 2
Average comments per issue: 1.3
Average comments per pull request: 0.0
Merged pull request: 0
Bot issues: 0
Bot pull requests: 0
Past year issues: 6
Past year pull requests: 3
Past year average time to close issues: about 4 hours
Past year average time to close pull requests: N/A
Past year issue authors: 4
Past year pull request authors: 2
Past year average comments per issue: 1.5
Past year average comments per pull request: 0.0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- si520 (3)
- JaceJu-frog (3)
- bysdyc (2)
- yyyac (1)
- Norlan-Ch (1)
Top Pull Request Authors
- DJTGtao (2)
- KevinWang676 (1)
Top Issue Labels
Top Pull Request Labels
Dependencies
- BatteryML ==0.0.1
- Requests ==2.32.3
- accelerate ==0.29.3
- datasets ==2.19.0
- denseweight ==0.1.2
- evaluate ==0.4.1
- joblib ==1.4.0
- matplotlib ==3.8.4
- numpy ==2.2.3
- pandas ==2.2.3
- peft ==0.12.0
- reformer_pytorch ==1.4.4
- scikit_learn ==1.4.2
- scipy ==1.15.2
- seaborn ==0.13.2
- sympy ==1.12
- torch ==2.4.1
- tqdm ==4.66.2
- transformers ==4.43.4
- scikit-learn *
- scikit-learn *
- scikit-learn *
Score: 7.0817085861055755