BOxCrete
An open-source Bayesian optimization framework for probabilistic strength curve prediction and sustainable mix design of concrete.
https://github.com/facebookresearch/sustainableconcrete
Category: Consumption
Sub Category: Buildings and Heating
Last synced: about 4 hours ago
JSON representation
Repository metadata
Repository to track versions of concrete strength data, models, and active learning proposals.
- Host: GitHub
- URL: https://github.com/facebookresearch/sustainableconcrete
- Owner: facebookresearch
- License: mit
- Created: 2023-01-17T12:51:19.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2026-06-04T18:34:36.000Z (6 days ago)
- Last Synced: 2026-06-04T19:07:47.300Z (6 days ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 10.8 MB
- Stars: 190
- Watchers: 10
- Forks: 39
- Open Issues: 3
- Releases: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION
README.md
BOxCrete: Bayesian Optimization for Sustainable Concrete Mix Design
🌐 Try the interactive explorer → — predict concrete strength from mix composition in your browser, no installation needed.
Concrete, the second most widely used material in the world, accounts for 6–8% of global anthropogenic CO₂ emissions, largely due to Portland cement production (~0.8 tons CO₂ per ton of cement).
Here, we introduce BOxCrete, an open-source Bayesian optimization framework for probabilistic strength curve prediction and sustainable mix design.
We invite researchers and practitioners from all disciplines including AI, machine learning, computer science, materials science, and civil engineering
to collaborate on discovering more sustainable concrete formulations that are applicable
to a wide array of construction projects, at scale.
For more information,
please see "BOxCrete: A Bayesian Optimization Open-Source AI Model for Concrete Strength Forecasting and Mix Optimization".
This repository contains probabilistic models and data for the
- Compressive strength of concrete and mortar mixes
- The associated global warming potential (GWP)
- Slump prediction using Gaussian Process regression with derived features
as a function of their composition, consisting of cement, fly ash, slag, fine and coarse aggregate, admixtures, and water, to name a few basic ingredients. See boxcrete/models.py for implementation details.
Included Data
- BOxCrete data (
data/boxcrete_data.csv): Combined mortar and concrete mix compositions with strength measurements at multiple curing ages, GWP values, and multiple material sources. This is the single unified dataset used for all model training.
Installation
Install directly from GitHub (no cloning required):
pip install git+https://github.com/facebookresearch/SustainableConcrete.git
Or install from source for development:
git clone https://github.com/facebookresearch/SustainableConcrete.git
cd SustainableConcrete
pip install -e .
For development (includes testing and linting tools):
pip install -e ".[dev]"
For running notebooks:
pip install -e ".[notebooks]"
Usage
import torch
from boxcrete.utils import load_concrete_strength, get_bounds
from boxcrete.models import SustainableConcreteModel
from boxcrete.plotting import plot_strength_curve
# Load data and fit models
data = load_concrete_strength()
data.bounds = get_bounds(data.X_columns)
model = SustainableConcreteModel(strength_days=[1, 28])
model.fit_gwp_model(data)
model.fit_strength_model(data)
# model.model_names shows the ordering: ["GWP", "1-day Strength", "28-day Strength"]
model_list = model.get_model_list()
# Plot strength curves: 100% cement vs 60% fly ash + 40% cement
cols = data.X_columns[:-1] # composition columns (without Time)
compositions = torch.zeros(2, len(cols))
compositions[0, cols.index("Cement (kg/m3)")] = 500.0 # 100% cement
compositions[1, cols.index("Cement (kg/m3)")] = 200.0 # 40% cement
compositions[1, cols.index("Fly Ash (kg/m3)")] = 300.0 # 60% fly ash
plot_strength_curve(model, compositions)
Slump Prediction
The slump model uses a SingleTaskGP with an AppendDerivedFeatures input transform
that automatically computes the HRWR-to-binder ratio — a key determinant of concrete
workability. Slump prediction is opt-in — use SLUMP_Y_COLUMNS to include slump
when loading data:
from boxcrete.utils import load_concrete_strength, SLUMP_Y_COLUMNS
# Load data with slump (opt-in)
data = load_concrete_strength(Y_columns=SLUMP_Y_COLUMNS)
# Fit the slump model (in addition to GWP and strength)
model.fit_slump_model(data)
# Get slump predictions for a composition
slump_post = model.slump_model.posterior(compositions)
print(f"Predicted slump: {slump_post.mean}")
See notebooks/slump_prediction_demo.ipynb for a complete walkthrough including calibration plots, LOO cross-validation, and feature importance.
The models can be used for a variety of tasks, including but not limited to
- Continuous-time strength curve predictions with uncertainty bands for a user-specified concrete mix.
- Experimental design: suggesting promising concrete mixtures to be tested in a lab,
- The computation of optimal strength-GWP trade-offs based on user-specified (possibly location-specific) constraints.
Examples
Compressive Strength Model
The SustainableConcreteModel in boxcrete/models.py includes a strength_model that predicts the evolution of compressive strength as a function of mixture composition. A demo is provided in notebooks/strength_curve_prediction_demo.ipynb, which demonstrates how the model can be used to predict the full strength development curve for any user-specified mix. A comprehensive tutorial covering prediction, calibration, Pareto frontiers, and gradient-based experimental design is available in notebooks/prediction_and_optimization_tutorial.ipynb. The model is based on Gaussian Process (GP) regression and incorporates custom modeling steps to ensure physically consistent strength evolution and calibrated uncertainty.
Strength Curve Predictions
The following figure shows predicted strength curves for two compositions: portland cement (blue) and a mix with high cement substitution (green). The model captures the distinct strength development trajectories associated with different binder chemistries while providing physically consistent uncertainty estimates.
Model Calibration
Cross-Validation on Independent Test Set
When the model is trained on the full training dataset and evaluated on an independent set of mixtures, it demonstrates strong predictive performance. The predicted compressive strengths closely match the experimentally measured values across the range of mixes and curing ages.
Training Set Calibration
When trained on the mortar and concrete mix strength data contained in this repository, the training set predictions also look sensible and well calibrated.
Experimental Design
Inferring Optimal Trade-Offs under Constraints
While the previous section focused on using the models to predict strength curves,
we can also use the trained model to predict what the optimal trade-offs between GWP and strength
are likely to look like under constraints on the concrete composition
that were not necessarily present during the training of the model.
In particular, the figure below shows the predicted Pareto frontiers
of GWP and strength subject to two constraints on the water-to-binder ratio,
i.e.:
- water-to-binder ratio > 0.2 (solid lines), and
- water-to-binder ratio > 0.35 (dashed lines),
as well as constraints on ingredients:
- no constraints (blue),
- no fly ash (orange), and
- no slag (green).
Notably, while the figure is purely based on model predictions,
the trends in the figure conform to expert knowledge.
In particular,
- the increase in the minimum water-to-binder ratio has an outsize negative effect
on the evolution of strength, - removing fly ash from the composition appears to have negligible effect during the time window we consider (< 28 days), and
- removing slag from the composition has a significant negative effect on strength, similar to the increase in the water-to-binder ratio.
These are just a few insights we can gain from querying the model,
and we believe that many more questions about the behavior of concrete
can be investigated in a similar way.
From a practical perspective, the insight that the exclusion of slag - a by-product of steel production -
is more significant than the exclusion of fly ash - a by-product of coal power plants -
can inform site selection
for large construction projects that seek to minimize carbon impact.
Empirical Pareto Frontier Evolution
The probabilistic model for compressive strength can in addition be used to design new concrete mixtures that are likely to exhibit an optimal trade-off between strength and GWP.
The following figure shows the evolution of the empirical Pareto frontier,
i.e. the points with empirically optimal trade-offs,
as a function of our experimental batches.
Importantly, the experimental design methodology has been able to propose mortar mixes
that have experimentally proven to exhibit superior trade-offs between GWP and strength
compared (orange-yellow) to human-designed mixes (blue-purple).
Multi-Objective Optimization (Concrete Data)
The framework also enables multi-objective optimization of early-age (1-day) and later-age (28-day) compressive strength alongside Global Warming Potential (GWP). By systematically exploring the composition space, BOxCrete can generate candidate mixes that balance structural performance requirements with carbon reduction targets.
The following figure shows the distribution of model-generated mixes plotted together with the training dataset, illustrating how the optimization explores the design space while remaining guided by experimentally validated compositions.
Citing
If you use the data or models contained in this repository, please cite
"BOxCrete: A Bayesian Optimization Open-Source AI Model for Concrete Strength Forecasting and Mix Optimization":
@misc{baten2026boxcrete,
title={BOxCrete: A Bayesian Optimization Open-Source AI Model for Concrete Strength Forecasting and Mix Optimization},
author={Bayezid Baten and M. Ayyan Iqbal and Sebastian Ament and Julius Kusuma and Nishant Garg},
year={2026},
eprint={2603.21525},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2603.21525},
}
For the earlier workshop paper that introduced the model with mortar data, please cite
"Sustainable Concrete via Bayesian Optimization":
@misc{ament2023sustainable,
title={Sustainable Concrete via Bayesian Optimization},
author={Sebastian Ament and Andrew Witte and Nishant Garg and Julius Kusuma},
year={2023},
eprint={2310.18288},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2310.18288},
}
License
SustainableConcrete is released under the MIT license, as found in the LICENSE file.
Citation (CITATION)
@misc{baten2026boxcrete,
title={BOxCrete: A Bayesian Optimization Open-Source AI Model for Concrete Strength Forecasting and Mix Optimization},
author={Bayezid Baten and M. Ayyan Iqbal and Sebastian Ament and Julius Kusuma and Nishant Garg},
year={2026},
eprint={2603.21525},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2603.21525},
}
@misc{ament2023sustainable,
title={Sustainable Concrete via Bayesian Optimization},
author={Sebastian Ament and Andrew Witte and Nishant Garg and Julius Kusuma},
year={2023},
eprint={2310.18288},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2310.18288},
}
Owner metadata
- Name: Meta Research
- Login: facebookresearch
- Email:
- Kind: organization
- Description:
- Website: https://opensource.fb.com
- Location: Menlo Park, California
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/16943930?v=4
- Repositories: 1060
- Last ynced at: 2024-12-17T01:36:52.238Z
- Profile URL: https://github.com/facebookresearch
GitHub Events
Total
- Delete event: 7
- Pull request event: 6
- Fork event: 9
- Watch event: 48
- Issue comment event: 2
- Push event: 52
- Pull request review comment event: 2
- Pull request review event: 2
- Create event: 7
Last Year
- Delete event: 7
- Pull request event: 6
- Fork event: 5
- Watch event: 39
- Issue comment event: 2
- Push event: 49
- Pull request review comment event: 2
- Pull request review event: 2
- Create event: 7
Committers metadata
Last synced: 3 days ago
Total Commits: 40
Total Committers: 5
Avg Commits per committer: 8.0
Development Distribution Score (DDS): 0.225
Commits in past year: 28
Committers in past year: 5
Avg Commits per committer in past year: 5.6
Development Distribution Score (DDS) in past year: 0.214
| Name | Commits | |
|---|---|---|
| Sebastian Ament | s****t@m****m | 31 |
| Julius Kusuma | 1****a | 5 |
| maiqbal2 | m****2@i****u | 2 |
| dependabot[bot] | 4****] | 1 |
| Arpit Jain | a****9@g****m | 1 |
Committer domains:
- illinois.edu: 1
- meta.com: 1
Issue and Pull Request metadata
Last synced: 3 days ago
Total issues: 0
Total pull requests: 17
Average time to close issues: N/A
Average time to close pull requests: 1 day
Total issue authors: 0
Total pull request authors: 4
Average comments per issue: 0
Average comments per pull request: 0.18
Merged pull request: 10
Bot issues: 0
Bot pull requests: 0
Past year issues: 0
Past year pull requests: 13
Past year average time to close issues: N/A
Past year average time to close pull requests: 1 day
Past year issue authors: 0
Past year pull request authors: 2
Past year average comments per issue: 0
Past year average comments per pull request: 0.23
Past year merged pull request: 9
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
Top Pull Request Authors
- SebastianAment (9)
- maiqbal2 (4)
- dreww2 (3)
- facebook-github-bot (1)
Top Issue Labels
Top Pull Request Labels
- CLA Signed (15)
Score: 6.872128101338986