Carbonara

Enrichment pipeline for CUR / FOCUS reports which adds energy and carbon data allowing to report and reduce the impact of the your cloud usage.
https://github.com/digitalpebble/carbonara

Category: Consumption
Sub Category: Computation and Communication

Keywords

apachespark aws carbon-emissions climate cloud focus greenops greensoftware sustainability

Last synced: about 17 hours ago
JSON representation

Repository metadata

Enrichment pipeline for CUR / FOCUS reports which adds energy and carbon data allowing to report and reduce the impact of the your cloud usage.

Host: GitHub
URL: https://github.com/digitalpebble/carbonara
Owner: DigitalPebble
License: apache-2.0
Created: 2025-05-22T14:59:47.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2025-06-25T16:00:51.000Z (1 day ago)
Last Synced: 2025-06-25T17:21:59.008Z (1 day ago)
Topics: apachespark, aws, carbon-emissions, climate, cloud, focus, greenops, greensoftware, sustainability
Language: Java
Homepage:
Size: 318 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 8
Releases: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

CARBONARA

Carbonara helps estimate the environmental impact of your cloud usage. By leveraging open source models and data, it enriches
usage reports generated by cloud providers and allows you to build reports and visualisations. Having the greenops and finops data in the same
place makes it easier to expose your costs and impacts side by side.

Carbonara uses Apache Spark to read and write the usage reports (typically in Parquet format) in a scalable way and, thanks to its modular approach,
splits the enrichment of the data into configurable stages.

A typical sequence of stages would be:

estimation of embedded emissions from resources used
estimation of energy used
application of PUE and other overheads
application of carbon intensity factors

Please note that this is currently a prototype which handles only CUR reports from AWS. Not all AWS services are covered.

One of the benefits of using Apache Spark is that you can use EMR on AWS to enrich
the CURs at scale without having to export or expose any of your data.

Prerequisites

You will need to have CUR reports as inputs. Those are generated via DataExports and stored on S3 as Parquet files.

Local install

With Apache Maven, Java and Apache Spark installed locally and added to the $PATH.

mvn clean package
spark-submit --class com.digitalpebble.carbonara.SparkJob --driver-memory 4g ./target/carbonara-1.0.jar ./curs ./output

Docker

Build the Docker image with
docker build -t digitalpebble/carbonara:1.0 .

The command below processes the data locally by mounting the directories containing the CURs and output as volumes:

docker run -it  -v ./curs:/curs -v ./output:/output  digitalpebble/carbonara:1.0 \
/opt/spark/bin/spark-submit  \
--class com.digitalpebble.carbonara.SparkJob \
--driver-memory 4g \
--master 'local[*]' \
/usr/local/lib/carbonara-1.0.jar \
/curs /output/enriched

Explore the output

Using DuckDB

create table enriched_curs as select * from 'output/*/*.parquet';

select line_item_product_code, product_servicecode, 
       round(sum(operational_emissions_co2eq_g),2) as co2_usage_g, 
       round(sum(energy_usage_kwh),2) as energy_usage_kwh 
       from enriched_curs where operational_emissions_co2eq_g > 0.01 
       group by line_item_product_code, product_servicecode order by co2_usage_g desc;

should give an output similar to

line_item_product_code	product_servicecode	co2_usage_g	energy_usage_kwh
AmazonS3	AWSDataTransfer	659.2	3.31
AmazonRDS	AWSDataTransfer	361.59	1.09
AmazonEC2	AWSDataTransfer	162.59	1.43
AmazonECR	AWSDataTransfer	88.75	0.8
AmazonVPC	AWSDataTransfer	40.55	0.38
AWSELB	AWSDataTransfer	6.3	0.06

Owner metadata

Name: DigitalPebble Ltd
Login: DigitalPebble
Email: [email protected]
Kind: organization
Description:
Website: http://www.digitalpebble.com
Location: Bristol, UK
Twitter:
Company:
Icon url: https://avatars.githubusercontent.com/u/1726647?v=4
Repositories: 27
Last ynced at: 2024-11-24T19:46:52.245Z
Profile URL: https://github.com/DigitalPebble

GitHub Events

Total

Issues event: 4
Delete event: 1
Issue comment event: 1
Push event: 7
Public event: 1
Gollum event: 1
Pull request event: 2
Create event: 1

Last Year

Issues event: 4
Delete event: 1
Issue comment event: 1
Push event: 7
Public event: 1
Gollum event: 1
Pull request event: 2
Create event: 1

Committers metadata

Last synced: 4 days ago

Total Commits: 17
Total Committers: 1
Avg Commits per committer: 17.0
Development Distribution Score (DDS): 0.0

Commits in past year: 17
Committers in past year: 1
Avg Commits per committer in past year: 17.0
Development Distribution Score (DDS) in past year: 0.0

Name	Email	Commits
Julien Nioche	j**n@d**m	17

Committer domains:

digitalpebble.com: 1

Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 9
Total pull requests: 2
Average time to close issues: 20 minutes
Average time to close pull requests: 4 minutes
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.11
Average comments per pull request: 0.5
Merged pull request: 2
Bot issues: 0
Bot pull requests: 0

Past year issues: 9
Past year pull requests: 2
Past year average time to close issues: 20 minutes
Past year average time to close pull requests: 4 minutes
Past year issue authors: 1
Past year pull request authors: 1
Past year average comments per issue: 0.11
Past year average comments per pull request: 0.5
Past year merged pull request: 2
Past year bot issues: 0
Past year bot pull requests: 0

More stats: https://issues.ecosyste.ms/repositories/lookup?url=https://github.com/digitalpebble/carbonara

Top Issue Authors

jnioche (9)

Top Pull Request Authors

jnioche (2)

Top Issue Labels

good first issue (4)
help wanted (4)
enhancement (4)
documentation (1)

Top Pull Request Labels

Dependencies

Dockerfile docker

apache/spark 4.0.0-java21 build
maven 3.9.9-eclipse-temurin-21 build

pom.xml maven

org.apache.spark:spark-sql_2.13 4.0.0 provided

Score: 2.302585092994046

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Sustainable Technology