eGRID
A comprehensive source of data from EPA's Clean Air and Power Division (CAPD) on the environmental characteristics of almost all electric power generated in the United State.
https://github.com/usepa/egrid
Category: Energy Systems
Sub Category: Energy Data Accessibility and Integration
Keywords
egrid oar
Last synced: about 13 hours ago
JSON representation
Repository metadata
Emissions & Generation Resource Integrated Database (eGRID)
- Host: GitHub
- URL: https://github.com/usepa/egrid
- Owner: USEPA
- License: mit
- Created: 2023-06-13T13:34:46.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2026-03-19T18:52:17.000Z (8 days ago)
- Last Synced: 2026-03-20T10:07:46.026Z (7 days ago)
- Topics: egrid, oar
- Language: R
- Homepage: https://www.epa.gov/egrid
- Size: 250 MB
- Stars: 27
- Watchers: 7
- Forks: 10
- Open Issues: 7
- Releases: 3
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE.md
README.md
eGRID
This repository includes all necessary scripts and documentation to create the Emissions & Generation Resource Integrated Database (eGRID).
Background
eGRID is a comprehensive source of data from EPA's Clean Air and Power Division (CAPD) on the environmental
characteristics of almost all electric power generated in the United States. eGRID is based on available plant-specific data for all
U.S. electricity generating plants that provide power to the electric grid and report emissions and electricity data to the U.S. government. Data reported include,
but are not limited to, net electric generation; resource mix (the share of generation by resource or fuel type); mass emissions of carbon dioxide
(CO2), nitrogen oxides (NOx), sulfur dioxide (SO2), methane (CH4), and nitrous oxide (N2O); emission rates for CO2, NOx, SO2,
CH4, and N2O; heat input; and nameplate capacity. eGRID reports this information on an annual basis (as well as by ozone season for
heat input and NOx) at different levels of geographic aggregation.
The final eGRID dataset includes eight levels of data aggregation:
-
Generator: A set of equipment that produces electricity and is connected to the U.S. electricity grid.
-
Unit: A set of equipment that either produces electricity and is connected to the U.S electricity grid or
a set of equipment that is connected to a generator which produces electricity and is connected to the U.S. electricity grid. -
Plant: A facility with one or more units and/or generators that provide power to the electric grid.
-
State: U.S. states, Puerto Rico (PR), and the District of Columbia (DC).
-
Balancing authority: Regional power system operators that ensure a balance of supply and demand.
-
eGRID subregion: EPA defined subregions designed to limit the impacts of the import and export of electricity (shown in Figure 1).
-
NERC (North American Electric Reliability Corporation) regions: Each NERC region listed in eGRID represents one of nine regional
portions of the North American electricity transmission grid: six in the contiguous United States, plus Alaska, Hawaii, and
Puerto Rico (which are not part of the formal NERC regions but are considered so in eGRID). -
National U.S.: Contains all 50 states, Puerto Rico (PR), and the District of Columbia (DC).
Further information on the eGRID methodology can be found in the eGRID Technical Guide.
The dataset that this code produces is publicly available here.

Architecture
This year EPA will be releasing the methodology to develop eGRID as an RStudio project. Recently, there has been increased interest from users in understanding the methods used to create the eGRID data. To increase transparency in the eGRID production process, EPA has made the R scripts available for users to view and use. EPA used the RStudio project beginning in 2024 to produce eGRID2023.
Figure 2 displays a summary of eGRID architecture, which specifies data sources, inputs, and outputs for creating eGRID.
A data dictionary is provided in eGRID Production Model Data Dictionary.xlsx. This file provides the row number, name, description, imperial units, metric units, source, and calculation method for each column reported in the final eGRID dataset.

Code base organization
This project is structured as an RStudio project. To ensure that all scripts run correctly, load the eGRID_R.Rproj within RStudio to enable the project environment. eGRID_master.qmd is a Quarto document that serves as a master script (i.e., it runs all necessary scripts in the correct order), while also providing documentation for the scripts and steps performed therein.
The code base is structured as follows:
-
scripts/: all scripts to download and clean data, create each data aggregation level, format final dataset, and convert to metric units. -
scripts/functions/: all helper functions. -
data/raw_data/: raw data obtained from EPA and EIA sites. -
data/clean_data/: data created from EPA and EIA data cleaning steps. -
data/outputs/: outputs generated by this code base. -
data/static_tables/: static tables used within the code base. These include crosswalks that match data between EIA and EPA data or regions and manual corrections that are made.
The data used to create eGRID are EPA and Energy Information Administration (EIA) electricity data. EPA data are loaded via an application programming interface (API) in scripts/data_load_epa.R and cleaned in scripts/data_clean_epa.R. EIA data are downloaded from EIA's website in scripts/data_load_eia.R and scripts/functions/function_download_eia_files.R and cleaned in scripts/data_clean_eia.R.
Creating eGRID
To create the eGRID dataset:
- Obtain an API key for EPA data.
- Request an API key from https://www.epa.gov/power-sector/cam-api-portal.
- Create folder
api_keys/within the root of the eGRID. - Create a text file named
epa_api_key.txtwithin the folderapi_keys/and save the API key here on a single line.
- Load
eGRID_R.Rprojwithin RStudio to enable the project environment. - Render
eGRID_master.qmd.- Set data year in
params(eGRID_year) in the YAML as a string in the format "YYYY" (ex:"2023"). - Render
eGRID_master.qmd. This will run all scripts and build the eGRID dataset.
- Set data year in
Outputs
The codebase outputs each data aggregation level in the eGRID dataset as an .RDS file and the final dataset as an Excel sheet in data/outputs/{params$eGRID_year}. Rendering eGRID_master.qmd also creates an HTML file that summarizes the data, methods, and output files used and created throughout the code base.
QA
The codebase contains two QA files as Quarto documents:
-
qa_all.qmd: Annual checks to confirm results are as expected for each file created in the codebase. -
qa_annual_comparison.qmd: Comparison of output data to previous eGRID years.
Contributing to eGRID
Please submit any questions about the eGRID dataset to this web form.
If you would like to ask a question about or report an issue in the code, review the CONTRIBUTING policy and submit an issue under the "Issues" tab in the GitHub repository. Provide a concise summary as the title of the issue and a clear description, including steps to reproduce the issue.
Disclaimer
The United States Environmental Protection Agency (EPA) GitHub project code is provided on an "as is" basis and the user assumes responsibility for its use. EPA has relinquished control of the information and no longer has responsibility to protect the integrity , confidentiality, or availability of the information. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by EPA. The EPA seal and logo shall not be used in any manner to imply endorsement of any commercial product or activity by EPA or the United States Government.
Owner metadata
- Name: U.S. Environmental Protection Agency
- Login: USEPA
- Email:
- Kind: organization
- Description:
- Website: https://www.epa.gov
- Location: United States of America
- Twitter: EPA
- Company:
- Icon url: https://avatars.githubusercontent.com/u/1304320?v=4
- Repositories: 449
- Last ynced at: 2024-04-14T19:47:37.473Z
- Profile URL: https://github.com/USEPA
GitHub Events
Total
- Release event: 1
- Delete event: 1
- Pull request event: 4
- Fork event: 9
- Issues event: 8
- Watch event: 22
- Issue comment event: 3
- Push event: 194
- Public event: 1
- Pull request review comment event: 32
- Pull request review event: 40
- Create event: 7
Last Year
- Pull request event: 2
- Fork event: 1
- Issues event: 2
- Watch event: 11
- Issue comment event: 2
- Push event: 132
- Pull request review comment event: 31
- Pull request review event: 35
- Create event: 2
Committers metadata
Last synced: 6 days ago
Total Commits: 759
Total Committers: 10
Avg Commits per committer: 75.9
Development Distribution Score (DDS): 0.565
Commits in past year: 7
Committers in past year: 1
Avg Commits per committer in past year: 7.0
Development Distribution Score (DDS) in past year: 0.0
| Name | Commits | |
|---|---|---|
| ABT\gofortht | T****h@a****m | 330 |
| ABT\russelle | E****l@a****m | 159 |
| ABT\BockS | s****k@a****m | 108 |
| ABT\zhangm | m****g@a****m | 69 |
| sarasoko | s****a@g****m | 31 |
| Caroline Watson | C****n@a****m | 26 |
| Sean Bock | b****s@a****l | 22 |
| Sean Bock | b****s@a****l | 11 |
| russelle | r****e@a****l | 2 |
| Johnathan Tafoya | 6****a | 1 |
Committer domains:
- abtglobal.com: 3
- abtassoc.com: 2
Issue and Pull Request metadata
Last synced: 8 days ago
Total issues: 72
Total pull requests: 27
Average time to close issues: 4 months
Average time to close pull requests: 12 days
Total issue authors: 5
Total pull request authors: 4
Average comments per issue: 0.75
Average comments per pull request: 0.78
Merged pull request: 26
Bot issues: 0
Bot pull requests: 0
Past year issues: 8
Past year pull requests: 2
Past year average time to close issues: 5 months
Past year average time to close pull requests: 3 days
Past year issue authors: 2
Past year pull request authors: 1
Past year average comments per issue: 0.38
Past year average comments per pull request: 0.0
Past year merged pull request: 2
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- teagan-goforth (44)
- brendan-small (12)
- seanbock (10)
- russell-e (3)
- zhang-madeline (3)
Top Pull Request Authors
- teagan-goforth (16)
- zhang-madeline (5)
- russell-e (4)
- j-tafoya (2)
Top Issue Labels
Top Pull Request Labels
- bug (3)
Score: 5.8289456176102075