censusdis
A Python package for discovering, loading, and analyzing U.S. Census demographic, economic, and geographic data and metadata with access to the full collection of data and maps the U.S. Census publishes via their APIs.
https://github.com/censusdis/censusdis
Category: Sustainable Development
Sub Category: Population and Poverty
Keywords
data-science maps python us-census us-census-api
Keywords from Contributors
transforms archiving measur observation routes conversion animals bird compose optimize
Last synced: 11 minutes ago
JSON representation
Repository metadata
censusdis is a Python package for discovering, loading and analyzing, U.S. Census demographic, economic, and geographic data and metadata. It is designed to be intuitive and Pythonic, giving users access to the full collection of data and maps the U.S. Census publishes via their APIs.
- Host: GitHub
- URL: https://github.com/censusdis/censusdis
- Owner: censusdis
- License: other
- Created: 2022-09-04T13:34:13.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-04-18T05:24:07.000Z (9 days ago)
- Last Synced: 2025-04-21T09:17:20.058Z (6 days ago)
- Topics: data-science, maps, python, us-census, us-census-api
- Language: Python
- Homepage:
- Size: 146 MB
- Stars: 106
- Watchers: 3
- Forks: 18
- Open Issues: 15
- Releases: 54
-
Metadata Files:
- Readme: .github/README.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
.github/README.md
censusdis
censusdis
is a package for discovering, loading, analyzing, and computing diversity, integration, and segregation metrics to U.S. Census demographic data.
It is designed
- to support every dataset, every geography, and every year. It's not just about ACS data through the last time the software
was updated and released; - to support all geographies, on and off-spine, not just states, counties, and census tracts;
- to have integrated mapping capabilities that save you time and extra coding;
- to be intuitive, Pythonic, and fast.
Click any of the thumbnails below to see the notebook that generated it.
Installation and First Example
censusdis can be installed with pip
:
pip install censusdis
Every censusdis query needs four things:
- What data set we want to query.
- What vintage, or year.
- What variables.
- What geographies.
Here is an example of how we can use censusdis to download data once we know
those four things.
import censusdis.data as ced
from censusdis.datasets import ACS5
from censusdis import states
df_median_income = ced.download(
# Data set: American Community Survey 5-Year
dataset=ACS5,
# Vintage: 2022
vintage=2022,
# Variable: median household income
download_variables=['NAME', 'B19013_001E'],
# Geography: All counties in New Jersey.
state=states.NJ,
county='*'
)
There are many more examples in the tuturial and in the sample notebooks.
Tutorial (A Great Place to Start!)
We presented a half-day tutorial
on censusdis
at SciPy '24. All the
material covered in the tutorial is available as in a github repo at
https://github.com/censusdis/censusdis-tutorial-2024.
The tutorial consists of a series of five lessons,
each with worked exercises, and two choices for a final project. If you
really want to learn the ins and outs of what censusdis
can do, from the
most basic queries all the way through some relatively advanced topics, this
is the tutorial for you.
An Older Tutorial
For an older tutorial that is shorter but does not include some of the newest features,
please see the censusdis-tutorial repository.
This tutorial was presented at PyData Seattle 2023. If you want to try it out for yourself, the README.md
contains links that let you run the tutorial notebooks live on mybinder.org in your browser without needing to set up a
local development environment or download or install any code.
Tutorial Video
We expect a vireo of the SciPy '24 tutorial to be available soon,
hopefully by some time in August '24.
A 86 minute
video
of the older tutorial as presented at
PyData Seattle 2023
is also available.
Overview
censusdis
is a package for discovering, loading, analyzing, and computing
diversity, integration, and segregation metrics
to U.S. Census demographic data. It is designed to be intuitive and Pythonic,
but give users access to the full collection of data and maps the US Census
publishes via their APIs. It also avoids hard-coding metadata
about U.S. Census variables, such as their names, types, and
hierarchies in groups. Instead, it queries this from the
U.S. Census API. This allows it to operate over a large set
of datasets and years, likely including many that don't
exist as of time of this writing. It also integrates
downloading and merging the geometry of geographic
geometries to make plotting data and derived metrics simple
and easy. Finally, it interacts with the divintseg
package to compute diversity and integration metrics.
The design goal of censusdis
are discussed in more
detail in design-goals.md.
I'm not sure I get it. Show me what it can do.
The Nationwide Diversity and Integration
notebook demonstrates how we can download, process, and
plot a large amount of US Census demographic data quickly
and easily to produce compelling results with just a few
lines of code.
I'm sold! I want to dive right in!
To get straight to installing and trying out
code hop over to our
Getting Started
guide.
censusdis
lets you quickly and easily load US Census data and make plots like
this one:
We downloaded the data behind this plot, including
the geometry of all the block groups, with a
single call:
import censusdis.data as ced
from censusdis.states import STATE_GA
# This is a census variable for median household income.
# See https://api.census.gov/data/2020/acs/acs5/variables/B19013_001E.html
MEDIAN_HOUSEHOLD_INCOME_VARIABLE = "B19013_001E"
gdf_bg = ced.download(
"acs/acs5", # The American Community Survey 5-Year Data
2020,
["NAME", MEDIAN_HOUSEHOLD_INCOME_VARIABLE],
state=STATE_GA,
block_group="*",
with_geometry=True
)
Similarly, we can download data and geographies, do a little
analysis on our own using familiar Pandas
data frame operations, and plot graphs like these
Modules
The public modules that make up the censusdis
package are
Module | Description |
---|---|
censusdis.geography |
Code for managing geography hierarchies in which census data is organized. |
censusdis.data |
Code for fetching data from the US Census API, including managing datasets, groups, and variable hierarchies. |
censusdis.maps |
Code for downloading map data from the US, caching it locally, and using it to render maps. |
censusdis.states |
Constants defining the US States. Used by the other modules. |
censusdis.counties |
Constants defining counties in all of the US States. |
Demonstration Notebooks
There are several demonstration notebooks available to illustrate how censusdis
can
be used. They are found in the
notebook
directory of the source code.
The demo notebooks include
Notebook Name | Description |
---|---|
ACS Comparison Profile.ipynb | Load and plot American Community Survey (ACS) Comparison Profile data at the state level. |
ACS Data Profile.ipynb | Load and plot American Community Survey (ACS) Data Profile data at the state level. |
ACS Demo.ipynb | Load American Community Survey (ACS) Detail Table data for New Jersey and plot diversity statewide at the census block group level. |
ACS Subject Table.ipynb | Load and plot American Community Survey (ACS) Subject Table data at the state level. |
Block Groups in CBSAs.ipynb | Load and spatially join on-spine and off-spine geographies and plot the results on a map. |
Congressional Districts.ipynb | Load congressional districts and tract-level data within them. |
Data With Geometry.ipynb | Load American Community Survey (ACS) data for New Jersey and plot diversity statewide at the census block group level. |
Exploring Variables.ipynb | Load metatdata on a group of variables, visualize the tree hierarchy of variables in the group, and load data from the leaves of the tree. |
Geographies Contained within Geographies.ipynb | Demonstrate working with geograhies from different hierarchies. |
Getting Started Examples.ipynb | Sample code from the Getting Started guide. |
Nationwide Diversity and Integration.ipynb | Load nationwide demographic data, compute diversity and integration, and plot. |
Map Demo.ipynb | Demonstrate loading at plotting maps of New Jersey at different geographic granularity. |
Map Geographies.ipynb | Illustrates a large number of different map geogpraphies and how to load them. |
Population Change 2020-2021.ipynb | Track the change in state population from 2020 to 2021 using ACS5 data. |
PUMS Demo.ipynb | Load Public-Use Microdata Samples (PUMS) data for Massachusetts and plot it. |
Querying Available Data Sets.ipynb | Query all available data sets. A starting point for moving beyond ACS. |
Seeing White.ipynb | Load nationwide demographic data at the county level and plot of map of the US showing the percent of the population who identify as white only (no other race) at the county level. |
SoMa DIS Demo.ipynb | Load race and ethnicity data for two towns in Essex County, NJ and compute diversity and integration metrics. |
Time Series School District Poverty.ipynb | Demonstrates how to work with time series datasets, which are a little different than vintaged data sets. |
Diversity and Integration Metrics
Diversity and integration metrics from the divintseg
package are
demonstrated in some notebooks.
For more information on these metrics
see the divintseg
project.
Owner metadata
- Name: censusdis
- Login: censusdis
- Email:
- Kind: organization
- Description:
- Website:
- Location:
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/153302187?v=4
- Repositories: 2
- Last ynced at: 2024-02-13T17:01:53.941Z
- Profile URL: https://github.com/censusdis
GitHub Events
Total
- Create event: 19
- Release event: 5
- Issues event: 23
- Watch event: 42
- Delete event: 20
- Issue comment event: 57
- Push event: 214
- Pull request review event: 9
- Pull request review comment event: 4
- Pull request event: 33
- Fork event: 7
Last Year
- Create event: 19
- Release event: 5
- Issues event: 23
- Watch event: 42
- Delete event: 20
- Issue comment event: 57
- Push event: 214
- Pull request review event: 9
- Pull request review comment event: 4
- Pull request event: 33
- Fork event: 7
Committers metadata
Last synced: 6 days ago
Total Commits: 805
Total Committers: 8
Avg Commits per committer: 100.625
Development Distribution Score (DDS): 0.368
Commits in past year: 439
Committers in past year: 5
Avg Commits per committer in past year: 87.8
Development Distribution Score (DDS) in past year: 0.128
Name | Commits | |
---|---|---|
GitHub Action | a****n@g****m | 509 |
Darren Erik Vengroff | v****f | 277 |
Canyon Foot | c****t@t****m | 6 |
Ari Lamstein | a****n@g****m | 5 |
audreymarthin | 1****n | 4 |
dependabot[bot] | 4****] | 2 |
Maxime Rey | 8****y | 1 |
Ibrahim Hasaan | 1****h | 1 |
Committer domains:
- twosigma.com: 1
- github.com: 1
Issue and Pull Request metadata
Last synced: 1 day ago
Total issues: 35
Total pull requests: 287
Average time to close issues: 27 days
Average time to close pull requests: about 7 hours
Total issue authors: 6
Total pull request authors: 3
Average comments per issue: 1.94
Average comments per pull request: 0.09
Merged pull request: 281
Bot issues: 0
Bot pull requests: 0
Past year issues: 17
Past year pull requests: 25
Past year average time to close issues: about 1 month
Past year average time to close pull requests: 2 days
Past year issue authors: 4
Past year pull request authors: 2
Past year average comments per issue: 2.41
Past year average comments per pull request: 0.96
Past year merged pull request: 22
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- vengroff (14)
- CanyonFoot (9)
- arilamstein (7)
- v-dev (3)
- riesthorsten (1)
- ctriley (1)
Top Pull Request Authors
- vengroff (273)
- arilamstein (8)
- CanyonFoot (6)
Top Issue Labels
- good first issue (6)
- documentation (3)
- dependencies (2)
- enhancement (1)
- python (1)
Top Pull Request Labels
Dependencies
- astroid 2.12.10 develop
- black 22.8.0 develop
- coverage 6.4.4 develop
- dill 0.3.5.1 develop
- iniconfig 1.1.1 develop
- isort 5.10.1 develop
- lazy-object-proxy 1.7.1 develop
- mypy 0.971 develop
- mypy-extensions 0.4.3 develop
- pathspec 0.10.1 develop
- platformdirs 2.5.2 develop
- pluggy 1.0.0 develop
- py 1.11.0 develop
- pylint 2.15.2 develop
- pytest 7.1.3 develop
- pytest-cov 3.0.0 develop
- tomlkit 0.11.4 develop
- types-requests 2.28.10 develop
- types-urllib3 1.26.24 develop
- wrapt 1.14.1 develop
- Babel 2.10.3
- Fiona 1.8.21
- Jinja2 3.1.2
- MarkupSafe 2.1.1
- Pillow 9.2.0
- Pygments 2.13.0
- Rtree 1.0.0
- Shapely 1.8.4
- Sphinx 5.1.1
- alabaster 0.7.12
- attrs 22.1.0
- certifi 2022.9.14
- charset-normalizer 2.1.1
- click 8.1.3
- click-plugins 1.1.1
- cligj 0.7.2
- colorama 0.4.5
- contourpy 1.0.5
- cycler 0.11.0
- defusedxml 0.7.1
- divintseg 0.1.3
- docutils 0.17.1
- flake8 5.0.4
- flake8-html 0.4.2
- fonttools 4.37.2
- genbadge 1.1.0
- geopandas 0.11.1
- idna 3.4
- imagesize 1.4.1
- importlib-metadata 4.12.0
- kiwisolver 1.4.4
- matplotlib 3.6.0
- mccabe 0.7.0
- munch 2.5.0
- numpy 1.23.3
- packaging 21.3
- pandas 1.4.4
- pockets 0.9.1
- pycodestyle 2.9.1
- pyflakes 2.5.0
- pyparsing 3.0.9
- pyproj 3.4.0
- python-dateutil 2.8.2
- pytz 2022.2.1
- requests 2.28.1
- setuptools 65.3.0
- setuptools-scm 7.0.5
- six 1.16.0
- snowballstemmer 2.2.0
- sphinx-rtd-theme 1.0.0
- sphinxcontrib-applehelp 1.0.2
- sphinxcontrib-devhelp 1.0.2
- sphinxcontrib-htmlhelp 2.0.0
- sphinxcontrib-jsmath 1.0.1
- sphinxcontrib-napoleon 0.7
- sphinxcontrib-qthelp 1.0.3
- sphinxcontrib-serializinghtml 1.1.5
- tomli 2.0.1
- typing-extensions 4.3.0
- urllib3 1.26.12
- zipp 3.8.1
- Rtree ^1.0.0
- Sphinx ^5.1.1
- divintseg ^0.1.3
- geopandas ^0.11.1
- matplotlib ^3.5.3
- python ^3.9
- requests ^2.28.1
- sphinx-rtd-theme 1.0.0
- sphinxcontrib-napoleon 0.7
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- snok/install-poetry v1 composite
- actions/checkout v3 composite
- actions/configure-pages v2 composite
- actions/deploy-pages v1 composite
- actions/upload-pages-artifact v1 composite
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- snok/install-poetry v1 composite
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- snok/install-poetry v1 composite
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- snok/install-poetry v1 composite
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- snok/install-poetry v1 composite
Score: 6.8752320872765775