Top Of The Poops
Analysing Sewage Information from the UK Environment Agency.
https://github.com/top-poop/top-of-the-poops
Category: Natural Resources
Sub Category: Water Supply and Quality
Keywords
environment-agency sewage uk-government
Last synced: about 3 hours ago
JSON representation
Repository metadata
Analysing Sewage Information from the UK Environment Agency
- Host: GitHub
- URL: https://github.com/top-poop/top-of-the-poops
- Owner: top-poop
- License: other
- Created: 2021-10-28T11:07:20.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2026-03-30T08:17:39.000Z (2 months ago)
- Last Synced: 2026-06-05T01:03:59.798Z (8 days ago)
- Topics: environment-agency, sewage, uk-government
- Language: Python
- Homepage: https://top-of-the-poops.org
- Size: 13.6 MB
- Stars: 26
- Watchers: 3
- Forks: 6
- Open Issues: 10
- Releases: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
README.md
Top Of The Poops
Website: top-of-the-poops.org
Seems that #sewage is on people's minds right now.
The UK publishes some information about sewage outfalls - here are some scripts to get this information, analyse it, and
perhaps publish some interesting findings.
Data Reuse and Attribution
Please re-use our data.
Press contact: press [at] top-of-the-poops.org
If you publish data, content, or images from our site, please note it is CC-BY-SA 4.0, and as such we require suitable attribution
- Derived Data / General Content - should be attributed, with name and hyperlink Top of the Poops
- Images / Maps - should have caption '(c) top-of-the-poops.org', or similar, either as plain text or hyperlink, and ther should be a hyperlink as above in the main body of the text.
Please refer to: https://wiki.creativecommons.org/wiki/Recommended_practices_for_attribution
Derived data is (C) Top-Of-The-Poops - CC-BY-SA 4.0, all original data is (C) the original data owner, and is used under appropriate licence
Maps
We previously used mapbox but after getting very popular we couldn't afford it anymore!
Maps now rendered ourselves, but it's not going to be as fast as MapBox.
We use TileServer GL in combination with a UK Vector map
from MapTiler
How to use
You can clone the repo - I use IntelliJ IDEA to make a hot-reloading web page.
The build runs locally with make watch
All data files are generated on developer machine, only the javascript build runs on CI. This ensures the CI build is acceptably fast.
Currently it runs in about 10 seconds. Which is OK, could be faster.
make watch uses inotify - this may not work on MacOS.
Contributing
Contributions are welcome - especially CSS / Javascript improvements! But please chat before doing any real work - to make sure everyone is aligned with direction.
Development Environment
This has been developed on Linux, the makefiles may or may not work on a Mac.
Setting up the database
make python
cd db/data
make load-all
Generating json data files
You'll need to have set up the database stuff first
make generated
React
Why is there a React app per page?
Because it makes it easy to write the software
MP Data
https://www.theyworkforyou.com/mps/?f=csv
https://www.politics-social.com/list/name
Not fetched yet
http://everypolitician.org/uk/commons/download.html
Constituency Shapes
Source: Office for National Statistics licensed under the Open Government Licence v.3.0
Contains OS data © Crown copyright and database right 2021
Sewage Data
Event Duration Monitoring
https://environment.data.gov.uk/dataset/21e15f12-0df8-4bfc-b763-45226c16a8ac
https://environment.data.gov.uk/portalstg/home/item.html?id=045af51b3be545b79b0c219811d3d243
https://environment.data.gov.uk/portalstg/sharing/rest/content/items/045af51b3be545b79b0c219811d3d243/data
2022
https://environment.data.gov.uk/portalstg/home/item.html?id=2f8d9b7628dd4f60a30fb1a8483fc2ae
Consented Discharges with Conditions
https://environment.data.gov.uk/dataset/5fe5ab2e-d465-11e4-8a42-f0def148f590
https://environment.data.gov.uk/portalstg/sharing/rest/content/items/5e618f2b5c7f47cca44eb468aa2e43f0/data
Wales
Consented Discharges with Conditions
https://lle.gov.wales/catalogue/item/ConsentedDischargesToControlledWatersWithConditions/?lang=en
https://naturalresourceswales.sharefile.eu/share/view/s05adea6ab5d4df58/fo289e69-abc0-4acb-9923-271512440118
https://storage-eu-205.sharefile.com/download.ashx?dt=dt99e5eec3bd194293acd60049575d41ee&cid=9AQXBd2ldhvlRrRbQ8tE-w&zoneid=zpc3159d90-01f7-41a7-a8ab-3704157466&exp=1637152468&zsid=FB&h=F%2BC3TQBtcWx%2BYjb4jglnxmRAZLWwiRKrwDw7xn%2BoShI%3D
Event Duration Monitoring
2020 - Can't find! - Partial information at: https://www.dwrcymru.com/en/our-services/wastewater/combined-storm-overflows/valleys-and-south-east-wales
2021 - Main page: https://www.dwrcymru.com/en/our-services/wastewater/river-water-quality/combined-storm-overflows
2021 - Seems to be split over 3 files (with different formats), unknown overlap with Environment Agency data.
- https://www.dwrcymru.com/-/media/Project/Files/Page-Documents/Our-Services/Wastewater/CSO/EDM-Return-Dwr-Cymru-Welsh-Water-Emergency-Overflow-Annual-2021.ashx
- https://www.dwrcymru.com/-/media/Project/Files/Page-Documents/Our-Services/Wastewater/CSO/EDM-Return-Dwr-Cymru-Welsh-Water-Storm-Overflow-Annual-2021.ashx
- https://www.dwrcymru.com/-/media/Project/Files/Page-Documents/Our-Services/Wastewater/CSO/EDM-Return-DCWW_Wales-Water-Annual-2021.ashx
Bathing
Bathing Water Monitoring Locations
https://www.data.gov.uk/dataset/dcb8bd46-c4cf-4749-bad0-7663da96845c/bathing-waters-monitoring-locations
Name + Classification by year
Sensitive Areas Bathing
https://www.data.gov.uk/dataset/4e2bbdb4-15d3-49dc-ba22-904045b091fb/sensitive-areas-bathing-waters
https://datamap.gov.wales/layers/inspire-nrw:NRW_UWWTD_SA_BATHING_WATERS
Postcodes
https://geoportal.statistics.gov.uk/datasets/ons-postcode-directory-february-2020/about
Software
You'll need the following:
- python3
- libreoffice
- gdal-bin
Things to do
- Link with voting results - need to find the division results...
- Rivers and beaches by constituency?
- Constituency page showing all the things by constituency?
Data Quality
To be sure the quality of the data is unbelievably poor. Perhaps it is so poor so that it is hard to understand?
2021 Data
Issues
- Distributed as an Excel file, which is hard to process
- Should ideally be a machine readable format. I'll say simple XML, with schema, but a consistent CSV file would be OK.
- Mix and match of data types
- Numeric columns have "N/A", "#N/A", and "#NA"
- Name columns have "0" in
- Percentage values scale from 0-1 in some sheets, and 0-100 in others, because some sheets have cells set to "Numeric", and others to "Percent"
- Continuation rows
- A few of the sheets don't stick to "one row per record", which is kinda mandatory in a machine readable file.
- Inconsistent data
- Particularly consent ids don't match consent ids in the consent database - the formatting differs
- Consent ids don't have a consistent format.
- Loads of EDM rows don't match valid consents.
- Duplicate data rows
- Some data rows are duplicates in many of the source files. It is not clear why, it looks like an extract from a database upstream has maybe repeated rows where there are multiple assets with the same consent information?
- Wales data is spread over multiple files, with different formats, and may or may not overlap with EA data.
Noted Improvements
- The files are now tabs in a single document with almost consistent data across the tabs.
Example Duplicate Data
'Anglian Water', 'DAVENTRY SEWER SYSTEM', 'AW5NF181', 'A1'
'Dwr Cymru Welsh Water', '#TBC', '#N/A', '', '', '', '0.25', '1', '100', ''
'Severn Trent Water', 'WITTON - GEORGE ROAD XXX (CSO)', 'TBC', '', '', '', '', '', '', ''
'South West Water', 'KINGSAND SEWAGE PUMPING STATION', '301903', 'A1', '', 'KINGSAND BEACH', '33.72', '21', '100', ''
Owner metadata
- Name: Top Poop
- Login: top-poop
- Email:
- Kind: user
- Description:
- Website:
- Location:
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/93323279?u=33b53dae1cbf201b9d9b8dd1d4fd30eeb2fa73f4&v=4
- Repositories: 2
- Last ynced at: 2024-05-01T10:58:58.063Z
- Profile URL: https://github.com/top-poop
GitHub Events
Total
- Fork event: 1
- Issues event: 1
- Watch event: 5
- Issue comment event: 1
- Push event: 5
Last Year
- Issues event: 1
- Watch event: 1
- Issue comment event: 1
- Push event: 4
Committers metadata
Last synced: 2 days ago
Total Commits: 279
Total Committers: 1
Avg Commits per committer: 279.0
Development Distribution Score (DDS): 0.0
Commits in past year: 10
Committers in past year: 1
Avg Commits per committer in past year: 10.0
Development Distribution Score (DDS) in past year: 0.0
| Name | Commits | |
|---|---|---|
| Top Poop | 9****p | 279 |
Issue and Pull Request metadata
Last synced: 23 days ago
Total issues: 20
Total pull requests: 3
Average time to close issues: about 1 year
Average time to close pull requests: 5 months
Total issue authors: 19
Total pull request authors: 2
Average comments per issue: 2.1
Average comments per pull request: 1.33
Merged pull request: 0
Bot issues: 0
Bot pull requests: 2
Past year issues: 2
Past year pull requests: 0
Past year average time to close issues: 3 months
Past year average time to close pull requests: N/A
Past year issue authors: 2
Past year pull request authors: 0
Past year average comments per issue: 3.5
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- V2G-EVSE (2)
- KJ-UoL (1)
- jonese1 (1)
- AmySlack (1)
- seagulljim (1)
- cleansouthernwater (1)
- FelixAJNobes (1)
- asibs (1)
- paul-hammant (1)
- petercrwilliams2013-jpg (1)
- 100Nicola (1)
- browndg (1)
- Stekkles (1)
- MadMackMcMac (1)
- Darkdeap (1)
Top Pull Request Authors
- dependabot[bot] (2)
- asibs (1)
Top Issue Labels
- data-sources (1)
Top Pull Request Labels
- dependencies (2)
Score: 3.58351893845611