solar-panel-detection
Using a combination of AI (machine vision), open data and short-term forecasting, the project aims to determine the amount of solar electricity being put into the UK grid at a given time (i.e., "right now", or "nowcasting")
https://github.com/alan-turing-institute/solar-panel-detection
Keywords
hut23 hut23-425
Last synced: over 1 year ago
JSON representation
Acceptance Criteria
- Revelant topics? true
- External users? true
- Open source license? false
- Active? false
- Fork? false
Repository metadata
Solar Panel Detection (Turing Climate Action Call)
- Host: GitHub
- URL: https://github.com/alan-turing-institute/solar-panel-detection
- Owner: alan-turing-institute
- Created: 2019-11-13T11:23:12.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-06-23T17:41:18.000Z (almost 5 years ago)
- Last Synced: 2024-01-21T03:31:51.415Z (over 1 year ago)
- Topics: hut23, hut23-425
- Language: Jupyter Notebook
- Size: 87.2 MB
- Stars: 20
- Watchers: 8
- Forks: 6
- Open Issues: 7
- Releases: 0
-
Metadata Files:
- Readme: README.md
README.md
Solar Panel Detection (Turing Climate Action Call)
Project code: R-SPES-115 - Enabling worldwide solar PV nowcasting via machine vision and open data
Hut23 issue: https://github.com/alan-turing-institute/Hut23/issues/425
Main Project Description
Using a combination of AI (machine vision), open data and short term forecasting, the project aims to determine the amount of solar electricity being put into the UK grid at a given time (i.e., “right now”, or “nowcasting”).
Dan Stowell (Queen Mary) and collaborators are working on using a number of datasets, each of which are incomplete and messy, to create an estimate of all solar panels and their orientation in the UK. This will involve some data wrangling to combine a number of geospatial data sources and then use data science methods to determine the solar panel assets across the UK and provide a web service to disseminate the results.
Data sources will be from Open Street Maps, which has been tagging solar panels in the UK, as well as other data provided by Sheffield Solar and Open Climate Fix. The REG would be doing most of the data wrangling and machine learning on the project, with the other partners providing data and expertise.
REG Project
Goals
- Aggregate UK solar PV data into a structured format, which can be accessed.
- Link the tagged panels in OSM to the other data sources
Overview of the directory structure
.
|-- admin -- project process and planning docs
|-- data
| |-- as_received -- downloaded data files
| |-- raw -- manually edited files (replace dummy data)
| |-- processed
|-- db -- database creation
|-- doc -- documentation
|-- explorations -- exploratory work
`-- notebooks
Data
Data is held in three directories: as_received
contains the data precisely as
downloaded from its original source and in its original format; raw
contains
data that has been manually restructured or reformatted to be suitable for use by
software in the project (see "Using this repo" header below). processed
contains data that may have been processed in some way, such as by Python code, but is still thought of as “source” data.
The following sources of data are used:
- OpenStreetMap - Great Britain download (Geofabrik).
- FiT - Report of installed PV (and other tech including wind). 100,000s entries.
- REPD - Official UK data from the "renewable energy planning database". It contains large solar farms only.
- Machine Vision dataset - supplied by Descartes labs (Oxford), not publicly available yet.
Project outcome
This repo includes a set of scripts that will take
input datasets (REPD, OSM, FiT and machine vision – each in diff format),
perform data cleaning/conversion, populate a PostgreSQL database, perform
grouping of data where necessary (there are duplicate entries in REPD, multiple solar farm
components in OSM) and then match entries between the data tables, based on the
matching criteria we have come up with.
The database creation and matching scripts should work with newer versions of the source data files, or at least do so with minimal changes to the data processing (see "Using this repo" below).
The result of matching is a table in the database called matches
that links the unique identifiers of the
data tables. This also contains a column called match_rule
, which refers to the method by which the match was determined, as documented in doc/matching.
Using this repo
Install requirements
- Install PostgreSQL
- Install Python 3 (version 3.7 or later) and
pip
- Run
pip install -r requirements.txt
- Install Osmium
Download and prepare data files
- Download the following data files from the internet and store locally. We recommend saving these original data files within the directory structure under
data/as_received
:- OSM PBF file (GB extract): Download
- FiT reports: Navigate to ofgem and click the link for the latest Installation Report (during the Turing project, 30 September 2019 was used), then download the main document AND subsidiary documents
- REPD CSV file: Download - this is always the most up to date version
- Machine Vision dataset: supplied by Descartes labs (Oxford), not publicly available yet.
- Navigate to
submodules/compile_osm_solar
and edit theosmsourcefpath
incompile_osm_solar.py
so that the file path points to the OSM PBF file you downloaded. After installing the requirements in the submodule README, runpython compile_osm_solar.py
. One of the data files produced is a csv, which we use as source data. You can move this file todata/as_received
- Carry out manual edits to the data files, as described in doc/preprocessing and save them in
data/raw
under the names suggested by the doc, replacing the default dummy data files. - Navigate to
data/processed
and typemake
- this will create versions of the data files ready for import to PostgreSQL
Run the database creation and data matching
- Make sure you have PostgreSQL on your machine, then run the command:
createdb hut23-425 "Solar PV database matching"
- this creates the empty database. - Navigate to
db
and run the commandpsql -f make-database.sql hut23-425
- this populates the database (see doc/database), carries out some de-duplication of the datasets and performs the matching procedure (see doc/matching). Note: this may take several minutes.
Note that the above commands require you to have admin rights on your PostgreSQL server. On standard Debian-based machines you could prepend the commands with sudo -u postgres
, or you could assign privileges to your own user account.
External collaborators guidance
From April 2020 this repo is no longer under active development, however a fork of the project is being created by Open Climate Fix if you wish to open issues and pull requests there.
Owner metadata
- Name: The Alan Turing Institute
- Login: alan-turing-institute
- Email: [email protected]
- Kind: organization
- Description: The UK's national institute for data science and artificial intelligence.
- Website: https://turing.ac.uk
- Location:
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/18304793?v=4
- Repositories: 242
- Last ynced at: 2023-02-25T16:15:56.736Z
- Profile URL: https://github.com/alan-turing-institute
GitHub Events
Total
- Issues event: 56
- Watch event: 21
- Delete event: 24
- Issue comment event: 18
- Member event: 2
- Push event: 213
- Pull request review comment event: 4
- Pull request event: 60
- Fork event: 5
- Create event: 27
Last Year
- Watch event: 3
Committers metadata
Last synced: over 1 year ago
Total Commits: 297
Total Committers: 4
Avg Commits per committer: 74.25
Development Distribution Score (DDS): 0.162
Commits in past year: 0
Committers in past year: 0
Avg Commits per committer in past year: 0.0
Development Distribution Score (DDS) in past year: 0.0
Name | Commits | |
---|---|---|
Ed Chalstrey | e****y@M****k | 249 |
James Geddes | j****s@t****k | 32 |
Ed Chalstrey | e****y@g****m | 11 |
Dan Stowell | d****l@u****t | 5 |
Committer domains:
Issue and Pull Request metadata
Last synced: over 1 year ago
Total issues: 31
Total pull requests: 31
Average time to close issues: 19 days
Average time to close pull requests: 1 day
Total issue authors: 2
Total pull request authors: 3
Average comments per issue: 0.35
Average comments per pull request: 0.19
Merged pull request: 28
Bot issues: 0
Bot pull requests: 0
Past year issues: 0
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 0
Past year pull request authors: 0
Past year average comments per issue: 0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- edwardchalstrey1 (28)
- triangle-man (3)
Top Pull Request Authors
- edwardchalstrey1 (24)
- triangle-man (4)
- danstowell (3)
Top Issue Labels
- enhancement (2)
- bug (1)
Top Pull Request Labels
Score: 4.68213122712422