Global Flood Database Scripts & Data
Used to produce the Global Flood Database and assess changes in population exposed to floods.
https://github.com/cloudtostreet/MODIS_GlobalFloodDatabase
Category: Climate Change
Sub Category: Natural Hazard and Storm
Last synced: about 21 hours ago
JSON representation
Repository metadata
This repository contains the code used to produce the Global Flood Database and assess changes in population exposed to floods.
- Host: GitHub
- URL: https://github.com/cloudtostreet/MODIS_GlobalFloodDatabase
- Owner: cloudtostreet
- License: mit
- Created: 2021-04-23T18:06:02.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-09-08T15:23:10.000Z (over 3 years ago)
- Last Synced: 2025-04-17T20:38:09.015Z (11 days ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 6.67 MB
- Stars: 104
- Watchers: 8
- Forks: 28
- Open Issues: 3
- Releases: 0
-
Metadata Files:
- Readme: README.html
- License: LICENSE
README.html
README Global Flood Database Scripts & Data
This repository includes code and supporting data for the Global Flood Database. Below are descriptions of the data and code and how they relate to Tellman et al, Satellite observations indicate increasing proportion of population exposed to floods
Data
Flood Maps
The flood maps (.tif files) can be accessed through a visualization and data portal at: http://global-flood-database.cloudtostreet.info/
You can also download the entire database as GeoTIFF files directly from Google Cloud Storage (GCS) using the gsutil cp command from the GCS bucket "gfd_v3". You can use the following command to download the entire database to a local directory:
gsutil -m cp gs://gfd_v3 your/local/directory/to/save/to
Flood Mapping
data\shp_files\dfo_polys_20191203.shp
: the Dartmouth Flood Observatory (DFO) flood polygon dataset used in our analyses and processing of satellite imagery.data\gfd_qcdatabase_2019_08_01.csv
: the Quality Control (QC) database described in Tellman et al.Validation
data\gfd_validation_points_2018_12_17.csv
: validation data of 123 selected flood events that includes geo-location of each assessment point, the classified data for different methods (e.g. 3day Standard), analyst initials and spectral data from the interpretation imagery (i.e. Landsat-5, 7 & 8). Field values are explained in themain_validation.pynb
(see below)data\gfd_validation_sensitivity.csv
: assessed validation points up to 400 points for selected flood events to test appropiate sampling intensitydata\gfd_validation_metrics.csv
: summarized validation metrics (e.g. commission error) for each validation flooddata\sample_frame_CC20_D1_051618.csv
: a summary of available Landsat images (5, 7 & 8) for each flood event. Used to determine which flood events can be used to collect validation data. The fieldDELTA
is the number of days following max flood extent,CLOUD_COVER
is the maximum allowable percent cloud cover for a validation image,X
andY
are the centroid of the flood event from the DFO polygon.Exposed Population Estimates
data\SSP2010.csv
: 2010 population estimates from the SSP2 (Socioeconomic pathways scenario)data\SSP2030.csv
: 2030 population estimates from the SSP2 (Socioeconomic pathways scenario)data\aqueductcountrydata.csv
: WRI Aqueduct flood exposure estimates for various return periods for 2010 and 2030data\aqueduct_dictionary.xlsx
: data dictionary to explain columns in WRI Aqueduct flood exposure estimatesdata\gfd_popsummary.csv
: Global Flood Database population exposure estimates per country, in 2000 and 2015, and associated statistics.data\GFDabove_13_wBias.csv
: Global Flood Database population exposure estimates per country, in 2000 and 2015, with bias correction factor based on comparison to HRSL data.data\gfd_popdictionary.xlsx
: data dictionary to explain columns in Global Flood Database exposure estimates- Population Exposed Per Event: Population exposure estimate per event. To access click on the
INFO
button on our data portal at: http://global-flood-database.cloudtostreet.info/- Population Exposed Per Country Per Event: Population exposure estimates per country by event. To access click on the
INFO
button on our data portal at: http://global-flood-database.cloudtostreet.info/Pop Sensitivity & Uncertainty
data\gfd_popsensitivity.csv
: Global Flood database population exposure estimates per country using the Global Human Settlement Layer (GHSL), High Resolution Settlement Layer (HRSL) and GridPop3. Countries are limited to those with HRSL data.Flood Mechanism
data\gfd_floodmechanism.csv
: Global Flood database disaggregated by "flood type" (data from the Dartmouth Flood Observatory) and estimated population exposure estimate per in 2000 and 2015.Code
Our code includes modules written in Python, Javascript and R. In the case of Javascript, this code is stored as a
.txt
file (.js
files are prohibited as Gmail attachments) and can be run by "copy and pasting" into Google Earth Engine's code editor. Python scripts are based on Google Earth Engine's Python API and require installation before running. Additional code in R require publicly available downloads of R or RStudio.Below is a short description of scripts within our repository and how they relate to Tellman et al, Satellite observations indicate increasing proportion of population exposed to floods
Flood Mapping
main_gfd.py
- uses GEE Python API to create flood maps for each Dartmouth Flood Observatory flood event. This script relies on modules found in theflood_detection
folder. The exports are stored in Google Cloud Storage which can be accessed as described above.Validation
gee_sampleFrameLandsat.txt
- uses GEE Code Editor to determine what floods have available Landsat imagery coincident within 1-day of the max extent of a flood event. This code producesdata\sample_frame_CC20_D1_051618.csv
.gee_validationGUI.txt
- used GEE Code Editor to collect validation data using a custom tool designed in GEE that retrieves a flood event, coincident Landsat imagery and creates a statrified sample. An example of our validation GUI can be seen below in Figure 1. Analysts can then interpret sample points based on Landsat imagery and results are recorded. This code relies upongee_landsatTools.txt
andgee_misc.txt
sub-modules. The outputs of assessment points by each analyst were stored in Google Cloud Storage and is compiled heredata\gfd_validation_points_2018_12_17.csv
.main_validation.pynb
- This script uses the accuracy assessment points (i.e.data\gfd_validation_points_2018_12_17.csv
) to calculate various accuracy metrics including ommission and commission errors. The results are stored indata\gfd_validation_metrics.csv
. This script also analyzes the validation sensitivity (Extended Data Fig 8).Exposed Population Estimates
main_popstats.py
- uses GEE Python API to estimate exposed populations for each flood event and country. This script relies on modules found in theflood_stats
folder. Outputs are available on our data portal by clicking on theINFO
button. These population estimates do not filter out isolated pixels as described in the methods.main_popchange.txt
- uses GEE Code Editor to calculate population change in areas of observed inundation from GFD between years 2000 and 2015 for each country. This method removed isolated pixels for a conservative estimate of change. This script yieldsdata\gfd_popsummary.csv
. Additional fields indata\gfd_popsummary.csv
are described indata\gfd_popdictionary.xlsx
.ext.datafig10.R
- This script was used to make extended data figure 10, which compares the population exposed to at least one flood event between 2000-2018 from the Global Flood Database to floods in 2010 in the WRI Aqueduct flood exposure 100 year return period at the country scale.ext.datafig7.R
- This script was used to make extended data figure 7, which is a sensitivity analysis of the proportion of population exposed to floods under climate change and population growth across return periodmain_gfdsummarystats.R
- This script was used to generate summary statistics from the Global Flood Database for the paper.Pop Sensitivity & Uncertainty
main_popsensitivity.txt
- uses GEE Code Editor to calculate population exposure using the Global Human Settlement Layer (GHSL), High Resolution Settlement Layer (HRSL) and GridPop3. This method removed isolated pixels for a conservative estimate of change. This script yields per region files that are later compiled intodata\gfd_popsensitivity.csv
.main_sensitivityanalysis.R
- R script that compiles individual region files generated frommain_popsensitivity.txt
and then calculates a bias factor. This script additionally joins the bias factor to a number of datasets includingdata\gfd_popsummary.csv
anddata\gfd_floodmechanism.csv
uncertaintyanalysis.R
- R script that estimates uncertainty in population trend estimates per country using the population datasetdata\GFDabove_13_wBias.csv
. It identifies countries we deem uncertainty and reproduces Figure 2 in the Supplementary discussion. This script recalcualtes the global flood exposure trend analysis removing the "uncertain" countries.Flood Mechanism
main_floodmechanism.txt
- uses GEE Code Editor to disaggregate the Global Flood database into flood plains representing different causes/ drivers. Population exposure is calculated using the Global Human Settlement Layer (GHSL) for 2000 and 2015. This script yields per mechanism files that are later compiled intodata\gfd_floodmechanism.csv
.
Figure 1. Example of the GFD Validaiton GUI
Owner metadata
- Name: Floodbase (formerly Cloud to Street)
- Login: cloudtostreet
- Email: [email protected]
- Kind: organization
- Description: Flood tracking for disasters and insurance.
- Website: https://boards.greenhouse.io/floodbase/
- Location:
- Twitter: floodbase
- Company:
- Icon url: https://avatars.githubusercontent.com/u/33142951?v=4
- Repositories: 9
- Last ynced at: 2024-04-23T18:21:20.261Z
- Profile URL: https://github.com/cloudtostreet
GitHub Events
Total
- Issues event: 2
- Watch event: 10
- Issue comment event: 1
- Fork event: 1
Last Year
- Issues event: 2
- Watch event: 10
- Issue comment event: 1
- Fork event: 1
Committers metadata
Last synced: 7 days ago
Total Commits: 8
Total Committers: 3
Avg Commits per committer: 2.667
Development Distribution Score (DDS): 0.5
Commits in past year: 0
Committers in past year: 0
Avg Commits per committer in past year: 0.0
Development Distribution Score (DDS) in past year: 0.0
Name | Commits | |
---|---|---|
Beth Tellman | 6****n | 4 |
Colin Doyle | c****e@g****m | 3 |
Cole Erickson | 6****t | 1 |
Committer domains:
Issue and Pull Request metadata
Last synced: 2 days ago
Total issues: 8
Total pull requests: 0
Average time to close issues: 4 days
Average time to close pull requests: N/A
Total issue authors: 5
Total pull request authors: 0
Average comments per issue: 0.88
Average comments per pull request: 0
Merged pull request: 0
Bot issues: 0
Bot pull requests: 0
Past year issues: 5
Past year pull requests: 0
Past year average time to close issues: about 2 hours
Past year average time to close pull requests: N/A
Past year issue authors: 2
Past year pull request authors: 0
Past year average comments per issue: 0.6
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
- MathildeBossut (4)
- sbaber1 (1)
- yxy-biubiubiu (1)
- ReptarK (1)
- mgmanalili (1)
Top Pull Request Authors
Top Issue Labels
Top Pull Request Labels
Score: 5.771441123130016