Global Flood Monitor
A global database of historic and real-time flood events based on social media.
https://github.com/jensdebruijn/global-flood-monitor
Category: Climate Change
Sub Category: Natural Hazard and Storm
Last synced: about 21 hours ago
JSON representation
Repository metadata
A global database of historic and real-time flood events based on social media
- Host: GitHub
- URL: https://github.com/jensdebruijn/global-flood-monitor
- Owner: jensdebruijn
- License: mit
- Created: 2019-12-09T09:03:56.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-06-22T07:23:33.000Z (almost 2 years ago)
- Last Synced: 2025-03-15T12:06:53.053Z (about 1 month ago)
- Language: Python
- Size: 28.4 MB
- Stars: 26
- Watchers: 3
- Forks: 9
- Open Issues: 0
- Releases: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
readme.md
Abstract
Early event detection and response can significantly reduce the societal impact of floods. Currently, early warning systems rely on gauges, radar data, models and informal local sources. However, the scope and reliability of these systems are limited. Recently, the use of social media for detecting disasters has shown promising results, especially for earthquakes. Here, we present a new database for detecting floods in real-time on a global scale using Twitter. The method was developed using 88 million tweets, from which we derived over 10.000 flood events (i.e., flooding occurring in a country or first order administrative subdivision) across 176 countries in 11 languages in just over four years. Using strict parameters, validation shows that approximately 90% of the events were correctly detected. In countries where the first official language is included, our algorithm detected 63% of events in NatCatSERVICE disaster database at admin 1 level. Moreover, a large number of flood events not included in NatCatSERVICE are detected. All results are publicly available on www.globalfloodmonitor.org.
Cite as
Bruijn, J.A., Moel, H., Jongman, B. et al. A global database of historic and real-time flood events based on social media. Sci Data 6, 311 (2019) doi:10.1038/s41597-019-0326-9
Links
How to run
- Setup
- Install Python (3.6+) and all modules in
requirements.txt
. - Install PostgreSQL (tested with 12) and PostGIS (tested with 3.0).
- Set all parameters in
config.py
. This includes theTWITTER_CONSUMER_KEY
,TWITTER_CONSUMER_SECRET
,TWITTER_ACCESS_TOKEN
,TWITTER_ACCESS_TOKEN_SECRET
for which you will need to register as Twitter developer.
- Install Python (3.6+) and all modules in
- Preparing data & preprocessing
- Obtain a high-resolution population raster (e.g., LandScan Global), convert to GeoTIFF (e.g.,
gdal_translate w001001.adf population.tif -co "COMPRESS=LZW"
) and place ininput/maps/population.tif
. - Set all parameters in
config.py
- Create elasticsearch index for tweets using create_index.py. This file automatically uses the proper index settings (see
input/es_document_index_settings.json
). - Fill index with tweets (example for reading tweets from jsonlines to database in
fill_es.py
). This assumes the fileinput/example.jsonl
has a new json-object obtained from the Twitter API on each line. - Run
preprocessing.py
- Obtain a high-resolution population raster (e.g., LandScan Global), convert to GeoTIFF (e.g.,
- Creating the text classifier
- Hydrate the labelled data (input/labeled_tweets.xlsx) by running
hydrate.py
. This creates a new file with additional data obtained from the Twitter API (including the tweets' texts ininput/labeld_tweets_hydrated.xlsx
). Don't forget to set the Twitter developer tokens inconfig.py
- Train the classifier by running
train_text_classifier.py
. This file exports the trained classifier to input/classifier.
- Hydrate the labelled data (input/labeled_tweets.xlsx) by running
- Finding time corrections per region
- In the next step we need to run just the localization algorithm TAGGS so that we can derive the number of localized tweets per hour of the day (see paper). To do so we run the main file
main.py
, with detection set to false, like so:main.py --detection false
- Run
get_time_correction.py
. This will create a new fileinput/time_correction.json
.
- In the next step we need to run just the localization algorithm TAGGS so that we can derive the number of localized tweets per hour of the day (see paper). To do so we run the main file
- Run the Global Flood Monitor
- Finally, run
main.py
without arguments to run the Global Flood Monitor. The resulting events are stored in the PostgreSQL database.
- Finally, run
Contact
Jens de Bruijn -- j.a.debruijn at outlook dot com
Owner metadata
- Name: Jens de Bruijn
- Login: jensdebruijn
- Email:
- Kind: user
- Description:
- Website: www.globalfloodmonitor.org
- Location: Amsterdam
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/2176353?v=4
- Repositories: 8
- Last ynced at: 2024-06-11T15:38:56.734Z
- Profile URL: https://github.com/jensdebruijn
GitHub Events
Total
- Fork event: 1
Last Year
- Fork event: 1
Committers metadata
Last synced: 7 days ago
Total Commits: 15
Total Committers: 1
Avg Commits per committer: 15.0
Development Distribution Score (DDS): 0.0
Commits in past year: 1
Committers in past year: 1
Avg Commits per committer in past year: 1.0
Development Distribution Score (DDS) in past year: 0.0
Name | Commits | |
---|---|---|
Jens de Bruijn | j****n@o****m | 15 |
Committer domains:
Issue and Pull Request metadata
Last synced: 2 days ago
Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull request: 0
Bot issues: 0
Bot pull requests: 0
Past year issues: 0
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 0
Past year pull request authors: 0
Past year average comments per issue: 0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
Top Pull Request Authors
Top Issue Labels
Top Pull Request Labels
Dependencies
- TwitterAPI *
- dill *
- elasticsearch *
- gdal *
- geopy *
- keras-radam *
- nltk *
- numpy *
- pandas *
- psycopg2 *
- pytz *
- scikit-learn *
- tensorflow-gpu *
- transformers *
- xlrd *
Score: 3.258096538021482