WOUDC Data Registry
A platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
https://github.com/woudc/woudc-data-registry
Category: Atmosphere
Sub Category: Radiative Transfer
Keywords
gaw ozone ozonesonde spectral totalozone ultraviolet umkehr uv wmo
Keywords from Contributors
downscaled
Last synced: about 3 hours ago
JSON representation
Repository metadata
WOUDC Data Registry is a platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
- Host: GitHub
- URL: https://github.com/woudc/woudc-data-registry
- Owner: woudc
- License: other
- Created: 2017-07-26T01:36:59.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2025-04-08T22:17:29.000Z (19 days ago)
- Last Synced: 2025-04-13T00:32:05.731Z (14 days ago)
- Topics: gaw, ozone, ozonesonde, spectral, totalozone, ultraviolet, umkehr, uv, wmo
- Language: Python
- Homepage: https://woudc.org
- Size: 794 KB
- Stars: 4
- Watchers: 2
- Forks: 9
- Open Issues: 0
- Releases: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Support: docs/support.rst
README.md
WOUDC Data Registry
Overview
WOUDC Data Registry is a platform that manages ozone and ultraviolet
radiation data in support of the World Ozone and Ultraviolet Radiation Data
Centre (WOUDC), one of six World Data Centres as part of
the Global Atmosphere Watch programme of the
WMO.
Installation
Requirements
- Python 3 and above
- virtualenv
- Elasticsearch (5.5.0 and above)
- woudc-extcsv
Dependencies
Dependencies are listed in requirements.txt. Dependencies
are automatically installed during installation.
Installing woudc-data-registry
# setup virtualenv
python3 -m venv woudc-data-registry_env
cd woudc-data-registry_env
source bin/activate
# clone woudc-extcsv and install
git clone https://github.com/woudc/woudc-extcsv.git
cd woudc-extcsv
pip install -r requirements.txt
pip install .
cd ..
# clone codebase and install
git clone https://github.com/woudc/woudc-data-registry.git
cd woudc-data-registry
pip install .
# optional: for PostgreSQL backends
pip install -r requirements-pg.txt
# set system environment variables
cp default.env foo.env
vi foo.env # edit database connection parameters, etc.
. foo.env
Initializing the Database
# create database
make ENV=foo.env createdb
# drop database
make ENV=foo.env dropdb
# show configuration
woudc-data-registry admin config
# initialize model (database tables)
woudc-data-registry admin registry setup
# initialize search engine
woudc-data-registry admin search setup
# load core metadata
woudc-data-registry admin init -d data/
# cleanups
# re-initialize model (database tables)
woudc-data-registry admin registry teardown
woudc-data-registry admin registry setup
# re-initialize search engine
woudc-data-registry admin search teardown
woudc-data-registry admin search setup
# If required reinitialized StationDobsonCorrections table and index
woudc-data-registry admin setup-dobson-correction -d data/
Running woudc-data-registry
TIP: autocompletion can be made available in some shells via:
eval "$(_WOUDC_DATA_REGISTRY_COMPLETE=source woudc-data-registry)"
Core Metadata Management
# list all instances of foo (where foo is one of:
# project|dataset|contributor|country|station|instrument|deployment)
woudc-data-registry <foo> list
# e.g.
woudc-data-registry contributor list
# show a specific instance of foo with a given registry identifier
woudc-data-registry <foo> show <identifier>
# e.g.
woudc-data-registry station show 023
woudc-data-registry instrument show ECC:2Z:4052:002:OzoneSonde
# add a new instance of foo (contributor|country|station|instrument|deployment)
woudc-data-registry <foo> add <options>
# e.g.
woudc-data-registry deployment add -s 001 -c MSC:WOUDC
woudc-data-registry contributor add -id foo -n "Contributor name" -c Canada -w IV -u https://example.org -e [email protected] -f foouser -g -75,45
# update an existing instance of foo with a given registry identifier
woudc-data-registry <foo> update -id <identifier> <options>
# e.g.
woudc-data-registry station update -n "New station name"
woudc-data-registry deployment update --end-date 'Deployment end date'
# delete an instance of foo with a given registry identifier
woudc-data-registry <foo> delete <identifier>
# e.g.
woudc-data-registry deployment delete 018:MSC:WOUDC
# for more information about options on operation (add|update):
woudc-data-registry <foo> <operation> --help
# e.g.
woudc-data-registry instrument update --help
Data Processing
# ingest directory of files (walks directory recursively)
woudc-data-registry data ingest /path/to/dir
# ingest single file
woudc-data-registry data ingest foo.dat
# ingest without asking permission checks
woudc-data-registry data ingest foo.dat -y
# verify directory of files (walks directory recursively)
woudc-data-registry data verify /path/to/dir
# verify single file
woudc-data-registry data verify foo.dat
# verify core metadata only
woudc-data-registry data verify foo.dat -l
# ingest with only core metadata checks
woudc-data-registry data ingest /path/to/dir -l
Search Index Generation
# sync all data and metadata tables (except data product tables) to ElasticSearch
woudc-data-registry admin search sync
# sync the data product tables (uv_index_hourly, totalozone, and ozonesonde) to ElasticSearch
woudc-data-registry admin search product-sync
UV Index Generation
# Teardown and generate entire uv_index_hourly table
woudc-data-registry product uv-index generate /path/to/archive/root
# Only generate uv_index_hourly records within year range
woudc-data-registry product uv-index update -sy start-year -ey end-year /path/to/archive/root
Total Ozone Generation
# Teardown and generate entire totalozone table
woudc-data-registry product totalozone generate /path/to/archive/root
OzoneSonde Generation
# Teardown and generate entire ozonesonde table
woudc-data-registry product ozonesonde generate /path/to/archive/root
Report Generation
The woudc-data-registry data ingest
command accepts a -r/--report
flag, which is a path pointing to a directory.
When that flag is provided, an operator report and a run report are automatically written to that directory
while the files are being processing.
woudc-data-registry data ingest /path/to/dir -r /path/to/reports/location
The run report has a filename run_report
. The file contains a series of blocks,
one per contributor in a processing run, of the following format:
<contributor acronym>
<status>: <filepath>
<status>: <filepath>
<status>: <filepath>
...
Where <status>
is either Pass
or Fail
, depending on how the file reported in that line fared in processing.
The operator report is a more in-depth error log in CSV format, with a filename like operator-report-<date>.csv
.
Operator reports contain one line per error or warning that happened during the processing run. The operator report
is meant to be a human-readable log which makes specific errors easy to find and diagnose.
Sending Emails to Contributors
To generate emails for contributors:
woudc-data-registry data generate-emails /path/to/dir
Delete Record
woudc-data-registry data delete-record /path/to/bad/file/
If a bad file was previously ingested, it can be removed using this command. This removes the file from the registry and the WAF.
Development
# install dev requirements
pip install -r requirements-dev.txt
Building the Documentation
# build local copy of https://woudc.github.io/woudc-data-registry
cd docs
make html
python3 -m http.server # view on http://localhost:8000/
Running Tests
# run tests like this:
cd woudc_data_registry/tests
python3 test_data_registry.py
python3 test_delete_record.py
# or this:
python3 setup.py test
# measure code coverage
coverage run --source=woudc_data_registry -m unittest woudc_data_registry.tests.test_data_registry
coverage report -m
Code Conventions
Bugs and Issues
All bugs, enhancements and issues are managed on GitHub.
Contact
Owner metadata
- Name: World Ozone and Ultraviolet Radiation Data Centre
- Login: woudc
- Email:
- Kind: organization
- Description: Collaborative software, issue tracker and wiki for WOUDC, one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
- Website: https://woudc.org
- Location: Canada
- Twitter:
- Company:
- Icon url: https://avatars.githubusercontent.com/u/10194891?v=4
- Repositories: 10
- Last ynced at: 2024-03-26T09:32:21.289Z
- Profile URL: https://github.com/woudc
GitHub Events
Total
- Watch event: 1
- Issue comment event: 7
- Push event: 26
- Pull request review event: 2
- Pull request review comment event: 9
- Pull request event: 17
Last Year
- Watch event: 1
- Issue comment event: 7
- Push event: 26
- Pull request review event: 2
- Pull request review comment event: 9
- Pull request event: 17
Committers metadata
Last synced: 6 days ago
Total Commits: 270
Total Committers: 12
Avg Commits per committer: 22.5
Development Distribution Score (DDS): 0.574
Commits in past year: 43
Committers in past year: 5
Avg Commits per committer in past year: 8.6
Development Distribution Score (DDS) in past year: 0.558
Name | Commits | |
---|---|---|
Alex Hurka | a****a@c****a | 115 |
Tom Kralidis | t****s@h****m | 47 |
danielwaiforssell | 5****l | 22 |
Victoria Rose Spada | 8****a | 22 |
Kevin Ngai | k****i | 19 |
ahurka | 3****a | 18 |
Simran Mattu | m****s@w****a | 11 |
Kevin Ngai | n****k@w****a | 10 |
Bob Du | b****u@c****a | 3 |
Victoria Spada | v****a@e****a | 1 |
Kevin Ngai | n****k@w****a | 1 |
BobMDu | n****e@g****m | 1 |
Committer domains:
Issue and Pull Request metadata
Last synced: 1 day ago
Total issues: 0
Total pull requests: 154
Average time to close issues: N/A
Average time to close pull requests: 8 days
Total issue authors: 0
Total pull request authors: 7
Average comments per issue: 0
Average comments per pull request: 0.44
Merged pull request: 144
Bot issues: 0
Bot pull requests: 0
Past year issues: 0
Past year pull requests: 12
Past year average time to close issues: N/A
Past year average time to close pull requests: 2 days
Past year issue authors: 0
Past year pull request authors: 3
Past year average comments per issue: 0
Past year average comments per pull request: 0.58
Past year merged pull request: 11
Past year bot issues: 0
Past year bot pull requests: 0
Top Issue Authors
Top Pull Request Authors
- victoriarspada (50)
- danielwaiforssell (40)
- tomkralidis (24)
- ahurka (23)
- simranmattu14 (9)
- BobMDu (6)
- kngai (2)
Top Issue Labels
Top Pull Request Labels
Dependencies
- alembic * development
- coverage * development
- flake8 * development
- sphinx * development
- wheel * development
- sphinx *
- sphinx-click *
- psycopg2 *
- click *
- elasticsearch <8
- jsonschema <4.4.0
- pyyaml *
- requests *
- sqlalchemy *
- woudc-extcsv >=0.5.0
- actions/checkout v2 composite
- actions/setup-python v2 composite
Score: 3.8712010109078907