Open Sustainable Technology

A curated list of open technology projects to sustain a stable climate, energy supply, biodiversity and natural resources.

Browse accepted projects | Review proposed projects | Propose new project | Open Issues


The Wind Turbine Prognostics and Health Management library processes wind turbine events data, as well as operational SCADA data for easier fault detection, prognostics or reliability research.

fault-detection machine-learning scada wind-energy wind-turbine

Last synced: about 10 hours ago
JSON representation

Repository metadata

SCADA data pre-processing library for prognostics, health management and fault detection of wind turbines. Successor to



.. comment


The **W**\ind **T**\urbine **P**\rognostics and **H**\ealth **M**\anagement library
processes wind turbine events (also called alarms or status) data, as well as
operational SCADA data (the usually 10-minute data coming off of wind turbines)
for easier fault detection, prognostics or reliability research.

Turbine alarms often appear in high numbers during fault events, and significant
effort can be involved in processing these alarms in order to find what actually
happened, what the root cause was, and when the turbine came back online.
This module solves this by automatically identifying stoppages and fault periods
in the data and assigning a high-level "stoppage category" to each.
It also provides functionality to use this info to label SCADA data for training
predictive maintenance algorithms.

Although there are commercial packages that can perform this task, this library
aims to be an open-source alternative for use by the research community.

Please reference this repo if used in any research. Any bugs, questions or
feature requests can be raised on GitHub. Can also reach me on twitter

This library was used to build the "batch creation" and "data labelling" steps of `this paper `_.


Install using pip! ::

pip install wtphm


Full documentation and user guide can be found on
`readthedocs `_.

A local copy of the docs can
be built by running ``_ with sphinx installed.

Is my Data Compatible?

The data manipulated in this library are turbine events/status/alarms data and
10-minute operational SCADA data.
They must be in the formats described below.

Event Data

.. start event comment

The ``event_data`` is related to any fault or information messages generated by
the turbine. This is instantaneous, and records information like faults that have
occurred, or status messages like low- or no- wind, or turbine shutting down due
to storm winds.

The data must have the following column headers and information available:

* ``turbine_num``: The turbine the data applies to
* ``code``: There are a set list of events which can occur on the
turbine. Each one of these has an event code
* ``description``: Each event code also has an associated description
* ``time_on``: The start time of the event
* ``stop_cat``: This is a category for events which cause the turbine to come to
a stop. It could be the functional location of where in the turbine the event
originated (e.g. pitch system), a category for grid-related events,
that the turbine is down for testing or maintenance, in curtailment due to
shadow flicker, etc.
* In addition, there must be a specific event ``code`` which signifies return to
normal operation after any downtime or abnormal operating period.

.. end event comment

SCADA/Operational data

.. start scada comment

The ``scada_data`` is typically recorded in 10-minute intervals and has attributes like
average power output, maximum, minimum and average windspeeds, etc. over the previous
10-minute period.

For the purposes of this library, it must have the following column headers and

* ``turbine_num``: The turbine the data applies to
* ``time``: The 10-minute period the data belongs to
* availability counters: Some of the functions for giving the batches a stop
category rely on availability counters. These are sometimes stored as part of
scada data, and sometimes in separate availability data. They count the portion
of time the turbine was in some mode of operation in each 10-minute period,
for availability calculations. For example, maintenance time, fault time, etc.
In order to be used in this library, the availability counters are
assumed to range between 0 and
*n* in each period, where *n* is some arbitrary maximum (typically 600, for
the 600 seconds in the 10-minute period).

.. end scada comment

Owner metadata

GitHub Events

Last Year

Committers metadata

Last synced: 1 day ago

Total Commits: 66
Total Committers: 1
Avg Commits per committer: 66.0
Development Distribution Score (DDS): 0.0

Commits in past year: 0
Committers in past year: 0
Avg Commits per committer in past year: 0.0
Development Distribution Score (DDS) in past year: 0.0

Name Email Commits
Kevin Leahy l****v@g****m 66

Committer domains:

Issue and Pull Request metadata

Last synced: 1 day ago

Total issues: 5
Total pull requests: 1
Average time to close issues: 3 days
Average time to close pull requests: 34 minutes
Total issue authors: 3
Total pull request authors: 1
Average comments per issue: 0.2
Average comments per pull request: 0.0
Merged pull request: 1
Bot issues: 0
Bot pull requests: 0

Past year issues: 1
Past year pull requests: 0
Past year average time to close issues: N/A
Past year average time to close pull requests: N/A
Past year issue authors: 1
Past year pull request authors: 0
Past year average comments per issue: 0.0
Past year average comments per pull request: 0
Past year merged pull request: 0
Past year bot issues: 0
Past year bot pull requests: 0

More stats:

Top Issue Authors

  • lkev (3)
  • anmoljaggi (1)
  • GuShuai02 (1)

Top Pull Request Authors

  • lkev (1)

Top Issue Labels

  • enhancement (1)

Top Pull Request Labels

Package metadata wtphm

SCADA data pre-processing library for prognostics and healthmanagement and fault detection of wind turbines

  • Homepage:
  • Documentation:
  • Licenses: GNU General Public License v3 (GPLv3)
  • Latest release: 0.1.3 (published almost 4 years ago)
  • Last Synced: 2024-02-27T18:00:58.386Z (1 day ago)
  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 39 Last month
  • Rankings:
    • Dependent packages count: 7.373%
    • Stargazers count: 9.155%
    • Forks count: 9.367%
    • Average: 22.022%
    • Dependent repos count: 22.233%
    • Downloads: 61.981%
  • Maintainers (2)

Dependencies pypi
  • pandas *

Score: 7.8567067930958405