Recent Releases of TECA
TECA - TECA 6.0.0
TECA 6.0.0 Release Highlights
This is a major release that contains numerous improvements and fixes. TECA BARD is
fully GPUized. Temporal reductions have been ported to C++ and optimized. The data
and execution models have been extended for batching (processing multiple steps per
request). New spatial parallel and space time parallel execution patterns allow the full space
time extent of high resolution data to processed in memory. The new spatial parallelism is
used in a low, high, and band pass filters as well as temporal percentile calculation.
Numerous I/O optimizations have been introduced including the use of MPI collective
buffering for spatial parallel execution.
Execution Model Improvements
e134264b add spatial executive
c6f9bc62 cf_writer add partitioning contraints
4d53de68 add space_time_executive
97efc350 add cf_space_time_time_step_mappper
98bbb97b adds cf_spatial_time_step_mapper
3d915ee9 cf_space_time_time_step_mapper add partitioning contraints
a8fa4d8d cf_spatial_time_step_mapper add partitioning contraints
19f5e229 coordinate_util partition add contraints
4d2a8f1c index_reduce execution controls
c765bd51 cf_writer command line parsing of spatial parallel properties
7c0c8a32 spatial_executive constrain partitioning
792b7f94 space_time_executive constrain partitioning
f572c81e metadata_probe report number of intervals
b295896e mesh wrap temporal bounds and extent
daa684d7 index_request_key update
25bd3b44 index_executive clean up verbose report
a61ec638 test cf_reader temporal extent handling
6e3323dc dataset_diff handle temporal extents
d5dad5eb test temporal reduction spatial parallelism
019dc836 cf_writer spatial parallelism
f3c14a09 cf_layout_manager spatial parallelism
ba50dd85 cf_time_step_mapper layout manager API
9aa17f18 interval time step mapper refactor
e3a25a8c block time step mapper refactor
1cfcc08f coordinate util spatial partitioning
e341f638 cf_reader reads temporal extents
423a8da9 data model updates for multiple time steps per mesh
Data Model Improvements
03939e15 add and apply simplified dispatch macros
69e88df9 hamr update to latest
422f3835 hamr fully asynchronous by default
2cd9c8e7 hamr enforce const for read only data access
95de5935 hamr update to latest master
2927a95e HAMR update to latest master
adf56038 variant_array_util add host synchronization helper
69760845 variant_array add synchronization method
c7b1b2d9 add teca_variant_array_util
29897a4c variant_array better dispatch
1ce73a70 variant_array better dispatch
a70cdfe8 variant_array make test for accessibility virtual
1422ea30 variant_array provide direct access to internal memory
59111349 variant_array python construct from numpy scalar
c3562a76 cartesian_mesh fall back to mesh extents
03143cb2 cartesian_mesh_source spatial parallelism
b8615ed7 cartesian_mesh_regrid per array dimensions
ca4fcbb3 cartesian_mesh per array extent and shape const
42446f27 cartesian_mesh_source generate data on the assigned GPU
d3082de4 cartesian_mesh_source include bounds metadata in output mesh
acf3fe2e cartesian_mesh overload array shape to return a tuple
6a9f3ac0 cartesian_mesh_regrid pass array attributes from the source
e5e8e4a7 cartesian_mesh array extent time dim and add shape
73b58ebb cartesian_mesh fix Python bindings for array shape/extent
86ef5616 cartesian_mesh_source fix calendaring metadata in output
New Algorithms
f730aa81 add teca_surface_integral alg
f79c2c8d add teca_regional_moisture_flux
dc66e328 add teca_table_join
f2af4c41 add spectral filter
e439275e add teca_vtk_util::partition_writer to help debug space-time paritioning
0fe459e0 add temporal_percentile temporal reduction
140008c5 wrote temporal_index_select and tests
New Applications
acfcaffe add regional_moisture_flux app
cfd6ce85 Add the spectral filter app
GPUization
a64839b6 bayesian_ar_detect add CUDA implementation
cf74102e 2d_component_area thrust use stream per thread stream
42d16f76 2d_component_area set cuda device before doing any work
e54e33b4 component_area_filter set cuda device before doing any work
c3efa90d connected_components set cuda device before doing any work
45a87f1d bayeseian_ar_detect set cuda device before doing any work
3791b67d latitude_damper set cuda device before doing any work
8993ed66 unpack_data set cuda device before doing any work
640ee577 index_executive explicitly assign device ids
79445b3b binary_segmentation use streams for sorting and data movement
23347358 cuda_util add a 1d domain decomposition
9644b346 latitude_damper add CUDA implementation
a2432065 component_area_filter add CUDA implementation
5a2f6603 2d_component_area use restrict on kernels
ad65931f 2d_component_area GPU-ize the area calculation
96c59666 cf_reader don't use page locked memory for cuda
7549e888 cuda_util simplify device assignment
1b14777c connected_components use 8 connetivity
52be3623 ha4 test code use 8 connectivity
2f4047f9 index_executive environment variable override CUDA device assignment
0919c784 connected_components inetgrate CUDA ha4 implementation
77884268 shape_file_mask add CUDA implementation
c44aded2 cuda_util implement a container for cuda streams
edf6c588 geometry_util GPUize point in poly
693a7b2c thread_util threads per device behavior
ac2f59fe cuda warning cleanup
3f2ba7f7 spatial_executive load balance across GPUs
5c082594 space_time_executive load balance across GPUs
Threading Improvements
62410659 bayesian_ar_detect fix thread safety issues
fa1c2099 thread_util warn about too few threads wo MPI
1d5f4158 thread_util clamp the number of threads
c9704448 thread_util report num threads when not binding
af1592a4 threaded_algorithm propagate_device_assignment
81d4e2d0 threaded_algorithm expose ranks_per_device in API
Optimizations
60c9e718 cf_restripe app add collective buffer mode
3dbc0e22 Added C++ version of the temporal reduction algorithm and application
9735209c cf_reader open file in collective mode
5558ff66 spectral_filter app command line options for collective buffering
c0efea8f cf/multi_cf_reader option to use collective buffering
f304f275 cf_writer use collective buffering
Documentation
d5eb0fcc cf_reader fix copy paste error in documentation
e5306fac component_area_filter fix indent add comments
30adda58 algorithm fix a documentation typo
bb730837 shape_file_mask improve documentation
d8fcade0 table_reduce improve documentation
b166667f integrated_water_vapor improve documentation
ef2cd480 integrated_vapor_transport improve documentation
f3623803 threaded_algorithm improve documentation
e5a26ff2 doc doxygen style comments for programmable_algorithm
dc367728 doc doxygen style comments for teca_table
de5e8d68 doc data access developer tutorial
1d25525b interval_iterator subclasses fix units doxygen doc strings
dd5f1fee doc update temporal_reduction user guide
c71e9057 cf_writer fix typo in docs
53effc02 doc update m1517 install locations for perlmutter
1b71d8eb coordinate_util improve documentation
ff383a0f rtd add section explaining execution model
ae237bd9 rtd docs fix doxygen install location
c51132bf rtd pin sphinx version as latest is incompatible with rtddocs
5ea6e10c rtd doc array access tutorial spell check
af9d2e6c doc rtd improve array access tutorial
b528ec9d rtd fix a rst warning
9a6e888b rtd updates to the install for mac os
1a7dc382 doc rtd exclude variant_array_oeprator from doxygen
Testing
bf97e954 test disable periodic bc in bard app test
238db9f6 test bayesian ar detect sort by label area
49e83a90 deeplab_ar_detect remove tests
b7d14f17 testing update linux distributions
c38337f9 testing cleanup use of %e% in tests
d40d800a temporal_reduction: added tests
80a01599 test add test for cpp_temporal_reduciton w. io
3b277b3a test temporal reduction steps_per_request command line argument
9e614ea1 add test_temporal_reduciton
3b338bf9 ha4 test code update ctests command
5dd84cb9 connected_components test ignore component labels
6569a79f ha4 test code improvements
a1012ed6 ha4 test code handles periodic BC in x-direction
a380f62c ha4 test code works on images not divisible by 32
e6216c3b add ha4 connected component label test code
a769ff73 test_streaming_reduce_threads: specifying netcdf file name to avoid conflict with temporal reduction all iterator test
6e02fa62 test temporal_reduction app python and C++
d79206a4 testing temporal_reduction tests specify number of threads
709f6853 temporal_reduction C++ impl improvements and regression test
5120006d update the DOI badge to point to the latest release
18533f8c Changed teca data revision from 149 to 151
General Improvement
24142094 bayesian_ar_detect_parameters add properties to select specific start row
be087dc5 bayesian_ar_detect instrument the BARD app
37f4237e bayesian_ar_detect app control writer thread pool size
176c1f6b connected_components cleanup a warning
10eaf195 connected_components minor improvements
ee8cbf23 temporal_reduction: set steps_per_request in python app; included definition in cpp app
27f3ef3e temporal_reduction: standardized n_threads command line
b371bea9 temporal_reduction construct output at end and others
494a3b42 temporal_reduction: caching the intermediate result
07a119ae temporal_reduction: any number of time steps per request is allowed
bd321844 descriptive_statistics remove debuging code
18768fd8 index_executive fix a compiler warning
ff551dce cpp_temporal_reduction algorithm errors are fatal
95bd6a88 temporal_reduction: set_thread_pool_size [cf_writer] changed from -1 to 1 to fix intermittent bugs
7953cbbb temporal_reduction: change the 1 time step per request to a run time specified number of steps
1bab4257 dataset_diff ignore specified arrays
03fc0bc7 table_sort sort either ascending or descending
b29c4fd7 coordinate_util wrap bounds to extent overload
d0ac7a98 integrated_vapor_transport handle ascending coordinates in the first order method
b593e57d integrated_vapor_transport app enable automatic z-axis unit conversion
78675ec9 integrated_vapor_transport warn if vertical axis units are incorrect
63087d20 normalize_coordinates check z-axis units
df9378e0 integrated_vapor_transport layer thickness
eb4853a7 evaluate_expression netcdf attributes for the cf_writer
4cccc26f table include dataset property for array attributes
98e0a891 table_join pass array attributes for NetCDF I/O
3b815827 integrated_water_vapor reformat units string
f6eabe0f algorithm add a single value setter for vector properties
657ba214 index_reduce use std::vector instead of std::array
abac3f23 indexed_dataset_cache override request index
eb86345c integrated_vapor_transport change format of units
b0a4390c dataset_source report variables from tables and meshes
31b748a2 dataset_source move to alg to access typed datasets
290db6bf coordinate_util improvements
20c5ad67 table_reduce report and request use default implementations
b878f89f program_options support std::array in algorithm properties
06b52b2f shape_file_mask improvements
9a506d7f dataset typed accessors
40ea89bd derived quantity improvements
ebb12862 array_attributes include mesh_dim_active
254f9e7f temporal_reduction app/alg cpp/python catch user errors
a1eb0f0e cf_writer improve error message
589f70c6 cf_writer improve collective buffering error message
d01da07f cf_restripe app runs in CPU only mode by default
19334ed6 cf_reader improve collective buffering error
c4de2426 spectral_filter per-rank timing output in verbose mode
51a26703 spectral_filter add ideal butterworth frequency response
dadb4911 spectral_filter fixes issue found when processing real data
2a8816ff spectral_filter refactor regression tests
976482d6 spectral filter fix high pass kernel generation
57e3f31d teca_temporal_reduction: added all iterator average test
9924d676 teca_temporal_reduction: added all iterator
fbb866ea teca_calendar_util: added new class all_iterator
c6704cdc temporal_reduction: added flag to spatial parallelism
4b0d251d teca_calendar_util: added the new class n_steps_iterator
ec98d675 added index selection to the temporal reduction
6f1ae9da metadata add support for std::array
6e99362f vorticity better identitiers in dispatch macro
4407b536 cuda_util remove redundant error check
d11f4803 valid_value_mask export mask type
296ec4a0 temporal_reduction app command line option controlling threadpool size
48e32130 temporal_reduction: rename the C++ implementation
bd6718ac temporal_reduction: handle the case where the number of inputs < 2.
19ead29e temporal_reduction: renamed the original python implementation
fbd22354 temporal_reduction: resolving a warning
9ba6d353 temporal_reduction clean up warnings with nvcc
3bcfcf43 tenporal_reduction app integrate multi threading
ec71e980 Renamed python version of temporal reduction; python bindings
cd28e3e8 teca_threaded_programmable_algorithm: increased the size of the class_name variable from 64 to 96
ccfdba31 potential_intensity user provided masking threshold
e7c53c0c potential_intensity units checks and conversions
c674b798 potential_intensity app reduce verbosity
8f6a1ed3 teca_potential_intensity clean up runtime warnings
df50b49a python functions returing typed scalars
11814513 potential_intensity app use spatial partitioning
e18f4d49 potential_intensity app land mask from mcf file
839ef1c3 app_util error out with positional options
Bug fixes
6cb6cccf component_area_filter fix indentation
ee9a4d84 connected_components fix 8-way connectivity accross periodic boundary
e85aa72f system_interface fix double free in stack trace generation
d19e7270 testing fix the component are filter test
10e94597 temporal_reduction: fix data access
d6b22e63 teca_profiler: fixed convertion of hexstring to int
220587ad cpu_thread_pool fix bind argument position
bf99eff3 cpu/cuda_thread_pool fix streaming bug
3d5d4db2 cf_writer fix let threaded_algorithm process command line
80820ef1 threaded_algorithm fix set algorithm props from command line
59c42b53 threaded_algorithm fix threads_per_device parameter name
bada5a60 cpp_temporal_reduciton fix thread safety issues
7de9224c cpp_temporal_reduction fix a typo in documentation
59eb4c79 ha4 test code fix race condition
2eaa71b6 connected_components fix race condition
ddaf758f connected_components fix compile w/o cuda
4c8032c1 connected_components 8-way connectivity bug fixes
55b0908f ha4 test code 8-way connectivity bug fixes
18e0c92d rename_variables fix set variables in the output attributes
e7396820 fixes for cuda 12 and warning cleanup for gcc 12
9d13cd42 temporal_reduction fix missing virtual destructor in base class
76ab59d8 array_collection fix double move
f462ac83 normalize_coordinates fix a bug in the output extents
e8dcfca3 tests fix regex that picks up new file
e3dc08f3 cpp_temporal_reduction cleanup, fixes, and improvements
00ba2421 temporal_reduction: included flag to choose python or c++ implementation; fixing the n_steps interval
bc43a364 temporal_reduction: rename the python implementation; fixing name of two python tests
244f58e5 temporal_reduction: fixing the parameter order in a test
79b36732 temporal_reduction: added a new finalize function to fix a bug
942aa111 temporal_reduction fix a warning and set strream size
7120ecb8 cpu/cuda_thread_pool fix thread safety issues
d2519402 threaded_algorithm fix indentation
26fba6d7 potential_intensity units checks and conversion fix
45dbd2b9 Fixed n_steps_iterator class of python version of temporal_reduction
86e6ea74 calendaring fix buffer overflow warnings
5ffc2d4e Fixing issue
98f04ee3 temporal_percentile fixes
Python
42ca1d80 python support wrapping API with fixed length C-arrays
61e9f34a remove numpy deprecated types
Build System
a76c7cf9 build cleanup cmake code
8f965035 added CMAKE_INSTALL_RPATH to CMakeLists.txt
3e43838f build define NDEBUG in CUDA release build
08b95f05 build always update the version descriptor
944a3f25 build system don't relink unless neccessary
Climate Change - Natural Hazard and Storm
- C++
Published by burlen over 1 year ago
TECA - TECA 5.0.0
Major features
The TECA data model now supports memory management on CPUs as well as CUDA,
OpenMP device offload, HIP capable GPUs and accelerators.
TECA's execution model was extended to support CUDA capable GPUs. This includes
automated load balancing across multi GPU accelerated compute nodes on
supercomputing systems as well as CUDA kernel launching and load balancing
infrastructure
Support for zero-copy interpoerability with Cupy and Numba on CUDA capable
GPU's was added.
GPUized algorithms
teca_binary_segmentation
teca_l2_norm
teca_valid_value_mask
teca_unpack_data
teca_integrated_vapor_transport
teca_temporal_reduction
teca_lapse_rate
teca_cf_reader
teca_cf_writer
New algorithms and apps
teca_lapse_rate
teca_tc_potential_intensity
teca_time_axis_convolution
teca_shapefile_mask
teca_tempest_remap
teca_cartesian_mesh_coordinate_transform
teca_array_collection_reader
teca_array_collection_writer
Improvements
Make the teca_array_collection a data set
Add user defined intervals and operators to the teca_temporal_reduction
teca_temporal_reduction handle integer data in the avergaing reduction
teca_temporal_reduction use the valid value mask
add a summation reduction to the teca_temporal_reduction
improved threading support on MacOS
users can provide call backs at runtime for custom error handling
Documentation
Numerous improvements to the user guide and Doxygen documentation including
documentation of new applications and install on GPU enabled systems
Updated examples illustrating how to use Cupy in Python applications
New Perlmutter specific examples were added to TECA_Examples
Climate Change - Natural Hazard and Storm
- C++
Published by burlen almost 3 years ago
TECA - TECA 4.1.0
4.1.0 is a feature release with a number of new and exciting features and a number of critical bug fixes.
- new mask below surface algorithm that creates point wise binary (0,1) mask identifying mesh points that are below land surface based on externally provided DEM.
- integrated the mask below surface stage into the BARD, IWV, and IVT apps
- new unpack NetCDF packed data stage
- add coordinate normalization stage transform for longitude from -180 to 180
to 0 to 360 - new IWV algorithm
- new IWV command line application
- new time based file layouts (daily, monthly, yearly, seasonal)
- BARD app can now generate output fields weighted by AR probabilities
- new rename variables stage
- improvements to cartesian_mesh_source for remeshing
- cf_reader correctly detects centering and per field dimensionality
- multi_cf_reader MCF file format improvements. Add support for reader
properties, globablly and per reader. - cf_reader option to produce 2D field when the 3'rd dimension is length 1
- Cartesian meshes can now contain both 2D and 3D arrays, metadata annotations
are used to differentiate at run time - metadata probe improvements to report per-field centering
- new remeshing capability deployed in cf_restripe and apps that utilize
elevation mask - improvements to the user guide
- refactored source code documentation to be compatible with Doxygen,
- published Doxygen on the rtd site : https://teca.readthedocs.io/en/integrating_breathe/doxygen/index.html
- new capabilities in the cf_restripe command line application for remeshing
- 25+ bug fixes
Climate Change - Natural Hazard and Storm
- C++
Published by burlen almost 4 years ago
TECA - TECA 4.0.0
Documentation
- A major overhaul of the command line application section of the user guide including the addition of examples.
- Publish batch scripts illustrating running TECA at scale in the new TECA_examples repo.
- Giving tutorials and publishing the materials in the new TECA_tutorials repo
- Updates to the installation section of the TECA User's Guide](https://teca.readthedocs.io/en/latest/installation.html)
Data Model Improvements
- Added support for Arakawa C Grids in
teca_arakawa_c_grid
- Added support for logically Cartesian so called curvilinear grids in
teca_curvilinear_mesh
- Refactored mesh related class hierarchy so that common codes such as array accessing and I/O live in
teca_mesh
- Added support for face and edge centered mesh based data.
I/O Capabilities
- Added reader for WRF simulation
teca_wrf_reader
- Add support for writing logically Cartesian curvilinear meshes in
teca_cartesian_mesh_writer
. - Added a new NetCDF based output format for tabular data to the
teca_table_writer
. - Added support for reading tabular CSV files to the
teca_table_reader
. This enables the tabular outputs such as TC tracks etc saved from TECA apps to be stored in a format ingestible by other tools such as Python and Excel without the need to convert from TECA's internal binary format. - Added versioning and error checking to TECA's internal binary serialization format across all datasets. This enables us to catch version differences and handle bad or corrupted files gracefully.
- use of NetCDF parallel 4 (i.e. MPI collective I/O) for writing results. this enables the use of any number of files with any number of ranks.
Execution Patterns
- Implement a new streaming mode reduction where data is incrementally reduced as it becomes available. This parallelizes the reduction step and reduces the memory overhead.
- Introducing a new MPI parallel approach to scan the time axis. This has substantial benefit when there are a large number of files.
- expose MPI aware thread load balancing to Python. This was used in the
teca_pytorch_algorithm
to automatically load balance the OpenMP backend of PyTorch. - implement GPU load balancing strategy in the
teca_pytorch_algorithm
. - Enable process groups to be excluded from execution. This lets a pipeline run on a subset of MPI_COMM_WORLD.
Algorithmic Capabilities
- Added
teca_pytorch_algorithm
a base class that handle tasks common to interfacing to PyTroch when developing Machine Learning based detectors. - Added
teca_deeplab_ar_detect
a new PyTorch based Machine Learning based AR detector. - Added
teca_valid_value_mask
an algorithm that generates a mask identifying the presence of NetCDF _FillValue values in arrays. Down stream algorithms use the mask to handle _FillValue's in an algorithm appropriate manner. - Added
teca_temporal_reduction
an algorithm that implements transformations from one time resolution to another. The implementation includes min, max, and average operators and supports daily, monthly, and seasonal intervals. - Added
teca_vertical_reduction
an algorithm that converts 3D data to 2D by applying a reduction in the vertical spatial dimension. This is a base class that contains code common to vertical reductions. - Added
teca_integrated_vapor_transport
a vertical reduction that computes IVT from horizontal wind vector and specific humidity. - An improved floating point differencing algorithm was developed and a number of codes were updated to use it.
Command Line Applications
- Added
teca_integrated_vapor_transport
command line application for computing IVT. - Added
teca_restripe
command line application for re-organizing NetCDF datasets. - Added
teca_deeplab_ar_detector
command line application detecting AR's using machine learning. - Integrated IVT calculations into the
teca_nayesian_ar_detector
. - Normalized names and meaning of command line options across command line applications
Python Capabilities
- A polymorphic redesigned the
teca_python_algorithm
makes it easier to use. - Handle numpy scalar types
- Expose more features such as MPI aware thread load balancing, calendaring, profiling, and file manipulation utilities.
Testing
- Added testing infrastructure and tests for command line applications
- Deployed testing on Ubuntu 18.04, Fedora 31, Fedora 32, and Mac OS with xcode 12.2.
Bug fixes
More than 50 bug fixes were reported.
Climate Change - Natural Hazard and Storm
- C++
Published by burlen over 4 years ago
TECA - TECA 3.0.0
This is a major release in support of:
T.A. O'Brien et al, "Detection of Atmospheric Rivers with Inline
Uncertainty Quantification: TECA-BARD v1.0", Geoscientific Model
Development, submitted winter 2020
The pipeline internals were refactored to be more general, the assumption that
time was the dimension across which the reduction is applied was removed, as
well as changes that enable nested map-reduce.
The TECA User Guide was ported to "Read the Docs". https://teca.readthedocs.io
Our Travis CI test infrastructure was updated to use Docker, and two new OS
images Fedora 28, and Ubuntu 18.04 were deployed.
More than 40 bug fixes
New algorithms included in this release:
Type | Name | Description |
---|---|---|
general puprose | teca_2d_component_area | Computes the area's of regions identified by the connected components filter. |
general puprose | teca_bayesian_ar_detect | Detects atmospheric rivers using a Bayesian method. |
general puprose | teca_bayesian_ar_detect_parameters | Parameters used by Bayesian AR detector. |
general puprose | teca_cartesian_mesh_source | Used to create Cratesian meshes in memory and inject them into a pipeline. |
general puprose | teca_component_area_filter | Masks regions with area out side a user specified range |
general puprose | teca_component_statistics | Gathers information about connected component regions into a tabular format |
general puprose | teca_latitude_damper | Multiplies a field by an inverted Gaussian (user specified mean and HWHM) |
general puprose | teca_normalize_coordinates | Transforms Cartesian meshes such that coordinates are always in ascending order |
general puprose | teca_python_algorithm | Base class for TECA algorithm's written in Python. Handles internal plumbing |
core infrastructure | teca_memory_profiler | Supporting class that samples memory consumtion during application execution |
core infrastructure | teca_profiler | Supporting class that logs start, stop, and duration of developer defined events |
I/O | teca_cartesian_mesh_reader | Reads TECA Cartesian meshes in TECA's internal binary format |
I/O | teca_cartesian_mesh_writer | Writes TECA Cartesian meshes in TECA's internal binary format |
I/O | teca_cf_writer | Writes TECA Cratesian meshes in NetCDF CF2 conventions |
New applications included in this release:
Name | Description |
---|---|
teca_bayesian_ar_detect | Command line application that can be used to detect AR's on HPC systems |
teca_profile_explorer | Interactive tool for exploring run time profiling data |
Climate Change - Natural Hazard and Storm
- C++
Published by burlen over 4 years ago