Recent Releases of Zeus
Zeus - Zeus v0.13.1
Better AMD GPU support
Got access to AMD MI210, M250X, and MI300X, so I smoothed out edge cases. Quick follow-up release from v0.13.0 for AMD GPU users.
What's Changed
- [AMD] Harden AMD GPU support by @jaywonchung in https://github.com/ml-energy/zeus/pull/203
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.13.0...zeus-v0.13.1
Consumption - Computation and Communication
- Python
Published by jaywonchung 4 months ago
Zeus - Zeus v0.13.0
Breaking Changes
The low-level device APIs are now all snake_case, instead of camelCase. It had to be done. It was an old mistake from following how pynvml methods were named like.
What's New
Various monitor usability improvements. Zeus now also follows logging best practices.
What's Changed
ZeusMonitorandPowerMonitorusability improvements by @jaywonchung in https://github.com/ml-energy/zeus/pull/196- Fix
stopand GC forPowerMonitorandTemperatureMonitorby @jaywonchung in https://github.com/ml-energy/zeus/pull/197 - [Device] Use snake case for device methods by @jaywonchung in https://github.com/ml-energy/zeus/pull/198
- [Chore] Switch formatter to
ruff foramtby @jaywonchung in https://github.com/ml-energy/zeus/pull/199 - [
show_env] Show device physical to application mapping by @jaywonchung in https://github.com/ml-energy/zeus/pull/200 - Follow logging best practices by @jaywonchung in https://github.com/ml-energy/zeus/pull/201
- [Device] Implement
get_power_management_limitby @jaywonchung in https://github.com/ml-energy/zeus/pull/202
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.12.3...zeus-v0.13.0
Consumption - Computation and Communication
- Python
Published by jaywonchung 4 months ago
Zeus - Zeus v0.12.3
New Features
CuPy synchronization support
It's not just deep learning our users are measuring energy for. There are other CUDA-based applications (e.g., cuDF) that are Python bindings of CUDA. Now, ZeusMonitor allows cupy as another mechanism for CPU-GPU synchronization at the boundary of measurement windows.
Temperature monitor
Temperature is a metric that also has a lot to do with power. It's a nice-to-have addition.
What's Changed
- Add CuPy sync support by @jaywonchung in https://github.com/ml-energy/zeus/pull/194
- [Feature] GPU Temperature Monitor by @jaywonchung in https://github.com/ml-energy/zeus/pull/195
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.12.2...zeus-v0.12.3
Consumption - Computation and Communication
- Python
Published by jaywonchung 5 months ago
Zeus - Zeus v0.12.2
This is a maintenance release focused on security.
What's Changed
- Add SECURITY.md by @jaywonchung in https://github.com/ml-energy/zeus/pull/188
- Add
scripts/check_licenses.shby @jaywonchung in https://github.com/ml-energy/zeus/pull/190 - Add
scripts/generate_sbom.shby @jaywonchung in https://github.com/ml-energy/zeus/pull/191 - Add permissions to GitHub Action workflows by @jaywonchung in https://github.com/ml-energy/zeus/pull/192
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.12.1...zeus-v0.12.2
Consumption - Computation and Communication
- Python
Published by jaywonchung 6 months ago
Zeus - Zeus v0.12.1
Change Highlights
New PowerMonitor
Power measurement over time was not a first-class feature, but now it is. The new PowerMonitor allows you to measure (1) GPU 1s windowed average power, (2) GPU instantaneous power, and (3) GPU memory windowed average power -- if supported by your GPU model -- over time, and export deduplicated power samples into a list of timestamps and power measurements.
Grace Hopper support
Zeus now supports measurements on Grace Hopper platforms. When you use the same Zeus APIs, it'll give you back the whole module's power and energy consumption (i.e., including the Grace CPU and the Hopper GPU). Support is still early stage, so please let us know if you bump into any rough edges.
uv
We're using uv in CI and local dev flow, and now uv.lock is in our codebase as well. Notably, uv has cut our CI time to literally half of what it used to be!
What's Changed
- [Feature] Grace Hopper support by @jaywonchung in https://github.com/ml-energy/zeus/pull/172
- ci: skip ci tests when only markdown files are changed in a push by @kitsiosk in https://github.com/ml-energy/zeus/pull/174
- Revert "ci: skip ci tests when only markdown files are changed in a push" by @jaywonchung in https://github.com/ml-energy/zeus/pull/175
- [UX] Improve
python -m zeus.show_envby @jaywonchung in https://github.com/ml-energy/zeus/pull/176 - [CI] Remove unnecessary tests that just raise warnings by @jaywonchung in https://github.com/ml-energy/zeus/pull/177
- [CI] Use
uvby @jaywonchung in https://github.com/ml-energy/zeus/pull/178 - [UX] Catch base errors in
python -m zeus.show_envby @jaywonchung in https://github.com/ml-energy/zeus/pull/179 - [CI] Use latest uv to avoid GitHub API call by @jaywonchung in https://github.com/ml-energy/zeus/pull/180
- [UX] Improve
python -m show_envfor CPU/RAPL by @jaywonchung in https://github.com/ml-energy/zeus/pull/182 - [Zeusd] Check
libnvidia-ml.so.1iflibnvidia-ml.sois not available by @jaywonchung in https://github.com/ml-energy/zeus/pull/183 - New
PowerMonitorby @jaywonchung in https://github.com/ml-energy/zeus/pull/184
New Contributors
- @kitsiosk made their first contribution in https://github.com/ml-energy/zeus/pull/174
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.12.0...zeus-v0.12.1
Consumption - Computation and Communication
- Python
Published by jaywonchung 8 months ago
Zeus - Zeus v0.12.0
Change Highlights
New SoC device measurement support!
We have a new device abstraction in zeus.device.soc. Measurements can be accessed from the soc field in ZeusMonitor measurement objects.
Apple Silicon
Zeus now provides energy measurement on Apple Silicon chips with component breakdowns like CPU, GPU, DRAM, and ANE (specifics depend on the underlying chip). This is done via a new child project called zeus-apple-silicon. Check out details in our documentation.
NVIDIA Jetson Platform
NVIDIA Jetson is an embedded platform for AI workloads. Zeus now supports energy measurement on Jetson platforms by reading off of its on-board power monitor. Check out details in our documentation.
Electricity price tracking
Via integration with the OpenEI API, Zeus now allows electricity price tracking with the EnergyCostMonitor class. Its API is essentially the same as ZeusMonitor (i.e., measurement windows).
What's Changed
- [Misc] Redirect stdout to
Noneonimport amdsmiby @jaywonchung in https://github.com/ml-energy/zeus/pull/154 - [CI] Upgrade QEMU image version to fix segfault in CI by @jaywonchung in https://github.com/ml-energy/zeus/pull/155
- [Feat] SoC Device Abstraction by @michahn01 in https://github.com/ml-energy/zeus/pull/160
- [Feat] Integrating Instruction Profiler in PFO.server.scheduler by @DdIiVvYyAaMm in https://github.com/ml-energy/zeus/pull/158
- [Fix] Update
amdsmiexception handling by @michahn01 in https://github.com/ml-energy/zeus/pull/165 - [Docs] Code Block Fix by @DdIiVvYyAaMm in https://github.com/ml-energy/zeus/pull/166
- [CI] Fix Pyright private import errors, upgrade actions by @jaywonchung in https://github.com/ml-energy/zeus/pull/169
- [Feat] Update SoC device common by @michahn01 in https://github.com/ml-energy/zeus/pull/168
- [Feat] Created price.py to incorporate OpenEI API integration. by @vishwa-11 in https://github.com/ml-energy/zeus/pull/162
- [Feat] Apple Silicon Integration by @michahn01 in https://github.com/ml-energy/zeus/pull/170
- [Feat] Jetson platform measurement support by @jxunn in https://github.com/ml-energy/zeus/pull/167
- [Misc] Update project news by @jaywonchung in https://github.com/ml-energy/zeus/pull/171
New Contributors
- @DdIiVvYyAaMm made their first contribution in https://github.com/ml-energy/zeus/pull/158
- @vishwa-11 made their first contribution in https://github.com/ml-energy/zeus/pull/162
- @jxunn made their first contribution in https://github.com/ml-energy/zeus/pull/167
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.11.0...zeus-v0.12.0
Consumption - Computation and Communication
- Python
Published by jaywonchung 10 months ago
Zeus - Zeus Daemon v0.2.0
Change Highlights
CPU and DRAM energy measurements
Zeus daemon now also supports CPU and DRAM energy measurements with RAPL, which also requires root privileges just for measurement. Zeus daemon has also been integrated into the Zeus Python library, so as long as you have the daemon deployed and you set the ZEUSD_SOCK_PATH environment variable, you'll be all set!
What's Changed
- [Feat] Implement CPU and DRAM monitoring for
zeusdby @wbjin in https://github.com/ml-energy/zeus/pull/137 - Incorporate Zeusd for CPU and DRAM monitoring in ZeusMonitor by @michahn01 in https://github.com/ml-energy/zeus/pull/150
- Trace GPU ID in Zeusd GPU routes by @jaywonchung in https://github.com/ml-energy/zeus/pull/152
Consumption - Computation and Communication
- Python
Published by jaywonchung about 1 year ago
Zeus - Zeus v0.11.0
Change Highlights
Renamed to zeus!
Until now we used zeus-ml because the name zeus was taken on PyPI, but now we're finally able to move to zeus:
pip install zeus
Prometheus Metrics
Zeus power and energy measurements can now be exported as Prometheus metrics! We currently support three metrics:
- Energy consumption of a fixed code range (Histogram)
- Power draw over time (Gauge)
- Cumulative energy consumption over time (Counter)
We wrote up a detailed metric monitoring guide and integration examples.
AMD GPU enhancements
We created an official distribution of ROCm AMDSMI Python bindings (GitHub, PyPI) and integrated it with Zeus. Before this, users had to cd into their ROCm installation's AMDSMI distribution directory and run pip install, which isn't very convenient.
Carbon Emission Estimations
The new [zeus.monitor.carbon.CarbonEmissionMonitor]https://ml.energy/zeus/reference/monitor/carbon/#zeus.monitor.carbon.CarbonEmissionMonitor) takes in a carbon intensity provider (e.g., from ElectricityMaps) and provides an estimate for operational carbon emissions. The window-based API is essentially the same as ZeusMonitor.
Full Changelog
- [Misc] Reorganize Zeus NSDI 23 paper artifacts by @jaywonchung in https://github.com/ml-energy/zeus/pull/126
- [Docs] Add
BUILD_SOCIAL_CARDenv, skip social card build by default by @jaywonchung in https://github.com/ml-energy/zeus/pull/130 - [Feat]
CarbonIntensityProviderand ElectricityMaps implementation by @danielhou0515 in https://github.com/ml-energy/zeus/pull/129 - [Misc] Fix link in PLO example README by @jaywonchung in https://github.com/ml-energy/zeus/pull/136
- Fix typo in profiler script by @dkopczyk in https://github.com/ml-energy/zeus/pull/138
- [Feat]
amdsmibindings integration by @parthraut in https://github.com/ml-energy/zeus/pull/132 - Make sure to assign EmptyCPUs to cpus if there is a permission error by @wbjin in https://github.com/ml-energy/zeus/pull/139
- [Feat] Implement CPU and DRAM monitoring for
zeusdby @wbjin in https://github.com/ml-energy/zeus/pull/137 - [Fix] Fix tests failing due to deprecated
appargument in httpx client by @jaywonchung in https://github.com/ml-energy/zeus/pull/140 - Out of Bounds Power Limit in
GlobalPowerLimitOptimizerby @parthraut in https://github.com/ml-energy/zeus/pull/143 - [CI] Upgrade
actions/cacheto V4 by @jaywonchung in https://github.com/ml-energy/zeus/pull/144 - [Misc] Update Perseus paper link by @jaywonchung in https://github.com/ml-energy/zeus/pull/145
- [feat]
CarbonEmissionMonitorby @danielhou0515 in https://github.com/ml-energy/zeus/pull/148 - Update
zeusddependencies following dependabot suggestions by @jaywonchung in https://github.com/ml-energy/zeus/pull/149 - [Feat] Prometheus metric export by @sharonsyh in https://github.com/ml-energy/zeus/pull/134
- Pytorch Fully Sharded Data Parallel (FSDP) Integration by @parthraut in https://github.com/ml-energy/zeus/pull/147
- Rename package from
zeus-mltozeusby @jaywonchung in https://github.com/ml-energy/zeus/pull/151 - Incorporate Zeusd for CPU and DRAM monitoring in ZeusMonitor by @michahn01 in https://github.com/ml-energy/zeus/pull/150
- Trace GPU ID in Zeusd GPU routes by @jaywonchung in https://github.com/ml-energy/zeus/pull/152
New Contributors
- @dkopczyk made their first contribution in https://github.com/ml-energy/zeus/pull/138
- @michahn01 made their first contribution in https://github.com/ml-energy/zeus/pull/150
Consumption - Computation and Communication
- Python
Published by jaywonchung about 1 year ago
Zeus - Zeus v0.10.1
This is a maintenance release aimed at enhancing usability and fixing small bugs.
What's Changed
- Feat: Catch
PermissionErrorand raise with more information by @wbjin in https://github.com/ml-energy/zeus/pull/111 - Feat: Alternative RAPL directory inside Docker containers by @wbjin in https://github.com/ml-energy/zeus/pull/115
- Feat: added utility function to retrieve CPU index from PID by @danielhou0515 in https://github.com/ml-energy/zeus/pull/117
- Docs: More documentation on CPU monitoring by @wbjin in https://github.com/ml-energy/zeus/pull/118
- Feat:
python -m zeus.show_envby @jaywonchung in https://github.com/ml-energy/zeus/pull/119 - Feat:
getAverageMemoryPowerUsageby @jaywonchung in https://github.com/ml-energy/zeus/pull/122 - Fix: Add
getAverageMemoryPowerUsagetoGPUsas well by @jaywonchung in https://github.com/ml-energy/zeus/pull/124
New Contributors 🎉
- @danielhou0515 made their first contribution in https://github.com/ml-energy/zeus/pull/117
Full Changelog: https://github.com/ml-energy/zeus/compare/zeus-v0.10.0...zeus-v0.10.1
Consumption - Computation and Communication
- Python
Published by jaywonchung over 1 year ago
Zeus - Zeus v0.10.0: Broader support
What's New
CPU and DRAM energy measurement
We implemented support for Intel RAPL, which allows CPU and DRAM energy measurement on supported CPUs.
Generally speaking, most Intel CPUs support would support both and some AMD CPUs will support RAPL, albeit only CPU measurement.
JAX support
We added preliminary JAX support. Check out our full example here.
API usage is mostly identical:
monitor = ZeusMonitor(sync_execution_with="jax") # JAX!
monitor.begin_window("computations")
# Run computation
measurement = monitor.end_window("computations")
Zeus Daemon
Our energy optimizers require changing setting on the GPU, including power limit and frequency. This requires admin privileges. More details in our docs.
Zeus Daemon lets you circumvent this by running as a standalone daemon process on the node that implements privileged operations on your behalf, so that you don't have to give the entire Zeus-integrated application admin privileges.
We wrote the Zeus Daemon in Rust: Check out the source code and crates.io for details.
Breaking Changes
ZeusMonitor.begin_window and ZeusMonitor.end_window's second parameter sync_cuda was renamed to sync_execution.
This is because JAX asynchronously runs CPU code as well, and we would like to synchronize both CUDA and CPU computations. This created the need to generalize sync_cuda to sync_execution.
Changelog
- Docs: Add warnings about instantiating
ZeusMonitoras a global variable. by @jaywonchung in https://github.com/ml-energy/zeus/pull/68 - Docs: Fix typo by @Sunt-ing in https://github.com/ml-energy/zeus/pull/69
- Docs: Improve the GPU energy monitoring demo by @Sunt-ing in https://github.com/ml-energy/zeus/pull/70
- Feat: Detect and reject unofficial
pynvmlbindings by @jaywonchung in https://github.com/ml-energy/zeus/pull/71 - Fix: Pandas warnings from
PowerMonitorby @jaywonchung in https://github.com/ml-energy/zeus/pull/75 - Feat: Zeus daemon by @jaywonchung in https://github.com/ml-energy/zeus/pull/81
- Test: Allow
zeusddev and testing on MacOS by @jaywonchung in https://github.com/ml-energy/zeus/pull/82 - Refactor: Reorg
zeus.device.gpuby @jaywonchung in https://github.com/ml-energy/zeus/pull/83 - Feat: Integrate
zeusdintozeus.device.gpuby @jaywonchung in https://github.com/ml-energy/zeus/pull/85 - Chore: Fix typo in GitHub Actions by @jaywonchung in https://github.com/ml-energy/zeus/pull/86
- Chore:
zeusddebug outputs and doc comments by @jaywonchung in https://github.com/ml-energy/zeus/pull/87 - Feat: Add CPU measurement (via Intel RAPL) to ZeusMonitor by @wbjin in https://github.com/ml-energy/zeus/pull/90
- Fix: RAPL DRAM measurements not to be included in package measurements by @wbjin in https://github.com/ml-energy/zeus/pull/92
- Chore: Run checks in PRs from forks by @jaywonchung in https://github.com/ml-energy/zeus/pull/95
- Docs: Fix attribute name in
ZeusMonitorexample by @HGangloff in https://github.com/ml-energy/zeus/pull/96 - Feat: Add zero energy warning in
ZeusMonitorby @sharonsyh in https://github.com/ml-energy/zeus/pull/93 - Feat: Add jax support in CUDA sync by @HGangloff in https://github.com/ml-energy/zeus/pull/97
- Docs: Refine JAX integration and example by @jaywonchung in https://github.com/ml-energy/zeus/pull/99
- Feat: Multi arch docker build by @sharonsyh in https://github.com/ml-energy/zeus/pull/104
- News: Add Perseus news and write Perseus blog by @jaywonchung in https://github.com/ml-energy/zeus/pull/107
- Feat: Multi-Arch Docker Build - Pushing to symbioticlab/zeus and mlenergy/zeus by @sharonsyh in https://github.com/ml-energy/zeus/pull/106
- Feat: RAPL Monitor for monitoring wraparounds for a rapl file by @wbjin in https://github.com/ml-energy/zeus/pull/105
- Test: Tests for CPU monitoring onn ZeusMonitor by @wbjin in https://github.com/ml-energy/zeus/pull/100
- Chore: Fix lint warnings from ruff by @wbjin in https://github.com/ml-energy/zeus/pull/108
New Contributors 🎉
- @Sunt-ing made their first contribution in https://github.com/ml-energy/zeus/pull/69
- @wbjin made their first contribution in https://github.com/ml-energy/zeus/pull/90
- @HGangloff made their first contribution in https://github.com/ml-energy/zeus/pull/96
- @sharonsyh made their first contribution in https://github.com/ml-energy/zeus/pull/93
Full Changelog: https://github.com/ml-energy/zeus/compare/v0.9.1...zeus-v0.10.0
Consumption - Computation and Communication
- Python
Published by jaywonchung over 1 year ago
Zeus - v0.9.0: Batch size optimizer and big cleanups
What's new
- The batch size optimizer is now a full-fledged server that can be deployed independently, with Docker Compose, or on Kubernetes + KubeFlow.
- GPU abstraction: We created an abstraction layer over GPU vendors (NVIDIA and AMD). We're on our way to supporting AMD GPUs.
- Completely revamped documentation under https://ml.energy/zeus.
Deprecated
- See #20 (
ZeusDataLoader,ZeusMaster, and the C++ Zeus monitor)
Consumption - Computation and Communication
- Python
Published by jaywonchung almost 2 years ago
Zeus - v0.8.0: Energy-efficient large model training
This release features Perseus, an optimizer for energy-efficient large model training.
See the Perseus docs for details.
Consumption - Computation and Communication
- Python
Published by jaywonchung over 2 years ago
Zeus - v0.7.1: Moved to under `ml-energy`!
We moved our repository to under ml-energy. No feature changes :)
Consumption - Computation and Communication
- Python
Published by jaywonchung over 2 years ago
Zeus - v0.7.0: Python-based power monitor
What's New
- We used to have a C++ power monitor under
zeus_monitor, but we've deprecated that. There's no need for high speed polling because NVML power counters do not update that quick anyway.- In order to poll power consumption programmatically, use
zeus.monitor.power.PowerMonitor.
- In order to poll power consumption programmatically, use
- CLI power & energy monitor:
python -m zeus.monitor powerpython -m zeus.monitor energy
- We switched from the old
setup.pyto the new package metadata standardpyproject.toml. - Docker image sizes are drastically smaller now! The compressed image used to be 8.48 GB, but now it's down to 2.71 GB.
Consumption - Computation and Communication
- Python
Published by jaywonchung over 2 years ago
Zeus - v0.6.1: `approx_instant_energy`
What's New
approx_instant_energy in ZeusMonitor
- Sometimes, the NVML energy counter update period is longer than the measurement window, in which case energy consumption may be return as
0.0. In this case, whenapprox_instant_energy=True,ZeusMonitorwill approximate the energy consumption of the window as instant power consumption multiplied by the duration of the measurement window:\textrm{Energy} = \int_0^T \textrm{Power}(t) dt \approx \textrm{Power}(T) \cdot T
Consumption - Computation and Communication
- Python
Published by jaywonchung over 2 years ago
Zeus - v0.6.0: `OptimumSelector`
What's New
OptimumSelector
- Until know, the optimal power limit for
GlobalPowerLimitOptimizerwas the one that minimizes the Zeus time-energy cost. Not everyone would want that. - Now,
OptimumSelectoris an abstract base class with which you can implement your own optimal power limit selection policy. - Pre-implemented one are
Time,Energy,ZeusCost, andMaxSlowdownConstraint. These are thoroughly tested.
wait_steps
- Now, you can specify
wait_stepsinGlobalPowerLimitOptimizer, and it'll wait for the specified number of steps before profiling and optimizing. wait_stepsis set to 1 by default to because users may havetorch.backends.cudnn.benchmark = TrueandDataLoaderworkers usually need time to warm up before ramping up to their normal fetch throughput.
Breaking Changes
GlobalPowerLimitOptimizernow takes an instance ofOptimumSelectorin its constructor, instead ofeta_knob. If you want to recover the functionality of v0.5.0, modify your code like this:# Before plo = GlobalPowerLimitOptimizer(..., eta_knob=0.5, ...)# After from zeus.optimizer.power_limit import ZeusCost plo = GlobalPowerLimitOptimizer(..., optimum_selector=ZeusCost(eta_knob=0.5), ...)
Consumption - Computation and Communication
- Python
Published by jaywonchung over 2 years ago
Zeus - v0.5.0: Big refactor, `GlobalPowerLimitOptimizer`
What's New
Callback-based architecture
zeus.callback.Callbackis the new backbone for Zeus componentsGlobalPowerLimitOptimizeris the shiny new way to online-profile and optimize the power limit of DNN training.EarlyStopControllermonitors and manages all sorts of conditions to determine whether training should stop.
Extensive testing
tests/is richer than ever. With deep component tests with exhaustive parametrization, there are now around 1500 test cases.- Especially,
zeus.util.testing.ReplayZeusMonitorexposes the same public API asZeusMonitorbut replays the measurement window logs produced byZeusMonitor, instead of doing actual measurement. With this, Zeus can now be tested without any actual GPUs.
Consumption - Computation and Communication
- Python
Published by jaywonchung over 2 years ago
Zeus - v0.4.0: `ZeusMonitor`
What's New
- Just measuring energy with Zeus has been non-trivial. Now,
ZeusMonitoris the only way to measure time and energy consumed by an arbitrary set of GPUs from executing an arbitrary range of code. There should be one-- and preferably only one --obvious way to do it.ZeusDataLoaderwas refactored to build aroundZeusMonitor.ZeusMonitoris quite thoroughly tested now.
Consumption - Computation and Communication
- Python
Published by jaywonchung almost 3 years ago
Zeus - v0.3.0: `ZeusMonitorContext` for in-training-loop profiling
What's New
ZeusMonitorContextallows users to profile their per-iteration energy and time consumption.- It's aimed for those who would like to get a feel for the energy consumption of their DNN training job with a couple additional lines (as opposed to modified lines).
- Documentation and integration example: here
Consumption - Computation and Communication
- Python
Published by jaywonchung over 3 years ago
Zeus - v0.2.0: Single-Node Data Parallel Support
New Features
- Single-node multi-GPU data parallel training support added (#2)
zeus_monitoris built at Docker image build time and baked into the image (#6)
Breaking Changes
ZeusDataLoader's profile window for each power limit is now based on the number of iterations, not time. (#2)- This was done to ease synchronization between GPUs while profiling power limits.
- The
ZEUS_PROFILE_PARAMSenvironment variable is now parsed as a comma separated string of the number of warmup and measure iterations. ZeusMaster's constructor now takes argumentsprofile_warmup_itersandprofile_measure_iters.
Consumption - Computation and Communication
- Python
Published by jaywonchung over 3 years ago