Releases: lfwa/carbontracker
# v2.0.1: Log parsing fix, GitHub Actions updates
v2.0.0: Documentation site, deprecation of (some) fetchers and more.
Features
📘 Documentation site is (almost) live!
We have set up a new documentation site on Github Pages using mkdocs
and mkdocstrings
. It will be available very soon.
🔧 Better type hints
Many of the classes and functions in Carbontracker now has type hinting for nicer IDE integration, automatic documentation and fewer type-related bugs. This has meant changing the return-types of some functions slightly, see
🗺 Updated default intensity values and PUE
When Carbontracker cannot read intensity values from its API, it falls back to a national average value. These values were updated to the latest reports from Our World in Data. The PUE used for the computations were also updated using the latest global average from Uptime Institute.
⌨️ Added carbontracker --parse <log_dir>
command
This command aggregates all logs in a folder for easier readability of ones total carbon emissions across multiple experiments.
⚠️ Breaking changes
❗️ Change of return types and parameters
As part of making better type hints, some internal functions were found to have inconsistent typing, where the decision was made to strengthen the parameter/return types.
- The
devices_by_pid
parameter for components no longer accepts dictionaries - The
power_usage
function in handlers now always returns lists of floats
❗️ Change of monitor_epochs
default value from 1 to -1
When using carbontracker.tracker.CarbonTracker
, the default value of monitor_epochs
meant that actual consumption would be printed after 1 epoch. This does not align with the most common use case of wanting the actual consumption after training has been completed.
❗ElectricityMaps is now the default fetcher for all regions.
This means that while EnergiDataService and CarbonIntensityGB still exists as modules, CarbonTracker
and the CLI will now only use ElectricityMaps
. In addition, a new warning will now appear whenever ElectricityMaps is missing an API key.
Minor internal changes and bug fixes
🐍 Set up testing across different Python versions
This runs both in our CI setup using Github Actions and locally using tox
.
This also included fixing a few version-specific bugs on Python 3.7, 3.8, 3.11 and 3.12.
🐛 Log statements no longer go into overdrive on Jupyter Notebooks.
Originally posed as issue #70 (thanks @andreasgoethals !)
Before, for every new CarbonTracker object instantiated, new threads would be created. This is problematic in interactive computing environments like Jupyter Notebook where one might accidentally call tracker = CarbonTracker(...)
many times. This was solved by fixing how Carbontracker identifies threads.
🪳 Fixed log parsing on logs generated by the CLI
Acknowledgements
This release was developed by @Snailed, @PedramBakh, @raghavian. A special thanks to @andreasgoethals and @jonathanwww for pinpointing bugs.
v1.2.1: CarbonTracker CLI Tool Update
🛠️ CarbonTracker CLI Tool Update
Issues: #60, #61
We've made a minor update to the CarbonTracker CLI tool. It now supports arbitrary command execution, broadening its utility beyond Python scripts. Do note that programs need to be executable for this to work.
You can use the tool in this manner:
carbontracker myscript arg1 arg2 --log_dir ./logs
CarbonTracker v1.2.0: Live Carbon Intensity from 160+ Regions, CLI Tool, and Apple Silicon Support
Highlights
🚀 Introducing the CarbonTracker CLI Tool
We are pleased to introduce the Command Line Interface (CLI) tool for CarbonTracker. This addition offers an efficient way to monitor and manage the carbon footprint of your Python scripts. Upon installing CarbonTracker via PyPi, the CLI tool is immediately available for use.
For straightforward usage without live carbon intensity API integration:
carbontracker --script train.py --log_dir ./logs
For users aiming to use live carbon intensity measurements (a feature also introduced in this release, detailed below), the API key can be integrated with the CarbonTracker CLI tool as follows:
carbontracker --script train.py --log_dir ./logs --api_keys '{"electricitymaps": "YOUR_KEY_HERE"}'
🔌 Transition from CO2Signal API
Issue: #1, #52
We have phased out support for the standalone CO2Signal API in favor of its integration into the ElectricityMaps API. This transition ensures greater consistency and addresses previous timeout issues experienced with consecutive requests.
🍏 OS X Support for Apple Silicon Chips
Issue: #24
We have rolled out support for OS X on M1/M2 Apple Silicon chips. This support encompasses all cores of the CPU and GPU, including the neural engine. Note: To initiate power measurements on these chips, users will be required to grant sudo access to the script.
📢 Enhanced Feedback with Verbose Setting
Issue: #35
An identified issue where setting verbose=0
rendered both stdout and the output log empty has been addressed. With the current update, the verbose
setting will only affect stdout, leaving the output log intact.
📏 Decimal Precision Update
Issues: #25, #45
We've increased the default decimal precision to 12 to align with kWh and gCO2/kWh units, which are standards in the energy sector. This enhancement has been integrated without affecting existing functionality.
🚨 Enhanced Carbon Intensity Estimation Notifications
Issue: #43
A gap was identified where users were not alerted when default fallback values were used for carbon intensity estimations. This has been addressed to provide notifications in both the log file and stdout.
⚡ Performance Optimization
Issue: #41
Feedback regarding performance slowdowns attributed to busy-waiting in the CarbonTrackerThread()
has been addressed. We have transitioned to an event-based approach, resulting in optimized performance.
🛠️ Additional Updates
- An issue related to fetching NVML device names in
carbontracker/components/gpu/nvidia.py
for Python versions below 3.10 has been resolved. Issue: #53 - We have extended our support for live carbon intensity measurements through integration with the ElectricityMaps API, enabling access to over 160 regions. Issue: #54
To leverage this feature, refer to the example below:
from carbontracker.tracker import CarbonTracker
from time import sleep
max_epochs = 10
api_keys = {
'electricitymaps': "YOUR_API_KEY_HERE"
}
tracker = CarbonTracker(epochs=max_epochs, log_dir="./logs", api_keys=api_keys)
tracker.epoch_start()
# Training loop.
for epoch in range(10):
# Your work here
tracker.epoch_end()
tracker.stop()
We are committed to providing valuable updates to enhance your experience with CarbonTracker.
CarbonTracker v1.1.7: Country-specific data, updated measurement factors, logging labels and error handling
Highlights
This release addresses issues preventing users from using CarbonTracker due to the new kernel update on Linux, which necessitates root privileges for energy measurements through Intel's RAPL monitoring. CarbonTracker now throws more descriptive error messages in the case above and when GPUs do not support the retrieval of power usages in NVML.
The constant factors for estimating ML tasks' power consumption and carbon footprint are now updated, reflecting the latest numbers reported by the European Economic Area (EEA). Also, the carbon intensity values are now country-specific.
Additionally, we have resolved a bug preventing the correctness of logging data when using multiple instances of CarbonTracker in the same script and added an option for prefixing (labelling) logging files for individual instances of CarbonTracker.
Summary:
- Catch Intel RAPL permission error (Issue: #40)
- Throw descriptive error message when GPU does not support retrieval of power usages in NVML (Issue: #36)
- Fix the issue with log files being overwritten due to short measurement periods when multiple instances of carbontracker are instantiated.
- Add prefix labelling for individual logging instances (Issue: #26)
- Fix energydataservice API (Issue: #46)
- Update PUE
- Updated default/fallback value for when live carbon intensity cannot be fetched
- Update carbon intensity to be country-specific (PR: #49)
- Update factor for equivalent km travelled by car
- Deprecate support for Python 3.6 (Issue: #48)
Monitoring power usage
Intel RAPL
A new security update for the Linux kernel now requires root privileges to access CPU power consumption through Intel's RAPL interface. This, unfortunately, caused jobs to get aborted when using CarbonTracker. We now omit to monitor CPU power usage in such cases to prevent crashes and give a message informing users of the issue and where to look to fix it.
GPUs not supporting retrieval of power usages in NVML
Not all GPUs support retrieval of power usage through the NVML API, which is used for monitoring the power usage of GPUs. The user was previously left uninformed about this issue, and monitoring the remaining hardware components would continue. A message is now shown informing the user of the issue with a link for where to find additional information.
Logging
Multiple instances
There was an issue where log files would be overwritten due to short measurement periods when multiple instances of carbontracker were instantiated since timestamps were used for naming logging files. The corresponding process ID of the logged task now prefixes logging files. Logging files now have the format processID_timestamp_carbontracker.log
for the standard log and processID_timestamp_carbontracker_output.log
for the output log.
Label log runs
It is now possible to label monitoring instances - logging files - by providing a prefix when instantiating CarbonTracker:
from carbontracker.tracker import CarbonTracker
tracker = CarbonTracker(epochs=max_epochs, log_dir="logs", log_file_prefix="prefix")
# Training loop.
for epoch in range(max_epochs):
tracker.epoch_start()
# Your model training.
tracker.epoch_end()
# Optional: Add a stop in case of early termination before all monitor_epochs have
# been monitored to ensure that actual consumption is reported.
tracker.stop()
The resulting log files will have the format prefix_processID_timestamp_carbontracker.log
.
Measurements
Carbon intensity
We updated the default/fallback value for when live carbon intensity cannot be fetched. We now use the latest data for the average carbon intensity of the specific country detected. If the aforementioned fails, we default to worldwide average carbon intensity for 2019 of 475 gCO2eq/kWh instead. The data used is generated using a script, which generates a small .csv file from our data source.
PUE
The values for estimating power consumption and carbon footprint are now updated. The PUE is now 1.55.
Conversion
The CO2-performance of new passenger cars used is now 107.5. This value is used for estimating the CO2 equivalent emission for km travelled by car.
API
The energydataservice API changed, and we have adjusted the API calls accordingly.
CarbonTracker v1.1.9-test: Country-specific data, updated measurement factors, logging labels and error handling
Highlights
This release addresses issues preventing users from using CarbonTracker due to the new kernel update on Linux, which necessitates root privileges for energy measurements through Intel's RAPL monitoring. CarbonTracker now throws more descriptive error messages in the case above and when GPUs do not support the retrieval of power usages in NVML.
The constant factors for estimating ML tasks' power consumption and carbon footprint are now updated, reflecting the latest numbers reported by the European Economic Area (EEA). Also, the carbon intensity values are now country-specific.
Additionally, we have resolved a bug preventing the correctness of logging data when using multiple instances of CarbonTracker in the same script and added an option for prefixing (labelling) logging files for individual instances of CarbonTracker.
Summary:
- Catch Intel RAPL permission error (Issue: #40)
- Throw descriptive error message when GPU does not support retrieval of power usages in NVML (Issue: #36)
- Fix the issue with log files being overwritten due to short measurement periods when multiple instances of carbontracker are instantiated.
- Add prefix labelling for individual logging instances (Issue: #26)
- Fix energydataservice API (Issue: #46)
- Update PUE
- Updated default/fallback value for when live carbon intensity cannot be fetched
- Update carbon intensity to be country-specific (PR: #49)
- Update factor for equivalent km travelled by car
- Deprecate support for Python 3.6 (Issue: #48)
Monitoring power usage
Intel RAPL
A new security update for the Linux kernel now requires root privileges to access CPU power consumption through Intel's RAPL interface. This, unfortunately, caused jobs to get aborted when using CarbonTracker. We now omit to monitor CPU power usage in such cases to prevent crashes and give a message informing users of the issue and where to look to fix it.
GPUs not supporting retrieval of power usages in NVML
Not all GPUs support retrieval of power usage through the NVML API, which is used for monitoring the power usage of GPUs. The user was previously left uninformed about this issue, and monitoring the remaining hardware components would continue. A message is now shown informing the user of the issue with a link for where to find additional information.
Logging
Multiple instances
There was an issue where log files would be overwritten due to short measurement periods when multiple instances of carbontracker were instantiated since timestamps were used for naming logging files. The corresponding process ID of the logged task now prefixes logging files. Logging files now have the format processID_timestamp_carbontracker.log
for the standard log and processID_timestamp_carbontracker_output.log
for the output log.
Label log runs
It is now possible to label monitoring instances - logging files - by providing a prefix when instantiating CarbonTracker:
from carbontracker.tracker import CarbonTracker
tracker = CarbonTracker(epochs=max_epochs, log_dir="logs", log_file_prefix="prefix")
# Training loop.
for epoch in range(max_epochs):
tracker.epoch_start()
# Your model training.
tracker.epoch_end()
# Optional: Add a stop in case of early termination before all monitor_epochs have
# been monitored to ensure that actual consumption is reported.
tracker.stop()
The resulting log files will have the format prefix_processID_timestamp_carbontracker.log
.
Measurements
Carbon intensity
We updated the default/fallback value for when live carbon intensity cannot be fetched. We now use the latest data for the average carbon intensity of the specific country detected. If the aforementioned fails, we default to worldwide average carbon intensity for 2019 of 475 gCO2eq/kWh instead. The data used is generated using a script, which generates a small .csv file from our data source.
PUE
The values for estimating power consumption and carbon footprint are now updated. The PUE is now 1.55.
Conversion
The CO2-performance of new passenger cars used is now 107.5. This value is used for estimating the CO2 equivalent emission for km travelled by car.
API
The energydataservice API changed, and we have adjusted the API calls accordingly.