Skip to content

Releases: lfwa/carbontracker

# v2.0.1: Log parsing fix, GitHub Actions updates

18 Nov 13:35
Compare
Choose a tag to compare

🐛 Bug fixes

  • Fixed a bug of double-counting log statements when multiple components were used and epochs were short.

👷 Internal changes

  • Update GitHub Actions (thanks @andife!)

Contributors:
@Snailed , @andife

v2.0.0: Documentation site, deprecation of (some) fetchers and more.

16 Sep 08:48
Compare
Choose a tag to compare

Features

📘 Documentation site is (almost) live!

We have set up a new documentation site on Github Pages using mkdocs and mkdocstrings. It will be available very soon.

🔧 Better type hints

Many of the classes and functions in Carbontracker now has type hinting for nicer IDE integration, automatic documentation and fewer type-related bugs. This has meant changing the return-types of some functions slightly, see ⚠️ Breaking Changes below.

🗺 Updated default intensity values and PUE

When Carbontracker cannot read intensity values from its API, it falls back to a national average value. These values were updated to the latest reports from Our World in Data. The PUE used for the computations were also updated using the latest global average from Uptime Institute.

⌨️ Added carbontracker --parse <log_dir> command

This command aggregates all logs in a folder for easier readability of ones total carbon emissions across multiple experiments.

⚠️ Breaking changes

❗️ Change of return types and parameters

As part of making better type hints, some internal functions were found to have inconsistent typing, where the decision was made to strengthen the parameter/return types.

  • The devices_by_pid parameter for components no longer accepts dictionaries
  • The power_usage function in handlers now always returns lists of floats

❗️ Change of monitor_epochs default value from 1 to -1

When using carbontracker.tracker.CarbonTracker, the default value of monitor_epochs meant that actual consumption would be printed after 1 epoch. This does not align with the most common use case of wanting the actual consumption after training has been completed.

❗ElectricityMaps is now the default fetcher for all regions.

This means that while EnergiDataService and CarbonIntensityGB still exists as modules, CarbonTracker and the CLI will now only use ElectricityMaps. In addition, a new warning will now appear whenever ElectricityMaps is missing an API key.

Minor internal changes and bug fixes

🐍 Set up testing across different Python versions

This runs both in our CI setup using Github Actions and locally using tox.
This also included fixing a few version-specific bugs on Python 3.7, 3.8, 3.11 and 3.12.

🐛 Log statements no longer go into overdrive on Jupyter Notebooks.

Originally posed as issue #70 (thanks @andreasgoethals !)
Before, for every new CarbonTracker object instantiated, new threads would be created. This is problematic in interactive computing environments like Jupyter Notebook where one might accidentally call tracker = CarbonTracker(...) many times. This was solved by fixing how Carbontracker identifies threads.

🪳 Fixed log parsing on logs generated by the CLI

Acknowledgements

This release was developed by @Snailed, @PedramBakh, @raghavian. A special thanks to @andreasgoethals and @jonathanwww for pinpointing bugs.

v1.2.1: CarbonTracker CLI Tool Update

13 Sep 06:36
Compare
Choose a tag to compare

🛠️ CarbonTracker CLI Tool Update

Issues: #60, #61
We've made a minor update to the CarbonTracker CLI tool. It now supports arbitrary command execution, broadening its utility beyond Python scripts. Do note that programs need to be executable for this to work.

You can use the tool in this manner:

carbontracker myscript arg1 arg2 --log_dir ./logs

CarbonTracker v1.2.0: Live Carbon Intensity from 160+ Regions, CLI Tool, and Apple Silicon Support

11 Sep 15:16
Compare
Choose a tag to compare

Highlights

🚀 Introducing the CarbonTracker CLI Tool

We are pleased to introduce the Command Line Interface (CLI) tool for CarbonTracker. This addition offers an efficient way to monitor and manage the carbon footprint of your Python scripts. Upon installing CarbonTracker via PyPi, the CLI tool is immediately available for use.

For straightforward usage without live carbon intensity API integration:

carbontracker --script train.py --log_dir ./logs

For users aiming to use live carbon intensity measurements (a feature also introduced in this release, detailed below), the API key can be integrated with the CarbonTracker CLI tool as follows:

carbontracker --script train.py --log_dir ./logs --api_keys '{"electricitymaps": "YOUR_KEY_HERE"}'

🔌 Transition from CO2Signal API

Issue: #1, #52
We have phased out support for the standalone CO2Signal API in favor of its integration into the ElectricityMaps API. This transition ensures greater consistency and addresses previous timeout issues experienced with consecutive requests.

🍏 OS X Support for Apple Silicon Chips

Issue: #24
We have rolled out support for OS X on M1/M2 Apple Silicon chips. This support encompasses all cores of the CPU and GPU, including the neural engine. Note: To initiate power measurements on these chips, users will be required to grant sudo access to the script.

📢 Enhanced Feedback with Verbose Setting

Issue: #35
An identified issue where setting verbose=0 rendered both stdout and the output log empty has been addressed. With the current update, the verbose setting will only affect stdout, leaving the output log intact.

📏 Decimal Precision Update

Issues: #25, #45
We've increased the default decimal precision to 12 to align with kWh and gCO2/kWh units, which are standards in the energy sector. This enhancement has been integrated without affecting existing functionality.

🚨 Enhanced Carbon Intensity Estimation Notifications

Issue: #43
A gap was identified where users were not alerted when default fallback values were used for carbon intensity estimations. This has been addressed to provide notifications in both the log file and stdout.

⚡ Performance Optimization

Issue: #41
Feedback regarding performance slowdowns attributed to busy-waiting in the CarbonTrackerThread() has been addressed. We have transitioned to an event-based approach, resulting in optimized performance.

🛠️ Additional Updates

  • An issue related to fetching NVML device names in carbontracker/components/gpu/nvidia.py for Python versions below 3.10 has been resolved. Issue: #53
  • We have extended our support for live carbon intensity measurements through integration with the ElectricityMaps API, enabling access to over 160 regions. Issue: #54

To leverage this feature, refer to the example below:

from carbontracker.tracker import CarbonTracker
from time import sleep

max_epochs = 10
api_keys = {
    'electricitymaps': "YOUR_API_KEY_HERE"
}

tracker = CarbonTracker(epochs=max_epochs, log_dir="./logs", api_keys=api_keys)

tracker.epoch_start()
# Training loop.
for epoch in range(10):
    
    # Your work here

tracker.epoch_end()
tracker.stop()

We are committed to providing valuable updates to enhance your experience with CarbonTracker.

CarbonTracker v1.1.7: Country-specific data, updated measurement factors, logging labels and error handling

06 Dec 08:03
Compare
Choose a tag to compare

Highlights

This release addresses issues preventing users from using CarbonTracker due to the new kernel update on Linux, which necessitates root privileges for energy measurements through Intel's RAPL monitoring. CarbonTracker now throws more descriptive error messages in the case above and when GPUs do not support the retrieval of power usages in NVML.

The constant factors for estimating ML tasks' power consumption and carbon footprint are now updated, reflecting the latest numbers reported by the European Economic Area (EEA). Also, the carbon intensity values are now country-specific.

Additionally, we have resolved a bug preventing the correctness of logging data when using multiple instances of CarbonTracker in the same script and added an option for prefixing (labelling) logging files for individual instances of CarbonTracker.

Summary:

  • Catch Intel RAPL permission error (Issue: #40)
  • Throw descriptive error message when GPU does not support retrieval of power usages in NVML (Issue: #36)
  • Fix the issue with log files being overwritten due to short measurement periods when multiple instances of carbontracker are instantiated.
  • Add prefix labelling for individual logging instances (Issue: #26)
  • Fix energydataservice API (Issue: #46)
  • Update PUE
  • Updated default/fallback value for when live carbon intensity cannot be fetched
  • Update carbon intensity to be country-specific (PR: #49)
  • Update factor for equivalent km travelled by car
  • Deprecate support for Python 3.6 (Issue: #48)

Monitoring power usage

Intel RAPL

A new security update for the Linux kernel now requires root privileges to access CPU power consumption through Intel's RAPL interface. This, unfortunately, caused jobs to get aborted when using CarbonTracker. We now omit to monitor CPU power usage in such cases to prevent crashes and give a message informing users of the issue and where to look to fix it.

GPUs not supporting retrieval of power usages in NVML

Not all GPUs support retrieval of power usage through the NVML API, which is used for monitoring the power usage of GPUs. The user was previously left uninformed about this issue, and monitoring the remaining hardware components would continue. A message is now shown informing the user of the issue with a link for where to find additional information.

Logging

Multiple instances

There was an issue where log files would be overwritten due to short measurement periods when multiple instances of carbontracker were instantiated since timestamps were used for naming logging files. The corresponding process ID of the logged task now prefixes logging files. Logging files now have the format processID_timestamp_carbontracker.log for the standard log and processID_timestamp_carbontracker_output.log for the output log.

Label log runs

It is now possible to label monitoring instances - logging files - by providing a prefix when instantiating CarbonTracker:

from carbontracker.tracker import CarbonTracker

tracker = CarbonTracker(epochs=max_epochs, log_dir="logs", log_file_prefix="prefix")

# Training loop.
for epoch in range(max_epochs):
    tracker.epoch_start()
    
    # Your model training.

    tracker.epoch_end()

# Optional: Add a stop in case of early termination before all monitor_epochs have
# been monitored to ensure that actual consumption is reported.
tracker.stop()

The resulting log files will have the format prefix_processID_timestamp_carbontracker.log.

Measurements

Carbon intensity

We updated the default/fallback value for when live carbon intensity cannot be fetched. We now use the latest data for the average carbon intensity of the specific country detected. If the aforementioned fails, we default to worldwide average carbon intensity for 2019 of 475 gCO2eq/kWh instead. The data used is generated using a script, which generates a small .csv file from our data source.

PUE

The values for estimating power consumption and carbon footprint are now updated. The PUE is now 1.55.

Conversion

The CO2-performance of new passenger cars used is now 107.5. This value is used for estimating the CO2 equivalent emission for km travelled by car.

API

The energydataservice API changed, and we have adjusted the API calls accordingly.

CarbonTracker v1.1.9-test: Country-specific data, updated measurement factors, logging labels and error handling

06 Dec 07:40
Compare
Choose a tag to compare

Highlights

This release addresses issues preventing users from using CarbonTracker due to the new kernel update on Linux, which necessitates root privileges for energy measurements through Intel's RAPL monitoring. CarbonTracker now throws more descriptive error messages in the case above and when GPUs do not support the retrieval of power usages in NVML.

The constant factors for estimating ML tasks' power consumption and carbon footprint are now updated, reflecting the latest numbers reported by the European Economic Area (EEA). Also, the carbon intensity values are now country-specific.

Additionally, we have resolved a bug preventing the correctness of logging data when using multiple instances of CarbonTracker in the same script and added an option for prefixing (labelling) logging files for individual instances of CarbonTracker.

Summary:

  • Catch Intel RAPL permission error (Issue: #40)
  • Throw descriptive error message when GPU does not support retrieval of power usages in NVML (Issue: #36)
  • Fix the issue with log files being overwritten due to short measurement periods when multiple instances of carbontracker are instantiated.
  • Add prefix labelling for individual logging instances (Issue: #26)
  • Fix energydataservice API (Issue: #46)
  • Update PUE
  • Updated default/fallback value for when live carbon intensity cannot be fetched
  • Update carbon intensity to be country-specific (PR: #49)
  • Update factor for equivalent km travelled by car
  • Deprecate support for Python 3.6 (Issue: #48)

Monitoring power usage

Intel RAPL

A new security update for the Linux kernel now requires root privileges to access CPU power consumption through Intel's RAPL interface. This, unfortunately, caused jobs to get aborted when using CarbonTracker. We now omit to monitor CPU power usage in such cases to prevent crashes and give a message informing users of the issue and where to look to fix it.

GPUs not supporting retrieval of power usages in NVML

Not all GPUs support retrieval of power usage through the NVML API, which is used for monitoring the power usage of GPUs. The user was previously left uninformed about this issue, and monitoring the remaining hardware components would continue. A message is now shown informing the user of the issue with a link for where to find additional information.

Logging

Multiple instances

There was an issue where log files would be overwritten due to short measurement periods when multiple instances of carbontracker were instantiated since timestamps were used for naming logging files. The corresponding process ID of the logged task now prefixes logging files. Logging files now have the format processID_timestamp_carbontracker.log for the standard log and processID_timestamp_carbontracker_output.log for the output log.

Label log runs

It is now possible to label monitoring instances - logging files - by providing a prefix when instantiating CarbonTracker:

from carbontracker.tracker import CarbonTracker

tracker = CarbonTracker(epochs=max_epochs, log_dir="logs", log_file_prefix="prefix")

# Training loop.
for epoch in range(max_epochs):
    tracker.epoch_start()
    
    # Your model training.

    tracker.epoch_end()

# Optional: Add a stop in case of early termination before all monitor_epochs have
# been monitored to ensure that actual consumption is reported.
tracker.stop()

The resulting log files will have the format prefix_processID_timestamp_carbontracker.log.

Measurements

Carbon intensity

We updated the default/fallback value for when live carbon intensity cannot be fetched. We now use the latest data for the average carbon intensity of the specific country detected. If the aforementioned fails, we default to worldwide average carbon intensity for 2019 of 475 gCO2eq/kWh instead. The data used is generated using a script, which generates a small .csv file from our data source.

PUE

The values for estimating power consumption and carbon footprint are now updated. The PUE is now 1.55.

Conversion

The CO2-performance of new passenger cars used is now 107.5. This value is used for estimating the CO2 equivalent emission for km travelled by car.

API

The energydataservice API changed, and we have adjusted the API calls accordingly.