Skip to content

Commit

Permalink
Merge amd-master into release/rocm-rel-6.1 20240405
Browse files Browse the repository at this point in the history
Signed-off-by: Maisam Arif <[email protected]>
Change-Id: I3b881a909c1427fff56e8036f026c3120317168b
  • Loading branch information
marifamd committed Apr 5, 2024
2 parents afcd367 + 1171c23 commit 6709757
Show file tree
Hide file tree
Showing 24 changed files with 2,226 additions and 481 deletions.
731 changes: 722 additions & 9 deletions CHANGELOG.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ find_program(GIT NAMES git)

## Setup the package version based on git tags.
set(PKG_VERSION_GIT_TAG_PREFIX "amdsmi_pkg_ver")
get_package_version_number("24.4.0" ${PKG_VERSION_GIT_TAG_PREFIX} GIT)
get_package_version_number("24.5.1" ${PKG_VERSION_GIT_TAG_PREFIX} GIT)
message("Package version: ${PKG_VERSION_STR}")
set(${AMD_SMI_LIBS_TARGET}_VERSION_MAJOR "${CPACK_PACKAGE_VERSION_MAJOR}")
set(${AMD_SMI_LIBS_TARGET}_VERSION_MINOR "${CPACK_PACKAGE_VERSION_MINOR}")
Expand Down
78 changes: 73 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,18 @@ installed to query firmware information and hardware IPs.

### Installation

* Install amdgpu driver
* Install amd-smi-lib package through package manager
### Install amdgpu using ROCm
* Install amdgpu driver:
See example below, your release and link may differ. The `amdgpu-install --usecase=rocm` triggers both an amdgpu driver update and AMD SMI packages to be installed on your device.
```shell
sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.0.2/ubuntu/jammy/amdgpu-install_6.0.60002-1_all.deb
sudo apt install ./amdgpu-install_6.0.60002-1_all.deb
sudo amdgpu-install --usecase=rocm
```
* amd-smi --help

### Install Example for Ubuntu 22.04
### Install Example for Ubuntu 22.04 (without ROCm)

``` bash
apt install amd-smi-lib
Expand Down Expand Up @@ -101,7 +108,7 @@ The only required AMD-SMI call for any program that wants to use AMD-SMI is the

When AMD-SMI is no longer being used, `amdsmi_shut_down()` should be called. This provides a way to do any releasing of resources that AMD-SMI may have held.

A simple "Hello World" type program that displays the temperature of detected devices would look like this:
1) A simple "Hello World" type program that displays the temperature of detected devices would look like this:

```c++
#include <iostream>
Expand Down Expand Up @@ -177,6 +184,67 @@ int main() {
}
```

2) A sample program that displays the power of detected cpus would look like this:

```c++
#include <iostream>
#include <vector>
#include "amd_smi/amdsmi.h"

int main(int argc, char **argv) {
amdsmi_status_t ret;
uint32_t socket_count = 0;

// Initialize amdsmi for AMD CPUs
ret = amdsmi_init(AMDSMI_INIT_AMD_CPUS);

ret = amdsmi_get_socket_handles(&socket_count, nullptr);

// Allocate the memory for the sockets
std::vector<amdsmi_socket_handle> sockets(socket_count);

// Get the sockets of the system
ret = amdsmi_get_socket_handles(&socket_count, &sockets[0]);

std::cout << "Total Socket: " << socket_count << std::endl;

// For each socket, get cpus
for (uint32_t i = 0; i < socket_count; i++) {
uint32_t cpu_count = 0;

// Set processor type as AMD_CPU
processor_type_t processor_type = AMD_CPU;
ret = amdsmi_get_processor_handles_by_type(sockets[i], processor_type, nullptr, &cpu_count);

// Allocate the memory for the cpus
std::vector<amdsmi_processor_handle> plist(cpu_count);

// Get the cpus for each socket
ret = amdsmi_get_processor_handles_by_type(sockets[i], processor_type, &plist[0], &cpu_count);

for (uint32_t index = 0; index < plist.size(); index++) {
uint32_t socket_power;
std::cout<<"CPU "<<index<<"\t"<< std::endl;
std::cout<<"Power (Watts): ";

ret = amdsmi_get_cpu_socket_power(plist[index], &socket_power);
if(ret != AMDSMI_STATUS_SUCCESS)
std::cout<<"Failed to get cpu socket power"<<"["<<index<<"] , Err["<<ret<<"] "<< std::endl;

if (!ret) {
std::cout<<static_cast<double>(socket_power)/1000<<std::endl;
}
std::cout<<std::endl;
}
}

// Clean up resources allocated at amdsmi_init
ret = amdsmi_shut_down();

return 0;
}
```
### Documentation
The reference manual, `AMD_SMI_Manual.pdf` will be in the /opt/rocm/share/doc/amd_smi directory upon a successful build.
Expand Down Expand Up @@ -277,4 +345,4 @@ Path to the program `amdsmitst`: build/tests/amd_smi_test/

The information contained herein is for informational purposes only, and is subject to change without notice. In addition, any stated support is planned and is also subject to change. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein.

© 2023 Advanced Micro Devices, Inc. All Rights Reserved.
© 2023-2024 Advanced Micro Devices, Inc. All Rights Reserved.
53 changes: 47 additions & 6 deletions amdsmi_cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Recommended: At least one AMD GPU with AMD driver installed

### Installation

* Install amdgpu driver
* [Install amdgpu driver](../README.md#install-amdgpu-using-rocm)
* Optionally install amd_hsmp driver for ESMI CPU functions
* Install amd-smi-lib package through package manager
* amd-smi --help
Expand Down Expand Up @@ -79,7 +79,7 @@ amd-smi will report the version and current platform detected when running the c
~$ amd-smi
usage: amd-smi [-h] ...

AMD System Management Interface | Version: 24.4.0.0 | ROCm version: 6.1.0 | Platform: Linux Baremetal
AMD System Management Interface | Version: 24.5.1.0 | ROCm version: 6.1.1 | Platform: Linux Baremetal

options:
-h, --help show this help message and exit
Expand Down Expand Up @@ -513,6 +513,7 @@ Set Arguments:
NPS1, NPS2, NPS4, NPS8
-o, --power-cap WATTS Set power capacity limit
-p, --dpm-policy POLICY_ID Set the GPU DPM policy using policy id
-x, --xgmi-plpd POLICY_ID Set the GPU XGMI per-link power down policy using policy id

CPU Arguments:
--cpu-pwr-limit PWR_LIMIT Set power limit for the given socket. Input parameter is power limit value.
Expand Down Expand Up @@ -675,7 +676,7 @@ GPU: 0
PARTITION:
COMPUTE_PARTITION: SPX
MEMORY_PARTITION: NPS1
POLICY:
DPM_POLICY:
NUM_SUPPORTED: 4
CURRENT_ID: 1
POLICIES:
Expand All @@ -687,6 +688,16 @@ GPU: 0
POLICY_DESCRIPTION: soc_pstate_1
POLICY_ID: 3
POLICY_DESCRIPTION: soc_pstate_2
XGMI_PLPD:
NUM_SUPPORTED: 3
CURRENT_ID: 1
PLPDS:
POLICY_ID: 0
POLICY_DESCRIPTION: plpd_disallow
POLICY_ID: 1
POLICY_DESCRIPTION: plpd_default
POLICY_ID: 2
POLICY_DESCRIPTION: plpd_optimized
NUMA:
NODE: 0
AFFINITY: 0
Expand Down Expand Up @@ -783,7 +794,7 @@ GPU: 1
PARTITION:
COMPUTE_PARTITION: SPX
MEMORY_PARTITION: NPS1
POLICY:
DPM_POLICY:
NUM_SUPPORTED: 4
CURRENT_ID: 1
POLICIES:
Expand All @@ -795,6 +806,16 @@ GPU: 1
POLICY_DESCRIPTION: soc_pstate_1
POLICY_ID: 3
POLICY_DESCRIPTION: soc_pstate_2
XGMI_PLPD:
NUM_SUPPORTED: 3
CURRENT_ID: 1
PLPDS:
POLICY_ID: 0
POLICY_DESCRIPTION: plpd_disallow
POLICY_ID: 1
POLICY_DESCRIPTION: plpd_default
POLICY_ID: 2
POLICY_DESCRIPTION: plpd_optimized
NUMA:
NODE: 1
AFFINITY: 1
Expand Down Expand Up @@ -891,7 +912,7 @@ GPU: 2
PARTITION:
COMPUTE_PARTITION: SPX
MEMORY_PARTITION: NPS1
POLICY:
DPM_POLICY:
NUM_SUPPORTED: 4
CURRENT_ID: 1
POLICIES:
Expand All @@ -903,6 +924,16 @@ GPU: 2
POLICY_DESCRIPTION: soc_pstate_1
POLICY_ID: 3
POLICY_DESCRIPTION: soc_pstate_2
XGMI_PLPD:
NUM_SUPPORTED: 3
CURRENT_ID: 1
PLPDS:
POLICY_ID: 0
POLICY_DESCRIPTION: plpd_disallow
POLICY_ID: 1
POLICY_DESCRIPTION: plpd_default
POLICY_ID: 2
POLICY_DESCRIPTION: plpd_optimized
NUMA:
NODE: 2
AFFINITY: 2
Expand Down Expand Up @@ -999,7 +1030,7 @@ GPU: 3
PARTITION:
COMPUTE_PARTITION: SPX
MEMORY_PARTITION: NPS1
POLICY:
DPM_POLICY:
NUM_SUPPORTED: 4
CURRENT_ID: 1
POLICIES:
Expand All @@ -1011,6 +1042,16 @@ GPU: 3
POLICY_DESCRIPTION: soc_pstate_1
POLICY_ID: 3
POLICY_DESCRIPTION: soc_pstate_2
XGMI_PLPD:
NUM_SUPPORTED: 3
CURRENT_ID: 1
PLPDS:
POLICY_ID: 0
POLICY_DESCRIPTION: plpd_disallow
POLICY_ID: 1
POLICY_DESCRIPTION: plpd_default
POLICY_ID: 2
POLICY_DESCRIPTION: plpd_optimized
NUMA:
NODE: 3
AFFINITY: 3
Expand Down
Loading

0 comments on commit 6709757

Please sign in to comment.