Date: 2022-02-24
software | description |
OS | Ubuntu-20.04.3 |
Python | 3.8.10 |
Tensorflow-rocm | 2.4.3 |
hardware | Product Name | ISA | CHIP IP |
CPU | Xeon 2620v3 | ||
GPU | RX580 8G | gfx803(Polaris10) | 0x67df |
On ROCm-4.5, we only need patch rocBLAS and gfx803 can run properly.
- gfx803 had been removed from ROCm-4.0 offical supporting list on 2020-12-19
If you don't want to compile ROCm components from sources. You can downgrade to ROCm-3.5.1, here is the documents
- ROCm-3.7+ on gfx803, run tensorflow text classification sample. Tensorflow offical sample could reproduce this issue, almost 90%.
- There are many people get this error, please refer here : ROCm/ROCm#1265
- Dont know yet
Delete library/src/blas3/Tensile/Logic/asm_full/r9nano_*.yaml
from rocBLAS, rebuild rocBLAS, issue resolved. If I just keep one solution of this file, issue reproduced.
git clone
cd rocBLAS
git checkout release/rocm-rel-5.0
bash -d
rm -rf library/src/blas3/Tensile/Logic/asm_full/r9nano*
mkdir build
cd build
CXX=/opt/rocm/bin/hipcc cmake -lpthread \
-DROCM_PATH=/opt/rocm \
-DTensile_LOGIC=asm_full \
-DTensile_ARCHITECTURE=all \
-DTensile_LIBRARY_FORMAT=yaml \
-DTensile_COMPILER=hipcc \
-DHIP_CLANG_INCLUDE_PATH=/opt/rocm/llvm/include \
-DCMAKE_INSTALL_PREFIX=rocblas-install \
-G "Unix Makefiles" \
make -j
make package
sudo dpkg -i *.deb
There is a beta version Pytorch-1.9.0 on pytorch offical website.
And it will crash on very beginning of running pytorch. Error info as follows:
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"
Offical pytorch-1.9.0 didn't provide fatbin for gfx803.
(Waiting for Pytorch-1.11.0 to support ROCm-5.0.)
Rebuild Pytorch with PYTORCH_ROCM_ARCH=gfx803.
Pytorch-1.9.0 need do a patch for ROCm-4.3 HIP version updating.
sudo ln -f -s /usr/bin/python3 /usr/bin/python
git clone
cd pytorch
git checkout v1.9.0
git submodule update --init --recursive
git apply /home/work/rocm-build/patch/pytorch-rocm43-1.patch
sudo apt install -y libopencv-highgui4.2 libopenblas-dev python3-dev python3-pip
pip3 install -r requirements.txt
export PATH=/opt/rocm/bin:$PATH \
ROCM_PATH=/opt/rocm \
export PYTORCH_ROCM_ARCH=gfx803
python3 tools/amd_build/
USE_ROCM=1 USE_NINJA=1 python3 bdist_wheel
pip3 install dist/torch-1.9.0a0+gitd69c22d-cp38-cp38-linux_x86_64.whl
PS: The pytorch-1.9.0 for ROCm-4.2 report cannot find
Traceback (most recent call last):
File "", line 4, in <module>
import torch
File "/home/work/.local/lib/python3.8/site-packages/torch/", line 197, in <module>
from torch._C import * # noqa: F403
ImportError: cannot open shared object file: No such file or directory
Create a symblic link for, the current version of tinfo is 6, not 5.
sudo ln -s /usr/lib/x86_64-linux-gnu/ /usr/lib/x86_64-linux-gnu/