Name		Name	Last commit message	Last commit date
parent directory ..
.github		.github
ci		ci
cmake		cmake
dll		dll
docs		docs
ext		ext
figures		figures
models		models
onnx-ops/python		onnx-ops/python
ops		ops
tests		tests
third_party/xrt-ipu/xrt		third_party/xrt-ipu/xrt
tools		tools
xclbin		xclbin
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
README.md		README.md
build_dependencies.bat		build_dependencies.bat
build_dependencies.ps1		build_dependencies.ps1
env.yaml		env.yaml
pyproject.toml		pyproject.toml
setup_phx.bat		setup_phx.bat
setup_phx.ps1		setup_phx.ps1
setup_stx.bat		setup_stx.bat
setup_stx.ps1		setup_stx.ps1

README.md

LLMs

The Ryzen AI Software includes support for deploying quantized LLMs on the NPU using an eager execution mode, simplifying the model ingestion process. Instead of compiling and executing as a complete graph, the model is processed on an operator-by-operator basis. Compute-intensive operations, such as GEMM/MATMUL, are dynamically offloaded to the NPU, while the remaining operators are executed on the CPU.

A set of performance-optimized models is available upon request on the AMD secure download site: Optimized LLMs on RyzenAI

Applicability: benchmarking and deployment of specific LLMs
Performance: highly optimized
Supported platforms: STX (and onwards)
Supported frameworks: ONNX Runtime GenAI

A general-purpose flow can be found here: LLMs on RyzenAI with Pytorch

Applicability: prototyping and early development with a broad set of LLMs
Performance: functional support only, not to be used for benchmarking
Supported platforms: PHX, HPT, STX (and onwards)
Supported frameworks: Pytorch

Run LLMs

LLMs on RyzenAI with Pytorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformers

transformers

README.md

LLMs

Run LLMs

Files

transformers

Directory actions

More options

Directory actions

More options

Latest commit

History

transformers

Folders and files

parent directory

README.md

LLMs

Run LLMs