Adds gpu support for stateless llama #216

IanNod · 2023-12-01T21:24:42Z

Adds support to compile vmfb to different device backends including cpu/vulkan/rocm/cuda
Adds flags for device, iree_target_triple (backend device specific info) and vulkan max allocation
Minor fix to gen_external_params when not doing quantization

python/turbine_models/custom_models/stateless_llama.py

raikonenfnu · 2023-12-01T21:47:24Z

python/turbine_models/custom_models/stateless_llama.py

@@ -48,6 +48,16 @@
    "--precision", type=str, default="fp16", help="dtype of model [f16, f32]"
 )

+parser.add_argument("--device", type=str, default="cpu", help="cpu, cuda, vulkan")


NIT: add rocm on here

raikonenfnu · 2023-12-01T21:48:26Z

I know this is a custom model, but would be nice to have a small test/runner for this, since this code can get a bit more complex. :)

python/turbine_models/custom_models/stateless_llama.py

IanNod · 2023-12-01T22:02:14Z

I know this is a custom model, but would be nice to have a small test/runner for this, since this code can get a bit more complex. :)

We have a very basic one that was added here cd063df but agreed we do need something more extensive

- Adds support to compile vmfb to different device backends including cpu/vulkan/rocm/cuda - Adds flags for device, iree_target_triple (backend device specific info) and vulkan max allocation - Minor fix to gen_external_params when not doing quantization

IanNod requested review from stellaraccident, qedawkins, dan-garvey and raikonenfnu December 1, 2023 21:24

qedawkins reviewed Dec 1, 2023

View reviewed changes

python/turbine_models/custom_models/stateless_llama.py Outdated Show resolved Hide resolved

python/turbine_models/custom_models/stateless_llama.py Outdated Show resolved Hide resolved

python/turbine_models/custom_models/stateless_llama.py Outdated Show resolved Hide resolved

raikonenfnu reviewed Dec 1, 2023

View reviewed changes

IanNod force-pushed the aot_gpu branch from 21430d6 to 43364af Compare December 1, 2023 21:54

qedawkins approved these changes Dec 1, 2023

View reviewed changes

python/turbine_models/custom_models/stateless_llama.py Outdated Show resolved Hide resolved

python/turbine_models/custom_models/stateless_llama.py Outdated Show resolved Hide resolved

python/turbine_models/custom_models/stateless_llama.py Outdated Show resolved Hide resolved

IanNod force-pushed the aot_gpu branch from 43364af to a5f2920 Compare December 1, 2023 22:05

IanNod merged commit 8399fdc into main Dec 1, 2023

IanNod deleted the aot_gpu branch December 1, 2023 22:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds gpu support for stateless llama #216

Adds gpu support for stateless llama #216

IanNod commented Dec 1, 2023

raikonenfnu Dec 1, 2023

raikonenfnu commented Dec 1, 2023

IanNod commented Dec 1, 2023

Adds gpu support for stateless llama #216

Adds gpu support for stateless llama #216

Conversation

IanNod commented Dec 1, 2023

raikonenfnu Dec 1, 2023

Choose a reason for hiding this comment

raikonenfnu commented Dec 1, 2023

IanNod commented Dec 1, 2023