You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The AMD L3 cache (SRAM; aka Infinity Cache) has very attractive capacity (256MB for MI300X). I am very interested to know if we can achieve performance gain by putting model data in the L3 cache when running application on AMD GPUs. IIUC, ROCm is the right layer to build APIs to program the L3 cache. So, here are my questions.First, is that right? Second, if it is right, can you share some code pointers how I can play with the idea myself, please? Many thanks.
Operating System
No response
GPU
No response
ROCm Component
No response
The text was updated successfully, but these errors were encountered:
Hi @JianbaoTao, thanks for your interest! That's an interesting idea, but as far as I know this cache is not visible to ROCm applications and this would have to be done at the driver or firmware level.
Suggestion Description
The AMD L3 cache (SRAM; aka Infinity Cache) has very attractive capacity (256MB for MI300X). I am very interested to know if we can achieve performance gain by putting model data in the L3 cache when running application on AMD GPUs. IIUC, ROCm is the right layer to build APIs to program the L3 cache. So, here are my questions.First, is that right? Second, if it is right, can you share some code pointers how I can play with the idea myself, please? Many thanks.
Operating System
No response
GPU
No response
ROCm Component
No response
The text was updated successfully, but these errors were encountered: