[Feature]: Build APIs to make the L3 cache programmable for users (ie, application developers) #295

JianbaoTao · 2025-02-20T04:03:55Z

Suggestion Description

The AMD L3 cache (SRAM; aka Infinity Cache) has very attractive capacity (256MB for MI300X). I am very interested to know if we can achieve performance gain by putting model data in the L3 cache when running application on AMD GPUs. IIUC, ROCm is the right layer to build APIs to program the L3 cache. So, here are my questions.First, is that right? Second, if it is right, can you share some code pointers how I can play with the idea myself, please? Many thanks.

Operating System

No response

GPU

No response

ROCm Component

No response

schung-amd · 2025-02-28T15:00:47Z

Hi @JianbaoTao, thanks for your interest! That's an interesting idea, but as far as I know this cache is not visible to ROCm applications and this would have to be done at the driver or firmware level.

ppanchad-amd added the Under Investigation label Feb 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Build APIs to make the L3 cache programmable for users (ie, application developers) #295

[Feature]: Build APIs to make the L3 cache programmable for users (ie, application developers) #295

JianbaoTao commented Feb 20, 2025

schung-amd commented Feb 28, 2025

[Feature]: Build APIs to make the L3 cache programmable for users (ie, application developers) #295

[Feature]: Build APIs to make the L3 cache programmable for users (ie, application developers) #295

Comments

JianbaoTao commented Feb 20, 2025

Suggestion Description

Operating System

GPU

ROCm Component

schung-amd commented Feb 28, 2025