You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that the latest code includes Mma1_is_RS. Does this imply that the results of MMA0 might be stored in shared memory (smem) and then loaded back into registers for mma1? I remember that in previous versions, the results were directly reused in registers.
The text was updated successfully, but these errors were encountered:
If Mma1_is_RS==false then the output of MMA0 is stored to smem and used directly by Mma1 without loading to registers. It's just an option to reduce register usage.
I noticed that the latest code includes Mma1_is_RS. Does this imply that the results of MMA0 might be stored in shared memory (smem) and then loaded back into registers for mma1? I remember that in previous versions, the results were directly reused in registers.
The text was updated successfully, but these errors were encountered: