You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Earlier I used to do inference using the past code till batch size of 64 , but don't know now it is not even running till batch size of 16 what things should i need to change in order to run as like earlier
The text was updated successfully, but these errors were encountered:
Earlier there is a code (for text generation run_generation.py) which doesn't have flash attention and also there is a argument called attn_softmax_bf16 , i exactly dont know what is the reason that i was unable to run batch size beyond 16
The commands that were there in the readme were only executed
And below attached file is the error that i got while running error.txt
Earlier I used to do inference using the past code till batch size of 64 , but don't know now it is not even running till batch size of 16 what things should i need to change in order to run as like earlier
The text was updated successfully, but these errors were encountered: