You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Qwen and interlm is chinese llm
Their architectures are basically the same as Llama3, with just a few more biases in the attention layer. If I want to support them in Keras, should I create a new class called QwenCasualModel, or simply add a few more configuration options to the existing Llama3?
Additionally, is it possible for me not to provide a Kaggle link, but instead convert the weights directly through HF?
The text was updated successfully, but these errors were encountered:
Qwen and interlm is chinese llm
Their architectures are basically the same as Llama3, with just a few more biases in the attention layer. If I want to support them in Keras, should I create a new class called QwenCasualModel, or simply add a few more configuration options to the existing Llama3?
Additionally, is it possible for me not to provide a Kaggle link, but instead convert the weights directly through HF?
The text was updated successfully, but these errors were encountered: