Fix flan-t5 models load fail with llama-server #8993

kylo5aby · 2024-08-12T02:28:40Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

fairydreaming · 2024-08-12T09:22:46Z

I'm not sure if there is a point in doing this, since T5 models require llama_encode() call and preparation of a custom input for llama_decode() (with decoder start tokens). This is not implemented in the current llama-server implementation, so even if the model loads it won't work correctly. But if you are willing to extend your PR by implementing the comprehensive support for T5 models in llama-server then you have my blessings.

fairydreaming · 2024-08-12T09:37:32Z

It looks like T5 model loading in llama-server is already fixed by #8997

ggerganov · 2024-08-12T12:27:19Z

Yup, this should fix the loading, but llama_encode() support needs more work

Fix T5 model load bug

75febe2

kylo5aby changed the title ~~Fix T5 model load fail with llama-server~~ Fix flan-t5 models load fail with llama-server Aug 12, 2024

github-actions bot added examples server labels Aug 12, 2024

ggerganov closed this Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flan-t5 models load fail with llama-server #8993

Fix flan-t5 models load fail with llama-server #8993

kylo5aby commented Aug 12, 2024 •

edited

Loading

fairydreaming commented Aug 12, 2024

fairydreaming commented Aug 12, 2024

ggerganov commented Aug 12, 2024

Fix flan-t5 models load fail with llama-server #8993

Fix flan-t5 models load fail with llama-server #8993

Conversation

kylo5aby commented Aug 12, 2024 • edited Loading

fairydreaming commented Aug 12, 2024

fairydreaming commented Aug 12, 2024

ggerganov commented Aug 12, 2024

kylo5aby commented Aug 12, 2024 •

edited

Loading