We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
t5xxl_fp8_e4m3fn_scaled.safetensors
t5-v1_1-xxl-encoder-Q8_0.gguf
What are the differences between these two models? How should I make my choice?In terms of model precision, which one is higher?
https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/blob/main/t5-v1_1-xxl-encoder-Q8_0.gguf
https://huggingface.co/Comfy-Org/stable-diffusion-3.5-fp8/blob/main/text_encoders/t5xxl_fp8_e4m3fn_scaled.safetensors
The text was updated successfully, but these errors were encountered:
I tried both but could not really see a difference in prompt following, but the other to clip encoders do the most heavy lifting I think
Sorry, something went wrong.
I think the new scaled FP8 ones should be on-par with Q8_0 for T5. The logic is fairly similar from what I can tell but I haven't tested them myself.
The main benefit of FP8 is that it's faster (even more so on 40XX cards with the launch arg) + natively supported in ComfyUI.
I guess the main benefit for GGUF would be lower initial system memory usage on windows due to the way the model is loaded.
Thank you. Although I have a 40XX graphics card, I chose GGUF for its lower RAM usage since I only have 16GB of RAM.
No branches or pull requests
What are the differences between these two models? How should I make my choice?In terms of model precision, which one is higher?
https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/blob/main/t5-v1_1-xxl-encoder-Q8_0.gguf
https://huggingface.co/Comfy-Org/stable-diffusion-3.5-fp8/blob/main/text_encoders/t5xxl_fp8_e4m3fn_scaled.safetensors
The text was updated successfully, but these errors were encountered: