Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_length of Siglip2 #157

Open
Yu-xm opened this issue Feb 23, 2025 · 1 comment
Open

max_length of Siglip2 #157

Yu-xm opened this issue Feb 23, 2025 · 1 comment

Comments

@Yu-xm
Copy link

Yu-xm commented Feb 23, 2025

"When using the standalone GemmaTokenizerFast make sure to pass padding="max_length" and max_length=64 as that’s how the model was trained." Does Siglip2 support longer text input? If the max_length is set to 256 or 512, will text exceeding 64 be truncated?

@mitscha
Copy link
Collaborator

mitscha commented Feb 24, 2025

SigLIP 2 was trained with text length 64. The big_vision Gemma tokenizer implementation will pad/truncate to 64 if you set length=64. I'm not sure how other implementations behave (it seems you're referencing the HF transformers implementation). It's unclear how model quality will change if you set the length/max_length to a different value (and resize the positional embedding of the text encoder accordingly), since it was trained with 64.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants