Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update transformers version to 4.48 #1372

Merged
merged 6 commits into from
Feb 5, 2025
Merged

Update transformers version to 4.48 #1372

merged 6 commits into from
Feb 5, 2025

Conversation

wizeng23
Copy link
Contributor

@wizeng23 wizeng23 commented Feb 4, 2025

Description

Llama vision attention is still broken. Nikolai determined it occurs when gradient checkpointing is enabled, and filed an issue here: huggingface/transformers#36040.

Since this can be worked around, and only affects one family of models, we still want to upgrade the version. 4.45 is 4 months old, and we need newer transformers versions to support new vision models.

We attempted this previously, and reverted it in #1111 due to the same issue.

Main code changes required are:

Related issues

Fixes OPE-738, OPE-875
Towards OPE-1018

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

@wizeng23 wizeng23 marked this pull request as ready for review February 4, 2025 23:06
@xrdaukar
Copy link
Collaborator

xrdaukar commented Feb 4, 2025

What config was used for Llama Vision testing?

@wizeng23
Copy link
Contributor Author

wizeng23 commented Feb 4, 2025

configs/recipes/vision/llama3_2_vision/sft/11b_lora/gcp_job.yaml. Added to PR description

@xrdaukar xrdaukar changed the title Update transformers version to 4.48 Update transformers version to 4.48.2 Feb 5, 2025
@wizeng23 wizeng23 marked this pull request as draft February 5, 2025 00:14
@wizeng23 wizeng23 changed the title Update transformers version to 4.48.2 Update transformers version to 4.48 Feb 5, 2025
@wizeng23 wizeng23 marked this pull request as ready for review February 5, 2025 06:28
pyproject.toml Outdated Show resolved Hide resolved
@wizeng23 wizeng23 merged commit f6e47fe into main Feb 5, 2025
2 checks passed
@wizeng23 wizeng23 deleted the wizeng/transformers branch February 5, 2025 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants