Update transformers version to 4.48 #1372

wizeng23 · 2025-02-04T22:11:25Z

Description

Llama vision attention is still broken. Nikolai determined it occurs when gradient checkpointing is enabled, and filed an issue here: huggingface/transformers#36040.

Since this can be worked around, and only affects one family of models, we still want to upgrade the version. 4.45 is 4 months old, and we need newer transformers versions to support new vision models.

We attempted this previously, and reverted it in #1111 due to the same issue.

Main code changes required are:

Rename tokenizer to processing_class
Update some attention class names due to HF Transformers' attention refactor 🚨All attention refactor🚨 huggingface/transformers#35235

Related issues

Fixes OPE-738, OPE-875
Towards OPE-1018

Before submitting

This PR only changes documentation. (You can ignore the following checks in that case)
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

xrdaukar · 2025-02-04T23:35:55Z

What config was used for Llama Vision testing?

wizeng23 · 2025-02-04T23:42:02Z

configs/recipes/vision/llama3_2_vision/sft/11b_lora/gcp_job.yaml. Added to PR description

pyproject.toml

src/oumi/models/layers/ring_attention.py

Update transformers version to 4.48

23a542f

wizeng23 requested review from optas, oelachqar, taenin and xrdaukar February 4, 2025 22:11

wizeng23 marked this pull request as ready for review February 4, 2025 23:06

a

fda50c2

xrdaukar changed the title ~~Update transformers version to 4.48~~ Update transformers version to 4.48.2 Feb 5, 2025

wizeng23 marked this pull request as draft February 5, 2025 00:14

xrdaukar and others added 3 commits February 4, 2025 16:31

save

a48a94f

merge main

d3dde0e

Add comments

0c46e18

wizeng23 changed the title ~~Update transformers version to 4.48.2~~ Update transformers version to 4.48 Feb 5, 2025

wizeng23 marked this pull request as ready for review February 5, 2025 06:28

oelachqar approved these changes Feb 5, 2025

View reviewed changes

taenin approved these changes Feb 5, 2025

View reviewed changes

xrdaukar reviewed Feb 5, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

xrdaukar reviewed Feb 5, 2025

View reviewed changes

src/oumi/models/layers/ring_attention.py Show resolved Hide resolved

xrdaukar reviewed Feb 5, 2025

View reviewed changes

src/oumi/models/layers/ring_attention.py Show resolved Hide resolved

xrdaukar approved these changes Feb 5, 2025

View reviewed changes

Address comments

84f6049

wizeng23 merged commit f6e47fe into main Feb 5, 2025
2 checks passed

wizeng23 deleted the wizeng/transformers branch February 5, 2025 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update transformers version to 4.48 #1372

Update transformers version to 4.48 #1372

wizeng23 commented Feb 4, 2025 •

edited by xrdaukar

Loading

xrdaukar commented Feb 4, 2025

wizeng23 commented Feb 4, 2025

Update transformers version to 4.48 #1372

Update transformers version to 4.48 #1372

Conversation

wizeng23 commented Feb 4, 2025 • edited by xrdaukar Loading

Description

Related issues

Before submitting

xrdaukar commented Feb 4, 2025

wizeng23 commented Feb 4, 2025

wizeng23 commented Feb 4, 2025 •

edited by xrdaukar

Loading