Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Qwen2VL mrope for transformers 4.47.0 #464

Merged
merged 1 commit into from
Dec 10, 2024
Merged

Conversation

li-plus
Copy link
Collaborator

@li-plus li-plus commented Dec 10, 2024

Summary

Fix #461

Testing Done

  • Hardware Type: A800-SXM4-80GB
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

@ByronHsu
Copy link
Collaborator

ByronHsu commented Dec 10, 2024

Dude this is fast lol. I am curious why the tests can pass on AMD?

@li-plus
Copy link
Collaborator Author

li-plus commented Dec 10, 2024

No idea, but it seems I have no permission to run tests on nvidia gpu so they failed.

Run modal run dev.modal.tests
╭─ Error ──────────────────────────────────────────────────────────────────────╮
│ Token missing. Could not authenticate client. If you have token credentials, │
│ see modal.com/docs/reference/modal.config for setup help. If you are a new   │
│ user, register an account at modal.com, then run `modal token new`.          │
╰──────────────────────────────────────────────────────────────────────────────╯

@austin362667
Copy link
Collaborator

Amazing! Thanks for super fast turnaround. @li-plus The Nvidia CI works as expected in my modal account.

I can help with reverting the workaround of #463 after this PR.

Wondering is this code change corresponds to https://github.com/huggingface/transformers/pull/34274/files#diff-09bc594f9680f1d042fd485106c68022d77b59831697a00b3b38f12a3e40f395R1698-R1715?

@li-plus
Copy link
Collaborator Author

li-plus commented Dec 10, 2024

@austin362667 Thanks! I haven't dug into the code diffs. From the model input perspective, before 4.47.0, the position_ids for each sample within a batch always starts from 0 and then increases, so I only accessed the first row of the cos/sin embed to save HBM bandwidth. However after 4.47.0, the position_ids is left padded by 1s and is different for each sample because the sequence length varies. I have to access the full rotary embeds to get the correct position for each sample. This change is compatible to transformers of older version.

@ByronHsu
Copy link
Collaborator

ByronHsu commented Dec 10, 2024

I can help with reverting the workaround of #463 after this PR.

@austin362667 Please do that. Thanks! Also, @li-plus i have added you as a maintainer, so you can directly push the the main repo's branch to run CI. We have disabled CI from external forks due to security reasons currently.

@ByronHsu ByronHsu merged commit 78e8a85 into linkedin:main Dec 10, 2024
3 of 5 checks passed
@li-plus li-plus deleted the fix-mrope branch December 11, 2024 02:16
ByronHsu pushed a commit that referenced this pull request Dec 11, 2024
## Summary

After fix #464

We can revert some changes in

- #463
- #459

Which are workarounds of

#461

<!---
## Details
This is an optional section; is there anything specific that reviewers
should be aware of?
--->

## Testing Done
<!--- This is a required section; please describe how this change was
tested. --->

<!-- 
Replace BLANK with your device type. For example, A100-80G-PCIe

Complete the following tasks before sending your PR, and replace `[ ]`
with
`[x]` to indicate you have done them. 
-->

- Hardware Type: <BLANK>
- [ ] run `make test` to ensure correctness
- [X] run `make checkstyle` to ensure code style
- [X] run `make test-convergence` to ensure convergence

---------

Signed-off-by: Austin Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Qwen VL Convergence Test Fails for Transformers >= 4.47.0
3 participants