Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mps support. #75

Merged
merged 3 commits into from
Oct 15, 2024
Merged

Conversation

iamnotagentleman
Copy link

Describe your changes and approach used

Added MPS support for better performance on apple silicon.
Added init.py in models/ to make it a Python package.

TEST MACHINE


{'cpu_count': 11,
 'machine': 'arm64',
 'platform': 'macOS-14.4.1-arm64-arm-64bit',
 'processor': 'arm',
 'ram': '36 GB',
 'system': 'Darwin'}

@luca-medeiros
Copy link
Owner

Cool! guess we can do the same here

DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

@iamnotagentleman
Copy link
Author

iamnotagentleman commented Oct 14, 2024

Great catch, @luca-medeiros! I’ve also added some additional information regarding this PR.

MPS users will encounter warnings like the following:

UserWarning: The operator 'aten::upsample_bicubic2d.out' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)

As of version 2.4.1, upsample_bicubic2d.out is not supported on the MPS backend. However, support has been added to the nightly builds, so native support of this function should be available soon.

Additionally, autocast support for MPS is on the way, though there is no planned merge yet. Once available, it should improve performance even further.

MPS Coverage tracking issue on pytorch: pytorch/pytorch#77764

P.S.

I have tested with nightly builds the warning is gone as expected.

torch @ https://download.pytorch.org/whl/nightly/cpu/torch-2.6.0.dev20241014-cp311-none-macosx_11_0_arm64.whl#sha256=34c8b2dab12214b0cabf626927470e70139d8f3dbe7333d64f743f87cf178f9e
torchvision @ https://download.pytorch.org/whl/nightly/cpu/torchvision-0.20.0.dev20241014-cp311-cp311-macosx_11_0_arm64.whl#sha256=89f7a4c3e3565d8513a6ce9e2f6fbbc97f2211c09e01dbf0593f2bde65934baf

@luca-medeiros
Copy link
Owner

@iamnotagentleman Nice, didn't know about that. Appreciate the explanation and PR!
What do you think would be best? wait for the stable release or revisit later?

@iamnotagentleman
Copy link
Author

iamnotagentleman commented Oct 15, 2024

@luca-medeiros, I believe we can proceed for now. Other operations besides "upsample_bicubic2d" are still accelerated by MPS, and SAM-2 has received optimizations for the MPS infrastructure, so overall performance should still improve. I haven’t encountered any bugs so far.

Also, before this PR, every action was running on the CPU, but now only "upsample_bicubic2d" will remain on the CPU.

@luca-medeiros luca-medeiros merged commit e82f70f into luca-medeiros:main Oct 15, 2024
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants