Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupedGEMM interface takes m_sizes instead of m_offsets. #3696

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

levendlee
Copy link
Member

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/771

Calculated m_offsets on the fly to avoid small scan kernel launches.

Differential Revision: D69686252

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69686252

Copy link

netlify bot commented Feb 14, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 4772ba8
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/67afd387bff8b80008cba8b3
😎 Deploy Preview https://deploy-preview-3696--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary:

X-link: facebookresearch/FBGEMM#764

- Always allocate workspace. 
  - Allocating is almost free with  PyTorch sub-allocation.
  - Not allocating could cause problems in multi-processing and cuda graph capturing.

- Disable TMA store for now.
  - Running into issues with on-device TMA store.

Reviewed By: jiawenliu64, jwfromm

Differential Revision: D69602533
…h#3696)

Summary:

X-link: facebookresearch/FBGEMM#771

Calculated `m_offsets` on the fly to avoid small scan kernel launches.

Differential Revision: D69686252
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69686252

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants