Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make tolerance configurable #1058

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Make tolerance configurable #1058

wants to merge 10 commits into from

Conversation

Doris26
Copy link
Collaborator

@Doris26 Doris26 commented Nov 21, 2024

Using pure (512 DCN) FSDP triggers MaxText error of "Number of unsharded parameters exceeds tolerance 2% of total parameters."

Make tolerance a configurable param to avoid future errors across certain machines setups.

Copy link

google-cla bot commented Nov 21, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@Doris26 Doris26 changed the title Tolerance configurable Make tolerance configurable Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant