Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(group): not fail on unverified KeyPackages and return failed to add installations when adding members to a conversation #1606

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

mchenani
Copy link
Contributor

@mchenani mchenani commented Feb 6, 2025

When an installation has a bad KeyPackage or a KeyPackage with an invalid lifetime, we currently fail to add other members as well.
This PR updates the logic to add all healthy installations while returning a list of installations that failed.

@mchenani mchenani requested a review from a team as a code owner February 6, 2025 16:41
…bers' into mc/validate-kp-before-adding-members

# Conflicts:
#	xmtp_mls/src/client.rs
#	xmtp_mls/src/groups/intents.rs
#	xmtp_mls/src/groups/mls_sync.rs
#	xmtp_mls/src/groups/mod.rs
xmtp_mls/src/client.rs Outdated Show resolved Hide resolved
@mchenani mchenani requested a review from codabrink February 6, 2025 17:28
new_installations,
new_key_packages,
installation_diff.removed_installations,
failed_installations,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this get persisted somewhere on the group? Ideally we would keep track of the failed installations, so that they can be retried at some point later. Otherwise once this fails it will be very difficult for us to add retrying later, because we won't know who is missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it’s not our responsibility to re-add failed installations. There are two cases to consider:

  1. All of a user’s installations fail – In this case, we inform the user that we can’t add them. They need to resolve the issue with their account before we can proceed.
  2. At least one of the user’s installations fails, but not all – In this case, the successful installations should handle adding any missing installations.

With this approach, we place the responsibility on the user rather than the admin to retry. This is because there’s a chance that the problematic installation may never be fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about retrying at a much later date. For example, if you have an expired key package that got dropped from the network and then come back online and upload a new key package.

If we don't store the failed installations it requires a very expensive diffing operation to figure out who is missing from a large group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still thinking of if we distribute the load on the clients of each user it would be better, then we can show their healthy clients some alerts that you have a bad installation go and fix it, or here you can revoke that installation!
Just showing the admin that we can't add that user bc they don't have a healthy client, then the admin can add them later rather than doing it in the background.

Copy link
Contributor

@neekolas neekolas Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have a mechanism for adding the unhealthy client later if we don't store it.

The inbox's sequence_id will be set to the latest. Uploading a new key package won't change their sequence_id. So even if you try to add the user again it won't see any installations in the diff and will be a no-op.

Copy link
Contributor Author

@mchenani mchenani Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to implement it. this is my idea:

  • the client fetches the list of all installations from either the server or the sync_group
  • on every welcome message the client checks if all of installations are in the group, if not, then tries to add
  • on every membership change, we can do like what we do on welcome messages, just to make sure we have our installations always are updated
  • based on the installations list, the client will fetch all the keypackages for other installations, if something wrong, it will inform the user-- we can provide a list of all the user installations and their status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we validate a commit we do some checks to make sure that the installations being added came from the diff between the old and new GroupMembers struct. We get a list of all installations that were expected to be added or removed based on that diff.

In the case of adding some missing installation it's not going to be present in that list and clients will reject it.

If we had a list of missing installations saved somewhere it would be easier to figure out that the newly added installation really is supposed to be there.

Otherwise to validate that commit you would have to get all the installations for all the members of the group to find out if the new leaf node is one of the missing installations. That's very expensive for large groups.

Copy link
Contributor Author

@mchenani mchenani Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 100% agree of saving the failed installations, just wanted to put the load on other clients.
Thanks for the comments Nick, as always insightful. will cover this part in the PR too 🙌

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I don't think you need to handle the retrying now. We just need to know how it will work to make sure the right data is persisted to the group so we can retry it later.

…bers' into mc/validate-kp-before-adding-members

# Conflicts:
#	xmtp_mls/src/client.rs
#	xmtp_mls/src/groups/intents.rs
#	xmtp_mls/src/groups/mod.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants