Skip to content

Latest commit

 

History

History
196 lines (144 loc) · 8.72 KB

README.md

File metadata and controls

196 lines (144 loc) · 8.72 KB

XMTP MLS

This document describes how XMTP implements Messaging Layer Security (MLS).

Database Schema

Foreign key constraints and indexes omitted for simplicity.

CREATE TABLE groups (
    -- Random ID generated by group creator
    "id" BLOB PRIMARY KEY NOT NULL,
    -- Based on the timestamp of the welcome message
    "created_at_ns" BIGINT NOT NULL,
    -- Enum of GROUP_MEMBERSHIP_STATE
    "membership_state" INT NOT NULL
);

-- Allow for efficient sorting of groups
CREATE INDEX groups_created_at_idx ON groups(created_at_ns);

CREATE INDEX groups_membership_state ON groups(membership_state);

-- Successfully processed messages meant to be returned to the user
CREATE TABLE group_messages (
    "id" BLOB PRIMARY KEY NOT NULL,
    -- Derived via SHA256(CONCAT(decrypted_message_bytes, conversation_id, timestamp))
    "group_id" BLOB NOT NULL,
    -- Message contents after decryption
    "decrypted_message_bytes" BLOB NOT NULL,
    -- Based on the timestamp of the message
    "sent_at_ns" BIGINT NOT NULL,
    -- Enum GROUP_MESSAGE_KIND
    "kind" INT NOT NULL,
    -- Could remove this if we added a table mapping installation_ids to wallet addresses
    "sender_installation_id" BLOB NOT NULL,
    "sender_account_address" TEXT NOT NULL,
    -- Enum: 1 = 'published' or 2 = 'unpublished'
    "delivery_status" INT NOT NULL,
    FOREIGN KEY (group_id) REFERENCES groups(id)
);

CREATE INDEX group_messages_group_id_sort_idx ON group_messages(group_id, sent_at_ns);

-- Used to keep track of the last seen message timestamp in a topic
CREATE TABLE topic_refresh_state (
    "topic" TEXT PRIMARY KEY NOT NULL,
    "last_message_timestamp_ns" BIGINT NOT NULL
);

-- This table is required to retry messages that do not send successfully due to epoch conflicts
CREATE TABLE group_intents (
    -- Serial ID auto-generated by the DB
    "id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
    -- Enum INTENT_KIND
    "kind" INT NOT NULL,
    "group_id" BLOB NOT NULL,
    -- Some sort of serializable blob that can be used to re-try the message if the first attempt failed due to conflict
    "publish_data" BLOB NOT NULL,
    -- Data needed after applying a commit, such as welcome messages
    "post_commit_data" BLOB NOT NULL,
    -- INTENT_STATE,
    "state" INT NOT NULL,
    -- The hash of the encrypted, concrete, form of the message if it was published.
    "message_hash" BLOB,
    FOREIGN KEY (group_id) REFERENCES groups(id)
);

CREATE INDEX group_intents_group_id_id ON group_intents(group_id, id);

Enums

GROUP_MEMBERSHIP_STATE

  • ALLOWED // User has agreed to be a member of the group
  • REJECTED // User has rejected an invite to the group or left
  • PENDING // User has neither accepted or rejected whether they should join the group

INTENT_STATE

  • TO_SEND // Either has never been sent to the network or needs to be re-sent
  • PUBLISHED // Sent to the network but has not been read back or committed
  • COMMITTED // Committed messages could be deleted

INTENT_KIND

  • SEND_MESSAGE // An intent to send a message to the group
  • ADD_MEMBERS // An intent to add members to the group
  • REMOVE_MEMBERS // An intent to remove members from the group
  • KEY_UPDATE // An intent to update your own group key

OUTBOUND_WELCOME_STATE

  • PENDING // Needs to wait for commit to be applied before sending
  • READY_TO_SEND
  • SENT // Messages may be deleted at this point. We may decide to remove this state altogether.

GROUP_MESSAGE_KIND

  • APPLICATION
  • MEMBER_ADDED
  • MEMBER_REMOVED

State Machine

The following diagram illustrates some common flows in the state machine

MLS State Machine

For the first version of MLS in XMTP, all members commit their own proposals immediately, and immediately discard any proposals from other members upon receiving them. Future versions of XMTP will have more sophisticated logic, such as batching proposals, allowing members to commit proposals from other members, as well as more sophisticated validation logic for which proposals are permitted from which members.

Known missing items from the state machine

  • Key updates
  • Processing incoming welcome messages
  • Tracking group membership at the account/user level
  • Permissioning for adding/removing accounts/users
  • Mechanism for syncing installations under each account/user

Add members to a group

Simplified high level flow for adding members to a group:

  1. Create a group_intent for adding the members
  2. Fetch Key Packages for all new members
  3. Convert the intent into concrete commit and welcome messages for the current epoch
    1. Write the welcome messages to the post_commit_data field for later
  4. Publish commit message
  5. Sync the state of the group with the network
  6. If no conflicts: Publish welcome messages to new members. If conflicts: Go back to step 2 and try again (reset the intent's state to TO_SEND and clear the publish_data and post_commit_data fields)

Remove members from a group

Simplified high level flow for removing members from a group:

  1. Create a group_intent for removing the members
  2. Convert the intent into concrete commit for the current epoch
  3. Publish commit to the network
  4. Sync the state of the group with the network
  5. If no conflicts: Done. If conflicts: Go back to step 2 and try again (reset the intent's state to TO_SEND and clear the publish_data and post_commit_data fields)

Send a message

Simplified high level flow for sending a group message:

  1. Create a group_intent for sending the message
  2. Convert the intent into a concrete message for the current epoch
  3. Publish message to the network
  4. Sync the state of the group with the network (can be debounced or otherwise only done periodically)
  5. If no conflicts: Mark the message as committed. If conflicts: Go back to step 2 and try again (reset the intent's state to TO_SEND and clear the publish_data and post_commit_data fields)

Syncing group state

The latest payloads on a group could be synced from the server in the following cases:

  • Push notifications
  • Application-triggered subscription
  • Application-triggered pull
  • Commit publishing flow

Any syncing strategy must be able to handle the following constraints:

  • Payload syncing could be initiated concurrently from multiple locations
  • Due to forward secrecy constraints, each payload may only be decrypted successfully once

These are the following possible strategies, each with their own limitations:

  • Co-ordinated: Syncing can only happen in one location at a time via locks/queues
  • Unco-ordinated: Allow syncing to happen in parallel

The latter is simpler to implement in the short-term, but raises the following potential challenges:

  • How to handle concurrent decryption failures and return the latest data regardless
  • How to handle updating the last_message_timestamp_ns on the topic_refresh_state table
  • How to know if a failure is due to the message having already been decrypted, or permanent failure

For the initial version, this simple strategy can be used to pull the latest payloads:

  1. Read the last_message_timestamp_ns from the database and pull all payloads from the server with timestamp greater than it
  2. For each payload, attempt to decrypt it
    1. If it succeeds, process the payload. Write the result, update the cryptographic state, and update the last_message_timestamp_ns, together in a single transaction. Set last_message_timestamp_ns to the larger value out of the value in the database and the payload's timestamp.
    2. If it fails, only attempt to update last_message_timestamp_ns in the database to the larger value out of the value in the database and the payload's timestamp.
  3. To return the result of the sync, pull the latest data from the database rather than using the in-memory data from the syncing process

This strategy effectively means that the processing of each payload succeeds or fails atomically. In the event of failure due to concurrency, the actual result can be read from the database.

For now, we can put off the issue of detecting if a decryption failure is due to concurrency or permanent failure. If OpenMLS cryptographic state is entirely database-driven, we may be able to detect that a failure is due to concurrency by the fact that last_message_timestamp_ns has already been updated. If OpenMLS cryptographic state is partially driven by in-memory data, we can record per-payload successes and failures in a separate table, with successes always overwriting failures.

Updating your list of conversations

  1. Read from the welcome topic for your installation_id, filtering for messages since last_message_timestamp
  2. For each message, create a group with a GROUP_MEMBERSHIP_STATE of pending