-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC1849: Proposal for m.relates_to aggregations #1849
Conversation
There is some related prior art here. Mjark and I thought about how to do updates vs relations back in 2015 - https://github.com/matrix-org/matrix-doc/blob/5795e1cedaa001994847702193bd02e517e07c83/drafts/general_api.rst#general-client-changes Some combination of the two may work for aggregating relations. |
@kegsay indeed - i had not forgotten it, and we should definitely take inspiration from it. In other news: we should perhaps consider extending redaction API to say "redact this event" or "redact all related events". |
proposals/1849-aggregations.md
Outdated
go walk the DAG to pull in the missing relations? Then, the next relation for an event | ||
could pull in any of the missing relations. | ||
* Could we also ask the server, after a gap, to provide all the relations which happened during the gap to events whose root preceeded the gap. | ||
* "Give me all relations which happened between this set of forward-extremities when I lost sync, and the point i've rejoined the DAG, for events which preceeded the gap"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when the set of relations becomes very large?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent question. god knows; this is being swept under the carpet for now.
{ | ||
// event contents | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to figure out how to send the aggregation key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add a key
query string param with the (url encoded) aggregation key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erikjohnston why? can't we just fish it out of the event contents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some thoughts about things, that annoy me with the current implementations of edits and relations. I would like this to finally land in the spec, but I think the current design does have some issues. Those are my opinions though, so take them with a grain of salt :3
field of the `m.relationship` and the object is the `event_id` field. | ||
|
||
We consciously do not support multiple different relations within a single event, | ||
in order to keep the API simple, and in the absence of identifiable use cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in practice this has turned out to be the wrong decision. While it is simpler, I do sometimes reply to the wrong person and not being able to change that is quite annoying. It also is usually quite annoying, that an edit loses the relationship to the reply. This makes it hard to strip out the reply fallback from edited events (which shouldn't be sent with the even, but that tends to break fairly often in Riots). This also limits any future usage, where you may want to reply to multiple events, reply in a thread, etc.
There are multiple alternatives, which may be worth to consider:
{
"m.relations" : {
"m.reply": {
"even_id": "someid"
},
"m.annotation": {
"even_id": "someid",
"key": "👍"
},
"m.thread": {
"even_id": "someid",
"thread": "internal-discussion"
}
}
}
or
{
"m.relations" : [
{
"type": "m.reply",
"even_id": "someid"
},
{
"type": "m.reply",
"even_id": "someotherid"
},
{
"type": "m.annotation",
"even_id": "someid",
"key": "👍"
},
{
"type": "m.thread",
"even_id": "someid",
"thread": "internal-discussion"
}
]
}
Some combinations make no sense of course and it does introduce some additional complexity, but it is a lot more extensible to use cases not in this proposal yet. It is also still changeable, since this change is not in the spec yet and I don't think clients use m.relations atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your first alternative is exactly what replies currently do, except with different keys. If we did go with that, it should probably just keep the existing m.relates_to
and m.in_reply_to
keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I should have probably made it clearer that I don't understand, why the format is being simplified so much, that it limits its use to such an extent. Keeping the current replies format and extending it would be fine by me too, although it is more limited than an array of relations.
most recent replacement event (as determined by `origin_server_ts`). The | ||
replacement event must contain an `m.new_content` which defines the | ||
replacement content (allowing the normal `body` fields to be used for a | ||
fallback for clients who do not understand replacement events). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like, that this adds another fallback, since I consider the old fallbacks to be a mistake. They are hard to handle correctly and almost double data usage for almost no benefit. They are also very disliked in my experience, since it leads to a lot of spam, if someone edits a large message multiple times (which happens far more often than editing a short message). I think there should be no fallback, since just adding an asterisk in front of the message content provides marginal value and at least as much annoyance and bloat. (see also #2589)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So a short summary from my experience with this fallback:
Every client currently implements the fallback wrong. This MSC specifically states not to send the reply fallback in body. In practice every client does. For a short time it was fixed in Riot/Web, but it immediately regressed again (like 2 weeks or so after). Other clients never implemented it correctly, like RiotX.
I was corrected that the reply fallback should only be stripped from the new_content. That of course breaks every client that implements the current spec including replies, but doesn't implement edits. Which makes the fallback worthless half of the time.
This is a direct continuation of the issues the reply fallback has. The reply fallback led to multiple security relevant issues in Riot/Web. It could be used to impersonate other users multiple times and at some point you could even inject arbitrary html into events using it. The edit fallback is simpler, but why risk the same issues.- In practice clients only add a * in front of the edited body. This may be something IRC people understand, but it is horrible to new users. It makes no sense to them. A sed expression also doesn't explain anything to them. Just dropping new_content would be minimally more confusing and makes implementation easier.
m.new_content
makes implementation in statically typed languages harder. With normal events you just need 1 type for each type of content. If you want to handle edits properly now, you need to allow nesting different content types in each content type. There are many ways to solve that, but it is not trivial and makes handling edits unnecessarily complex.
Really, we shouldn't repeat the same mistake again. Reply fallbacks were bad, so don't add new ones. Especially if they are just there to waste everyones bandwidth.
there are multiple replacement events, the server must consistently choose | ||
the same one as all other servers. | ||
|
||
* `m.reference` - Intended in future for handling replies and threading, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the bikeshed discussion from #2589 (comment) over to here.
I'm not entirely convinced by the relationship name (m.reference
) and I'd prefer to have m.reply
even if it happens to be unnecessarily specific in some contexts later on (which I believe is unlikely; see also what happened to In-Reply-To and References in e-mail).
Reply threads are distinct enough from other kinds of threading (label-based (#2326); continuation-style a la Twitter chains) for clients to want to treat them differently. #2326, notably, doesn't use m.reference
(and, arguably, m.reference
would anyway look alien there). It's also more difficult to defend that a "reference" should be singular, unlike a "reply" which is generally expected to revert to one specific prior piece of discussion.
@@ -0,0 +1,993 @@ | |||
# Proposal for aggregations via relations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ara4n says:
We should spec that emoji reactions should be sent with the colourful variation-16 selector when applicable
see element-hq/element-web#13926 and element-hq/element-web#13586 and friends
@@ -0,0 +1,993 @@ | |||
# Proposal for aggregations via relations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the spec rule: #2635
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The push rule is tracked in this separate MSC, which seems to have been wedged in the state of saying it should be folded into MSC1849.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😭
} | ||
``` | ||
|
||
## End to end encryption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several issues with encrypting the key of a reaction to aggregate it. Soru will briefly recap her current understanding of this proposal off of which she builds this comment:
- We want to aggregate based on
event_id
,rel_type
ANDkey
- We don't want the
key
to be cleartext as that leaks potentially valuably content - Assuming Alice, Bob and Claire are in a room and Claire sends a message that Alice wants to react to, Alice takes the same AES key Claires used to encrypt that specific message to generate an encrypted key blob, that the server won't be able to understand.
- Since both Bob and Claire have the AES key, they can make sense of the blob
This raises the following attack scenario:
- Alice, Bob and Claire are all in a room
- Claire sends a message
- Alice reacts to it, creating the key blob that the server can't understand. She does, however, send the message as Bob (we can't trust the server, so Alice might have Bobs access token or something)
- The reaction will appear as if sent from bob and Claire won't be able to tell the difference. Alice successfully fooled Bob and Claire
Soru can't think of any real way how to fix this. Perhaps this raises the question if events in e2ee rooms should only be aggregated by event_id
and rel_type
, or, even better, just by event_id
as that leaks less meta-information.
A reaction could then look like following:
The encrypted context would be as following:
{
"type": "m.reaction",
"content": {
"m.relates_to": {
"rel_type": "m.annotation",
"key": "🦊"
}
}
}
The cleartext content would be as following:
{
"type": "m.room.encrypted",
"sender": "@alice:example.org",
// things for a normal event
"content": {
"algorithm": "m.megolm.v1.aes-sha2",
"ciphertext": "<cipertext>",
"device_id": "<device id>",
"sender_key": "<sender key>",
"session_id": "<session id>",
"m.relates_to": {
"event_id": "$some_event"
}
}
}
The client could then merge the m.relates_to objects and can figure out things for itself. The server is still able to aggregate which event relates to which one.
Does this mean more work for the client? Yes, yes it does, at the gain of preventing replay and impersonation attacks. We constantly have such trade-offs in e2ee rooms: Be it no server-side search, or push notifications not being able to trigger based on keywords. Such a tradeoff seems rather appropriate here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few tradeoffs to consider here:
- Encrypting the relation type:
- This will make pushes more noisy. If you encrypt the relation type for reactions, pushes in e2ee rooms won't be able to distinguish reactions from replies. So every reaction will cause a notification and push to the device, which may increase power use on mobile devices, since reactions tend to be unimportant. On the other hand, maybe reactions don't happen that often anyway in encrypted rooms, so that could be fine.
- You are losing endpoints that aggregate by relation type -> that's totally fine imo
- Encrypting the key
- You are losing the additional info in unsigned, that can make reaction initially easier to handle. On the other hand, you don't want to rely on unsigned in encrypted rooms anyway, to prevent serverside forgery. So I think that is actually a benefit!
- You lose serverside aggregated results for reactions via endpoints. I think that is totally fine too. You don't want to rely on the server to do that anyway in e2ee rooms at least.
I think most of that is alright for e2ee rooms and it may even be a benefit. The unsigned stuff and endpoints are mostly helpful for thin clients. Clients that support encryption already cache most events anyway, so handling reactions client side usually proves to be simpler. With that we could actually simplify the aggregation logic by a lot. We could just provide a single endpoint, that returns all events related to a specific event id and let clients sort through the events they are interested in. We may provide additional events for large, unencrypted rooms to target specific relation types, since there the list may be too large. It shouldn't matter that much in smaller, encrypted rooms.
With that, we could reduce reactions to the following:
{
"type": "m.reaction",
"content": {
"m.relation": {
"type": "m.annotation",
"key": "🦊",
"event_id": "$someevent so that we can be sure this is matches what the server tells us"
},
"m.relations": ["$someevent"]
}
In public rooms you would peek into the m.relation
key in content
to give access to the additional aggregation stuff, in encrypted rooms you would fall back to just aggregating everything depending on their event id. It would also make reactions saner, since their content is actually the key, so it should be in content imo. (Note that I'm still partial to having multiple relations and renaming the keys to not break clients, so m.relations is actually an array and you may want to structure the relation in content differently too.)
All in all I don't think much is lost by encrypting everything but the related to event id and it would reduce metadata leaks by a lot and make replay and reaction targets actually verifiable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will make pushes more noisy.
Wild idea: how about a flag to events is added if they should trigger a push notification or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the idea of splitting up the one object into an encrypted and unencrypted part is to not introduce two places the server has to look for to aggregate events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wild idea: how about a flag to events is added if they should trigger a push notification or not?
That sounds easily abusable... I think any encrypted event should just trigger a push and the client should then decide how to handle that (unless the whole room was muted).
Also, the idea of splitting up the one object into an encrypted and unencrypted part is to not introduce two places the server has to look for to aggregate events.
Ah, I guess that makes sense. I forgot relates_to is sent in content. Yeah, I think your json is better. I'm guessing your ciphertext includes the full relates_to, so that clients don't need to merge it and it is signed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess that makes sense. I forgot relates_to is sent in content. Yeah, I think your json is better. I'm guessing your ciphertext includes the full relates_to, so that clients don't need to merge it and it is signed?
Sorus suggestion actually did not include the full relates_to, but it would be trivial (and maybe good!) to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds easily abusable... I think any encrypted event should just trigger a push and the client should then decide how to handle that (unless the whole room was muted).
Wouldn't be much difference than sending it still as m.reaction
- only that if we eventually introduce more such events, then the server doesn't know which one got triggered.
It could even still trigger a push notification, and the client uses that flag to maybe not show it
Or the client is so advanced even that it is able to decrypt the event fully anyways and then no flag is needed and you can still determine to drop push notifications for reactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or the client is so advanced even that it is able to decrypt the event fully anyways and then no flag is needed and you can still determine to drop push notifications for reactions.
That sounds great if there was a general mechanism for it. I would love it if I could simply mark my messages as unimportant, don't disturb the other party!
I've split this up into #2674 (general event shape), #2675 (serverside aggregations), #2676 (edits), and #2677 (reactions), and created matrix-org/matrix-spec#660 as an open issue for encryption. |
This one should probably be closed, then, and the original post updated? |
Aggregations MSC for edits & reactions to replace #441
Rendered
As of July 2020, this MSC has been split into: #2674 (general event shape), #2675 (serverside aggregations), #2676 (edits), and #2677 (reactions), and matrix-org/matrix-spec#660 as an open issue for encryption.