Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify if MXIDs are case-sensitive #996

Open
ShadowJonathan opened this issue Mar 15, 2022 · 7 comments
Open

Clarify if MXIDs are case-sensitive #996

ShadowJonathan opened this issue Mar 15, 2022 · 7 comments
Labels
clarification An area where the expected behaviour is understood, but the spec could do with being more explicit

Comments

@ShadowJonathan
Copy link
Contributor

Link to problem area: (the spec in general, or more specifically, the absence of explanation regarding this)

Issue
Related to matrix-org/matrix-spec-proposals#3794

In the spec, there’s no clarification how case-sensitivity interacts with identifying users.

Precedent is that, from implementations, there exist “canonical” casings for MXIDs, but that they’re coalesced in a case-insensitive way from lookups and other addressings.

Conduit has already effectively disabled case-sensitive signups, and converts users on signup to lowercase.

Synapse historically allowed case-sensitive usernames.

The main question here is; if I have a MXID @Jo:matrix.org and @jo:matrix.org, would these point to the same user, or is it possible these point to two different users?

(I understand the historical nature of this problem, but I think it’s useful to put this problem to bed once and for all by clarifying if MXIDs are case-sensitive or not.)


keywords; case insensitive, lowercase, case-sensitive

@ShadowJonathan ShadowJonathan added the clarification An area where the expected behaviour is understood, but the spec could do with being more explicit label Mar 15, 2022
@turt2live
Copy link
Member

The spec should already be very clear on this - what do the appendices say?

Users differing only by case are two different users. Synapse collapses this down at registration to try and check for potential logical conflicts, and on login if there's no conflicts then I believe it maps to the correct user ID. It can do this because the username does not need to be the localpart - it just happens to be in many cases.

@ShadowJonathan
Copy link
Contributor Author

ShadowJonathan commented Mar 15, 2022

Synapse collapses this down at registration to try and check for potential logical conflicts, and on login if there's no conflicts then I believe it maps to the correct user ID.

Yes, but this is an implementation detail, and the spec does not explicitly spell out that this is correct (in this case).

As an example of where this matters, I was thinking for ruma to make its implementation accept case-insensitive == for UserId (MXID), but I realised the spec does not explicitly spell out that this is fine to do, thus this issue.

The latter cascades into a whole bunch of things, for example one where Conduit might treat @Jo:matrix.org as a different user to @jo:matrix.org, while synapse would treat them the same.


What do the appendices say?

^ "Forbidding upper-case characters [...] is a relatively simple way to ensure that @USER:matrix.org cannot refer to a different user to @user:matrix.org."

It does say this, but it does not say "treat @USER:matrix.org as if it's the same as @user:matrix.org", which is the kind of explicit wording i'd like it to have, if it's implying this.

In any case, as a SCT member, can you confirm this is what the spec is implying? And so that we can make something like UserId over at ruma equate MXIDs in an case-insensitive way?

@turt2live
Copy link
Member

The spec does not treat @USER:example.org the same as @user:example.org, which is what it is saying. The login and registration APIs also allow for the implementation-specific detail you're mentioning because the username is disassociated from the user ID, thus allowing someone to theoretically have access to @user, @USER, @USer, etc using a single username - the server just returns the one it wants the client to use.

In events though, @USER is very different from @user - they are entirely different senders.

Which area of the spec are you looking for this information in?

@ShadowJonathan
Copy link
Contributor Author

In events though, @USER is very different from @user - they are entirely different senders.

Thanks for clarifying this.

Which area of the spec are you looking for this information in?

I was generally taking hints from a lot of implementations while looking at the spec, which seems to be saying exactly what you just clarified; they're different users.

This issue came to be because every implementation choice seems to act as if that's not the case, toeing a careful line around "they're the same" and "they're different" in different areas, which confused me.

Though, this clarifies it as a historical wart, one where - due to historical (in)decisions - MXIDs are case-sensitive, and MXIDs with uppercase characters exist in the wild.

However, on the client-server API front, when dealing with a server's own users, or another server's users, it did not seem to fundamentally matter what casing that user has (as is the case with synapse), because it would be automagically coalesced to whatever casing that remote/local user has in their MXID. Similarly to usernames, they'd be "upgraded" to the "canonical" casing by the server somewhere along the way.

What I then fail to see in the spec, then, is where these upgrades can or are expected to happen, as they happen in implementations such as synapse. In that area, I think the spec is underdocumented.

Because, say, if I submit an invite for @user, and I get an incoming invite in the timeline for @User, an application might not "recognise" this is the same user, as they have different casings, and the server has upgraded this somewhere along the way.

Even if this is not the case today, then it might be useful to at least acknowledge this backwards-compatible behaviour in a way (by documenting cases where "upgrading" to uppercase is allowed or not), or at least build APIs for it, such as what matrix-org/matrix-spec-proposals#3794 originally suggested.

Note: I am speculating on the specific points where Synapse actually handles this automagically, I'm going off memories where I have seen either Element, Synapse, or other servers "update"/"upgrade" casings in my submitted MXID in some instances. It might've just as well been Element masking this behaviour to a degree.

@turt2live
Copy link
Member

I'm trying to figure out where we'd put the few sentences that are needed to clarify the situation, though I also believe they're already there in the depths of the spec. Is there a particular area you were looking in?

Element (Web) applies artificial restrictions to ensure lowercase user IDs get created, and Synapse in particular further assists in this by preventing new user IDs created by it being upper/mixed case. The login API then tries to map lowercase usernames to mixed/upper case localparts (because in Synapse your username happens to be your localpart - this isn't a requirement in the spec). If Synapse finds that, during login, there are multiple people who could match that username then it requires exact case to be used to continue with login, which is one of the reasons why Element doesn't apply validation to the login page.

@ShadowJonathan
Copy link
Contributor Author

ShadowJonathan commented Mar 15, 2022

Element (Web) applies artificial restrictions to ensure lowercase user IDs get created

(note; I wasn't talking about registration, but when inviting people who have an uppercase letter in their name)

Is there a particular area you were looking in?

Not specifically, though as a pointer I could say that any endpoint - on client-server - which deals with MXIDs, could classify.

Though, now that I'm thinking about this, it's possible that synapse could implicitly be doing this on federation endpoints as well (/invite/)

However, at this point I'm unsure about most things I'm speculating/assuming to be there, this warrants more investigation on my side to figure out what exact behaviour synapse has that fits this description of "upgrading" casings in any join, invite, or user referral (profile requests, etc.) operation.

If I find none, and that element has been correcting this quirk the whole time, then I'll close this issue, otherwise I'll post more specific endpoints and their implementation behaviours to help this discussion.

@turt2live
Copy link
Member

Invites don't do anything to the user ID being invited? I feel you're talking far broader than the problem scope here ;)

I'm not aware of any code in Synapse or Element which would casefold user IDs on invite, at least. Happy to be proven wrong, obviously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification An area where the expected behaviour is understood, but the spec could do with being more explicit
Projects
None yet
Development

No branches or pull requests

2 participants