Make api::icon::is_valid_domain more robust/accurate #4070
Replies: 4 comments 13 replies
-
I think this is a bit too much. The validation of the domain it's done here already Line 140 in f863ffb That has all the validation tooling already. Is there a specific reason for this? Are running into an issue? |
Beta Was this translation helpful? Give feedback.
-
Thx, I'll have a look at it too see if this also works for all cases i have locally. Performance is nice indeed, but security and validity are more of a trigger for me. |
Beta Was this translation helpful? Give feedback.
-
If the proposed code is rejected since you want to allow non-ASCII characters, then the current code still should be amended since it even violates RFC 2181 which is obviously the most permissive definition of "domain" since it's the DNS spec. Specifically, the current code currently allows domains with labels of lengths longer than 63 and also allows domains that are longer than the maximum length possible in representation format: 253 when the last character is not fn main() {
let illegal_label = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
// This does _not_ `panic` but should since RFC 2181 max label lengths are violated.
assert!(illegal_label.len() == 64 && is_valid_domain_cur(illegal_label));
let mut illegal_domain = String::with_capacity(255);
let longest_legal_label = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
illegal_domain.push_str(longest_legal_label);
illegal_domain.push('.');
illegal_domain.push_str(longest_legal_label);
illegal_domain.push('.');
illegal_domain.push_str(longest_legal_label);
illegal_domain.push('.');
illegal_domain.push_str(longest_legal_label);
// This does _not_ `panic` but should since RFC 2181 max length of a domain in
//representation format is violated.
assert!(longest_legal_label.len() == 63 && illegal_domain.len() == 255 && is_valid_domain_cur(illegal_domain.as_str()));
} |
Beta Was this translation helpful? Give feedback.
-
Calling the parse function and also checking if it starts with two dots doesn't rule out the check. If parse fails or the custom char start/end/contains check is valid it will return false. Domains aren't always punycode and thus we need to allow valid Unicode characters which are allowed to be used which is exactly what I never said that we adhere to the rfc, as you can see in our code. I only mentioned it because that doesn't allow the special characters. I also mentioned why allowing I also didn't say it doesn't matter anyway, i mentioned that most are coming from the clientsb from entries added in a normal way. But people can still call that endpoint manually or enter an invalid domain into the url field. We still should try to filter those out. Also, i mentioned several times i would take a look at this and see if we can improve. That doesn't mean today or maybe not even tomorrow. So closing this is in my opinion unnecessary and not very productive way of working. Especially when i mention why we do some checks and why we do not want or need specific characters to be allowed for a domain. Anyways, I'm still glad you mentioned this, any fresh look to things done longer time ago is good. Thx for that. |
Beta Was this translation helpful? Give feedback.
-
api::icon::is_valid_domain
can be made more accurate without too much extra code. I think the only RFC that should matter is RFC 2181. While RFCs 1123 and 5891 are nice, neither one is a subset of the other; and more relevantly, browsers like Firefox are more permissive than both of those RFCs. For that reason I propose changing the logic of this function to allow whatever URL web browsers allow (sans those that are transformed into Punycode). More specifically, the following Unicode scalar values should be allowed as they are allowed by Firefox and likely other browsers:!
,$
,&
,'
,(
,)
,+
,,
,-
,0
–9
,;
,=
,_
,`
,A
–Z
,a
–z
,{
,}
,~
,.
.Additionally the check on length is not strict enough. While RFC 2181 allows domain names to be as many as 255 bytes in length, that is the wire format and not the representation format that we are interested in. Specifically, the maximum length in representation format (ignoring server-specific escape characters) is 254 iff the last character is
.
; otherwise the max length is 253.diff:
Beta Was this translation helpful? Give feedback.
All reactions