You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i think this test is incorrect
assert_eq!(validator.validate_pattern("[\\c0-�]", false), Ok(()));
that range is out of order
the odd thing is v8 accepts it but other tools i tried dont
i think theres something weird going on with a test in dlint's regex validator tests, because [🌷-🌸] is checked for being invalid, but running dlint on a file with it doesnt yield any errors. I still dont know why v8 does not accept it, i think its something weird with utf16 code points because its valid by utf8 code points.
The text was updated successfully, but these errors were encountered:
@bartlomieju yeah according to the spec, if the regex doesnt have /u then the chars are utf16 code points, if its parsed with /u then they are utf32 code points (rust chars)
i dont think it's very hard to fix the validator to treat code points right
as far as i know, it should be fine to just encode_utf16() on the char, then if its multiple code points then yield the first one
although utf8 makes this... weird
because i don't think its possible in utf8 to partially advance over a multi-codepoint char without being inside of a char boundary. :sweating:
i think for now im going to just keep 32 bit codepoints since:
- 16 bit codepoints are hard to get working correctly
- most people dont put multi codepoint chars in their regex
- it makes error reporting easier for rules like no-misleading-character-class
if that's fine for you
Reported by @RDambrosio016 on Discord:
The text was updated successfully, but these errors were encountered: