-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix versionsort chunk split on non-ASCII numerics #6407
fix versionsort chunk split on non-ASCII numerics #6407
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Style Guide mentions that numeric chunks are defined by ascii digits so using is_ascii_digit
is definitely the right function call to use.
I've left a few comments inline. I think we should rework the current test case and also add a style_edition=2015
case so that we can compare the sort order when dealing with non ascii numeric characters.
fn main() { | ||
first_print(); | ||
second_print(); | ||
third_print(); | ||
|
||
assert_eq!("print๙msg".cmp("printémsg"), Ordering::Greater); | ||
} | ||
|
||
/// '๙' = 0E59;THAI DIGIT NINE;Nd; | ||
mod print๙msg { | ||
pub fn print() { | ||
println!("Non-ASCII Decimal_Number") | ||
} | ||
} | ||
|
||
/// '0' = 0030;DIGIT ZERO;Nd; | ||
mod print0msg { | ||
pub fn print() { | ||
println!("ASCII Decimal_Number") | ||
} | ||
} | ||
|
||
/// 'é' = 00E9;LATIN SMALL LETTER E WITH ACUTE;Ll; | ||
mod printémsg { | ||
pub fn print() { | ||
println!("Lowercase_Letter") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove any code that isn't an import since it's not needed for the test case. Though, It would be great to keep the explanatory comments you added about '๙'
, '0'
, and 'é'
to help document the test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and expanded on the comments to explain why they sort in that order.
use std::cmp::Ordering; | ||
use print๙msg::print as first_print; | ||
use print0msg::print as second_print; | ||
use printémsg::print as third_print; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version sort is only be applied when using style_edition=2024
. You'll need to configure that for this test as follows:
use std::cmp::Ordering; | |
use print๙msg::print as first_print; | |
use print0msg::print as second_print; | |
use printémsg::print as third_print; | |
// rustfmt-style_edition: 2024 | |
use std::cmp::Ordering; | |
use print๙msg::print as first_print; | |
use print0msg::print as second_print; | |
use printémsg::print as third_print; |
You can read more about the comment configuration in test cases here.
Also, can you also add a style_edition=2015
import sorting test case so that we can compare the pre version sort ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. The 2015 edition actually sorts the same way since U+0030 < U+00E9 < U+0E59
, so the bug only affected 2024 edition (and can't be reproduced on earlier versions because ASCII digits are the earliest Unicode numerics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding that additional test case!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help on this one!
I think we're good to go here, but just want to do a quick sanity check by running the updated rustfmt on some larger rust codebases. Here's a link to the diff check job. Edit: Job ran successfully ✅ |
Description
Fixes a bug with 2024 Edition versionsort, where a string chunk will split on all numeric characters. For example, this would cause imports containing non-ASCII numeric characters to be incorrectly sorted.
Unformatted (Playground link):
Formatted imports (nightly 2024-11-29, Playground link):
Here,
printémsg
should be sorted beforeprint๙msg
, but since๙
is a non-ASCII numeric, that import is split into two short string chunks, which will sort before the one longer chunk.Changes & Notes
char::is_numeric()
check with the correctchar::is_ascii_digit()
call inVersionChunkIter::parse_str_chunk()
.rustfmt
against the repo and found no changes, so there isn't a separate commit for that.