-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize recursive ls #2083
Optimize recursive ls #2083
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start.
Some first comments (I know it is a draft)
Please also fix the conflicts
and please create a
src/uu/ls/BENCHMARKING.md
to document this like src/uu/sort/BENCHMARKING.md
don't hesitate to add more tests too |
- PathData will hold Path related metadata / strings that are required frequently in subsequent functions - All data is precomputed and cached and subsequent functions just use cached data
- Cache filename and sort by filename instead of full path - Cache uid->usr and gid->grp mappings
f059880
to
b4af0d0
Compare
@sylvestre thank you for the early review, I've addressed your comments:
I've added a small test for error path related to width specifically (it was a low hanging fruit for error path that was not covered as per local code coverage report) Please let me know if you think any specific cases I should try to cover
Yes, that was a great idea - I have created one to start with.
Conflicts are now fixed Marking this as ready, please revisit this for review/merge. Thanks! |
src/uu/ls/src/ls.rs
Outdated
@@ -16,6 +16,7 @@ extern crate uucore; | |||
mod quoting_style; | |||
mod version_cmp; | |||
|
|||
use cached::proc_macro::cached; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix the warning :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've fixed the warning - though I see failure in a CI/CD build for MinRustV: https://github.com/uutils/coreutils/pull/2083/checks?check_run_id=2373934810
Any suggestions on how to handle that would be welcome.
Maybe importing the cached just for caching uid2usr and gid2grp can be avoided and we could get away with something simpler - but I did not want to complicate things too much and this seemed like a textbook case for drop-in use of cached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MinRustV fails because cached
depends on async-mutex
which seems to be using a feature added in Rust 1.41 (our current MSRV is 1.40). Here is the PR for that feature: rust-lang/rust#64325
Error: --> /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/async-mutex-1.4.0/src/lib.rs:292:31
|
292 | pub fn try_lock_arc(self: &Arc<Self>) -> Option<MutexGuardArc<T>> {
| ^^^^^^^^^^
|
= note: for more information, see https://github.com/rust-lang/rust/issues/44874
= help: consider changing to `self`, `&self`, `&mut self`, `self: Box<Self>`, `self: Rc<Self>`, `self: Arc<Self>`, or `self: Pin<P>` (where P is one of the previous types except `Self`)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking forward to this PR and it does not disappoint! Very nice work! I think this also enables more optimizations in the future. I put some questions/suggestions below. I don't think these necessarily need to be changed in this PR, but I'm interested to hear whether you think they might be worthwhile.
Sort::Name => entries.sort_by_key(|k| k.to_string_lossy().to_lowercase()), | ||
Sort::Version => entries.sort_by(|a, b| version_cmp::version_cmp(a, b)), | ||
Sort::Name => entries.sort_by_key(|k| k.file_name.to_lowercase()), | ||
Sort::Version => entries.sort_by(|k, j| version_cmp::version_cmp(&k.p_buf, &j.p_buf)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version_cmp
is actually one of the worst offenders of unnecessary conversions (which I know because I wrote it). It converts its inputs to String
via to_string_lossy
and disregards the Path
. So another easy win is to pass the lossy_string
to version_cmp
instead of the Path
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - I did not get to check the implementation for version_cmp - we should definitely do this as it is a low hanging fruit.
I'll do it as part of this PR if I get the chance.
/// Caching data here helps eliminate redundant syscalls to fetch same information | ||
struct PathData { | ||
// Result<MetaData> got from symlink_metadata() or metadata() based on config | ||
md: std::io::Result<Metadata>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to return this Result
on PathData::new
? The fs::metadata
call fails if the path does not exist or the user does not have permissions to read the metadata, in which case I think we want to log that and then not consider the file for any later calls. So the new method would look rougly like this:
impl PathData {
fn new(p_buf: PathBuf, config: &Config, command_line: bool) -> std::io::Result<Self> {
let md = get_metadata(&p_buf, config, command_line)?;
...
Ok(Self{...})
}
}
And then we don't have to check for errors from the metadata later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's another good idea.
Need to check where / when the error is expected to be logged, when we print the details of the file (after sorting) or if it is fine to log all errors upfront.
For metadata, I had another optimization in mind where if the config does not include any options that require metadata details, we could skip get_metadata call altogether (type for the md field could then become Option)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, that's a neat idea! An easy way to implement this might be to make a method PathData::md()
that retrieves the metadata on the first call and then stores it in the struct and reuses that value when it's called again.
match md { | ||
Err(e) => { | ||
let filename = get_file_name(i, strip); | ||
let filename = get_file_name(&i.p_buf, strip); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this could just be
let filename = i.file_name;
or is it too hard to implement strip
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nice suggestion 👍
Actually, I think all get_file_name could be updated to be something like (based what all current use-cases look like)
fn get_file_name(p: &PathData, strip: bool) -> String {
if strip {
p.file_name
} else {
p.lossy_string
}
}
cached will be only used for unix currently as current use of caching gid/uid mappings is only relevant on unix
@tertsdiepraam I think you are the best reviewer for this. please let me know when you think we can merge it :) |
Implement caching using HashMap and lazy_static
Rust 1.40 did not support map_or for result types
@tertsdiepraam I've updated to remove the dependency on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! We can further optimize this in other PR's and it's definitely worth merging this and not do too much in this PR. All checks are green too, so this can be merged in my opinion. @sylvestre
match uid_cache.get(&uid) { | ||
Some(usr) => usr.clone(), | ||
None => { | ||
let usr = entries::uid2usr(uid).unwrap_or_else(|_| uid.to_string()); | ||
uid_cache.insert(uid, usr.clone()); | ||
usr | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simplify this a bit with the HashMap::entry
method:
match uid_cache.get(&uid) { | |
Some(usr) => usr.clone(), | |
None => { | |
let usr = entries::uid2usr(uid).unwrap_or_else(|_| uid.to_string()); | |
uid_cache.insert(uid, usr.clone()); | |
usr | |
} | |
} | |
uid_cache.entry(&uid).or_insert_with(|| | |
entries::uid2usr(uid).unwrap_or_else(|_| uid.to_string()) | |
).clone() |
(haven't tested whether all the types and ownership work out)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had tried something like this, but it was not caching for reasons I didnt get a chance to debug.
I will give it another shot in a follow-up, worth switching to this for the benifit of cleaner code.
Thanks!
well done :) |
Interesting, I have different results than you. On firefox source code:
it is an improvement but still much slower than ls :) |
@sylvestre I also tried it on my machine with difference combinations of 'ls -al -R' ran
2.05 ± 0.05 times faster than '../coreutils/target/release/coreutils ls -al -R'
'ls -R' ran
5.03 ± 0.03 times faster than '../coreutils/target/release/coreutils ls -R'
'ls -a -R' ran
5.25 ± 0.07 times faster than '../coreutils/target/release/coreutils ls -a -R'
'ls -l -R' ran
2.18 ± 0.03 times faster than '../coreutils/target/release/coreutils ls -l -R' (this is also on the Firefox source code) |
@sylvestre for difference between results, I imagine that could be because I'm trying this on wsl2 rather than bare linux and / or other filesystem / hardware differences (these kind of differences will affect what is the bottleneck and hence the overall results I believe) I do get results along these lines (note this is on linux source tree - not sure of how that compares with firefox sources, but I've tried on larger directories too with more or less similar results)
There are some more optimization opportunities that this PR exposes - some are listed in comments from @tertsdiepraam 's review that I couldn't get to in this PR. As for the difference with vs without -a / -l, I imagine that maybe because GNU skips fetching information that is not really required (eg. GNU ls will do no stat calls with just -R but the way things are working after this PR, we do it once, always for every path regardless of whether we will require it down the line or not) We may be able to do the same following one of @tertsdiepraam 's suggestions of adding methods to PathData structure that will lazyily fetch information as opposed to the way this PR introduces of fetching all information upfront. I am open to keeping the linked issue open and getting into some of these optimizations with another PR if that sounds okay to you. |
Ok, thanks! |
Series of patches for optimizing recursive ls
Mostly resolves #2069
This will require an update to sync with latest de-reference related changes so marking as draft.
Brief results with the patch locally:
BEFORE:
AFTER: