Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance with large files #3204

Open
pamburus opened this issue Feb 8, 2025 · 3 comments
Open

Poor performance with large files #3204

pamburus opened this issue Feb 8, 2025 · 3 comments
Labels
feature-request New feature or request

Comments

@pamburus
Copy link

pamburus commented Feb 8, 2025

I am experiencing extremely poor performance with large assembly source files.
It takes 4 minutes for bat to process a 45-MiB assembly source file.

⇛ time bat target/release/deps/bench-f36188efc4eb690a.s --color always >/dev/null
bat target/release/deps/bench-f36188efc4eb690a.s --color always > /dev/null  237.31s user 10.16s system 102% cpu 4:00.49 total

⇛ lsd -lah target/release/deps/bench-f36188efc4eb690a.s
.rw-r--r-- pamburus staff 45 MB Sat Feb  8 18:30:04 2025 target/release/deps/bench-f36188efc4eb690a.s

⇛ sysctl machdep.cpu
machdep.cpu.cores_per_package: 10
machdep.cpu.core_count: 10
machdep.cpu.logical_per_package: 10
machdep.cpu.thread_count: 10
machdep.cpu.brand_string: Apple M1 Max

The average processing speed seems to be about 190 KiB per second.

@pamburus pamburus added the feature-request New feature or request label Feb 8, 2025
@pamburus pamburus changed the title Bad performance on large files Poor performance with large files Feb 8, 2025
@keith-hall
Copy link
Collaborator

Related: #304 (comment)

@pamburus
Copy link
Author

pamburus commented Feb 9, 2025

Yes, this issue seems to be related, but the reasoning about performance bottlenecks in #304 does not seem to match reality.

Here is the reality:
Image

About 92% of the time is spent in syntect::easy::HighlightLines::highlight_line, which in turn spends about 97% of that time in parsing.

Only about 3% of the time is spent in write_fmt, so buffering the output at this point will make any visible difference.

It looks like the performance problem needs to be redirected to syntect. It will probably not be easy to fix.

@pamburus
Copy link
Author

pamburus commented Feb 9, 2025

As an experiment, I tried replacing the output writer with std::io::sink.
Surprisingly, this reduced the total processing time by 13% instead of 3%, but the bottleneck remains the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants