Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: amortize allocations #2

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BurntSushi
Copy link

This is a PoC for only the Rust code that shows how to optimize it a bit
more by both amortizing allocation and reusing the result of UTF-8
decoding instead of repeating it for each iteration of the inner loop.

The results on my machine:

$ yarn bench
yarn run v1.22.4
$ node run.js bench
go: 1.573124
javascript: 2.94
rust: 1.295147587
Done in 6.17s.

Of course, this isn't a fair comparison, since this doesn't update the
Go code to reuse allocation either. Although it wouldn't be surprising
if its GC was doing this automatically anyway.

I'm not necessarily submitting this with a request to merge it, but
rather, just providing more data.

This is a PoC for only the Rust code that shows how to optimize it a bit
more by both amortizing allocation and reusing the result of UTF-8
decoding instead of repeating it for each iteration of the inner loop.

The results on my machine:

    $ yarn bench
    yarn run v1.22.4
    $ node run.js bench
    go: 1.573124
    javascript: 2.94
    rust: 1.295147587
    Done in 6.17s.

Of course, this isn't a fair comparison, since this doesn't update the
Go code to reuse allocation either. Although it wouldn't be surprising
if its GC was doing this automatically anyway.

I'm not necessarily submitting this with a request to merge it, but
rather, just providing more data.
@mlbright
Copy link

mlbright commented May 4, 2020

I'm on a MacBook Pro 2017, and I applied #3 because it fixes and improves the Golang version.
Then I added @BurntSushi's code, then switched to jemallocator.

  • master
go: 2.082300
javascript: 3.823
rust: 3.291851612
go: 1.684807
javascript: 3.861
rust: 3.337344558
go: 1.688000
javascript: 3.84
rust: 1.876212421
go: 1.685848
javascript: 3.845
rust: 1.655761329

MacOS's allocator could still be partly to blame.

@christianscott
Copy link
Owner

Thanks for making the comparisons @mlbright. Even with #3 it's still not fair to compare this version to the go implementation unfortunately. You would need to amortize the slice allocations like @BurntSushi has done here – I imagine placing them on a struct like he has done would work.

@mlbright
Copy link

mlbright commented May 4, 2020

That's true, but if we just focus on Rust for a moment, I think the memory allocator on the Mac makes a big difference. I have a similar language comparison for a less useful algorithm. Ignore that my data structures are not ideal and that the comparison is again not fair. The performance difference as a result of changing allocators is dramatic.

➜  sudoku-norvig-rs git:(slow) cargo run --release
   Compiling sudoku-norvig-rs v0.1.0 (/Users/brightm/Documents/sudoku-norvig-rs)
    Finished release [optimized] target(s) in 1.79s
     Running `target/release/sudoku-norvig-rs`
Solved 50 of 50 easy puzzles (avg. 0.0008 secs (1251.64 Hz), max 0.0019 secs).
Solved 95 of 95 hard puzzles (avg. 0.0020 secs (505.09 Hz), max 0.0085 secs).
Solved 11 of 11 hardest puzzles (avg. 0.0013 secs (788.14 Hz), max 0.0020 secs).
Solved 99 of 99 random puzzles (avg. 0.0007 secs (1378.78 Hz), max 0.0011 secs).
➜  sudoku-norvig-rs git:(slow) git checkout -
Switched to branch 'slow-add-jem'
➜  sudoku-norvig-rs git:(slow-add-jem) cargo run --release
    Updating crates.io index
   Compiling sudoku-norvig-rs v0.1.0 (/Users/brightm/Documents/sudoku-norvig-rs)
    Finished release [optimized] target(s) in 3.45s
     Running `target/release/sudoku-norvig-rs`
Solved 50 of 50 easy puzzles (avg. 0.0004 secs (2837.08 Hz), max 0.0004 secs).
Solved 95 of 95 hard puzzles (avg. 0.0010 secs (1011.52 Hz), max 0.0045 secs).
Solved 11 of 11 hardest puzzles (avg. 0.0004 secs (2382.72 Hz), max 0.0005 secs).
Solved 99 of 99 random puzzles (avg. 0.0004 secs (2790.27 Hz), max 0.0005 secs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants