Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too slow parallel execution with :crypto.strong_rand_bytes/1 #41

Open
zacky1972 opened this issue Dec 2, 2024 · 5 comments
Open

Too slow parallel execution with :crypto.strong_rand_bytes/1 #41

zacky1972 opened this issue Dec 2, 2024 · 5 comments

Comments

@zacky1972
Copy link
Contributor

Hi,

I tested the following benchmark on M3 Max:

Mix.install([:flow, :benchee])

Benchee.run(
  %{
    "sequential execution" => fn -> 1..1_000_000 |> Enum.map(fn _ -> :crypto.strong_rand_bytes(1000) end) |> Enum.map(& Base.encode32(&1, case: :lower)) end,
    "parallel execution" => fn -> 1..1_000_000 |> Flow.from_enumerable() |> Flow.map(fn _ -> :crypto.strong_rand_bytes(1000) end) |> Flow.map(& Base.encode32(&1, case: :lower)) |> Enum.to_list() end
  }
)

The log of the execution with Erlang/OTP installed by asdf is as follows:

Operating System: macOS
CPU Information: Apple M3 Max
Number of Available Cores: 16
Available memory: 128 GB
Elixir 1.17.3
Erlang 27.1.2
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s

Benchmarking parallel execution ...
Benchmarking sequential execution ...
Calculating statistics...
Formatting results...

Name                           ips        average  deviation         median         99th %
parallel execution            2.59         0.39 s    ±21.20%         0.35 s         0.53 s
sequential execution          0.35         2.89 s     ±3.14%         2.89 s         2.95 s

Comparison: 
parallel execution            2.59
sequential execution          0.35 - 7.46x slower +2.50 s

However, the log of the execution with the community-maintained pre-compiled Erlang/OTP for macOS is as follows:

Operating System: macOS
CPU Information: Apple M3 Max
Number of Available Cores: 16
Available memory: 128 GB
Elixir 1.17.3
Erlang 27.1.2
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s

Benchmarking parallel execution ...
Benchmarking sequential execution ...
Calculating statistics...
Formatting results...

Name                           ips        average  deviation         median         99th %
sequential execution          0.33         3.03 s     ±3.33%         3.03 s         3.10 s
parallel execution           0.186         5.37 s     ±0.00%         5.37 s         5.37 s

Comparison: 
sequential execution          0.33
parallel execution           0.186 - 1.77x slower +2.34 s

It seems too slow parallel execution of this benchmark. I felt this is an issue.

@wojtekmach
Copy link
Collaborator

Thank you for the report. Maybe openssl version but more likely openssl build flags we use are the culprit?

  1. : "${OPENSSL_VERSION:=3.1.6}"
  2. export CFLAGS="-Os -fno-common -mmacosx-version-min=11.0"
  3. ./config no-shared --prefix="${rel_dir}" ${CFLAGS}

You can make a build locally like this:

$ gh repo clone erlef/otp_builds
$ cd otp_builds
$ sh scripts/build_otp_macos.bash OTP-27.1.2
$ tmp/otp_builds/otp-OTP-27.1.2-openssl-3.1.6-wxwidgets-3.2.6/bin/erl
Erlang/OTP 27 [erts-15.1.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Eshell V15.1.2 (press Ctrl+G to abort, type help(). for help)
1>

If you can reproduce slowness and/or performance improvements by changing build script we'd happily accept a PR.

@zacky1972
Copy link
Contributor Author

Thank you for your information.

I tested the script without any modification, but I got the error:

In file included from sys/unix/erl_unix_sys.h:65:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:525:10: fatal error: cannot open file 'aarch64-apple-darwin/opt/jit/_ctermid.h': Too many open files
  525 | #include <_ctermid.h>
      |          ^
 CXX	obj/aarch64-apple-darwin/opt/jit/instr_map.o
1 error generated.
make[4]: *** [obj/aarch64-apple-darwin/opt/jit/beam_asm_module.o] Error 1
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [opt] Error 2
make[2]: *** [opt] Error 2
make[1]: *** [jit] Error 2
make: *** [emulator] Error 2

Fortunately, I know how to fix the issue:

ulimit -n 65536

Then, I got the executable at the following path:

tmp/otp_builds/otp-OTP-27.1.2-openssl-3.1.6-wxwidgets-3.2.6/bin/erl

Then, I tested the benchmark with the generated erl:

Operating System: macOS
CPU Information: Apple M3 Max
Number of Available Cores: 16
Available memory: 128 GB
Elixir 1.17.3
Erlang 27.1.2
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s

Benchmarking parallel execution ...
Benchmarking sequential execution ...
Calculating statistics...
Formatting results...

Name                           ips        average  deviation         median         99th %
sequential execution          0.33         3.06 s     ±1.42%         3.06 s         3.09 s
parallel execution            0.22         4.57 s     ±5.25%         4.57 s         4.74 s

Comparison: 
sequential execution          0.33
parallel execution            0.22 - 1.49x slower +1.51 s

This is the expected results, so no problem.

I continue to try to fix this issue. Thank you.

@zacky1972
Copy link
Contributor Author

I tested the settings of Homebrew openssl@3, but the benchmark score did not improve.

@wojtekmach
Copy link
Collaborator

Could you try statically linking to OpenSSL you got from homebrew? That should rule out OpenSSL build options differences hopefully. If that does not improve performance is it possible it is the difference between static and dynamic linking?

@zacky1972
Copy link
Contributor Author

OK, I'll try later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants