Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement #1339

Open
rusher opened this issue Jun 29, 2023 · 4 comments
Open

Performance improvement #1339

rusher opened this issue Jun 29, 2023 · 4 comments

Comments

@rusher
Copy link
Contributor

rusher commented Jun 29, 2023

Just for information, here is a benchmark I did using multiple connectors and languages ​​on the same database, on a single thread. Drivers all performs differently.

These benchmarks allow you to see different things, a simple query, parsing/sending/managing parameters of small or big results, parsing many columns, ...

It is not a fair comparison (influence of language performance, sync/async), but gives indications of what can be improved.
Still, I would compare .net to node.js, since both are asynchronous (with lots of threads performance would downgrade less than sync drivers)

This driver is ".net communi." (community)

                                                      |--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|
                                                      | java mysql   | java mariadb | .net mysql   | .net communi.| node mysql2  | node mariadb | python maria.| python mysql |
------------------------------------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|
do 1                           - TEXT                 | 51707 |  88% | 58868 | 100% | 43180 |  73% | 34151 |  58% | 21647 |  37% | 41118 |  70% | 46249 |  79% | 12802 |  22% |
do 1000 parameters             - BINARY EXECUTE ONLY  |  2897 |  19% | 15356 | 100% |       |      |  4045 |  26% |  6832 |  44% |  7527 |  49% |   884 |   6% |   753 |   5% |
do 1000 parameters             - TEXT                 |  2912 |  47% |  6229 | 100% |   386 |   6% |  2928 |  47% |  4363 |  70% |  5268 |  85% |       |      |       |      |
batch 100 insert of 100 chars  - BINARY EXECUTE ONLY  |   246 | 100% |       |      |       |      |       |      |       |      |       |      |       |      |       |      |
batch 100 insert of 100 chars  - BULK                 |       |      | 11687 | 100% |       |      |       |      |       |      |  6730 |  58% |  5795 |  50% |       |      |
batch 100 insert of 100 chars  - REWRITE              |  4429 | 100% |       |      |       |      |       |      |       |      |       |      |       |      |       |      |
batch 100 insert of 100 chars  - TEXT                 |   355 |   4% |  9426 | 100% |       |      |       |      |   171 |   2% |       |      |       |      |       |      |
batch 100 insert of 100 chars  - BINARY               |       |      |       |      |       |      |       |      |       |      |       |      |       |      |  1565 | 100% |
select 1                       - TEXT                 | 41323 |  82% | 50674 | 100% | 27929 |  55% | 30125 |  59% | 17207 |  34% | 30264 |  60% | 34896 |  69% |  9360 |  18% |
select 1000 rows               - BINARY EXECUTE ONLY  |  3054 |  71% |  4321 | 100% |   856 |  20% |  2974 |  69% |  2066 |  48% |  2916 |  67% |       |      |       |      |
select 1000 rows               - TEXT                 |  3073 |  75% |  4072 | 100% |   996 |  24% |  2977 |  73% |  2009 |  49% |  2943 |  72% |  3028 |  74% |   237 |   6% |
Select 100 int cols            - BINARY EXECUTE ONLY  | 10218 |  51% | 19847 | 100% |  1308 |   7% | 14883 |  75% |  3296 |  17% | 14363 |  72% |  7922 |  40% |       |      |
Select 100 int cols            - BINARY PIPELINE      |       |      |  9601 | 100% |       |      |       |      |       |      |       |      |       |      |       |      |
Select 100 int cols            - BINARY               |  6428 |  75% |  8531 | 100% |       |      |       |      |       |      |       |      |       |      |       |      |
Select 100 int cols            - TEXT                 | 11389 |  78% | 14656 | 100% |  1645 |  11% | 10325 |  70% |  3467 |  24% |  9080 |  62% | 10500 |  72% |  1528 |  10% |
------------------------------------------------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|--------------|

Just to indicate that it would seem that simple query resulting in OK_Packet might be improved, and passing multiple parameters.

@bgrainger
Copy link
Member

Just for clarity, are the numbers time taken (lower is better) or requests per second (higher is better)?

It is not a fair comparison

Especially compared to Node or PHP (and probably Python), MySqlConnector has some unfortunate overhead caused by conforming to the ADO.NET API. I've often considered (don't remember if I created a GitHub issue for it) creating a "Direct" API that maps much more closely on to the MySQL protocol and drops ADO.NET compatibility.

@rusher
Copy link
Contributor Author

rusher commented Jun 29, 2023

It is number of operation per second (higher is better).
If needed here is the code used for benchmark : https://github.com/rusher/connector-benchmark/blob/main/scripts/dotnet/Program.cs

Each connector has different problem, like some needs to parse column name, ADO.NET need possibility to cancel each row read, ...

There one thing i'm not still convinced : async versus sync. That is for 2 reasons:

  • Pools with more than 200 connections usually performs worse than with fewer connections. And 200 threads is not that much.
  • In all the benchmark i've done, even using huge number of threads (connections), sync seems to behave better, even if lots of people says that async should behave better using lots of threads. Sync implementation is generally 25-30% faster than async when using single thread. Even with lots of threads, that gap is never is never completely bridged. And I've still not study green thread, checking (recent java has that with project loom). That "might" permit to have better multi-threading with sync. If that's is confirmed, then i would definitively say that sync implementation is better.

@bgrainger
Copy link
Member

Async definitely adds a measurable amount of overhead in .NET. It's been greatly improved over the years (e.g., ValueTask, etc.), but there are necessarily extra allocations and thread context switches when "pausing" and "resuming" an async method.

The purported benefit is scalability: .NET threads can process other work that the server needs to do while waiting for a response from the MySQL server. It prioritises throughput over latency. If your benchmarks are primarily driven by latency, then I would expect sync to look better than async, and async only start to shine in a highly concurrent benchmark like TechEmpower (although note that it's "only" doing 200,000 rps on MySQL).

@rusher
Copy link
Contributor Author

rusher commented Jun 29, 2023

If your benchmarks are primarily driven by latency, then I would expect sync to look better than async, and async only start to shine in a highly concurrent benchmark like TechEmpower (although note that it's "only" doing 200,000 rps on MySQL).

Yes, async improved a lots when using huge number of threads, i'm not discussing that. Just wondering in the context of database connection, where most of the time best pool connection size is under 50 connections. Some other reading : https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants