Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC Document API benchmark #16711

Open
4 tasks
Tracked by #16787
amberzsy opened this issue Nov 23, 2024 · 4 comments
Open
4 tasks
Tracked by #16787

GRPC Document API benchmark #16711

amberzsy opened this issue Nov 23, 2024 · 4 comments
Labels
Meta Meta issue, not directly linked to a PR Search:Performance

Comments

@amberzsy
Copy link

amberzsy commented Nov 23, 2024

Please describe the end goal of this project

based on GRPC document API implementation, perform benchmark on varies document apis with different scenarios.

Supporting References

#15190

Issues

  • Bulk API benchmark.
  • Index document benchmark.
  • Update document benchmark
  • Get document benchmark.

Related component

Indexing

@amberzsy amberzsy added Meta Meta issue, not directly linked to a PR untriaged labels Nov 23, 2024
@amberzsy amberzsy changed the title [META] GRPC poc and benchmark - [Indexing] GRPC Document API benchmark Dec 5, 2024
@amberzsy
Copy link
Author

amberzsy commented Dec 9, 2024

assigned to @karenyrx.

@dblock
Copy link
Member

dblock commented Dec 16, 2024

[Catch All Triage - 1, 2, 3]

@karenyrx
Copy link

karenyrx commented Jan 2, 2025

Results

Latency benchmark results from testing with a single data-node cluster and small GRPC payload:

Screenshot 2025-01-02 at 4 20 12 PM

Summary

  1. GRPC latency of the first request is 1.2 - 1.25x higher than HTTP.
  2. GRPC latency of the subsequent requests is 1.5 - 5x lower than HTTP.
  3. For both GRPC and HTTP, CPU, memory, IO usage were similar, based on internal dashboard observations.

Next steps:

  1. Confirm where the improvements are arising from: HTTP2, or Protobuf, or something else. To do so:
    i) Instrument codebase with detailed metrics to breakdown granular latency of GRPC vs. HTTP internal ops.
    ii) Confirm if results align with HTTP2 + JSON benchmarking results (to be performed by @reta )
  2. Test with a larger payload (e.g. 5MB request) to gain more confidence
  3. Submit PR for Protobuf and GRPC support for Bulk endpoint in Opensearch (in parallel with step 1 and 2)

@mgodwan
Copy link
Member

mgodwan commented Jan 6, 2025

@karenyrx Thanks for the benchmarks. The idea is indeed promising.
I would like to understand and discuss the proposal around how you're thinking to map document schema using protobuf, and how dynamic mappings would work for such scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta Meta issue, not directly linked to a PR Search:Performance
Projects
Status: New
Status: Todo
Status: 🆕 New
Development

No branches or pull requests

4 participants