Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offload TLS negotiation to I/O threads #1338

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

uriyage
Copy link
Contributor

@uriyage uriyage commented Nov 21, 2024

TLS Negotiation Offloading to I/O Threads

Overview

This PR introduces the ability to offload TLS handshake negotiations to I/O threads, significantly improving performance under high TLS connection loads.

Key Changes

  • Added infrastructure to offload TLS negotiations to I/O threads
  • Refactored SSL event handling to allow I/O threads modify conn flags.
  • Introduced new connection flag to identify client connections

Performance Impact

Testing with 650 clients with SET commands and 160 new TLS connections per second in the background:

Throughput Impact of new TLS connections

  • With Offloading: Minimal impact (1050K → 990K ops/sec)
  • Without Offloading: Significant drop (1050K → 670K ops/sec)

New Connection Rate

  • With Offloading:
    • 1,757 conn/sec
  • Without Offloading:
    • 477 conn/sec

Implementation Details

  1. Main Thread:

    • Initiates negotiation-offload jobs to I/O threads
    • Adds connections to pending-read clients list (using existing read offload mechanism)
    • Post-negotiation handling:
      • Creates read/write events if needed for incomplete negotiations
      • Calls accept handler for completed negotiations
  2. I/O Thread:

    • Performs TLS negotiation
    • Updates connection flags based on negotiation result

Related issue:#761

Copy link

codecov bot commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 7.14286% with 26 lines in your changes missing coverage. Please review.

Project coverage is 70.74%. Comparing base (3d0c834) to head (dc5aa76).
Report is 14 commits behind head on unstable.

Files with missing lines Patch % Lines
src/io_threads.c 0.00% 25 Missing ⚠️
src/networking.c 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1338      +/-   ##
============================================
+ Coverage     70.71%   70.74%   +0.02%     
============================================
  Files           115      116       +1     
  Lines         63159    63307     +148     
============================================
+ Hits          44666    44786     +120     
- Misses        18493    18521      +28     
Files with missing lines Coverage Δ
src/connection.h 87.05% <ø> (-0.16%) ⬇️
src/server.c 87.65% <100.00%> (+0.02%) ⬆️
src/server.h 100.00% <ø> (ø)
src/tls.c 100.00% <ø> (ø)
src/networking.c 88.53% <50.00%> (-0.14%) ⬇️
src/io_threads.c 6.94% <0.00%> (-0.67%) ⬇️

... and 17 files with indirect coverage changes

---- 🚨 Try these New Features:

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation basically breaks the connection abstraction, since it has the TLS implementation calling functions related to IO threading (which is supposed to be agnostic to the connection type). I was sort of expecting we would offload the event handler.

src/io_threads.c Outdated
@@ -9,6 +9,7 @@
static __thread int thread_id = 0; /* Thread local var */
static pthread_t io_threads[IO_THREADS_MAX_NUM] = {0};
static pthread_mutex_t io_threads_mutex[IO_THREADS_MAX_NUM];
void (*tls_negotiation_cb)(void *);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a typdef for this, it'll be a bit more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Done.

src/connection.h Outdated
@@ -56,6 +56,7 @@ typedef enum {

#define CONN_FLAG_CLOSE_SCHEDULED (1 << 0) /* Closed scheduled by a handler */
#define CONN_FLAG_WRITE_BARRIER (1 << 1) /* Write barrier requested */
#define CONN_FLAG_CLIENT (1 << 2) /* Connection is of a client - not a cluster link. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define CONN_FLAG_CLIENT (1 << 2) /* Connection is of a client - not a cluster link. */
#define CONN_FLAG_NO_OFFLOAD (1 << 2) /* Connection should not be offloaded to IO threads. */

The goal was to keep the connection layer independent of the the server layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

src/io_threads.c Outdated
Comment on lines 577 to 592
if (server.io_threads_num <= 1) {
return C_ERR;
}

if (!(conn->flags & CONN_FLAG_CLIENT)) {
return C_ERR;
}

client *c = connGetPrivateData(conn);
if (c->io_read_state != CLIENT_IDLE) {
return C_OK;
}

if (server.active_io_threads_num <= 1) {
return C_ERR;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (server.io_threads_num <= 1) {
return C_ERR;
}
if (!(conn->flags & CONN_FLAG_CLIENT)) {
return C_ERR;
}
client *c = connGetPrivateData(conn);
if (c->io_read_state != CLIENT_IDLE) {
return C_OK;
}
if (server.active_io_threads_num <= 1) {
return C_ERR;
}
if (server.active_io_threads_num <= 1) {
return C_ERR;
}
if (!(conn->flags & CONN_FLAG_CLIENT)) {
return C_ERR;
}
client *c = connGetPrivateData(conn);
if (c->io_read_state != CLIENT_IDLE) {
return C_OK;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems everywhere else we just check for active threads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a scenario where the main thread sends a job to the IO thread. After the IO thread completes the job, the main thread deactivates all threads. In this case, we want to ensure the main thread processes the returned job from the IO thread before performing an accept operation even if the io threads are not active.

To achieve this:

First, check if IO threads are enabled at all (less expensive check)
Then, verify the read_state is not idle
Finally, check the active state

Unlike read/write where we anyway check for the read/write states, the accept flow operates at the connection layer rather than the client layer, so we can't check the read state there.

Signed-off-by: Uri Yagelnik <[email protected]>
@uriyage
Copy link
Contributor Author

uriyage commented Nov 22, 2024

This implementation basically breaks the connection abstraction, since it has the TLS implementation calling functions related to IO threading (which is supposed to be agnostic to the connection type). I was sort of expecting we would offload the event handler.

We still don't check in any point for the connection type, since the TLS code calls the IO threads to offload the negotiation with a supplied callback, not the other way around. Maybe we can rename it to 'accept' instead of 'tls_negotiate' to be more abstract.

I was sort of expecting we would offload the event handler.

Not sure I get this, would you please elaborate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants