Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cl-dataplane: Disable controlplane TLS session keys #364

Merged
merged 1 commit into from
Mar 3, 2024

Conversation

orozery
Copy link
Collaborator

@orozery orozery commented Feb 29, 2024

This PR disables envoy from using TLS session keys when connecting to the controlplane.
Enabling session keys produces big TLS client hello packets, which cause a "buffer full" error on the controlplane's SNI proxy.

@orozery orozery linked an issue Feb 29, 2024 that may be closed by this pull request
@kfirtoledo kfirtoledo self-requested a review February 29, 2024 13:52
@@ -92,6 +92,7 @@ static_resources:
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: {{.controlplaneGRPCSNI}}
max_session_keys: 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to have TODO remove it after the SNI proxy is removed, or maybe explain why you put it in the code.

@praveingk
Copy link
Collaborator

Interesting, Thanks for the quick fix. Was this config changed recently which made envoy unstable?

@orozery
Copy link
Collaborator Author

orozery commented Feb 29, 2024

Interesting, Thanks for the quick fix. Was this config changed recently which made envoy unstable?

Just to be clear, this is not a bug in envoy, nor an envoy mis-configuration.

In this case, envoy is the TLS client, and our cl-controlplane is the TLS server.
Our cl-controlplane includes an SNI-based TCP-proxy imported from inet.af/tcpproxy.
This SNI proxy works by "peeking" the TLS client hello packet, reading it to a buffer of size 4096 bytes.
The issue was caused because envoy TLS client was sending a legitimate TLS client hello packet of size greater than 4096 bytes.
This caused the "connection reset" error that you have seen, because our SNI proxy was unable to handle this "large" client hello.
This is obviously a bug in the SNI proxy used by our controlplane (and the go-dataplane as well BTW).

The workaround fix of this PR is to configure envoy to disable the use of such large client hello packets, by disabling the "TLS session resume" feature (setting max_session_keys to 0).

I don't know why we have not seen this error so far. Perhaps only in certain setups envoy decides to use this defaultly-enabled TLS-sessions feature.

@orozery
Copy link
Collaborator Author

orozery commented Mar 1, 2024

Opened up a PR (and issue) to fix the bug in tcpproxy:
inetaf/tcpproxy#41

This commit disables envoy from using TLS session keys
when connecting to the controlplane.
Enabling session keys produces big TLS client hello packets,
which cause a "buffer full" error on the controlplane's SNI proxy.

Signed-off-by: Or Ozeri <[email protected]>
@orozery orozery merged commit ae4aa42 into clusterlink-net:main Mar 3, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Envoy failing to successfully connect dataplane connections
4 participants