Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encrypt traffic on the data-path "natively" #11239

Closed
markusthoemmes opened this issue Apr 21, 2021 · 5 comments
Closed

Encrypt traffic on the data-path "natively" #11239

markusthoemmes opened this issue Apr 21, 2021 · 5 comments
Labels
area/networking kind/feature Well-understood/specified features, ready for coding. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@markusthoemmes
Copy link
Contributor

In what area(s)?

/area networking

Describe the feature

Currently, whenever we're asked on how the traffic inside of Knative can be secured, our default answer (I think) is: Use a service mesh and enable mTLS.

I guess that works in theory, but it has a ton of drawbacks coming with it. Among others: It forces the additional sidecar everywhere, which a) increases overhead and b) makes our data-plane actively worse as the activator has to deal with non-pod-addressability in the mesh case.

We also had a few users (read customers) ask specifically for a solution to encrypt traffic on the data-path without having to buy into the increased overhead that a mesh brings with it.

So: Can and should we solve this use-case "natively" within Knative? Personally, I think we should.


Thinking about potential designs for this, a few key questions come to mind:

  1. What level of granularity is needed? Can we just configure a global certificate, serve that in all our components and call it a day? Do we need certs per-namespace or even per-service to secure the communication to the queue-proxy?
  2. Do we need passthrough TLS, i.e. the ability to pass through encrypted traffic and let the application decrypt it (more on that below)
  3. Do we need mTLS? Or are we fine with just TLS?
  4. What kinda UX would this be? Do we ask people to muck with Secrets directly? Do we create a new API?
On passthrough encryption

Passthrough encryption certainly would be the best escape hatch here, as it'd allow us to defer this completely to the user's application. However, I don't believe we can implement that with our current assumptions. We need to have the ability to change headers of a request to allow for our routing to work. Even if we solved that, we need to be able to count requests for our autoscaling to work. Encrypted traffic wouldn't allow us to do either AFAIK (not an expert, more than happy to be wrong here).

@markusthoemmes markusthoemmes added the kind/feature Well-understood/specified features, ready for coding. label Apr 21, 2021
@evankanderson
Copy link
Member

evankanderson commented Apr 27, 2021

A few quick thoughts/answers:

  1. It's probably desirable to separate the certificates for system components from components that run in the user (developer) namespace (i.e. different certs for ingress/activator than for queue-proxy). Components that run in a developer namespace should be assumed to be available to all workloads (Knative and otherwise) in that namespace.

    In addition, we have two data paths to protect, one of which is managed by KIngress / multiple HTTP implementations, including non-Envoy implementations. This may restrict our ability to be especially creative (e.g. Client-cert-only TLS, which would be out of spec).

  2. I think you're correct that we'd need to a lot of heavy lifting and cooption of the TLS protocol to get the desired semantics and support passthrough encryption. One option would be to have passthrough encryption but include the negotiated secrets passed through to the client.

    A simple thought experiment shows that we probably need to be able to crack the TLS request at the routing layer -- when the client connects to the ingress, they may not have provided a hostname until after ServerHello; this means that the ingress cannot scale-up-from-zero the correct backend until after the initial TLS handshake.

  3. We may not mTLS if we can trust the Kubernetes network routing layer within the cluster (possibly a reasonable assumption). In that case, we could potentially distribute a shared "not a secret" certificate which supports perfect forward secrecy to all queue-proxies, and have them validate the "client certificate" from the ingress or activator.

    This should probably get at least a once-over from someone with more crypto experience than myself, who will probably yell "armchair cryptographer" at me and throw things.

  4. It would be nice if we could manage the certificates (and any necessary secrets) ourselves, distributing them between the KIngress, activator, and queue-proxy. Alternatively, it would be nice if we could hook into a cert infrastructure like SPIFFE if it was available on the cluster; I'm not sure how difficult it would be to support both code paths.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 27, 2021
@markusthoemmes
Copy link
Contributor Author

/lifecycle frozen

@knative-prow-robot knative-prow-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 27, 2021
@evankanderson
Copy link
Member

This looks like it will be solved by #11906

/close

@knative-prow-robot
Copy link
Contributor

@evankanderson: Closing this issue.

In response to this:

This looks like it will be solved by #11906

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/feature Well-understood/specified features, ready for coding. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

3 participants