Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackendTLSPolicy applied a Kibana instance Service resulting in upstream connect error or disconnect/reset before headers. reset reason: connection termination #4769

Closed
ferdinandosimonetti opened this issue Nov 22, 2024 · 9 comments · Fixed by #4784
Assignees
Labels
kind/bug Something isn't working

Comments

@ferdinandosimonetti
Copy link

Description:

I expected to be able to reach a Kibana instance (HTTPS-enabled, created via Elastic Cloud for Kubernetes Operator) through an HTTPRoute.
Each time I tried to connect to it through my newly-built HTTPRoute, I receive

upstream connect error or disconnect/reset before headers. reset reason: connection termination along with status 503

while accessing the same Kibana through an Ingress resource whose configuration made use of these annotations

kind: Ingress
apiVersion: networking.k8s.io/v1
metadata:
  name: es-ingress
  namespace: elastic-dev
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: 'false'
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
...

works as usual

Repro steps:

At first, I extracted the certificate of the (self-generated) CA that issued the Kibana certificate:

[KS-Farmhub-admin|elastic-dev] ➜  ~ k get kibana
NAME         HEALTH   NODES   VERSION   AGE
fh-cluster   green    1       8.15.2    3y75d
[KS-Farmhub-admin|elastic-dev] ➜  ~ k get deploy
NAME            READY   UP-TO-DATE   AVAILABLE   AGE
fh-cluster-kb   1/1     1            1           3y75d
[KS-Farmhub-admin|elastic-dev] ➜  ~ k get -o json deploy/fh-cluster-kb | jq -r '.spec.template.spec.volumes[] | select( .name == "elastic-internal-http-certificates")'
{
  "name": "elastic-internal-http-certificates",
  "secret": {
    "defaultMode": 420,
    "optional": false,
    "secretName": "fh-cluster-kb-http-certs-internal"
  }
}

[KS-Farmhub-admin|elastic-dev] ➜  ~ k view-secret fh-cluster-kb-http-certs-internal ca.crt > ca.crt
[KS-Farmhub-admin|elastic-dev] ➜  ~ cat ca.crt
-----BEGIN CERTIFICATE-----
MIIDSTCCAjGgAwIBAgIQc8BYw8mZXdh5hMKi3OD5eDANBgkqhkiG9w0BAQsFADAv
MRMwEQYDVQQLEwpmaC1jbHVzdGVyMRgwFgYDVQQDEw9maC1jbHVzdGVyLWh0dHAw
HhcNMjQwOTA0MTY1MDA3WhcNMjUwOTA0MTcwMDA3WjAvMRMwEQYDVQQLEwpmaC1j
bHVzdGVyMRgwFgYDVQQDEw9maC1jbHVzdGVyLWh0dHAwggEiMA0GCSqGSIb3DQEB
AQUAA4IBDwAwggEKAoIBAQC17IddJ51zRdeGnlZ/8bxsUlUaCcxZwQ/OBAcCHhSb
dvM8ebaNa/fInKyDtyrskbj/uOPoJWOuVyHv46WGkRekyxvPHrgxWGM53dNTczL0
Xn2Zh4YQRuUejJYTlWWRmPefCJFnovO/kTmcUPdH2s0cSTbhahyn9zwilW8rNFd4
jyTHOm7Tf0DMRjPCdXuHTO3K6uqfPAa62m3s3PJlyqd9OGpgdkf8AJ55+n0o5pvR
1sxgqFkQBc9OkqPdU4IQ+LvAsVZ6715oMt/Zk8HDVOAOaOOSTMW8rOAtPUZY6Tez
MqCAXedOLgFA5IafOG2ktxBv5916lilFbRIFqEHKwXtnAgMBAAGjYTBfMA4GA1Ud
DwEB/wQEAwIChDAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwDwYDVR0T
AQH/BAUwAwEB/zAdBgNVHQ4EFgQUeJTkn4wJmIeCr/hyX4EG91r5hlswDQYJKoZI
hvcNAQELBQADggEBAJPAsne94nbM8xa/uJ0hVa+7WPa2me7j64gfrxS/yX4FvSGX
72/3ohyJdeckZlTgoqYE3urnpfpTmmM8hjjGYw4sMTJgnBOX0PY6Tz2Io4lxxXSQ
XEeiZoAriPUeJr0uoESXVijmvftGzIYMC+zn/6/0V08+6HGMJWd2m6SsiJaBZJIM
plb/OT9sCpoufVKu9FD2ZORw4C7+ZHrFXLyUpo0wNSNvrIIZFqhLw97eFB1TMsoo
BTR0Q9sfeHuzv2lO6ltx8O5PsXTQlJW9vxKw5Pz4TnYliBS36x2X3Jb7Efa3ZDCk
arDrFLxBF6MN7pQU4LgzGigy1J4oB4KPlXYoc28=
-----END CERTIFICATE-----

Then I created a ConfigMap out of it

[KS-Farmhub-admin|elastic-dev] ➜  ~ kubectl create cm kb-ca --dry-run=client -o yaml --from-file=ca.crt > kb-ca.yml
[KS-Farmhub-admin|elastic-dev] ➜  ~ cat kb-ca.yml
---
apiVersion: v1
data:
  ca.crt: |
    -----BEGIN CERTIFICATE-----
    MIIDSTCCAjGgAwIBAgIQc8BYw8mZXdh5hMKi3OD5eDANBgkqhkiG9w0BAQsFADAv
    MRMwEQYDVQQLEwpmaC1jbHVzdGVyMRgwFgYDVQQDEw9maC1jbHVzdGVyLWh0dHAw
    HhcNMjQwOTA0MTY1MDA3WhcNMjUwOTA0MTcwMDA3WjAvMRMwEQYDVQQLEwpmaC1j
    bHVzdGVyMRgwFgYDVQQDEw9maC1jbHVzdGVyLWh0dHAwggEiMA0GCSqGSIb3DQEB
    AQUAA4IBDwAwggEKAoIBAQC17IddJ51zRdeGnlZ/8bxsUlUaCcxZwQ/OBAcCHhSb
    dvM8ebaNa/fInKyDtyrskbj/uOPoJWOuVyHv46WGkRekyxvPHrgxWGM53dNTczL0
    Xn2Zh4YQRuUejJYTlWWRmPefCJFnovO/kTmcUPdH2s0cSTbhahyn9zwilW8rNFd4
    jyTHOm7Tf0DMRjPCdXuHTO3K6uqfPAa62m3s3PJlyqd9OGpgdkf8AJ55+n0o5pvR
    1sxgqFkQBc9OkqPdU4IQ+LvAsVZ6715oMt/Zk8HDVOAOaOOSTMW8rOAtPUZY6Tez
    MqCAXedOLgFA5IafOG2ktxBv5916lilFbRIFqEHKwXtnAgMBAAGjYTBfMA4GA1Ud
    DwEB/wQEAwIChDAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwDwYDVR0T
    AQH/BAUwAwEB/zAdBgNVHQ4EFgQUeJTkn4wJmIeCr/hyX4EG91r5hlswDQYJKoZI
    hvcNAQELBQADggEBAJPAsne94nbM8xa/uJ0hVa+7WPa2me7j64gfrxS/yX4FvSGX
    72/3ohyJdeckZlTgoqYE3urnpfpTmmM8hjjGYw4sMTJgnBOX0PY6Tz2Io4lxxXSQ
    XEeiZoAriPUeJr0uoESXVijmvftGzIYMC+zn/6/0V08+6HGMJWd2m6SsiJaBZJIM
    plb/OT9sCpoufVKu9FD2ZORw4C7+ZHrFXLyUpo0wNSNvrIIZFqhLw97eFB1TMsoo
    BTR0Q9sfeHuzv2lO6ltx8O5PsXTQlJW9vxKw5Pz4TnYliBS36x2X3Jb7Efa3ZDCk
    arDrFLxBF6MN7pQU4LgzGigy1J4oB4KPlXYoc28=
    -----END CERTIFICATE-----
kind: ConfigMap
metadata:
  creationTimestamp: null
  name: kb-ca

I found out which Service I should point to

➜  k8s git:(feature/envoygateway) ✗ k get svc/fh-cluster-kb-http
NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
fh-cluster-kb-http   ClusterIP   10.0.143.212   <none>        5601/TCP   3y75d
➜  k8s git:(feature/envoygateway) ✗ k describe svc/fh-cluster-kb-http
Name:                     fh-cluster-kb-http
Namespace:                elastic-dev
Labels:                   common.k8s.elastic.co/type=kibana
                          kibana.k8s.elastic.co/name=fh-cluster
Annotations:              <none>
Selector:                 common.k8s.elastic.co/type=kibana,kibana.k8s.elastic.co/name=fh-cluster
Type:                     ClusterIP
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.0.143.212
IPs:                      10.0.143.212
Port:                     https  5601/TCP
TargetPort:               5601/TCP
Endpoints:                10.100.65.84:5601
Session Affinity:         None
Internal Traffic Policy:  Cluster
Events:                   <none>

found out the internal FQDN exposed by Kibana

[KS-Farmhub-admin|elastic-dev] ➜  1-dev git:(feature/envoygateway) ✗ k view-cert fh-cluster-kb-http-certs-internal tls.crt
[
    {
        "SecretName": "fh-cluster-kb-http-certs-internal",
        "Namespace": "elastic-dev",
        "Version": 3,
        "SerialNumber": "6c7323b615d595670d08a795bd886f79",
        "Issuer": "CN=fh-cluster-http,OU=fh-cluster",
        "Validity": {
            "NotBefore": "2024-09-04T16:50:07Z",
            "NotAfter": "2025-09-04T17:00:07Z"
        },
        "Subject": "CN=fh-cluster-kb-http.elastic-dev.kb.local,OU=fh-cluster",
        "IsCA": false
    }
]

and finally created both BackendTLSPolicy and HTTPRoute

---
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: BackendTLSPolicy
metadata:
  name: kb-enable-tls
  namespace: elastic-dev
spec:
  targetRefs:
  - group: ''
    kind: Service
    name: fh-cluster-kb-http
    sectionName: https
  validation:
    caCertificateRefs:
    - name: kb-ca
      group: ''
      kind: ConfigMap
    hostname: fh-cluster-kb-http.elastic-dev.kb.local
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: kb-route
  namespace: elastic-dev
spec:
  parentRefs:
    - name: gateway-private
      namespace: balancers-dev
      sectionName: https
  hostnames:
    - "kb-dev.xxx.yyy.com"
  rules:
    - backendRefs:
        - group: ""
          kind: Service
          name: fh-cluster-kb-http
          port: 5601
      matches:
        - path:
            type: PathPrefix
            value: /

Note: If there are privacy concerns, sanitize the data prior to
sharing.

Environment:

Azure AKS 1.30.4, ECK Operator 2.14, Kibana 8.15.2,

Logs:

Include the access logs and the Envoy logs.

[2024-11-22 15:18:06.930][13][debug][http2] [source/common/http/http2/codec_impl.cc:1695] [Tags: "ConnectionId":"32897"] updating connection-level initial window size to 1048576
[2024-11-22 15:18:06.930][13][debug][http] [source/common/http/conn_manager_impl.cc:393] [Tags: "ConnectionId":"32897"] new stream
[2024-11-22 15:18:06.930][13][debug][http] [source/common/http/conn_manager_impl.cc:1183] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] request headers complete (end_stream=true):
':method', 'GET'
':scheme', 'https'
':authority', 'kb-dev.xxx.yyy.com'
':path', '/'
'user-agent', 'curl/8.7.1'
'accept', '*/*'

[2024-11-22 15:18:06.930][13][debug][http] [source/common/http/conn_manager_impl.cc:1166] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] request end stream timestamp recorded
[2024-11-22 15:18:06.930][13][debug][connection] [./source/common/network/connection_impl.h:98] [Tags: "ConnectionId":"32897"] current connecting state: false
[2024-11-22 15:18:06.930][13][debug][router] [source/common/router/router.cc:527] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] cluster 'httproute/elastic-dev/kb-route/rule/0' match for URL '/'
[2024-11-22 15:18:06.930][13][debug][router] [source/common/router/router.cc:756] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] router decoding headers:
':method', 'GET'
':scheme', 'https'
':authority', 'kb-dev.xxx.yyy.com'
':path', '/'
'user-agent', 'curl/8.7.1'
'accept', '*/*'
'x-forwarded-for', '10.100.244.222'
'x-forwarded-proto', 'https'
'x-envoy-internal', 'true'
'x-request-id', '23c80288-1ec6-47a7-b469-06378122bc17'

[2024-11-22 15:18:06.930][13][debug][pool] [source/common/http/conn_pool_base.cc:78] queueing stream due to no available connections (ready=0 busy=0 connecting=0)
[2024-11-22 15:18:06.930][13][debug][pool] [source/common/conn_pool/conn_pool_base.cc:291] trying to create new connection
[2024-11-22 15:18:06.930][13][debug][pool] [source/common/conn_pool/conn_pool_base.cc:145] creating a new connection (connecting=0)
[2024-11-22 15:18:06.930][13][debug][connection] [./source/common/network/connection_impl.h:98] [Tags: "ConnectionId":"32898"] current connecting state: true
[2024-11-22 15:18:06.930][13][debug][client] [source/common/http/codec_client.cc:57] [Tags: "ConnectionId":"32898"] connecting
[2024-11-22 15:18:06.930][13][debug][connection] [source/common/network/connection_impl.cc:1017] [Tags: "ConnectionId":"32898"] connecting to 10.0.143.212:5601
[2024-11-22 15:18:06.930][13][debug][connection] [source/common/network/connection_impl.cc:1036] [Tags: "ConnectionId":"32898"] connection in progress
[2024-11-22 15:18:06.933][13][debug][connection] [source/common/network/connection_impl.cc:746] [Tags: "ConnectionId":"32898"] connected
[2024-11-22 15:18:06.933][13][debug][client] [source/common/http/codec_client.cc:88] [Tags: "ConnectionId":"32898"] connected
[2024-11-22 15:18:06.933][13][debug][pool] [source/common/conn_pool/conn_pool_base.cc:328] [Tags: "ConnectionId":"32898"] attaching to next stream
[2024-11-22 15:18:06.933][13][debug][pool] [source/common/conn_pool/conn_pool_base.cc:182] [Tags: "ConnectionId":"32898"] creating stream
[2024-11-22 15:18:06.933][13][debug][router] [source/common/router/upstream_request.cc:593] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] pool ready
[2024-11-22 15:18:06.933][13][debug][client] [source/common/http/codec_client.cc:142] [Tags: "ConnectionId":"32898"] encode complete
[2024-11-22 15:18:06.935][13][debug][connection] [source/common/network/connection_impl.cc:714] [Tags: "ConnectionId":"32898"] remote close
[2024-11-22 15:18:06.935][13][debug][connection] [source/common/network/connection_impl.cc:276] [Tags: "ConnectionId":"32898"] closing socket: 0
[2024-11-22 15:18:06.935][13][debug][client] [source/common/http/codec_client.cc:107] [Tags: "ConnectionId":"32898"] disconnect. resetting 1 pending requests
[2024-11-22 15:18:06.935][13][debug][client] [source/common/http/codec_client.cc:159] [Tags: "ConnectionId":"32898"] request reset
[2024-11-22 15:18:06.935][13][debug][router] [source/common/router/router.cc:1384] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] upstream reset: reset reason: connection termination, transport failure reason: 
[2024-11-22 15:18:06.935][13][debug][http] [source/common/http/filter_manager.cc:1084] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] Sending local reply with details upstream_reset_before_response_started{connection_termination}
[2024-11-22 15:18:06.935][13][debug][http] [source/common/http/conn_manager_impl.cc:1878] [Tags: "ConnectionId":"32897","StreamId":"5400320508560504825"] encoding headers via codec (end_stream=false):
':status', '503'
'content-length', '95'
'content-type', 'text/plain'
'date', 'Fri, 22 Nov 2024 15:18:06 GMT'
@ferdinandosimonetti
Copy link
Author

ferdinandosimonetti commented Nov 22, 2024

Apparently, I shouldn't use the sectionName within my TLSBackendPolicy definition.

With this formulation

---
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: BackendTLSPolicy
metadata:
  name: kb-enable-tls
  namespace: elastic-dev
spec:
  targetRefs:
  - group: ''
    kind: Service
    name: fh-cluster-kb-http
  validation:
    caCertificateRefs:
    - name: kb-ca
      group: ''
      kind: ConfigMap
    hostname: fh-cluster-kb-http.elastic-dev.kb.local

I am able to connect to Kibana.

Additionally, I can refer directly to the Secret containing the certificates, without the need to extract the CA's as a ConfigMap.

---
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: BackendTLSPolicy
metadata:
  name: kb-enable-tls
  namespace: elastic-dev
spec:
  targetRefs:
  - group: ''
    kind: Service
    name: fh-cluster-kb-http
  validation:
    caCertificateRefs:
    - name: fh-cluster-kb-http-certs-internal
      group: ''
      kind: Secret
    hostname: fh-cluster-kb-http.elastic-dev.kb.local

@ferdinandosimonetti
Copy link
Author

Closing the issue after solving on my own, I hope that it could be helpful for someone else

@arkodg
Copy link
Contributor

arkodg commented Nov 22, 2024

imo sectionName should work, opening this issue since its a bug

@arkodg arkodg added kind/bug Something isn't working and removed triage labels Nov 22, 2024
@zhaohuabing
Copy link
Member

zhaohuabing commented Nov 25, 2024

@ferdinandosimonetti Which EG version did you use for testing? It seems that this was caused by the same reason of #4445, and it should already have been fixed by #4630 .

@ferdinandosimonetti
Copy link
Author

Ah, sorry, I forgot to mention it. It is v1.2.1.

@ferdinandosimonetti ferdinandosimonetti changed the title TLSBackendPolicy applied a Kibana instance Service resulting in upstream connect error or disconnect/reset before headers. reset reason: connection termination BackendTLSPolicy applied a Kibana instance Service resulting in upstream connect error or disconnect/reset before headers. reset reason: connection termination Nov 25, 2024
@zhaohuabing
Copy link
Member

zhaohuabing commented Nov 25, 2024

EG asumes the sectionName for BackendTLSPolicy is a port number instead of a port name, but according to the Gateway API, it should be the port Name.

  • Gateway: Listener name
  • HTTPRoute: HTTPRouteRule name
  • Service: Port name
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: BackendTLSPolicy
metadata:
  name: kb-enable-tls
  namespace: elastic-dev
spec:
  targetRefs:
  - group: ''
    kind: Service
    name: fh-cluster-kb-http
    sectionName: https

Since port name is an optional field in the service spec, Gateway API should use port number as the SectionName for Service, or at least supports both Port name and Port number.

@ferdinandosimonetti
Copy link
Author

ferdinandosimonetti commented Nov 25, 2024 via email

@arkodg
Copy link
Contributor

arkodg commented Nov 25, 2024

thanks for triaging this @zhaohuabing, imo we should continue with portName and resolve the bug in EG until the upstream spec changes

@zhaohuabing
Copy link
Member

thanks for triaging this @zhaohuabing, imo we should continue with portName and resolve the bug in EG until the upstream spec changes

Sounds good, I will raise a PR to address it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants