Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP2 locally-originated stream resets use the NO_ERROR code #36200

Open
AdamEAnderson opened this issue Sep 18, 2024 · 3 comments · May be fixed by #37270
Open

HTTP2 locally-originated stream resets use the NO_ERROR code #36200

AdamEAnderson opened this issue Sep 18, 2024 · 3 comments · May be fixed by #37270

Comments

@AdamEAnderson
Copy link
Contributor

AdamEAnderson commented Sep 18, 2024

HTTP2 locally-originated stream resets use the NO_ERROR code: When envoy originates a stream reset on an HTTP2 codec it sends a RST_STREAM frame but erroneously uses the NO_ERROR error code, causing clients to think the stream terminated successfully

Description:
When envoy's HTTP2 codec sends a stream reset, it converts the stream reset reason into an HTTP2 error code here. Only two reset reasons are covered, and the remainder reset with NO_ERROR. From the RFC, NO_ERROR indicates:

The associated condition is not a result of an error. For example, a GOAWAY might include this code to indicate graceful shutdown of a connection.

The other reset reasons (see all here) are terminated as NO_ERROR indicating that the stream is complete.

Here's an example where I ran into this issue:

  • Upstream is HTTP1
  • Downstream is HTTP2
  • Downstream sends a GET request as HTTP2
  • Upstream gets the request headers as HTTP1
  • Upstream sends happy response headers (200 OK) with Transfer-Encoding: chunked
  • Downstream receives response headers
  • Upstream sends response data in chunks, but gets an error part way through the response body. Headers were already sent, so all it can do is reset the HTTP1 stream
  • Envoy sees this as a protocol error, HPE_INVALID_EOF_STATE
  • Envoy proxies the reset by resetting the downstream HTTP2 stream
  • However the downstream stream is reset with 0x00, NO_ERROR
  • The client sees this and thinks the response body is complete and nothing is amiss, not realizing it doesn't have a complete response

Repro steps:
Python server to provide partial chunked responses over HTTP/1.1:

import socket
import threading

def handle_client(client_socket: socket.socket):
    _ = client_socket.recv(1024).decode()
    response = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nTransfer-Encoding: chunked\r\n\r\n5\r\nHel"
    # response = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nTransfer-Encoding: chunked\r\n\r\n5\r\nHello\r\n0\r\n\r\n"
    client_socket.sendall(response.encode())
    client_socket.close()

def start_server(host='127.0.0.1', port=8080):
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.bind((host, port))
    server_socket.listen(5)
    print(f"Serving HTTP on {host} port {port} ...")

    while True:
        client_socket, _ = server_socket.accept()
        client_handler = threading.Thread(target=handle_client, args=(client_socket,))
        client_handler.start()

if __name__ == "__main__":
    start_server()

Envoy config:

admin:
  address:
    socket_address: { address: 127.0.0.1, port_value: 19901 }
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 127.0.0.1
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: HTTP2
          access_log:
          - name: envoy.access_loggers.stdout
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: service
  clusters:
  - name: service
    type: STATIC
    dns_lookup_family: V4_ONLY
    load_assignment:
      cluster_name: service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 127.0.0.1
                port_value: 8080

then reach envoy via an nghttp client (settings, priority, window_update frames omitted):

with partial chunked response:

[  0.005] Connected
[  0.005] send HEADERS frame <length=38, flags=0x25, stream_id=13>
          ; END_STREAM | END_HEADERS | PRIORITY
          (padlen=0, dep_stream_id=11, weight=16, exclusive=0)
          ; Open new stream
          :method: GET
          :path: /
          :scheme: http
          :authority: localhost:10000
          accept: */*
          accept-encoding: gzip, deflate
          user-agent: nghttp2/1.63.0
[  0.045] recv (stream_id=13) :status: 200
[  0.045] recv (stream_id=13) content-type: text/plain
[  0.045] recv (stream_id=13) x-envoy-upstream-service-time: 7
[  0.045] recv (stream_id=13) date: Tue, 17 Sep 2024 23:18:26 GMT
[  0.045] recv (stream_id=13) server: envoy
[  0.045] recv HEADERS frame <length=65, flags=0x04, stream_id=13>
          ; END_HEADERS
          (padlen=0)
          ; First response header
[  0.045] recv DATA frame <length=3, flags=0x00, stream_id=13>
[  0.045] recv RST_STREAM frame <length=4, flags=0x00, stream_id=13>
          (error_code=NO_ERROR(0x00))
[  0.045] send GOAWAY frame <length=8, flags=0x00, stream_id=0>
          (last_stream_id=0, error_code=NO_ERROR(0x00), opaque_data(0)=[])

with complete response:

[  0.011] Connected
[  0.011] send HEADERS frame <length=38, flags=0x25, stream_id=13>
          ; END_STREAM | END_HEADERS | PRIORITY
          (padlen=0, dep_stream_id=11, weight=16, exclusive=0)
          ; Open new stream
          :method: GET
          :path: /
          :scheme: http
          :authority: localhost:10000
          accept: */*
          accept-encoding: gzip, deflate
          user-agent: nghttp2/1.63.0
[  0.014] recv (stream_id=13) :status: 200
[  0.014] recv (stream_id=13) content-type: text/plain
[  0.014] recv (stream_id=13) x-envoy-upstream-service-time: 1
[  0.014] recv (stream_id=13) date: Tue, 17 Sep 2024 23:22:55 GMT
[  0.014] recv (stream_id=13) server: envoy
[  0.014] recv HEADERS frame <length=65, flags=0x04, stream_id=13>
          ; END_HEADERS
          (padlen=0)
          ; First response header
[  0.014] recv DATA frame <length=5, flags=0x00, stream_id=13>
[  0.014] recv DATA frame <length=0, flags=0x01, stream_id=13>
          ; END_STREAM
[  0.014] send GOAWAY frame <length=8, flags=0x00, stream_id=0>
          (last_stream_id=0, error_code=NO_ERROR(0x00), opaque_data(0)=[])

Envoy trace logs around the event:

[2024-09-17 15:54:44.924][54449801][debug][client] [source/common/http/codec_client.cc:170] [Tags: "ConnectionId":"1"] Error dispatching received data: http/1.1 protocol error: HPE_INVALID_EOF_STATE
[2024-09-17 15:54:44.924][54449801][debug][client] [source/common/http/codec_client.cc:107] [Tags: "ConnectionId":"1"] disconnect. resetting 1 pending requests
[2024-09-17 15:54:44.924][54449801][debug][client] [source/common/http/codec_client.cc:158] [Tags: "ConnectionId":"1"] request reset
[2024-09-17 15:54:44.924][54449801][trace][main] [source/common/event/dispatcher_impl.cc:242] item added to deferred deletion list (size=1)
[2024-09-17 15:54:44.924][54449801][debug][router] [source/common/router/router.cc:1323] [Tags: "ConnectionId":"0","StreamId":"164029717455481211"] upstream reset: reset reason: protocol error, transport failure reason: 
[2024-09-17 15:54:44.924][54449801][trace][main] [source/common/event/dispatcher_impl.cc:242] item added to deferred deletion list (size=2)
[2024-09-17 15:54:44.924][54449801][trace][main] [source/common/event/dispatcher_impl.cc:242] item added to deferred deletion list (size=3)
[2024-09-17 15:54:44.924][54449801][debug][http] [source/common/http/filter_manager.cc:974] [Tags: "ConnectionId":"0","StreamId":"164029717455481211"] Resetting stream due to upstream_reset_after_response_started{protocol_error}. Prior headers have already been sent
[2024-09-17 15:54:44.925][54449801][debug][http] [source/common/http/conn_manager_impl.cc:243] [Tags: "ConnectionId":"0","StreamId":"164029717455481211"] doEndStream() resetting stream
[2024-09-17 15:54:44.925][54449801][debug][http] [source/common/http/conn_manager_impl.cc:1932] [Tags: "ConnectionId":"0","StreamId":"164029717455481211"] stream reset: reset reason: protocol error, response details: -
[2024-09-17 15:54:44.925][54449801][trace][main] [source/common/event/dispatcher_impl.cc:242] item added to deferred deletion list (size=4)
[2024-09-17 15:54:44.925][54449801][trace][misc] [source/common/event/scaled_range_timer_manager_impl.cc:60] enableTimer called on 0x11e76c2b0 for 3600000ms, min is 3600000ms
[2024-09-17 15:54:44.925][54449801][trace][http2] [source/common/http/http2/codec_impl.cc:1315] [Tags: "ConnectionId":"0"] about to send frame type=3, flags=0
[2024-09-17 15:54:44.925][54449801][trace][http2] [source/common/http/http2/codec_impl.cc:1337] [Tags: "ConnectionId":"0"] send data: bytes=13
[2024-09-17 15:54:44.925][54449801][trace][connection] [source/common/network/connection_impl.cc:529] [Tags: "ConnectionId":"0"] writing 13 bytes, end_stream false
[2024-09-17 15:54:44.925][54449801][trace][http2] [source/common/http/http2/codec_impl.cc:1208] [Tags: "ConnectionId":"0"] sent frame type=3, stream_id=1, length=4
[2024-09-17 15:54:44.925][54449801][debug][http2] [source/common/http/http2/codec_impl.cc:1237] [Tags: "ConnectionId":"0"] sent reset code=0
[2024-09-17 15:54:44.925][54449801][debug][http2] [source/common/http/http2/codec_impl.cc:1362] [Tags: "ConnectionId":"0"] stream 1 closed: 0

Desired behavior:
Envoy should use a non-zero error code when resetting HTTP2 streams. For the chunked encoding case, PROTOCOL_ERROR would seem appropriate.

I have a PR (will submit soon) that should fix this, but it will require us to make some decisions about what the right error codes are for each situation.

@zuercher
Copy link
Member

cc @alyssawilk

@alyssawilk
Copy link
Contributor

Looks like a bug. I suspect the cause is
https://github.com/envoyproxy/envoy/blob/main/source/common/http/http2/codec_impl.cc#L104
where most unknown errors are converted to NO_ERROR.
Looking at https://httpwg.org/specs/rfc7540.html#NO_ERROR
I think that INTERNAL_ERROR is a better default.

The one gotcha is if the peer sends NO_ERROR we do want to propogate that, so I'd suggest also adding a new check around here
https://github.com/envoyproxy/envoy/blob/main/source/common/http/http2/codec_impl.cc#L1480
that if the error is NO_ERROR that we set the stream reset code to RemoteResetNoError and adapt reasonToReset to handle that case.

Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Oct 24, 2024
@zuercher zuercher added help wanted Needs help! and removed stale stalebot believes this issue/PR has not been touched recently labels Oct 25, 2024
@alyssawilk alyssawilk linked a pull request Nov 20, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants