New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Trace HTTPClient request execution #320

Draft

slashmo wants to merge 24 commits into swift-server:tracing-development from slashmo:feature/tracing

slashmo commented Dec 5, 2020 •

edited

Loading

Motivation:

Context Propagation

In order to instrument distributed systems, metadata such as trace ids
must be propagated across network boundaries through HTTP headers.
As HTTPClient operates at one such boundary, it should take care of
injecting metadata into HTTP headers automatically using the configured
instrument.

Built-in tracing

Furthermore, HTTPClient should create a Span for executed requests
under the hood, so that users benefit from tracing effortlessly.

Modifications:

Inject instrumentation metadata into HTTP headers
Add HTTPClient method overloads accepting LoggingContext
Create Span for executed HTTP request

Result:

New HTTPClient method overloads accepting LoggingContext
Existing overloads accepting Logger construct a DefaultLoggingContext
Existing methods that neither take Logger nor LoggingContext construct
a DefaultLoggingContext

swift-server-bot commented Dec 5, 2020

Can one of the admins verify this patch?

4 similar comments

swift-server-bot commented Dec 5, 2020

Can one of the admins verify this patch?

swift-server-bot commented Dec 5, 2020

Can one of the admins verify this patch?

swift-server-bot commented Dec 5, 2020

Can one of the admins verify this patch?

swift-server-bot commented Dec 5, 2020

Can one of the admins verify this patch?

Author

slashmo commented Dec 5, 2020

I chatted with @ktoso earlier to discuss the manual context propagation, and we agreed that we probably shouldn't deprecate the "old" API accepting a Logger for each request overload, as we don't want to push users too much into the direction of manual context passing because that's ideally not necessary once the mentioned language changes have been made: https://github.com/apple/swift-distributed-tracing#important-note-on-adoption

slashmo marked this pull request as draft

December 5, 2020 17:11

ktoso reviewed

View reviewed changes

Contributor

ktoso left a comment

So since technically we're 0.1 and something may change... how do we want to tackle adoption here.

I was thinking to kick off a branch like tracing for now, so we can polish up there and once we're all confident merge into mainline? We could also tag those tracing releases, they'd follow normal releases e.g. 1.2.2-tracing.

I don't really expect anything breaking in the core APIs but the open telemetry support which we may want to use here could still fluctuate a little bit until they're final hmmm...

Sources/AsyncHTTPClient/HTTPHeadersInjector.swift Outdated Show resolved Hide resolved

Sources/AsyncHTTPClient/HTTPClient.swift Outdated Show resolved Hide resolved

Package.swift Outdated

                   ],
                   targets: [
                       .target(
                           name: "AsyncHTTPClient",
                           dependencies: ["NIO", "NIOHTTP1", "NIOSSL", "NIOConcurrencyHelpers", "NIOHTTPCompression",
-                                         "NIOFoundationCompat", "NIOTransportServices", "Logging"]
+                                         "NIOFoundationCompat", "NIOTransportServices", "Logging", "Instrumentation"]

Contributor

ktoso Dec 7, 2020

Can we right away go with Tracing and do the full thing in a single PR?

Author

slashmo Dec 7, 2020

That's my intention. I've added a checklist to the PR including creating a Span. I first wanted to get the instrumentation part down and then continue with tracing, but all inside this PR.

slashmo force-pushed the feature/tracing branch 2 times, most recently from 047fbb0 to 87085d9 Compare

December 7, 2020 17:05

slashmo commented

View reviewed changes

Sources/AsyncHTTPClient/HTTPClient.swift Outdated Show resolved Hide resolved

Collaborator

Lukasa commented Dec 8, 2020

@swift-server-bot add to whitelist

Collaborator

Lukasa commented Dec 8, 2020

I'd like to punt this to a side-branch for iterative development if we can.

Author

slashmo commented Dec 8, 2020

I'd like to punt this to a side-branch for iterative development if we can.
@Lukasa

Sure, sounds like a good approach. I can change the target branch once it's created.

Collaborator

Lukasa commented Dec 8, 2020

I've opened up the tracing-development branch.

slashmo changed the base branch from main to tracing-development

December 8, 2020 09:21

Author

slashmo commented Dec 8, 2020 •

edited

Loading

@ktoso The CI seems to fail because the Baggage repo cannot be cloned through the Git URL. Should we pin Tracing to 0.1.1 here in order to get the fix? (apple/swift-distributed-tracing/pull/25)

Contributor

ktoso commented Dec 8, 2020

No, we need to tag a 0.1.1, I'll do that in a moment.

Contributor

ktoso commented Dec 8, 2020

0.1.1. tagged, please depend on that.

Thanks Cory for the development branch, sounds good 👍

slashmo force-pushed the feature/tracing branch from 87085d9 to ae7268d Compare

December 8, 2020 10:43

Contributor

ktoso commented Dec 8, 2020

@swift-server-bot test this please

Contributor

ktoso commented Dec 8, 2020

Can drafts get CI validation? 🤔

Collaborator

Lukasa commented Dec 8, 2020

Yes, they can: I think the CI isn't targeting that branch at the moment.

slashmo force-pushed the feature/tracing branch from ae7268d to 329522c Compare

December 8, 2020 17:06

0xpablo and others added 6 commits

January 5, 2021 11:01


          Fix Timeout snippet in README.md (swift-server#323)

a72c5ad


          Add defensive connection closure. (swift-server#328)

1aec5d7

Motivation:

Currently when either we or the server send Connection: close, we
correctly do not return that connection to the pool. However, we rely on
the server actually performing the connection closure: we never call
close() ourselves. This is unnecessarily optimistic: a server may
absolutely fail to close this connection. To protect our own file
descriptors, we should make sure that any connection we do not return
the pool is closed.

Modifications:

If we think a connection is closing when we release it, we now call
close() on it defensively.

Result:

We no longer leak connections when the server fails to close them.

Fixes swift-server#324.


          Use welcoming language (swift-server#333)

9671de7


          fix doc generation setup (swift-server#336)

bbebce3


          Address flakiness of testSSLHandshakeErrorPropagation (swift-server#335)

2bacb97

Motivation:

Flaky tests are bad.

This test is flaky because the server closes the connection immediately
upon channelActive. In practice this can mean that the handshake never
even gets a chance to start: by the time the SSLHandler ends up
in the pipeline the connection is already dead. Heck, by the time we
attempt to complete the connection the connection might be dead.

Modifications:

- Change the shutdown to be on first read.
- Remove the disabled autoRead.
- Change the expected NIOTS failure mode to connectTimeout,
    which is how this manifests in NIOTS.

Result:

Test is no longer flaky.


          Update Readme to account for Package.swift format (swift-server#339)

ba845ee

Adding the product dependency to the target by name only produces an error in Xcode 12.4. Instead, the product dependency should be given as a `.product`. Updated the README with the new format, so that new user's won't stumble over this.

artemredkin and others added 5 commits

March 3, 2021 17:10


          Fixes bi-directional streaming (swift-server#344)

5d9b784

Motivation:
When we stream request body, current implementation expects that body
will finish streaming _before_ we start to receive response body parts.
This is not correct, reponse body parts can start to arrive before we
finish sending the request.

Modifications:
 - Simplifies state machine, we only case about request being fully sent
   to prevent sending body parts after .end, but response state machine
   is mostly ignored and correct flow will be handled by NIOHTTP1
   pipeline
 - Adds HTTPEchoHandler, that replies to each response body part
 - Adds bi-directional streaming test

Result:
Closes swift-server#327


          Fix CoW in HTTPResponseAggregator (swift-server#345)

0dda95c

Motivation:

HTTPResponseAggregator attempts to build a single, complete response
object. This necessarily means it loads the entire response payload into
memory. It wants to provide this payload as a single contiguous buffer
of data, and it does so by aggregating the data into a single contiguous
buffer as it goes.

Because ByteBuffer does exponential reallocation, the cost of doing this
should be amortised constant-time, even though we do have to copy some
data sometimes. However, if this operation triggers a copy-on-write then
the operation will become quadratic. For large buffers this will rapidly
come to dominate the runtime.

Unfortunately in at least Swift 5.3 Swift cannot safely see that during
the body stanza the state variable is dead. Swift is not necessarily
wrong about this: there's a cross-module call to ByteBuffer.writeBuffer
in place and Swift cannot easily prove that that call will not lead to a
re-entrant access of the `HTTPResponseAggregator` object. For this
reason, during the call to `didReceiveBodyPart` there will be two copies
of the body buffer alive, and so the write will CoW.

This quadratic behaviour is a nasty performance trap that can become
highly apparent even at quite small body sizes.

Modifications:

While Swift can't prove that the `self.state` variable is dead, we can!
To that end, we temporarily set it to a different value that does not
store the buffer in question. This will force Swift to drop the ref on
the buffer, making it uniquely owned and avoiding the CoW.

Sadly, it's extremely difficult to test for "does not CoW", so this
patch does not currently come with any tests. I have experimentally
verified the behaviour.

Result:

No copy-on-write in the HTTPResponseAggregator during body aggregation.


          Use synchronous pipeline hops to remove windows. (swift-server#346)

ae5f185

Motivation:

There is an awkward timing window in the TLSEventsHandler flow where it
is possible for the NIOSSLClientHandler to fail the handshake on
handlerAdded. If this happens, the TLSEventsHandler will not be in the
pipeline, and so the handshake failure error will be lost and we'll get
a generic one instead.

This window can be resolved without performance penalty if we use the
new synchronous pipeline operations view to add the two handlers
backwards. If this is done then we can ensure that the TLSEventsHandler
is always in the pipeline before the NIOSSLClientHandler, and so there
is no risk of event loss.

While I'm here, AHC does a lot of pipeline modification. This has led to
lengthy future chains with lots of event loop hops for no particularly
good reason. I've therefore replaced all pipeline operations with their
synchronous counterparts. All but one sequence was happening on the
correct event loop, and for the one that may not I've added a fast-path
dispatch that should tolerate being on the wrong one. The result is
cleaner, more linear code that also reduces the allocations and event
loop hops.

Modifications:

- Use synchronous pipeline operations everywhere
- Change the order of adding TLSEventsHandler and NIOSSLClientHandler

Result:

Faster, safer, fewer timing windows.


          Unconditionally insert TLSEventsHandler (swift-server#349)

b075d19

Motivation:

AsyncHTTPClient attempts to avoid the problem of Happy Eyeballs making
it hard to know which Channel will be returned by only inserting the
TLSEventsHandler upon completion of the connect promise. Unfortunately,
as this may involve event loop hops, there are some awkward timing
windows in play where the connect may complete before this handler gets
added.

We should remove that timing window by ensuring that all channels always
have this handler in place, and instead of trying to wait until we know
which Channel will win, we can find the TLSEventsHandler that belongs to
the winning channel after the fact.

Modifications:

- TLSEventsHandler no longer removes itself from the pipeline or throws
  away its promise.
- makeHTTP1Channel now searches for the TLSEventsHandler from the
  pipeline that was created and is also responsible for removing it.
- Better sanity checking that the proxy TLS case does not overlap with
  the connection-level TLS case.

Results:

Further shrinking windows for pipeline management issues.


          Better backpressure management. (swift-server#352)

e4fded7

Motivation:

Users of the HTTPClientResponseDelegate expect that the event loop
futures returned from didReceiveHead and didReceiveBodyPart can be used
to exert backpressure. To be fair to them, they somewhat can. However,
the TaskHandler has a bit of a misunderstanding about how NIO
backpressure works, and does not correctly manage the buffer of inbound
data.

The result of this misunderstanding is that multiple calls to
didReceiveBodyPart and didReceiveHead can be outstanding at once. This
would likely lead to severe bugs in most delegates, as they do not
expect it.

We should make things work the way delegate implementers believe it
works.

Modifications:

- Added a buffer to the TaskHandler to avoid delivering data that the
   delegate is not ready for.
- Added a new "pending close" state that keeps track of a state where
   the TaskHandler has received .end but not yet delivered it to the
   delegate. This allows better error management.
- Added some more tests.
- Documented our backpressure commitments.

Result:

Better respect for backpressure.

Resolves swift-server#348

slashmo force-pushed the feature/tracing branch from 329522c to d68cb8f Compare

April 27, 2021 14:33

Davidde94 and others added 10 commits

April 27, 2021 15:43


          Fix tests (swift-server#356)

f352103


          add 5.4 docker setup for CI (swift-server#360)

abac00a

motivation: 5.4 is out!

changes:
* update Dockerfile handling of rubygems
* add docker compose setup for ubuntu 20.04 and 5.4 toolchain


          add 5.4 docker setup for CI (swift-server#361)

b5b04ac

motivation: test with nightly toolchain

changes: add docker compose setup for ubuntu 20.04 and nightly toolchain


          Support request specific TLS configuration (swift-server#358)

ca722d8

Adds support for request-specific TLS configuration:
Request(url: "https://webserver.com", tlsConfiguration: .forClient())


          cache NIOSSLContext (saves 27k allocs per conn) (swift-server#362)

e2d03ff

Motivation:

At the moment, AHC assumes that creating a `NIOSSLContext` is both cheap
and doesn't block.

Neither of these two assumptions are true.

To create a `NIOSSLContext`, BoringSSL will have to read a lot of
certificates in the trust store (on disk) which require a lot of ASN1
parsing and much much more.

On my Ubuntu test machine, creating one `NIOSSLContext` is about 27,000
allocations!!! To make it worse, AHC allocates a fresh `NIOSSLContext`
for _every single connection_, whether HTTP or HTTPS. Yes, correct.

Modification:

- Cache NIOSSLContexts per TLSConfiguration in a LRU cache
- Don't get an NIOSSLContext for HTTP (plain text) connections

Result:

New connections should be _much_ faster in general assuming that you're
not using a different TLSConfiguration for every connection.


          TLS on Darwin: Add explainer that MTELG supports all options

06daedf


          Generate trust roots SecCertificate for Transport Services (swift-ser…

9cdf8a0

…ver#350)

This PR is a result of another swift-server#321.

In that PR I provided an alternative structure to TLSConfiguration for when connecting with Transport Services.

In this one I construct the NWProtocolTLS.Options from TLSConfiguration. It does mean a little more work for whenever we make a connection, but having spoken to @weissi he doesn't seem to think that is an issue.

Also there is no method to create a SecIdentity at the moment. We need to generate a pkcs#12 from the certificate chain and private key, which can then be used to create the SecIdentity.

This should resolve swift-server#292


          SSLContextCache: use DispatchQueue instead of NIOThreadPool (swift-se…

8ccba73

…rver#368)

Motivation:

In the vast majority of cases, we'll only ever create one and only one
`NIOSSLContext`. It's therefore wasteful to keep around a whole thread
doing nothing just for that. A `DispatchQueue` is absolutely fine here.

Modification:

Run the `NIOSSLContext` creation on a `DispatchQueue` instead.

Result:

Fewer threads hanging around.


          fix Swift Package Index build (swift-server#369)

cddb69d

Co-authored-by: Johannes Weiss <[email protected]>


          Inject instrumentation metadata into HTTP headers

ddc5304

Motivation:

In order to instrument distributed systems, metadata such as trace ids
must be propagated across network boundaries.
As HTTPClient operates at one such boundary, it should take care of
injecting metadata into HTTP headers automatically using the configured
instrument.

Modifications:

HTTPClient gains new method overloads accepting LoggingContext.

Result:

- New HTTPClient method overloads accepting LoggingContext
- Existing overloads accepting Logger construct a DefaultLoggingContext
- Existing methods that neither take Logger nor LoggingContext construct
  a DefaultLoggingContext

slashmo force-pushed the feature/tracing branch from d68cb8f to ddc5304 Compare

May 25, 2021 08:44

slashmo and others added 3 commits

May 25, 2021 14:33


          Trace request execution

59b61eb


          Fix building on macOS 12

10f2642


          Merge pull request #1 from 0xTim/monterey-tracing

e6e48ef

Fix building on macOS 12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet