Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SWIFT-1026 Extended JSON parsing performance improvements #52

Merged

Conversation

patrickfreed
Copy link
Contributor

SWIFT-1026

This PR implements the proposed improvements to Extended JSON parsing in swift-bson as part of SWIFT-993.

The main improvements are as follows:

  • replaced Foundation's JSONDecoder with the much faster XJSONDecoder from swift-extras-json
  • decode specific BSON types based on the input JSON rather trying each successively until one works
  • Recursively build up a single document reusing the same ByteBuffer rather than separate ones for each subdocument
  • replace Foundation's base64 decoding with the much faster swift-extras-base64

Re-running the benchmarks on the spawn host, both the flat BSON and deep BSON ones are within the target time, though the full BSON one exceeds it by a little bit due to code and codewithscope parsing being slow. I figured it probably wasn't worth putting in much effort to improve this, however.

I reran the baseline again to get updated numbers.

Benchmarking results (all in seconds):

benchmark libbson based median time target time (4x) post-optimizations
flat JSON 0.793 3.172 2.513
deep JSON 0.3 1.2 1.159
full JSON 0.789 3.156 3.756


// The `BSON` is then passed through a `BSONDecoder` where it is outputted as a `T`
let bsonDecoder = BSONDecoder()
bsonDecoder.userInfo = self.userInfo
return try bsonDecoder.decode(T.self, fromBSON: bson)
}

/// Decode a `BSON` from the given extended JSON.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the methods here enable us to recursively build up the document without having to allocate many ByteBuffers for each subdocument, which is much faster. The caveat is that it is a lot less natural to do so. For scalars, we can operate pretty similarly to before, so we return those as BSON. (technically we should build up code with scope similarly, but I opted to treat it as a scalar for simplicity). For non-scalars (arrays and documents), we return them from decodeScalar as-is and then do further processing here in decodeBSONFromJSON and recursive calls in descending from appendElement.

debugDescription: "Not a valid JSON type"
))
}
internal struct JSON {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this wrapper struct around so that we could add the ExpressibleByXLiteral and other conformances to JSONValue, which we get from ExtrasJSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does make me wonder if it's worth upstreaming support for these conformances to swift-extras-json. seems like it would be really straightforward for us to make a quick PR for if they're interested. what do you think?

edit: we already talked about this on Slack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// how many bytes it wrote.
///
/// This may be used to build up a fresh document or a subdocument.
internal mutating func buildDocument(_ appendFunc: (inout Self) throws -> Int) throws -> Int {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method allows us build up a document or subdocument without making new ByteBuffers. See the additions and my comment in ExtendedJSONDecoder to see how this is used.

/// `BSONCodeWithScope`). Some wrapper keys are associated with multiple types (e.g. "$code" maps to both
/// `BSONCode` and `BSONCodeWithScope`). Attempt to decode each of the types returned from the map until one works
/// to find the proper decoding.
private static var wrapperKeyMap: [String: [BSONValue.Type]] = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allows us to use the input JSON to determine which type(s) to attempt to decode rather than just trying each in succession. Code and CodeWithScope make this a little trickier, hence it not being a 1:1 mapping.

@patrickfreed patrickfreed marked this pull request as ready for review January 8, 2021 02:19
@patrickfreed patrickfreed requested a review from kmahar January 8, 2021 02:19
@patrickfreed
Copy link
Contributor Author

So it turns out the swift-extras-base64 package might not usable for us. The base64 included in the BSON corpus tests doesn't get round-tripped properly, which seems to indicate that package is broken. I filed an issue and once that gets fixed we can re-incorporate it to regain some of those performance improvements.

Copy link
Contributor

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! and based on the original runtimes I had anyway you've achieved anywhere from 70-285x speedup on all of the benchmarks which is awesome. this is mainly a bunch of nits and questions

Sources/SwiftBSON/JSON.swift Outdated Show resolved Hide resolved
Sources/SwiftBSON/JSON.swift Show resolved Hide resolved
debugDescription: "Not a valid JSON type"
))
}
internal struct JSON {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does make me wonder if it's worth upstreaming support for these conformances to swift-extras-json. seems like it would be really straightforward for us to make a quick PR for if they're interested. what do you think?

edit: we already talked about this on Slack

Sources/SwiftBSON/String+BSONValue.swift Outdated Show resolved Hide resolved
Sources/SwiftBSON/ExtendedJSONDecoder.swift Outdated Show resolved Hide resolved
Sources/SwiftBSON/ExtendedJSONDecoder.swift Outdated Show resolved Hide resolved
Sources/SwiftBSON/BSONDocument.swift Show resolved Hide resolved
Sources/SwiftBSON/ExtendedJSONDecoder.swift Outdated Show resolved Hide resolved
Tests/SwiftBSONTests/JSONTests.swift Outdated Show resolved Hide resolved
Tests/SwiftBSONTests/JSONTests.swift Show resolved Hide resolved
Copy link
Contributor

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! nice work

@patrickfreed patrickfreed merged commit 70c0ac0 into mongodb:master Jan 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants