New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

add Utf8JsonTransformer #2239

Merged

ueman merged 18 commits into cfug:main from knaeckeKami:utf_json_tranformer

Jun 15, 2024

Contributor

knaeckeKami commented Jun 8, 2024 •

edited

Loading

New Pull Request Checklist

I have searched for a similar pull request in the project and found none
I have read the Documentation
I have updated this branch with the latest main branch to avoid conflicts (via merge from master or rebase)
I have added the required tests to prove the fix/feature I'm adding
I have updated the documentation (if necessary)
I have run the tests without failures
I have updated the CHANGELOG.md in the corresponding package

Additional context and info (if any)

fixes #2238

This PR adds the Utf8JsonTransformer.

I made sure that it is as close to BackgroundTransformer as possible.
The only behavior change is that is checks to Content-Length header for deciding whether to switch to a background isolate for decoding or not.
If the Content-Lenght header is not set or not valid, it will fall back to counting the bytes in the response.

This is because it's slightly faster to feed to the response stream directly into the decoder without consolidating all the response bytes into a single Uint8List first. In case the server does not send a valid Content-Length header, the Transformer will check the actual bytes from the response, as the BackgroundTransformer does.

The remaining question is if we should set the Transformer as default.
IMO it would make sense to change the default transformer to

  /// The default [Transformer] that transfers requests and responses
  /// into corresponding content to send.
  @override
  Transformer transformer = Utf8JsonTransformer(contentLengthIsolateThreshold: 50 * 1024);

so all users get the improved performance by default.

knaeckeKami added 3 commits

June 8, 2024 15:51


          add Utf8JsonTransformer

fee993d


          fix format

6d2d5fe


          add to changelog

f3787f7

knaeckeKami requested a review from a team as a code owner

June 8, 2024 15:43

knaeckeKami added 3 commits

June 8, 2024 16:47


          analyzer fixes

192e3f9


          fix wrong import of compute

6c2b3c8


          try to use content-length header for checking whether to decode on an…

b26893a

… isolate, but count the response bytes if the header is not available instead

Contributor

ueman commented Jun 8, 2024

https://github.com/knaeckeKami/dio_benchmark

AlexV525 reviewed

View reviewed changes

dio/lib/src/transformer.dart Outdated Show resolved Hide resolved

dio/lib/src/transformers/utf8_json_transformer.dart Outdated Show resolved Hide resolved

dio/lib/src/transformers/utf8_json_transformer.dart Outdated Show resolved Hide resolved

dio/lib/src/transformers/utf8_json_transformer.dart Outdated

+              import 'package:dio/src/compute/compute.dart';
+              /// A [Transformer] that has a fast path for decoding utf8-encoded JSON.
+              /// If the response is utf8-encoded JSON and no custom decoder for a Request is specified, this transformer

Member

AlexV525 Jun 9, 2024

What is the Request referencing here?

Contributor Author

knaeckeKami Jun 9, 2024

the RequestOptions. If a custom decoder is set in to RequestOptions, we respect that, but in this case we cannot use the fused decoder.
I changed the comments to clarify that.

dio/lib/src/transformers/utf8_json_transformer.dart Outdated Show resolved Hide resolved

dio/lib/src/transformers/utf8_json_transformer.dart Outdated

+              /// By default, this transformer will transform responses in the main isolate,
+              /// but a custom threshold can be set to switch to an isolate for large responses by passing
+              /// [contentLengthIsolateThreshold].
+              class Utf8JsonTransformer extends Transformer {

Member

AlexV525 Jun 9, 2024

How about FusedTransformer?

Contributor Author

knaeckeKami Jun 9, 2024

fine by me, changed the name

dio/lib/src/transformers/utf8_json_transformer.dart Outdated

+                  final shouldUseIsolate = !(contentLengthIsolateThreshold < 0) &&
+                      contentLength >= contentLengthIsolateThreshold;
+                  if (shouldUseIsolate) {
+                    // we can't send the stream to the isolate, so we need to decode the response bytes first

Member

AlexV525 Jun 9, 2024

Would that lead to a memory issue compare to the previous transformer?

Contributor Author

knaeckeKami Jun 9, 2024 •

edited

Loading

no, the previous transformers always decode into a single Uint8List first. (even in a horribly slow way, this will copy byte-by-byte, potentially even creating wrapper objects for each byte)
SyncTransformer:

    final chunks = await responseBody.stream.toList();
    final responseBytes = Uint8List.fromList(chunks.expand((c) => c).toList());

It's an optimization in this Transformer that it has a path where it can avoid that.

dio/lib/src/transformers/utf8_json_transformer.dart Outdated

+                  } else {
+                    if (!hasContentLengthHeader || contentLength == 0) {
+                      responseBytes ??= await _consolidateStream(responseBody.stream);
+                      // if the response is empty, return null, since the decoder would throw

Member

AlexV525 Jun 9, 2024

Could you explain more on this case with examples? Also comments should start with capital letter and end with dot too:

Suggested change

      
                    // if the response is empty, return null, since the decoder would throw
          
                    // Returns `null` to let the decoder throws when the response is empty.

Contributor Author

knaeckeKami Jun 9, 2024 •

edited

Loading

I added a more detailed explanation as comment.
This branch is for backwards compatibility with the other Transformers.

The old behavior was, that if a server responds with an Content-Type: application/json header. but the response body was empty, the transformer would return null.
But the fused decoder would throw an exception in this case.

So, if we don't have a valid Content-Length header, we first check if the body is empty before converting the bytes to json. If it's empty, we manually return null.
If we do have a Content-Length header which is >0, we can feed the response bytes directly into the decoder, which is the most performant path.

knaeckeKami added 11 commits

June 9, 2024 13:07


          PR feedback: relative imports

6e57aa8


          PR feedback: refactor imports, clarify comments


          format

a0d7736


          reflect new name 'FusedTransformer' in CHANGELOG

8023fba


          delete typdefs file for transformers

091bf64


          rename fused transformer file name to reflect name change

190364c


          Merge branch 'main' into utf_json_tranformer

6a6e3f5


          use bytebuilder to consolidate bytes of streamed response

15b62b0


          Merge branch 'utf_json_tranformer' of https://github.com/knaeckeKami/dio

47e6298

 into utf_json_tranformer

fmt


          use takeBytes instead of toBytes

c3f723a

Contributor

Reprevise commented Jun 13, 2024

Can you add a jsonDecodeCallback parameter to this new transformer? I have a long running isolate who's purpose is just to decode JSON whereas the internal compute function spawns a new isolate per request.

Contributor

ueman commented Jun 13, 2024 •

edited

Loading

Can you add a jsonDecodeCallback parameter to this new transformer? I have a long running isolate who's purpose is just to decode JSON whereas the internal compute function spawns a new isolate per request.

That doesn't really make sense for this specific transformer as far as I can tell. It's an interesting idea though. What benefits do you get by doing it?

Contributor Author

knaeckeKami commented Jun 13, 2024 •

edited

Loading

We could allow to pass in a jsonDecodeCallback, but it would need to operate an on a Uint8List, not a String like the BackgroundTransformer to leverage the fused decoder.

However, we should measure if that even makes sense in current versions of Dart.
Spawning a new isolate got a lot cheaper recently, and an advantage of using compute() is that it can use Isolate.exit to return the decoded json to the calling isolate, which avoids a copy. (at least on recent versions of Dart which have Isolate groups)

see https://api.dart.dev/stable/3.4.4/dart-isolate/Isolate/exit.html

If the port is a native port -- one provided by ReceivePort.sendPort or RawReceivePort.sendPort -- the system may be able to send this final message more efficiently than normal port communication between live isolates. In these cases this final message object graph will be reassigned to the receiving isolate without copying. Further, the receiving isolate will in most cases be able to receive the message in constant time.

So the question is, is spawning a new Isolate more expensive than copying the decoded map back into the main isolate?
I don't know, we could benchmark it.

Contributor Author

knaeckeKami commented Jun 13, 2024 •

edited

Loading

FWIW, I found that a large portion of the performance improvement actually comes from the more efficient consolidating of the Stream<Uint8List> to a single Uint8List

The SyncTransformer and the extending BackgroundTransformer do

Uint8List.fromList(chunks.expand((c) => c).toList()); which copies every byte sequentially, probably also creating wrapper objects for each byte.

Using a ByteBuffer instead is an optimization that could also be integrated into SyncTransformer without breaking changes.

Contributor

Reprevise commented Jun 13, 2024

So the question is, is spawning a new Isolate more expensive than copying the decoded map back into the main isolate?
I don't know, we could benchmark it.

I'm just interested in the best performance possible. In my project, the size cutoff is 50KB. Any larger than that and it gets sent to the long-running Isolate for parsing. If we find spawning many isolates is better, then I'll concede.

Contributor

Reprevise commented Jun 13, 2024

FWIW, I found that a large portion of the performance improvement actually comes from the more efficient consolidating of the Stream to a single Uint8List

Sounds like a free win. Maybe a separate PR is in order? I think this PR should move forward as well though, I think the performance benefits of using the fused decoder is worth it.

Contributor Author

knaeckeKami commented Jun 14, 2024

Sounds like a free win. Maybe a separate PR is in order? I think this PR should move forward as well though, I think the performance benefits of using the fused decoder is worth it.

Yeah, when this is merged I'll prepare a new one, that extracts the _consolidateStream function and make SyncTransformer use it as well.

Contributor Author

knaeckeKami commented Jun 14, 2024

I'm just interested in the best performance possible. In my project, the size cutoff is 50KB. Any larger than that and it gets sent to the long-running Isolate for parsing. If we find spawning many isolates is better, then I'll concede.

The question is, for which aspect do we want to optimize?
We get the least amount of CPU usage overall with decoding the JSON in the main isolate because no data needs to be copied between isolates and no new isolates need to be spawned.

The least amount of CPU usage on the main isolate is likely achieved by spawning a new isolate each time, as the spawning of the new isolate happens on another thread, and the decoded JSON does not need to be copied back in the main isolate.

With a long-running Isolate, the question is first consolidating the byte stream on the main isolate, then copying it to another isolate and then copying the decoded JSON map back into the main isolate is even less CPU intensive than just decoding it in the main isolate in the first place.

But taking the https://github.com/knaeckeKami/dio_benchmark repo and adapting it to show how a long running isolate performs might be a good idea.

ueman approved these changes

View reviewed changes

Contributor

ueman left a comment

LGTM. This is awesome. Thank you!


          Merge branch 'main' into utf_json_tranformer

c1f9bfc

Signed-off-by: Jonas Uekötter <[email protected]>

Contributor

github-actions bot commented Jun 15, 2024

Code Coverage Report: Only Changed Files listed

Package	Base Coverage	New Coverage	Difference
dio/lib/src/compute/compute_io.dart	🔴 0%	🟠 69.81%	🟢 69.81%
dio/lib/src/transformer.dart	🟢 100%	🟢 84%	🔴 -16%
dio/lib/src/transformers/fused_transformer.dart	🔴 0%	🟢 85.71%	🟢 85.71%
dio/lib/src/transformers/sync_transformer.dart	🟢 86.67%	🟢 100%	🟢 13.33%
Overall Coverage	🟢 80.9%	🟢 82.46%	🟢 1.56%

Minimum allowed coverage is 0%, this run produced 82.46%

ueman merged commit 4d310a2 into cfug:main

3 checks passed

knaeckeKami mentioned this pull request

backport improved byte consolidation to SyncTransformer and BackgroundTranformer #2248

Merged

ueman pushed a commit that referenced this pull request


          backport improved byte consolidation to SyncTransformer and Backgroun…

fa043d4

…dTranformer (#2248)

Follow up to #2239:

The more efficient Stream<UintList> -> Uint8List conversion can be
backported to `SyncTransformer` & the extending `BackgroundTransformer`

Member

AlexV525 commented Jun 17, 2024

The remaining question is if we should set the Transformer as default. IMO it would make sense to change the default transformer to
  /// The default [Transformer] that transfers requests and responses
  /// into corresponding content to send.
  @override
  Transformer transformer = Utf8JsonTransformer(contentLengthIsolateThreshold: 50 * 1024);
so all users get the improved performance by default.

We are not doing this currently, right? IMOContent-Length might not be specified in regular requests so consolidating bytes will still be its place. If there are no drawbacks then we definitely can move on with the new one.

Contributor

ueman commented Jun 17, 2024

Excactly, that's not yet done

Contributor Author

knaeckeKami commented Jun 17, 2024 •

edited

Loading

We are not doing this currently, right? IMOContent-Length might not be specified in regular requests so consolidating bytes will still be its place. If there are no drawbacks then we definitely can move on with the new one.

There are no drawbacks compared to the old Transformer, as the old transformer does the same unconditionally.
Also, I think in practice pretty much any server that returns JSON also sets Content-Length anyway.

Yes, I'll prepare a new PR for setting the default transformer.

knaeckeKami mentioned this pull request

Dio Throws Exception on DELETE Request with 204 Response #2279

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet