-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] Please add synchronous access to http2 write #2899
Comments
Streams are not inherently asynchronous in the way you suggest: Unfortunately, your strategy for changing this wouldn't work, because It might be enough to call the callback early in |
Do you have a self-contained benchmark that demonstrates the problem you are seeing that you can share here? |
Indeed, I was confused with this line of
We don't have any callbacks on function writeWidget(widget) {
// chunk can be ready: actual html
// chunk can be delimeter: there should be nested widget
const chunks = await widget.chunks;
const writeChunk = async (chunk: WidgetChunk) => {
if (chunk.ready) {
httpStream.write(widget, chunk);
return;
}
const {slotName} = chunk;
// there can be nested widgets inside one we are processing
await writeWidget(widget.slots[slotName]);
};
for (const chunk of chunks) {
await writeChunk(chunk);
}
}
Yes, of course. I can guarantee that invariant in my particular application, but my solution is not good enough for the library.
I can't get why. I am not sure if there is a difference between having |
Ugh, it takes a little more time than I originally imagined. I was hoping I could do it today, but no=( Still working on it. |
As I said, this is about the case where there are interceptors. Interceptors can process messages asynchronously, and they are not required to process different messages consistently. So if an interceptor gets one message and starts some slow asynchronous operation with it, then gets another message and passes it along immediately, that will reorder the messages. |
@murgatroid99 Thank you for your patience
I guess I got your point. I may suggest that an interceptor should have similar
I prepared an example: https://github.com/re-gor/grpc-js-issue. It was kinda tricky. The whole point was to exploit the line I mentioned before to make method Long story short I wrote simple client and simple server. The server produces a bunch of chunks. At some point it buffers them. At this point one got delay. This is exactly the behaviour we observed in our application. We were loosing quite a bit of TTFB (Time to first byte) metrics, but a lot of Hero Element and LCP And today we got some actual A/B results for my nasty patch. One can see that after release of the "patch" our grpc-experiment shows a better speed ![]() ![]() |
Hi! Before I start asking I have to say that it was my pleasure to read through the code of grpc-node library. Splendid work, thank you!
Is your feature request related to a problem? Please describe.
tldr; We noticed that synchronous API is significantly faster in our scenario.
We have a grpc-js server with single BiDi Stream method:
Our NodeJS backend receives chunks of data from Api Gateway. Once he got one chunk he starts an operation of a template rendering.
Once chunk's rendering is done server writes it into the grpc-js response stream. Sequence Diagram of the process if one is interested.
We noticed that this server is losing a lot of client speed to our old plain-http-1.1 server in the A/B experiment.
I was digging around about a week or two and made a conclusion that the only one significant difference. Grpc-js is using streams as a proxy to
http2.write
instead of good oldResponse.write
which our old server does.Streams are asynchronous by their nature. Once one does
write
, actual writing will happen onnextTick
as far I understand. So we have a situation when the writing operation stands aside of its corresponding rendering operation. Roughly something like this happens:There are a couple of new operations before writing which were not presented before. In our case there are about 700 chunks and 100 rendering operations (one rendering operation can produce several chunks). And it makes delay: about 280ms in the 75 percentile (p75) and 211ms in p50 for Largest Content Paint. Our old website's LSP value is about 1000(p50)-1500ms(p75), so loosing 20% of speed is a lot. While in general LCP is a good metric for the web-sites we prefer to rely on Hero Element Rendering - we are tracking the most important element arriving time. So according this metric grpc-js is loosing about 370ms of p75 and 300ms of p50. Again it is about 20% degradation =(
Describe alternatives you've considered
I've been trying a lot of things during the last couple of weeks. First of all I've proven to myself and my colleagues using cpuprofiles and codes' listings that there is no other reason for degradation different from I described above.
The second thing I tried was chunk compressing using zlib and brotli. It was а failing effort. First of all zlib is making some cpu intensive work which delays the moment of chunk's readiness. However the most important thing that zlib is using streams too, so I got more of nextTicks and result was actually worse than without compression. So we went back to balancer's compression.
The last effort was successful but is a nasty one. I started using grpc-js private methods. Actually the most important one:
sendMessage
(https://github.com/grpc/grpc-node/blob/master/packages/grpc-js/src/server-interceptors.ts#L837) which synchronously callsthis.stream.write
which is actuallyhttp2Response.write
.And this effort made significant difference!
We deployed this version to the production yesterday (about 21:00 of our local time)
![LCP](https://private-user-images.githubusercontent.com/11293990/410069193-1c296391-4e66-4e92-b8d8-042bce6eced2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyOTk1MzMsIm5iZiI6MTczOTI5OTIzMywicGF0aCI6Ii8xMTI5Mzk5MC80MTAwNjkxOTMtMWMyOTYzOTEtNGU2Ni00ZTkyLWI4ZDgtMDQyYmNlNmVjZWQyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDE4NDAzM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTQzNDRlN2VmYTEwOTllM2YyNzAwMjZlNjI5ODdjMGFlNzQwMDlmZWM1YTdkNWI2NjZlMmRlMGFlOWEyOTI3MTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.cPYgOu3awQV4Di1IgqLZAcu-Ua9yn_U1sYbQOYNCOQk)
![Hero element velocity](https://private-user-images.githubusercontent.com/11293990/410069023-2ba996ce-3658-43d1-b8f6-5f6d1a8cc717.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyOTk1MzMsIm5iZiI6MTczOTI5OTIzMywicGF0aCI6Ii8xMTI5Mzk5MC80MTAwNjkwMjMtMmJhOTk2Y2UtMzY1OC00M2QxLWI4ZjYtNWY2ZDFhOGNjNzE3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDE4NDAzM1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWYzMzYyNGE4MmE0YmRmM2IxMTBkMThlODU1NjY3MzhkM2YwNGM1M2IwYzc0MTEyNzkyNDVmZTk5ZDNjOGM4NWYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.bsC7eQl5LFf1lHI74CT8KIpEqgFVHE-IdxsATBmI4Ts)
As one can see we win about 200-300ms per each metric (LCP and Hero Element). Now we are loosing about 50ms against the plain http version and it is ok. It is not so much. And we are ready that packing grpc-messages is not free. Also we have a new one proxy before our server.
Describe the solution you'd like
In general I want synchronous way of using http2.response. For example it can be a new method
writeSync
forServerDuplexStreamImpl<RequestType, ResponseType>
class.Additional context
¯_(ツ)_/¯
Thanks for reading all of this! Ready to your questions
The text was updated successfully, but these errors were encountered: