Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx worker goes to 100% cpu utilization with njs_stream #413

Closed
shankerwangmiao opened this issue Aug 11, 2021 · 4 comments
Closed

nginx worker goes to 100% cpu utilization with njs_stream #413

shankerwangmiao opened this issue Aug 11, 2021 · 4 comments
Assignees
Labels
Milestone

Comments

@shankerwangmiao
Copy link
Contributor

shankerwangmiao commented Aug 11, 2021

When using nginx 1.18.0 with njs 0.6.0 on a high traffic server, it can be spotted some of the nginx worker processes are utilizing 100% of cpu time.

Using lsof, tcpdump, and strace, it can be verified that the data stream is running through. Using perf to get samples from the process in question and analyze hot spot, I can find that it takes the worker 99.98% of cpu time to reach the end of the linked list *busy in https://github.com/nginx/nginx/blob/release-1.18.0/src/core/ngx_buf.c#L195 .

I analyze the core image of the worker process and confirmed that the *busy chain is really really long, piled up with buffers with the tag ngx_stream_proxy_module. Some of the chain objects are referring the same buf object, and some of the buf objects are within the range of session->upstream->upstream_buf and the others session->upstream-> downstream_buf.

I believe this issue is because in njs_stream module, buf objects from both directions are mixed and appended into the same *busy chain. Suppose ngx_stream_js_body_filter is handling a chain of one buf object from upstream and a chain of two buf objects from downstream and the first buf is not sent entirely and moved to *busy chain. If the two following buf objects from downstream are fully sent, they will still be appended to the *busy chain, and not moved to the *free chain because there is one chain containing the first buf, which is not “empty”. However, when the control flow returns to ngx_stream_proxy_process, things become different: busy chain is separate for the two direction and the two buf objects will be put into u->free and be reused. Considering u->free chain is last-in-first-out, these objects will be soon reused. Remember now there are still references from the *busy chain in ngx_stream_js_body_filter to these objects. The chain objects there may have no chance to be freed and begin to pile up.

The root problem is that modules in the chain of buf processing should have the same view whether the buf object referred by a chain is busy or not. Since the *busy chain is separated in ngx_stream_proxy_module, we should also separate them in ngx_stream_js_module.

Since applying the following patch on our production environment, the symptom has not happened again.

From 7d0a302c591ac64642bae578b0eb9e4f63b83611 Mon Sep 17 00:00:00 2001
From: Miao Wang <[email protected]>
Date: Wed, 11 Aug 2021 11:44:12 +0800
Subject: [PATCH] stream: seperate busy chain for two directions

---
 nginx/ngx_stream_js_module.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/nginx/ngx_stream_js_module.c b/nginx/ngx_stream_js_module.c
index e7b29a1..24054e9 100644
--- a/nginx/ngx_stream_js_module.c
+++ b/nginx/ngx_stream_js_module.c
@@ -49,7 +49,7 @@ typedef struct {
     ngx_buf_t              *buf;
     ngx_chain_t           **last_out;
     ngx_chain_t            *free;
-    ngx_chain_t            *busy;
+    ngx_chain_t            *busy[2];
     ngx_int_t               status;
 #define NGX_JS_EVENT_UPLOAD   0
 #define NGX_JS_EVENT_DOWNLOAD 1
@@ -611,7 +611,8 @@ ngx_stream_js_body_filter(ngx_stream_session_t *s, ngx_chain_t *in,
     if (out != NULL || dst == NULL || dst->buffered) {
         rc = ngx_stream_next_filter(s, out, from_upstream);
 
-        ngx_chain_update_chains(c->pool, &ctx->free, &ctx->busy, &out,
+        ngx_chain_update_chains(c->pool, &ctx->free,
+                                &ctx->busy[!!from_upstream], &out,
                                 (ngx_buf_tag_t) &ngx_stream_js_module);
 
     } else {
-- 
2.20.1
@xeioex xeioex added the bug label Aug 11, 2021
@xeioex
Copy link
Contributor

xeioex commented Aug 11, 2021

@shankerwangmiao

Thanks for the report with the detailed root cause tracing, will take a look.

@shankerwangmiao
Copy link
Contributor Author

shankerwangmiao commented Aug 11, 2021

I have not find a way to reliably reproduce the problem in a test environment. May be the condition is assigning an empty js function to js_filter, which does not install any event handlers, so buf objects from both directions can reach the else clause of if(event->ev != NULL) in ngx_stream_js_body_filter, where those buf objects are directly referred by created chain objects which are then appended into out and handled by ngx_stream_top_filter and then by ngx_chain_update_chains.

@xeioex xeioex self-assigned this Aug 11, 2021
@xeioex xeioex added this to the 0.6.2 milestone Aug 12, 2021
@xeioex
Copy link
Contributor

xeioex commented Aug 12, 2021

@shankerwangmiao thanks, committed.

@shankerwangmiao
Copy link
Contributor Author

BTW, I wonder if you have any idea on my patch in #392 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants