You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Linux lb 5.15.0-100-generic #110-Ubuntu SMP Wed Feb 7 13:27:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Description
Describe the bug in full detail including expected and actual behavior.
Specify conditions that caused it. Provide the relevant part of nginx
configuration and debug log.
The bug is reproducible with the latest version of nginx
The nginx configuration is minimized to the smallest possible
to reproduce the issue and doesn't contain third-party modules
Hello,
we've started killing (sending SIGTERM) "old" nginx workers (nginx: worker process is shutting down) as we have regular configuration changes and a lot of websocket connections.
Since we do this, the counters from stub_status are incorrect.
nginx status
> curl localhost/nginx_status
Active connections: 65369
server accepts handled requests
1042173178 1042173178 5035465167
Reading: 0 Writing: 31968 Waiting: 5356
Adding up Writing and Waiting it's just 37324 instead of the 65369 "active connections". But even 37324 is too high. The correct number should be around this:
Open 2 tabs in the browser with the URL: http://localhost:1234/.ws
Now there are 3 active connection (one connection is the request to /basic_status):
# curl localhost:1234/basic_status
Active connections: 3
server accepts handled requests
4 4 4
Reading: 0 Writing: 3 Waiting: 0
After that, reload the nginx process and you can see a nginx: worker process is shutting down process:
# systemctl reload nginx
# ps aux | grep [n]ginx
root 2233850 0.0 0.0 55372 5652 ? Ss 22:29 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx 2233851 0.0 0.0 55896 6044 ? S 22:29 0:00 nginx: worker process is shutting down
nginx 2233940 0.0 0.0 55868 5400 ? S 22:30 0:00 nginx: worker process
nginx 2233941 0.0 0.0 55868 5240 ? S 22:30 0:00 nginx: worker process
# kill 2233851
After we've killed the old process the websocket client in the browser will reconnect. After that we get two additional connection from stub_status even as the old connections are gone:
# curl localhost:1234/basic_status
Active connections: 5
server accepts handled requests
7 7 7
Reading: 0 Writing: 5 Waiting: 0
The text was updated successfully, but these errors were encountered:
Only the worker processes are aware of the connections and can decrement the counters on connection closure. As soon as you terminate the process, all the connection state it held is gone, and we're no longer able to guarantee a valid state of the counters.
Thank you for the feedback. According to the documentation I thought it would be ok:
Individual worker processes can be controlled with signals as well, though it is not required. The supported signals are:
TERM, INT | fast shutdown
[...]
I assumed sending a TERM will behave similar as when the worker shutdown timeout happens.
I assumed sending a TERM will behave similar as when the worker shutdown timeout happens.
The documentation says that SIGTERM and SIGINT invoke fast shutdown but doesn't explain the difference between the "fast" and the "graceful" shutdown procedures.
Graceful shutdown closes the idle connections and waits until the remaining clients are served.
The worker_shutdown_timeout timer only runs during the graceful shutdown, assumes that it's fine to delay the exit and calls the appropriate close methods for the remaining connections. That includes updating the stats and sending the necessary protocol packets (GOAWAY frame for HTTP/2 and HTTP/3).
Fast shutdown does a very minimal set of things before exiting: it runs exit_process handlers for the configured modules, logs a message and that's it. Notably, that does not include closing the connections, sending any packets or doing anything else that may block or delay the exit. All the remaining cleanup is left to the OS.
I don't think we want to change this behavior, but clarifying the consequences in the documentation could be useful.
Environment
Include the result of the following commands:
nginx -V
uname -a
Description
Describe the bug in full detail including expected and actual behavior.
Specify conditions that caused it. Provide the relevant part of nginx
configuration and debug log.
to reproduce the issue and doesn't contain third-party modules
Hello,
we've started killing (sending SIGTERM) "old" nginx workers (nginx: worker process is shutting down) as we have regular configuration changes and a lot of websocket connections.
Since we do this, the counters from stub_status are incorrect.
nginx status
Adding up Writing and Waiting it's just 37324 instead of the 65369 "active connections". But even 37324 is too high. The correct number should be around this:
It's reproducable e.g. using echo.websocket.org:
nginx configuration
Test
Open 2 tabs in the browser with the URL: http://localhost:1234/.ws
Now there are 3 active connection (one connection is the request to /basic_status):
After that, reload the nginx process and you can see a nginx: worker process is shutting down process:
After we've killed the old process the websocket client in the browser will reconnect. After that we get two additional connection from stub_status even as the old connections are gone:
The text was updated successfully, but these errors were encountered: