When killing the child process, kill all the descendants as well #170

tomsotte · 2019-01-20T19:15:41Z

This is an attempt to fix the problem described by the issue #96.
If a child process is killed, the descendents of that child process won't be killed as well. This happens on Linux but not on Windows ¹.
The adopted solution is the "PID range hack" ² that uses the detached mode for spawning a process and then kills that child process by killing the PID group, using process.kill(-pid), effectively killing all the descendents.

Implementation

added an internal option killByPid as a reminder for the spawned child process that it will be detached and to kill it by PID
expanded and moved to a separate function the routine to kill the spawned process to killSpawned
the ChildProcess#kill method of the spawned child process will be replaced by the killSpawned routine, to kill by pid if necessary
the killSpawned routine signals that the child process has been killed, if it has been killed by the pid

I checked and all the tests pass.

This implementation also considers the issue #115 and shouldn't interfere with the detached/cleanup fix.

This is an attempt to fix the problem described by the issue sindresorhus#96. If a child process is killed, the descendents of that child process won't be killed as well. This happens on Linux but not on Windows [^1]. The adopted solution is the "PID range hack" [^2] that uses the `detached` mode for spawning a process and then kills that child process by killing the PID group, using `process.kill(-pid)`, effectively killing all the descendents. *Implementation* - added an internal option `killByPid` as a remained for the spawned child process that it will be `detached` and to kill it by PID - expanded and moved to a separate function the routine to kill the spawned process to `killSpawned` - the `ChildProcess#kill` method of the spawned child process will be replaced by the `killSpawned` routine, to kill by pid if necessary - the `killSpawned` routine signals that the child process has been killed, if it has been killed by the pid I checked and all the tests pass. This implementation also considers the issue sindresorhus#115 and shouldn't interfere with the detached/cleanup fix. [^1]: https://nodejs.org/api/child_process.html#child_process_subprocess_kill_signal [^2]: https://azimi.me/2014/12/31/kill-child_process-node-js.html

sindresorhus · 2019-03-06T17:03:08Z

index.js

+			spawned._kill(signal);
+		} else {
+			// Kills the spawned process and its descendents using the pid range hack
+			// Source: https://azimi.me/2014/12/31/kill-child_process-node-js.html


From reading the article, it looks like the PID trick only works when using {detached: true} which we cannot assume.

sindresorhus · 2019-03-06T17:03:22Z

This will needs tests.

tomsotte · 2019-03-21T16:39:32Z

The PID hack requires detached: true. My implementation tries to use it subtly but I understand it might not be the best solution.

I'm looking into simonepri/pidtree, a package similar to ps-tree suggested by the issue related to this PR. As written in the README, pidtree should be "faster, safer and provides sub-children results".

The basic implementation that comes to mind should go something like this: if it's Linux then use pidtree to collect all the PIDs related to this child_process and terminate all.
This should account for Windows, which does not need this kind treatment, since by default it should kill all descendants.
I'll try to see if macOS has the problem, but I won't be able to test on it.

tomsotte · 2019-03-22T13:32:58Z

Which one of these implementation is better?

Enhancing the kill method to kill all the descendants; the method will retain compatibility with child_process#kill and keep its sync nature; it would be more consistent with Windows and so execa would be more cross-platform; this implementation requires a function to gather all the descendants PIDs in a sync way, but could kill them later on (to not impact on performance);
Adding a new method killTree that would ensure killing all the descendants; execa would remain retro-compatible, cancel and kill methods wouldn't be touched; if implemented, should cancel use this method to ensure killing all children, like it would happen on Windows?
Changing the kill method to kill all the descendants; it would ditch it's compatibility with child_process#kill; I can implement it this way right now but this probably would require re-adapting some tests.

For the solution 1. pidtree and ps-tree are async by callbacks, which means this would require a new library to gather all the children PIDs in a sync way. Otherwise, if we kill the spawned process before collecting the children PIDs they would change their parent PID. All the impl. to gather the children PIDs use the parent PID, but if it changes it would be impossible to find them.

sindresorhus · 2019-03-25T03:49:53Z

Number 2, but .kill() and .cancel() should also kill descendants if the killTree option is set. The killTree option should mean that no matter how the process is killed, the descendants should also be killed.

sindresorhus · 2019-03-25T03:50:11Z

@ehmicky Any opinions on this?

ehmicky · 2019-03-25T10:39:37Z

I am wondering if the original problem was framed correctly because of two reasons:
a) Despite its name, kill() can also be used for IPC/messaging, not only terminating processes.
b) Some of the discussion might relate to failure termination, not only explicit signals termination / kill(). For example when a process is terminated because of some syntax error or segmentation fault: should child processes (even detached) be terminated as well or not? This question seems related to me.

I.e. for me the question is about failure termination, either external (signals / kill()) or internal (process abort, etc.). In that case what we would need to do instead would be killing descendants if the process failed, which would happen before makeError().

tomsotte · 2019-03-25T18:21:50Z

I am wondering if the original problem was framed correctly because of two reasons:
a) Despite its name, kill() can also be used for IPC/messaging, not only terminating processes.

True, so it would be better to leave that method alone and add a new one.

b) Some of the discussion might relate to failure termination, not only explicit signals termination / kill(). For example when a process is terminated because of some syntax error or segmentation fault: should child processes (even detached) be terminated as well or not? This question seems related to me.

By a quick test I've made on Linux, to sum up:

a process terminated because of syntax errors doesn't leave orphan child processes, even if they are ran detached
process.exit() or uncaught exception after a timeout do leave orphan child processes
also, ChildProcess#kill() do not terminate its descendants (but it does on Windows)

I.e. for me the question is about failure termination, either external (signals / kill()) or internal (process abort, etc.). In that case what we would need to do instead would be killing descendants if the process failed, which would happen before makeError().

On a practical note, on Linux, after a process has failed it leaves orphaned processes which can't be tracked as their PPID (which is the process PID) changes. We can't ensure killing the descendants once a process has failed.

I'm also wondering if this problem could and should be addressed by execa or not. There're other libs out there that already deals with killing the child processes tree. I believe that, in the form of a new method killTree, it is an enhancement if we treat it as "terminate the process and all it's descendants", but with other signals things could get complicated.

As a spec to follow we could compare what Windows and what Linux do and don't do well in terms of sending signals and terminating the process and it's descedants.

ehmicky · 2019-05-13T14:18:21Z

On a practical note, on Linux, after a process has failed it leaves orphaned processes which can't be tracked as their PPID (which is the process PID) changes. We can't ensure killing the descendants once a process has failed.

Yes you're right. I just checked it on Ubuntu and when a process is terminated, its still-running child processes become orphan, i.e. their ppid switch from the PID of the original parent process to the PID of /lib/systemd/systemd --user.

It does not seem to be possible to know the original parent after it has exited. Which means it can only be done while the parent is still running, which we cannot guarantee on signal termination like SIGKILL. So ideally we would make the cleanup option deep, but it seems like we cannot.

Based on that I agree with @tomsotte that the best thing to do would be to limit this PR to an explicit method (like killTree([signal])) that looks up descendants by PID/PPID and send a signal to all of them. What do you think @tomsotte?

Note: it would be nice also if that method could re-use the extra kill() options added by #208 (not merged yet).

tomsotte · 2019-05-13T16:27:59Z

For the killTree method implementation I'd use node-tree-kill under the hood. It seemed a good candidate when I researched a while ago. I'll assess this later on.

As @sindresorhus said:

.kill() and .cancel() should also kill descendants if the killTree option is set. The killTree option should mean that no matter how the process is killed, the descendants should also be killed.

What do we do about those two other ways of killing the process? Should they remain intact? Should they honour the option killTree?

Note: it would be nice also if that method could re-use the extra kill() options added by #208 (not merged yet).

Yes, I might look into adding this as well.

ehmicky · 2019-05-13T16:55:30Z

About using an option: I think we should either use an option with kill() and cancel(), or use a separate killTree() method, but not both, as it complicates the API (for the users) and the implementation (for us). Either works for me.

tomsotte · 2019-05-15T11:52:06Z

You're right on that.
I've implemented the killTree method with node-tree-kill and made a simple test as well. I'll wait for #208 to be finalized before trying to add the retry options.

I'm also thinking that this addition will require a different library, something like pidtree. This is because I need to collect all the PIDs in the tree, because the parent process may have been killed before its descendants and I wouldn't be able to find them again via the parent PID.

ehmicky · 2019-05-15T12:04:24Z

Instead of having two different strategies:

kill processes by PPID (with node-tree-kill) when no retry option is used.
find processes by PPID (with pidtree) then kill them by PID (with process.kill()) when retry option is used.

Wouldn't it be simpler to use only 2), whether retry is used or not?

node-tree-kill actually uses that second strategy under the hood for Mac and Linux. On Windows it uses taskkill /pid PID /T /F though, which is faster since it uses a single call, but I'm wondering if it's worth the extra implementation complexity?

tomsotte · 2019-05-15T16:42:53Z

Correct, it would be best to just use the 2) implementation as you said.

The method might use taskkill with the tree option for Windows, while on Linux and Mac it would run the whole tree of descendants processes and process.kill one by one.
This should cover the case where the parent is dead but some descendant is still alive, provided that we collected the PIDs tree of the parent process.

I was also looking at fkill but it spawns the kill process for each PID to kill. I'm worried about performance of spawning multiple processes and I'm not sure if it's worth using it for a whole tree of processes to kill, instead of using a more low-level process.kill.
I'll try to find some references to this and maybe find a way to benchmark it. In the meantime a first implementation might just use what I said above.

ehmicky · 2019-05-15T16:57:13Z

I agree. I actually thought of fkill as well, but I also think spawning a child process for each kill is... an overkill :)

ehmicky · 2019-06-18T14:44:28Z

I've been thinking about this PR and I actually think the following implementation might work better, let me know what you think:

const childProcess = execa('echo');
childProcess.kill('SIGTERM', {deep: true});

childProcess.kill('SIGTERM', {deep: true, forceKillAfterTimeout: 5000});

deep (default: false) would call process.kill(signal) on all descendants.

When it comes to finding all descendants, I am actually wondering whether pidtree is a good idea. The library looks good but it does not look maintained anymore, which is a problem.

I am also starting to think that this feature belongs to a separate repository. Or sending a PR to any repository that already does something similar. This could be useful outside of execa and is more about signal sending than process execution.

tomsotte · 2019-06-20T18:59:24Z

I am also starting to think that this feature belongs to a separate repository. Or sending a PR to any repository that already does something similar. This could be useful outside of execa and is more about signal sending than process execution.

The very reason I've started to look into this issue, that this PR is trying to fix, was that I was upset that on Windows the ChildProcess kill method would kill all the descendants processes and on *nix it wasn't.
On that note, yes, I agree that kill would have looked better than having a new killTree method, but I also think that it's best if kill stays in sync with the original ChildProcess implementation.

I agree with you that execa may not be the best place to add this feature.
I'll try to improve the PR on fkill instead.

EDIT: If you feel the same we can close this PR. Thanks in advance for your time spent on this and as always I've appreciated your mentoring and your insights :)

ehmicky · 2019-06-20T19:49:19Z

Yes I agree with you. I suggested using a deep option for kill() because we are already monkey-patching it and adding a options object (to force kill the child process after 5 seconds). However a separate killTree() method works as well for me, if you think it fits better. We just need to make sure the "force killing" feature can be combined with the "kill all descendants" feature.

I also agree that this is a useful feature but I do think the following might be better:

implementing in a different library (such as fkill). Hopefully without spawning one child process for each child process to kill.
then opening a new PR to integrate that library with execa.

What do you think?

sindresorhus · 2019-06-20T20:16:19Z

execa is a heavy enough dependency as it is. I don’t think we should add any more large dependencies here.

ehmicky · 2019-06-20T20:40:27Z

@sindresorhus do you think we should implement this feature directly inside execa? It might involve quite some logic and will also most likely bring some dependencies (such as finding processes by PPID).

A third solution would be to separate the logic into a new library for this specific purpose (hence smaller).

sindresorhus · 2019-06-23T13:05:19Z

I think we should open an issue on Node.js about providing this built-in.

sindresorhus · 2019-10-08T11:05:30Z

Closing per previous discussion. Would be great if someone could open that Node.js issue and comment a link to it here.

sindresorhus reviewed Mar 6, 2019

View reviewed changes

tomsotte added 2 commits March 10, 2019 18:28

Merge upstream/master

a82d9da

Merge remote-tracking branch 'upstream/master' into iss96

1725e3b

ehmicky changed the title ~~When killing the child process, kill all the descendents as well. Closes #96~~ When killing the child process, kill all the descendants as well. Closes #96 Mar 14, 2019

tomsotte changed the title ~~When killing the child process, kill all the descendants as well. Closes #96~~ When killing the child process, kill all the descendants as well Mar 21, 2019

tomsotte added 5 commits March 21, 2019 18:43

Merge remote-tracking branch 'upstream/master' into iss96

2b0cecd

Cleaned code and minimized changes

6aa5254

Refactored enhanced kill more consistently

c73e754

Added test, should fail on linux but pass on windows

d116af8

Fixed code styling

278fa7d

tomsotte mentioned this pull request May 15, 2019

Add feature to retry process killing after specified duration #208

Merged

sindresorhus closed this Oct 8, 2019

sindresorhus mentioned this pull request Oct 8, 2019

Ability to kill all descendents of the child process #96

Open

jely2002 mentioned this pull request Nov 14, 2021

yt-dlp process not exiting when stopping a download jely2002/youtube-dl-gui#237

Open

rarkins mentioned this pull request Jun 23, 2022

Interrupting npm install child process due to timeouts leaves orphaned npm processes renovatebot/renovate#16197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When killing the child process, kill all the descendants as well #170

When killing the child process, kill all the descendants as well #170

tomsotte commented Jan 20, 2019 •

edited

Loading

sindresorhus Mar 6, 2019

sindresorhus commented Mar 6, 2019

tomsotte commented Mar 21, 2019 •

edited

Loading

tomsotte commented Mar 22, 2019

sindresorhus commented Mar 25, 2019

sindresorhus commented Mar 25, 2019

ehmicky commented Mar 25, 2019 •

edited

Loading

tomsotte commented Mar 25, 2019 •

edited

Loading

ehmicky commented May 13, 2019 •

edited

Loading

tomsotte commented May 13, 2019 •

edited

Loading

ehmicky commented May 13, 2019

tomsotte commented May 15, 2019

ehmicky commented May 15, 2019 •

edited

Loading

tomsotte commented May 15, 2019

ehmicky commented May 15, 2019 •

edited

Loading

ehmicky commented Jun 18, 2019 •

edited

Loading

tomsotte commented Jun 20, 2019 •

edited

Loading

ehmicky commented Jun 20, 2019 •

edited

Loading

sindresorhus commented Jun 20, 2019

ehmicky commented Jun 20, 2019 •

edited

Loading

sindresorhus commented Jun 23, 2019

sindresorhus commented Oct 8, 2019

When killing the child process, kill all the descendants as well #170

When killing the child process, kill all the descendants as well #170

Conversation

tomsotte commented Jan 20, 2019 • edited Loading

Footnotes

sindresorhus Mar 6, 2019

Choose a reason for hiding this comment

sindresorhus commented Mar 6, 2019

tomsotte commented Mar 21, 2019 • edited Loading

tomsotte commented Mar 22, 2019

sindresorhus commented Mar 25, 2019

sindresorhus commented Mar 25, 2019

ehmicky commented Mar 25, 2019 • edited Loading

tomsotte commented Mar 25, 2019 • edited Loading

ehmicky commented May 13, 2019 • edited Loading

tomsotte commented May 13, 2019 • edited Loading

ehmicky commented May 13, 2019

tomsotte commented May 15, 2019

ehmicky commented May 15, 2019 • edited Loading

tomsotte commented May 15, 2019

ehmicky commented May 15, 2019 • edited Loading

ehmicky commented Jun 18, 2019 • edited Loading

tomsotte commented Jun 20, 2019 • edited Loading

ehmicky commented Jun 20, 2019 • edited Loading

sindresorhus commented Jun 20, 2019

ehmicky commented Jun 20, 2019 • edited Loading

sindresorhus commented Jun 23, 2019

sindresorhus commented Oct 8, 2019

tomsotte commented Jan 20, 2019 •

edited

Loading

tomsotte commented Mar 21, 2019 •

edited

Loading

ehmicky commented Mar 25, 2019 •

edited

Loading

tomsotte commented Mar 25, 2019 •

edited

Loading

ehmicky commented May 13, 2019 •

edited

Loading

tomsotte commented May 13, 2019 •

edited

Loading

ehmicky commented May 15, 2019 •

edited

Loading

ehmicky commented May 15, 2019 •

edited

Loading

ehmicky commented Jun 18, 2019 •

edited

Loading

tomsotte commented Jun 20, 2019 •

edited

Loading

ehmicky commented Jun 20, 2019 •

edited

Loading

ehmicky commented Jun 20, 2019 •

edited

Loading