Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to $&here w/r/t forking #160

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

jpco
Copy link
Collaborator

@jpco jpco commented Dec 16, 2024

First of all -- rewrite $&here so that it more closely matches the other pipefork()ing primitives in prim-io.c, rather than the REDIR()-using primitives. There are aspects of PRIM(here) which match both sets of primitives (with this PR, $&here is the only pipefork()-style primitive which uses a defer_*() function), but I think that holistically $&here "fits" better with the pipefork() style.

Plus, the pipefork() style makes it more straightforward to actually wait for the forked-off process, which we do now, no longer leaking child processes. This fixes #150.

Lastly, we now compare the doc length to PIPE_BUF and if smaller, we just write the whole doc to the new pipe without forking at all. POSIX demands a PIPE_BUF of at least 512 bytes, and most systems use 4096, but I expect that even the smaller value catches a pretty large majority of heredoc uses in modern use. (Technically PIPE_BUF doesn't even indicate pipe capacity on a system, and some shells like bash at least do an active test at build time to probe the actual capacity, but it makes for a pretty reasonable lower bound and, again, I think this lower bound catches most cases). Not all OSes (e.g., the Hurd) define a PIPE_BUF; in that case we just fall back to always forking.

See the following for the performance impact (in the optimal case, where there are no other forks involved) of removing the forks:

$ time es -c 'echo <={for (i = `{seq 1 10000}) {%read <<< herestring.^$i\n}}'
herestring.10000

real	0m4.671s
user	0m0.091s
sys	0m1.894s
$ time ./es -c 'echo <={for (i = `{seq 1 10000}) {%read <<< herestring.^$i\n}}'
herestring.10000

real	0m0.097s
user	0m0.026s
sys	0m0.071s

Even when writing to a fork/exec'd binary, the difference is fairly significant (over 20% reduction in real time):

[jpco@jpco es-fork]$ time es -c '{for (i = `{seq 1 10000}) {cat <<< herestring.^$i\n}} > /dev/null'

real	0m15.968s
user	0m8.739s
sys	0m6.845s
[jpco@jpco es-fork]$ time ./es -c '{for (i = `{seq 1 10000}) {cat <<< herestring.^$i\n}} > /dev/null'

real	0m11.789s
user	0m7.499s
sys	0m3.977s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

$&here leaks child processes
1 participant