Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement BASH_COMPLETION_FINALIZE_HOOKS #739

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

akinomyoga
Copy link
Collaborator

@akinomyoga akinomyoga commented Apr 12, 2022

This is an experimental implementation of hooks mentioned in #720 (comment).

This is still a stub; the design should be considered well, tests and documentation need to be added, etc. Nevertheless, I'd like to hear your thoughts in the current status.

  • 05b4159 This commit cares everything related to RETURN handling that I now recognize. I initially thought the proper handling might be more complicated, but it finally ended with this form.
  • bc6f87d This implements a function _comp_array_filter. I'm not sure whether Ville would finally find it reasonable to include the function in bash-completion, but I tried to make the interface minimal (one function _comp_array_filter) yet make the function versatile as @calestyo would be likely to request. In any way, this function can be used for the array manipulations of general cases.

@akinomyoga
Copy link
Collaborator Author

@calestyo Could you review it?

@calestyo
Copy link
Contributor

@calestyo Could you review it?

Will take a while...

@akinomyoga akinomyoga marked this pull request as draft April 12, 2022 23:09
@calestyo
Copy link
Contributor

Okay I did a "review" now,.. though admittedly it was more testing rather than reviewing (I'm really not that much of an expert with bash specific shell language features)..

Some things I've noticed about the general hook functionality:

  • the evals ... are those safe? I mean eval is always prone to code execution... what if something is completed that contains " or other shell meta characters?
  • BASH_COMPLETION_FINALIZE_CMD_HOOKS and BASH_COMPLETION_FINALIZE_HOOKS are always set, even if the feature isn't used at all... would be nice if that could be avoided
  • the "problem" with BASH_COMPLETION_FINALIZE_HOOKS is that it's always called... (which I'd assume is rarely needed), while BASH_COMPLETION_FINALIZE_CMD_HOOKS allows filtering on commands only... what would IMO be quite useful is to select based on the compeltion function (e.g. _known_hosts_real)... do you intend this to be done via BASH_COMPLETION_FINALIZE_HOOKS and the actual user function looking for the function in the stack?
  • when the user hook function isn't defined, it gives an error... but that may actually be nice and make sense
  • doesn't the way via trap affect all function (i.e. also ones not from bash-completion)? If so I'd be quite scared if that doesn't have any side effects... plus the function will be called everywhere, not just in cases where some users would actually need it?

Some things about _comp_array_filter:

  • It's quite impressive (and complex, at least for me ^^)... especially the evals ... I always fear that these could allow "breaking out"... imagine something that is completed which is under the control of some remote party, e.g. a git branch name,... and that somehow contains meta characters to break the quoting and then a command (rm -rf /). Sure this isn't possible.
  • It's always loaded, whether used or not. So while I very much like the idea, to have such utility function shipped with bash-completion... wouldn't it be better to place it in some /usr/share/bash-completion/utilities/array.sh or so, where people could source it from their hook functions if needed?
  • There is not really any difference from using -G and just using * in the pattern, than using it with -s, -m and -p, is there? I'd guess you just allow that, because it works, so why not, but in principle it wouldn't be needed (unlike with -F)?
  • When using -G, then the pattern a'*'b or "a*b" or similar... should also work as literal, right? So in principle one could skip -F and just tell people they'd need to quote any pattern matching notation characters?
  • Again, all those places where you use unquoted variables (like $__comp_value or $__comp_pattern* can't these be used for code injection?
  • -E didn't seem to work as documented...? I.e. not as ERE.

But as for my simple tests... all seemed to work (other than -E).

Thanks,
Chris

@calestyo
Copy link
Contributor

(didn't check with your two most recent commits, but AFAICS, these anyway just rename stuff)

@calestyo
Copy link
Contributor

calestyo commented Apr 13, 2022

With side effects above, I especially meant:

  • What if _comp_finalize is called (because of the trap) from any unrelated user function, that has e.g. _comp_finalize__depth overriden?
  • bash has a pretty... special way of handling local and unset... with the "unshadowing" and so on... in _comp_finalize you call unset ... what if the trapped function had its own (local) _comp_finalize__depth[-1] and you unset actually that?
  • what if the user hook is poorly written, and unsets stuff (or even multiple invocations of unset, going out through the layers of localed vars of that name) from the trapped function?

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Apr 13, 2022

Thanks for your review.

@scop Sorry for making long discussion. You can just skip the section of "Bash language discussion".

Bash language discussion

First, for the comments on eval and code injections. I take these comments as invalid ones. The reasons are explained here. If you believe there are some mistakes in the code, could you point it more specifically rather than expressing a vague fear?

  • the evals ... are those safe? I mean eval is always prone to code execution...

When we pass uncontrolled strings (that would be picked up from something the user did not prepare) that are not intended to be executed to eval, of course, it is unsafe. If we just pass the strings that are safe, it is safe. Or, if we pass the strings that are provided for the execution, using eval is nothing different from running as a function.

  • For example, var='echo hello'; eval "$var" is safe even though you might still think it is unsafe. A corner case might be that another user makes the variable var readonly and assigns some command to that variable, but the game has been already over at the time when the other user had access to the shell variable. If we start to care about such cases, we cannot either use arithmetic evaluations at all.
  • Another case is the arguments or variables that are intended to be executed by the command/function. For example, the builtin command receives the command name $1 and arguments $2 $3... and executes it, but this is not considered vulnerability because it is intended. If there are any problems, that is caused by the fact that someone had access to the command.

Should I explain it one by one?

eval "((\${#$1[@]}))" || return 0

eval "_comp_local_indices=(\"\${!$1[@]}\")"

eval "_comp_local_value=\${$1[\$_comp_local_index]}; $_comp_local_predicate"

eval -- "((\${#$1[@]})) && $1=(\"\${$1[@]}\")"

In these lines, $1 is already checked in the preceding code that it only contains a valid variable name. $_comp_local_predicate is the argument that is passed to _comp_array_filter so that it is executed by _comp_array_filter. There are no other strings expanded in these evals.

eval -- "$_comp_local_hook"

eval -- "$_comp_local_hook"

eval -- "${_comp_finalize__original_return_trap:-trap - RETURN}"

$_comp_local_hook contains strings that have been specified to BASH_COMPLETION_FINALIZE_{,CMD_}HOOKS. This is not different from executing a function name without using eval because one could anyway include any codes in the function. $_comp_finalize__original_return_trap contains the output of trap -p RETURN, so it contains the valid command. The output of trap -p RETURN contains another code of the trap handler, but those strings are the ones that were originally set, so evaluating $_comp_finalize__original_return_trap just restores the original state. There is no way to rewrite the global shell variable _comp_finalize__original_return_trap in Bash without full access to the shell, so we can assume that _comp_finalize__original_return_trap is not rewritten. If it has been rewritten to something unexpected strings, that means the game is over at the time when the variable is rewritten because one could run arbitrary commands at that timing.

what if something is completed that contains " or other shell meta characters?

In particular, we do not pass any strings generated by completions to eval.

  • It's quite impressive (and complex, at least for me ^^)... especially the evals ... I always fear that these could allow "breaking out"... imagine something that is completed which is under the control of some remote party, e.g. a git branch name,... and that somehow contains meta characters to break the quoting and then a command (rm -rf /). Sure this isn't possible.

This is also the same as the previous comment: eval is not used for the strings that are generated by completions. The code you are afraid of has the structure result=$(some external command); eval "var=\$result" (note that $ is quoted), where eval runs the string var=$result. It doesn't execute var=<some expanded results>.

  • Again, all those places where you use unquoted variables (like $__comp_value or $__comp_pattern* can't these be used for code injection?

No, you should carefully look at the quoting.

Also, it should be noted that the quoting is unneeded in the right-hand side of the variable assignments, after the case keyword, and in the conditional commands (unless it contains $* or ${arr[*]} where some versions of Bash have bugs) because word splitting and pathname expansions are not performed in these contexts. The quoting is supposed to be unneeded also in the here strings for the same reason, but again some versions of Bash have a bug, so we actually need to quote the word of here strings if we cover older Bash versions. (But bash-completion doesn't recommend using here strings to begin with, which are documented somewhere in CONTRIBUTING or doc/styleguide.txt.)

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Apr 13, 2022

Bash language discussion

These are also the points or just questions that I feel are caused by the lack of your Bash knowledge or the lack of programming experience:

  • the "problem" with BASH_COMPLETION_FINALIZE_HOOKS is that it's always called... (which I'd assume is rarely needed), while BASH_COMPLETION_FINALIZE_CMD_HOOKS allows filtering on commands only... what would IMO be quite useful is to select based on the compeltion function (e.g. _known_hosts_real)... do you intend this to be done via BASH_COMPLETION_FINALIZE_HOOKS and the actual user function looking for the function in the stack?

If you don't want to run them for every completion, you can just keep the array BASH_COMPLETION_FINALIZE_HOOKS empty. The reason I added the array is that you have requested the command name pattens to avoid specifying a hook for each of ssh, scp, ... If we would support these patterns, we anyway need to test the command name against the patterns for every completion. Unless running something for every completion, we cannot support any patterns for the command names.

  • when the user hook function isn't defined, it gives an error... but that may actually be nice and make sense

Oh, really? I'm sourcing the updated bash_completion without any hooks, but I don't see any error. Could you provide me with the error message? If you are saying this without actually trying it, you want to learn that eval '' just does nothing without any errors.

  • doesn't the way via trap affect all function (i.e. also ones not from bash-completion)? If so I'd be quite scared if that doesn't have any side effects... plus the function will be called everywhere, not just in cases where some users would actually need it?

If we didn't remove the trap handler, yes, the trap affects all the callers. But I'm removing the trap in the following line so that the trap doesn't affect the callers.

eval -- "${_comp_finalize__original_return_trap:-trap - RETURN}"

As for the callees, DEBUG and RETURN traps will not be inherited unless the callee function is marked with the trace attribute by declare -ft funcname. The function marked with the trace attribute intentionally captures all the RETURN actions, so I don't think that is the problem.

  • What if _comp_finalize is called (because of the trap) from any unrelated user function, that has e.g. _comp_finalize__depth overriden?

I don't see the point. Doesn't it apply to all the functions ***? That is, What if *** is called in a wrong way? If that is considered the problem, we cannot define even a single function in bash_completion. If they know the usage of ***, they will never call the function *** in a wrong way.

  • bash has a pretty... special way of handling local and unset... with the "unshadowing" and so on... in _comp_finalize you call unset ...

If you are talking about the dynamic unset (i.e., the unset on the previous-scope variable placeholder), that doesn't apply to the unset of the array element; More precisely the array-element unset doesn't remove the variable placeholder nor set the variable unset state unless the subscript is * or @, so the difference between dynamic unset and local unset is irrelevant.

what if the trapped function had its own (local) _comp_finalize__depth[-1] and you unset actually that?

I don't see the point again the same as the second previous comment. Doesn't that also apply to any global variables ***? That is, What if any function changes the variable ***? If that is the problem, we cannot define even a single global variable including the configuration variables. The caller or other functions just shouldn't touch the variable _comp_finalize__depth. The variable name is chosen so as to avoid an unexpected clash (cf #539, #537, #731).

  • what if the user hook is poorly written, and unsets stuff (or even multiple invocations of unset, going out through the layers of localed vars of that name) from the trapped function?

This is also similar to the other cases. If we start to care about it, we cannot use any kind of hook functions. We might run it inside a subshell, but that means we don't allow the hook functions modifying the state of the parent shell. Or do you suggesting developing a general-purpose IPC scheme for the shell? That's overcomplicated for bash-completion and even for the shell scripting itself. If you are suggesting a completely separated sandbox environment for the command execution inside the same process, you should ask Chet, which I'm pretty sure that it will be rejected.


Edit: There was an oversight.

  • When using -G, then the pattern a'*'b or "a*b" or similar... should also work as literal, right? So in principle one could skip -F and just tell people they'd need to quote any pattern matching notation characters?

Yes.

@akinomyoga
Copy link
Collaborator Author

And these points are something that I think we can discuss:

  • BASH_COMPLETION_FINALIZE_CMD_HOOKS and BASH_COMPLETION_FINALIZE_HOOKS are always set, even if the feature isn't used at all... would be nice if that could be avoided

Maybe. Preparing the default (empty) setup is my preference. Is there anything more than preferences?

  • It's always loaded, whether used or not. So while I very much like the idea, to have such utility function shipped with bash-completion... wouldn't it be better to place it in some /usr/share/bash-completion/utilities/array.sh or so, where people could source it from their hook functions if needed?

Yes, this one is also what I thought in implementing this function. I think we may e.g. extend the usage of _xfunc (that is going to be replaced by _comp_xfunc in #734). Currently the search path is only in completions/* for command completions, but we may consider adding helpers in the search path of the xfunc mechanism, but that is another mechanism that should be discussed in a separate PR.

  • There is not really any difference from using -G and just using * in the pattern, than using it with -s, -m and -p, is there? I'd guess you just allow that, because it works, so why not, but in principle it wouldn't be needed (unlike with -F)?

Yes, that is just for completeness and consistency with -F. I have actually even thought it for -E to automatically surround the regex by ^(...) (-p), (...)$ (-s), ^(...)$ (default), and not surround with -m, but I have just felt that's too much. But I still think that is also one possible design.

  • -E didn't seem to work as documented...? I.e. not as ERE.

Thanks for pointing it out. That is just a mistake. I actually have already fixed it in 8809534c (Yeah, I have mixed the changes in one commit, but I will squash all of these changes at some point, so I have just become lazy here).

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Apr 13, 2022

For the eval-related matter, to avoid confusion like this, we can use name references if we could update the requirements on the Bash version to 4.3+, yet that doesn't yet eliminate all the occurrence of eval.


  • There is not really any difference from using -G and just using * in the pattern, than using it with -s, -m and -p, is there? I'd guess you just allow that, because it works, so why not, but in principle it wouldn't be needed (unlike with -F)?

Edit: d06a9b4 I have added -x for the exact matching and experimental support for -psmx with -E. The default matching with -F is now changed to -m. This is just experimental support, so we can anytime drop the commit.

  • BASH_COMPLETION_FINALIZE_CMD_HOOKS and BASH_COMPLETION_FINALIZE_HOOKS are always set, even if the feature isn't used at all... would be nice if that could be avoided

Edit: f1f454b I have decided to just give the associative/index array attributes to the configuration variables BASH_COMOPLETION_FINALIZE{,_CMD}_HOOKS.

@calestyo
Copy link
Contributor

I take these comments as invalid ones.

I hadn't said that there were injection holes,… I’ve asked whether there could be! ;-)

could you point it more specifically rather than expressing a vague fear?

Well I didn't mean to spread FUD, just wanted to point out, whether this has been checked thoroughly.

When we pass uncontrolled strings (that would be picked up from something the user did not prepare) that are not intended to be executed to eval, of course, it is unsafe.

Sure, and these were the things I had in mind. I mean especially the words that are completed may not be under the control of the user (e.g. git branch/tag names and similar).

If we just pass the strings that are safe, it is safe.

Of course

Or, if we pass the strings that are provided for the execution, using eval is nothing different from running as a function.

Unless of course, it would contain a string from the first kind (uncontrolled strings).

A corner case might be that another user makes the variable var readonly and assigns some command to that variable, but the game has been already over at the time when the other user had access to the shell variable.

Quite smart... I wouldn't even had thought about that. The read-only status is never "inherited" like when calling su or so, or is it? Cause than it might be abused. But other than that I'd say you're right and if an attacker can set such variable read-only, it's anyway already too late.

If we start to care about such cases, we cannot either use arithmetic evaluations at all.

What could happen there?

For example, the builtin command receives the command name $1 and arguments $2 $3... and executes it, but this is not considered vulnerability because it is intended.

Sure, unless of course, $1 would be an uncontrolled string.

I did the validation again... first for _comp_array_filter():

case $_comp_local_opt in
[EFG]) _comp_local_pattype=$_comp_local_opt ;;
[psmx]) _comp_local_anchoring=$_comp_local_opt ;;
[rC]) _comp_local_flags=$_comp_local_opt$_comp_local_flags ;;

IFS should be no problem here, as there's not field splitting in assignments. So good.

shift $((OPTIND - 1))

Isn't that arithmetic expansion subject to field splitting? So in principle, if one uses like many -E and so that the value gets more digits and IFS is such digit, it could be split to multiple fields, right? So I'd quote that.

if (($# != 2)); then

I'm too blind to find what happens with ((…)),… is the $# in that subject to field splitting (thus unusual IFS could cause troubles again, when it's not double quoted)?

elif [[ ! $_comp_local_pattype && $1 == value ]]; then

Doesn't that already protect from any shenanigans with read-only vars?

eval "((\${#$1[@]}))" || return 0

I see why it's safe with respect to the use of $1,... but similar to above, I don't understand whether or not an unusual IFS could cause troubles (I mean the IFS in the subshell that does the eval - the outer thing is quoted, so there cannot be any field splitting, but the inner is not? Don't think this could be used for an attack, but perhaps a syntax error?).

I'd guess here IFS could cause troubles, as the inner expansion is quoted:

eval "_comp_local_indices=(\"\${!$1[@]}\")"

But why are you doing it that way? To support associative arrays?

elif [[ $1 != [a-zA-Z_]*([a-zA-Z_0-9]) ]]; then

AFAIU, bash’s [[…]] is not subject to field splitting, right? So again, no problem if IFS would contain characters in the variable name.

Same for:

p) _comp_local_predicate='[[ $_comp_local_value =~ ^($_comp_local_pattern) ]]' ;;

and similar uses of [[…]].

Which is, I guess, why it's save to use e.g. $_comp_local_value which AFAICS would contain uncontrolled strings (e.g. the completion words) later in the eval, right? And also because AFAIU, in [[…]] and unquoted $var is still one field, even if empty/unset ... so it's always e.g. 3 fields here and thus clear what bash does, right?

if declare -F "$2" &>/dev/null; then
_comp_local_predicate="$2 \"\$_comp_local_value\""
else
_comp_local_predicate="local value=\$_comp_local_value; $2"
fi

Couldn't you use value right away as variable name therefore saving the extra assignment of local value=\$_comp_local_value (though not really needed).
Apart from that... value would be local, so if $2 is a non-special built-in command (i.e. either a regular built-in or non-built-in)... wouldn't that require an additional export value to get the variable into the command?

I mean $2 could be any shell command in that case,... or that's what you want, I guess? If you'd just allow a simply command you could make the assignment part of that.
But I guess allowing not just simple commands is better.

local _comp_local_indices _comp_local_index _comp_local_value
eval "_comp_local_indices=(\"\${!$1[@]}\")"
for _comp_local_index in "${_comp_local_indices[@]}"; do
eval "_comp_local_value=\${$1[\$_comp_local_index]}; $_comp_local_predicate"
case $? in
"$_comp_local_expected_status") continue ;;
[01])
unset -v "$1[\$_comp_local_index]"

I had written before, that I wonder whether the unset is safe... but I guess it probably is, cause it's anyway on an array. Or what would happen, if $2 is a shell command that by chance also unsets the very same element, and then the function repeats that?

*)
printf 'bash_completion: %s: %s\n' "$FUNCNAME" \
"the filter condition broken '${_comp_local_pattype:+-$_comp_local_pattype }$2'" >&2
return 2
;;

First, is there an is missing in the error message?
Second, couldn't it be, that e.g. the first few times, the command (or shell function) works and returns 0 or 1 and then suddenly it stops working... and gives another exit status.
Then you'd still return while the original array might have already been modified... and the compactification is also not performed?

In these lines, $1 is already checked in the preceding code that it only contains a valid variable name.

Yeah,... I... kinda missed that ^^

$_comp_local_predicate is the argument that is passed to _comp_array_filter so that it is executed by _comp_array_filter. There are no other strings expanded in these evals.

Well $2 is.. but that's also user controlled... so should be ok.

@calestyo
Copy link
Contributor

calestyo commented Apr 14, 2022

Oh and some more about _comp_array_filter() which are however not really part of evaluation but rather just matter of personal taste, I guess:

  • I probably wouldn't even off the anchoring, but leave that as an exercise to the user (i.e. telling that EREs are unanchored, and globs need to be quoted in order to be taken literal). That would also make -F unneeded. But as said, purely a matter of taste.

  • I personally would think, that for -F -x should be the default... if one has an exact string one typically wants exact matches and not "anywhere in the string" matches.

  • Conversely, for -G I'd use -m per default.

  • Since your function documentation is pretty nice and exact, I'd further add that: -E, -F and -G are neither mutually exclusive, nor last one wins or so... but the first of and in those order as documented wins. Same for the anchoring options.

  • # -p Combined with -EFG, it performs the prefix matching.

    in these I'd write something like either one of -EFG to make it more clear, that you don't have to give all of them... okay should be anyway clear, but would look better.

  • It may also make sense to write "Ignored otherwise." for the non--EFG case.

@calestyo
Copy link
Contributor

Now for the other functions:

eval -- "$_comp_local_hook"

eval -- "$_comp_local_hook"

eval -- "${_comp_finalize__original_return_trap:-trap - RETURN}"

These were clear before... You rightly assume that I got mostly scared by:

eval "_comp_local_value=\${$1[\$_comp_local_index]}; $_comp_local_predicate"

(and also by the [[…]] with unquoted variables)

but as you say:

In particular, we do not pass any strings generated by completions to eval.

That doesn't really contain the contents.

((${#_comp_finalize__depth[@]})) || return 0

Same as above and for all cases of ((…)) with unquoted parameter expansions in them... is that possibly affected by IFS?

Again, all those places where you use unquoted variables (like $__comp_value or $__comp_pattern* can't these be used for code injection?

No, you should carefully look at the quoting.

The main point I had been missing here was that [[...]] doesn't do field splitting... I guess that's why it's safe.

for _comp_local_hook in "${BASH_COMPLETION_FINALIZE_HOOKS[@]}"; do

Shouldn't _comp_local_hook be made local? You do it in the if above, but if that's not entered?

These are also the points or just questions that I feel are caused by the lack of your Bash knowledge or the lack of programming experience:

Well the former might be the case... not sure about the latter, but it was around 02:00 in the night when I reviewed it first and I guess I was a bit too tired after the Fantastic Beasts movie in the cinema (not that the movie would have been tiring ^^).

If you don't want to run them for every completion, you can just keep the array BASH_COMPLETION_FINALIZE_HOOKS empty. The reason I added the array is that you have requested the command name pattens to avoid specifying a hook for each of ssh, scp

Well it's clear that one can leave it empty. What I meant is, how do you intend to get a user hook that is called whenever _known_hosts_real (i.e. anything completes hostnames) returns?

I'd assume your intention is to use BASH_COMPLETION_FINALIZE_HOOKS and look at FUNCNAME?

when the user hook function isn't defined, it gives an error... but that may actually be nice and make sense

Oh, really? I'm sourcing the updated bash_completion without any hooks, but I don't see any error. Could you provide me with the error message? If you are saying this without actually trying it, you want to learn that eval '' just does nothing without any errors.

What I meant was the following, but I guess I simply misunderstood what you meant with command (i.e. any real shell command and not just a simple command)):

$ BASH_COMPLETION_FINALIZE_CMD_HOOKS=(ssh myfunc)
$ ssh <TAB>bash: myfunc: command not found

So I just meant to handle it gracefully (i.e. silently), when the function/program didn't exist... but since it's any shell command... I guess there's nothing we can do about it.

If we didn't remove the trap handler, yes, the trap affects all the callers. But I'm removing the trap in the following line so that the trap doesn't affect the callers.

So I guess that also means that _init_completion() on every completion and not just once? Forgive me my ignorance about the inner workings of bash-completion O:-) .

And I blindly assume that it's then also guaranteed that _comp_finalize() does run, when completing... in order to reset the trap?!

I don't see the point. Doesn't it apply to all the functions ***? That is, What if *** is called in a wrong way? If that is considered the problem, we cannot define even a single function in bash_completion. If they know the usage of ***, they will never call the function *** in a wrong way.

I somehow thought I'd had seen (when trying around) that any functions (not just completion functions were trapped on return and had _comp_finalize() called. Not sure what I did there or what I misinterpreted... at least I couldn't reproduce this anymore and saw it just running for completion functions (i.e. such called within a "completion context").

If you are talking about the dynamic unset (i.e., the unset on the previous-scope variable placeholder), that doesn't apply to the unset of the array element; More precisely the array-element unset doesn't remove the variable placeholder nor set the variable unset state unless the subscript is * or @, so the difference between dynamic unset and local unset is irrelevant.

Answers my question from before... so I guess that should also be completely fine.

what if the user hook is poorly written, and unsets stuff (or even multiple invocations of unset, going out through the layers of localed vars of that name) from the trapped function?

This is also similar to the other cases. If we start to care about it, we cannot use any kind of hook functions. We might run it inside a subshell, but that means we don't allow the hook functions modifying the state of the parent shell. Or do you suggesting developing a general-purpose IPC scheme for the shell? That's overcomplicated for bash-completion and even for the shell scripting itself. If you are suggesting a completely separated sandbox environment for the command execution inside the same process, you should ask Chet, which I'm pretty sure that it will be rejected.

I would have rather thought about some general words in the documentation that describes that whole feature, telling that care must be used on the shell commands used as hooks, that internal variables should all be local and that unset should only be called on such internal ones that had been localed in the same hook command, unless one knows what one's doing.

@calestyo
Copy link
Contributor

BASH_COMPLETION_FINALIZE_CMD_HOOKS and BASH_COMPLETION_FINALIZE_HOOKS are always set, even if the feature isn't used at all... would be nice if that could be avoided

Maybe. Preparing the default (empty) setup is my preference. Is there anything more than preferences?

No that was just about personal taste... seems you've already changed that.

Yes, this one is also what I thought in implementing this function. I think we may e.g. extend the usage of _xfunc (that is going to be replaced by _comp_xfunc in #734). Currently the search path is only in completions/* for command completions, but we may consider adding helpers in the search path of the xfunc mechanism, but that is another mechanism that should be discussed in a separate PR.

But then please give some example like tutorials. E.g. I didn't even know about _xfunc() until you've mentioned it and while all these functions do have some documentation, it's kinda difficult for newcomers to get the full picture of what is there and how things should be used. Or at least I haven't gotten that so far ^^

Yes, that is just for completeness and consistency with -F. I have actually even thought it for -E to automatically surround the regex by ^(...) (-p), (...)$ (-s), ^(...)$ (default), and not surround with -m, but I have just felt that's too much. But I still think that is also one possible design.

Well I personally, but again just a matter of taste, wouldn't do any auto-anchoring. That's how BREs/EREs/PCREs work by themsevels. OTOH, patterns, by themselves, are anchored so I would keep it like that.

For the eval-related matter, to avoid confusion like this, we can use name references if we could update the requirements on the Bash version to 4.3+, yet that doesn't yet eliminate all the occurrence of eval.

I think I'm fine now with all the evals, so unless there's any great benefit in terms of performance or so.. just leave it from my side.

Edit: f1f454b I have decided to just give the associative/index array attributes to the configuration variables BASH_COMOPLETION_FINALIZE{,_CMD}_HOOKS.

Not sure whether that's really better... because now, these don't show up when e.g. completing $ or in set yet they are still already there... might be confusing to people... but again probably also just a matter of taste.

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Apr 14, 2022

I take these comments as invalid ones.

I hadn't said that there were injection holes,… I’ve asked whether there could be! ;-)

OK, then that's fine!

I think I need to enclose these Bash lectures inside <details> ... </details> so that Ville can skip them. I'll also reorganize my previous comments using <details> ... </details>. I also recommend you reorganize your previous comments not to bother Ville with reading long Bash language discussions.

@calestyo Here's my template for the Bash language discussion.

<details><summary>Bash language discussion</summary>

Note: We need an empty line after the `<details>` line to use GFM inside `<details></details>`

</details>

Some Bash language discussion

Quite smart... I wouldn't even had thought about that. The read-only status is never "inherited" like when calling su or so, or is it?

Right, it's never inherited. The readonly attribute is just the concept of shell but not the one defined by the operating system.

If we start to care about such cases, we cannot either use arithmetic evaluations at all.

What could happen there?

See this:

$ function f { local i=$1; while ((--i)); do echo $i; done; }
$ f 10
9
8
7
6
5
4
3
2
1
$ declare -gr i='a[$(echo This is Attacker >&2)]'
$ f 10
bash: local: i: readonly variable
This is Attacker
bash: i: readonly variable

I'm too blind to find what happens with ((…)),… is the $# in that subject to field splitting (thus unusual IFS could cause troubles again, when it's not double quoted)?

Word splitting doesn't occur inside the arithmetic command ((...)).

eval "((\${#$1[@]}))" || return 0

I see why it's safe with respect to the use of $1,... but similar to above, I don't understand whether or not an unusual IFS could cause troubles (I mean the IFS in the subshell that does the eval - the outer thing is quoted, so there cannot be any field splitting, but the inner is not? Don't think this could be used for an attack, but perhaps a syntax error?).

(( ... )) is not a subshell but an arithmetic command where word splitting doesn't happen.

eval "_comp_local_indices=(\"\${!$1[@]}\")"

But why are you doing it that way? To support associative arrays?

The indexed array in Bash is sparse, i.e., the indices are not necessarily 0..${#arr[@]}. See this:

$ arr[10]=1
$ declare -p arr
declare -a arr=([10]="1")
$ echo ${#arr[@]}
1

elif [[ $1 != [a-zA-Z_]*([a-zA-Z_0-9]) ]]; then

AFAIU, bash’s [[…]] is not subject to field splitting, right?

Right, and that's the motivation for introducing the conditional command [[ ... ]] to replace [ ... ].

Which is, I guess, why it's save to use e.g. $_comp_local_value which AFAICS would contain uncontrolled strings (e.g. the completion words) later in the eval, right? And also because AFAIU, in [[…]] and unquoted $var is still one field, even if empty/unset ... so it's always e.g. 3 fields here and thus clear what bash does, right?

Right, and again that's the raison d'être for [[ ... ]].

But I guess allowing not just simple commands is better.

Currently, it does allow any commands including the ones that are not simple commands.

I had written before, that I wonder whether the unset is safe... but I guess it probably is, cause it's anyway on an array. Or what would happen, if $2 is a shell command that by chance also unsets the very same element, and then the function repeats that?

Bash first finds array for unset 'array[key]' and, once it finds array, it will never search another array even if key is not found in the first-found array. This is trivial to check; you should try this trivial stuff in your local environment by yourself before asking someone for it.

Oh and some more about _comp_array_filter() which are however not really part of evaluation but rather just matter of personal taste, I guess:

  • I probably wouldn't even off the anchoring, but leave that as an exercise to the user (i.e. telling that EREs are unanchored, and globs need to be quoted in order to be taken literal). That would also make -F unneeded.

If the string to search is stored in the variable, quoting the value isn't that easy. In a naive implementation, one needs to check the characters one by one in a loop. Or perform escaping in the loop of special characters that need to be quoted. Or use patsub_replacement of the coming Bash-5.2. In that sense, it'd be still nice to support -F in my opinion.

((${#_comp_finalize__depth[@]})) || return 0

Same as above and for all cases of ((…)) with unquoted parameter expansions in them... is that possibly affected by IFS?

No. This is the third time you ask about the word splitting inside the arithmetic command (( ... )), but the word splitting doesn't happen in the arithmetic command. You could try it by yourself before asking it three times.


shift $((OPTIND - 1))

Isn't that arithmetic expansion subject to field splitting? So in principle, if one uses like many -E and so that the value gets more digits and IFS is such digit, it could be split to multiple fields, right? So I'd quote that.

Ah, OK. Good point. I haven't thought about IFS being some digit characters. I'll fix it later. (edit: fixed 40a204e)

elif [[ ! $_comp_local_pattype && $1 == value ]]; then

Doesn't that already protect from any shenanigans with read-only vars?

What do you mean? The above line in the code is unrelated to the protection from external readonly variables.

if declare -F "$2" &>/dev/null; then
_comp_local_predicate="$2 \"\$_comp_local_value\""
else
_comp_local_predicate="local value=\$_comp_local_value; $2"
fi

Couldn't you use value right away as variable name therefore saving the extra assignment of local value=\$_comp_local_value (though not really needed).

That's also possible, but I would like to keep it possible to specify value to $1 (target array name) as far as it is not used for the user-supplied predicate.

Apart from that... value would be local, so if $2 is a non-special built-in command (i.e. either a regular built-in or non-built-in)... wouldn't that require an additional export value to get the variable into the command?

I haven't assumed that someone would specify the command that needs external commands because spawning a process for each element has too much cost. The shell command that I had in my mind is a builtin command/construct or a shell function where the local variables are available through the dynamic scoping. If one wants to spawn an external command, one can include export value in $2 or call the external command with tempenv (i.e., value=$value cmd args...). (edit: I have updated to accept external command as the command name and also make value exported variables 72ddb92)

*)
printf 'bash_completion: %s: %s\n' "$FUNCNAME" \
"the filter condition broken '${_comp_local_pattype:+-$_comp_local_pattype }$2'" >&2
return 2
;;

First, is there an is missing in the error message?

Well, that's intentional; I didn't intend to make it a complete sentence when I have written that (although making it a complete sentence is also another valid choice of course). That's just the same as

$ aaaaaa
bash: aaaaaa: command not found

Second, couldn't it be, that e.g. the first few times, the command (or shell function) works and returns 0 or 1 and then suddenly it stops working... and gives another exit status. Then you'd still return while the original array might have already been modified... and the compactification is also not performed?

Actually, I intend that the predicate always returns 0 or 1 unless there is a bug in the code. The error message is just to detect the wrong usage (note that exit status 2 typically means the wrong usage, i.e., a bug at the caller side). I wouldn't like to ensure any well-defined status after something went south by bugs, but if I were to support any, I would recover the original contents of the array rather than keeping the already half-modified contents. Touching (i.e., compactifying) the modified array after detecting exceptions seems to make the debugging harder, so I currently don't feel it's a good idea.

Would you like to have a mechanism to cancel the modification in the middle of the loop? Maybe I can support it with another special exit status e.g. 3, 27, etc. In that case, I think we shouldn't output any error messages. (edit: I have supported the exit status 27 7d7b961)

  • I personally would think, that for -F -x should be the default...

I don't feel that's very useful. The resulting array just contains multiple same strings. For example, that's not the default of grep.

  • Conversely, for -G I'd use -m per default.

Hmm, I'm again against the idea. The glob patterns in the shell is always the exact matching, e.g., in pathname expansions and in the right-hand side of [[ xx == pat ]].

  • Since your function documentation is pretty nice and exact, I'd further add that: -E, -F and -G are neither mutually exclusive, nor last one wins or so... but the first of and in those order as documented wins.

Maybe I miss your point, but with the current implementation, these are mutually exclusive, and the last-specified one does win. (edit: I have updated the description 1eb7a6e)

  • # -p Combined with -EFG, it performs the prefix matching.

    in these I'd write something like either one of -EFG to make it more clear, that you don't have to give all of them...

Thanks for the suggestion, I'd modify it in that way. (edit: updated 1eb7a6e)

  • It may also make sense to write "Ignored otherwise." for the non--EFG case.

I just feel it's verbose. I generally prefer concise, necessary and sufficient description.

for _comp_local_hook in "${BASH_COMPLETION_FINALIZE_HOOKS[@]}"; do

Shouldn't _comp_local_hook be made local? You do it in the if above, but if that's not entered?

Ah, right. This is an oversight in the change c8ec2ca. (edit: ba1f944)

Well it's clear that one can leave it empty. What I meant is, how do you intend to get a user hook that is called whenever _known_hosts_real (i.e. anything completes hostnames) returns?

That's not the scope of this PR as I have replied in the original issue. To put it explicitly, this PR doesn't provide the feature of modifying the behavior of _known_hosts_real.

I'd assume your intention is to use BASH_COMPLETION_FINALIZE_HOOKS and look at FUNCNAME?

No, _comp_finalize is only called for the functions that set the RETURN hook by calling _init_completion so will not be called for _known_hosts_real.

when the user hook function isn't defined, it gives an error... but that may actually be nice and make sense

Oh, really? I'm sourcing the updated bash_completion without any hooks, but I don't see any error. Could you provide me with the error message? If you are saying this without actually trying it, you want to learn that eval '' just does nothing without any errors.

What I meant was the following, [...] So I just meant to handle it gracefully (i.e. silently), when the function/program didn't exist...

I wouldn't like the slient failure unless it's not controllable. For the present case, the user can just register the hook when the command exists.

And I blindly assume that it's then also guaranteed that _comp_finalize() does run, when completing... in order to reset the trap?!

It's guaranteed unless some functions set their own RETURN trap and forget to reset it before it returns. Or unless some functions cause fatal error and the entire function call chain is canceled, which may happen with nounset, failglob, etc. But I'd say all of these cases are the fault of the function. One case that might be a problem is that the top-level external completion function that calls _init_completion also sets its own RETURN trap. Anyway, even if it failed to clear the RETURN trap, it will finally be cleared when _comp_finalize is called next time for the return from the top-level function call, so I don't think there is an actual impact. (edit: I have added the checks for the caller name in case something is wrong e2c4a54)

I would have rather thought about some general words in the documentation that describes that whole feature, telling that care must be used on the shell commands used as hooks, that internal variables should all be local and that unset should only be called on such internal ones that had been localed in the same hook command, unless one knows what one's doing.

I don't know, but that seems like general points that we need to care about when writing shell functions that would work with other frameworks but not the special points we need to be aware of in writing these hooks.

Yes, this one is also what I thought in implementing this function. I think we may e.g. extend the usage of _xfunc (that is going to be replaced by _comp_xfunc in #734). Currently the search path is only in completions/* for command completions, but we may consider adding helpers in the search path of the xfunc mechanism, but that is another mechanism that should be discussed in a separate PR.

But then please give some example like tutorials.

This is just an idea that I had. I just wrote it in case you are interested enough so that you could take a glance at the implementation of _xfunc, which is just a function with several lines. Looking at the actual picture is worth more than a thousand words. If you are not interested enough to do that, that's fine for me, so you can forget about the _xfunc matter.

edit: I have changed the function name so that it can be used through _comp_xfunc fd5526b. Now the function has been moved to completions/ARRAY (where ARRAY is a tentative choice) and can be called with, e.g.,

_comp_xfunc ARRAY filter -mF arr hello

Edit: f1f454b I have decided to just give the associative/index array attributes to the configuration variables BASH_COMOPLETION_FINALIZE{,_CMD}_HOOKS.

Not sure whether that's really better... because now, these don't show up when e.g. completing $ or in set yet they are still already there... might be confusing to people... but again probably also just a matter of taste.

What would you suggest then? The reason that I kept the attribute is that, otherwise, the users need to set it by themselves every time they set the hook to the associative array BASH_COMPLETION_FINALIZE_CMD_HOOKS. If the user forgets it, the command will be interpreted as an index after the arithmetic evaluation (that would probably be just 0). For the indexed array BASH_COMPLETION_FINALIZE_HOOKS, I actually think we do not have to worry about that, but it's good to set the attribute just for symmetry.

@akinomyoga akinomyoga force-pushed the BASH_COMPLETION_FINALIZE_HOOKS branch 2 times, most recently from d65cb92 to 7b48387 Compare April 17, 2022 22:28
@calestyo
Copy link
Contributor

Bash language discussion > What do you mean? The above line in the code is unrelated to the protection from external readonly variables.

I got the wrong line of code, I had meant this:

elif [[ $1 == @(_comp_local_*|OPTIND|OPTARG|OPTERR) ]]; then

But anyway... that of course doesn't protect (from read-only issues) either.

Right, it's never inherited. The readonly attribute is just the concept of shell but not the one defined by the operating system.

Sure, but I mean a shell could choose to inherit it e.g. via using non-valid POISX variable names in the environment with which another shell is invoked. IIRC that's what bash does when exporting functions.

Anyway.. I guess the point about read-only is settled.

That's also possible, but I would like to keep it possible to specify value to $1 (target array name) as far as it is not used for the user-supplied predicate.

Fair enough.

I haven't assumed that someone would specify the command that needs external commands because spawning a process for each element has too much cost. The shell command that I had in my mind is a builtin command/construct or a shell function where the local variables are available through the dynamic scoping.
If one wants to spawn an external command, one can include export value in $2 or call the external command with tempenv (i.e., value=$value cmd args...). (edit: I have updated to accept external command as the command name and also make value exported variables 72ddb92)

You do really make this quite perfectionist. 😊👍

Well, that's intentional; I didn't intend to make it a complete sentence when I have written that (although making it a complete sentence is also another valid choice of course). That's just the same as

Okay, but then I'd suggest to drop the the before filter… otherwise it sounds a bit strange.

Actually, I intend that the predicate always returns 0 or 1 unless there is a bug in the code. The error message is just to detect the wrong usage (note that exit status 2 typically means the wrong usage, i.e., a bug at the caller side). I wouldn't like to ensure any well-defined status after something went south by bugs, but if I were to support any, I would recover the original contents of the array rather than keeping the already half-modified contents. Touching (i.e., compactifying) the modified array after detecting exceptions seems to make the debugging harder, so I currently don't feel it's a good idea.

Well I have no particular expectations… just noted it and wanted to see what you think of it.

Would you like to have a mechanism to cancel the modification in the middle of the loop? Maybe I can support it with another special exit status e.g. 3, 27, etc. In that case, I think we shouldn't output any error messages. (edit: I have supported the exit status 27 7d7b961)

What I'd perhaps do is to define exit statuses like the following:

  • 0/1 as is
  • 2 error, any modifications to the array that already took are kept, but the function is left
  • any others are undefined and explicitly reserved for future use

That would give you the freedom to add any further future functionality, should it ever be needed, without causing possible breakage.
Your use of 27 is of course nice.

Maybe I miss your point, but with the current implementation, these are mutually exclusive, and the last-specified one does win.

With "mutually exclusive" I meant in the sense, that you don't get an error if more than one is specified.
Your commit 1eb7a6e fixes this nicely.

That's not the scope of this PR as I have replied in the original issue. To put it explicitly, this PR doesn't provide the feature of modifying the behavior of _known_hosts_real.

No, _comp_finalize is only called for the functions that set the RETURN hook by calling _init_completion so will not be called for _known_hosts_real.

Do you see any other "simple" way of getting what I desire (within the currently proposed framework), other than by setting a hook for any command that might do hostname completion?

Also, if the hook kicks in just based on the command, e.g. one that does exclusion of completions, might also just exclude such that match the string, but are perhaps from a completely different completed part of the command.

For example if I'd set for ssh a hook that filters out ^example$, my understanding is that it wouldn't just do so for the hostname, but also e.g. for -l, right?

Probably not such a big problem for me, as hostnames give rather typical patterns... but still.

I mean that was why I always thought it was nice to also do this on a per function base.

If you are not interested enough to do that, that's fine for me, so you can forget about the _xfunc matter.
edit: I have changed the function name so that it can be used through _comp_xfunc fd5526b. Now the function has been moved to completions/ARRAY (where ARRAY is a tentative choice) and can be called with, e.g.,

I actually think it's really nice. What I meant was rather that any other normal (possibly interested) user (who didn't follow this ticket) of this will likely not search for the goodies in the code - that's why some small documentation or tutorial could be nice "How to hook into completions?" which then give the example of:

_comp_xfunc ARRAY filter -mF arr hello

What would you suggest then? The reason that I kept the attribute is that, otherwise, the users need to set it by themselves every time they set the hook to the associative array BASH_COMPLETION_FINALIZE_CMD_HOOKS. If the user forgets it, the command will be interpreted as an index after the arithmetic evaluation (that would probably be just 0). For the indexed array BASH_COMPLETION_FINALIZE_HOOKS, I actually think we do not have to worry about that, but it's good to set the attribute just for symmetry.

Well I guess this is also just a matter of personal taste, so I leave that up to you as the developer of this.
I could have also gone by requiring the user to set the attributes (which would also make sure that the user actually realises what he's working with).
But I can really live with either way.

@akinomyoga
Copy link
Collaborator Author

Okay, but then I'd suggest to drop the the before filter… otherwise it sounds a bit strange.

What I'd perhaps do is to define exit statuses like the following:

  • 0/1 as is
  • 2 error, any modifications to the array that already took are kept, but the function is left
  • any others are undefined and explicitly reserved for future use

That would give you the freedom to add any further future functionality, should it ever be needed, without causing possible breakage.

Thanks, I have updated the descriptions and the error message. e5ccbd7


Do you see any other "simple" way of getting what I desire (within the currently proposed framework), other than by setting a hook for any command that might do hostname completion?

I don't think there is a way you think simple. I think the simplest way in this situation is actually to replace _known_hosts_real by overwriting it with the version defined in the user's .bashrc file.

If you want to track the upstream definition, you may use the approach mentioned in #720 (comment). Or, if you are using ble.sh, maybe you can borrow the pre-defined utility of this trick, e.g.,

ble/function#advice after _known_hosts_real "_comp_xfunc ARRAY filter -E COMPREPLY '^example'"

Actually, ble.sh already overwrites _parse_help, _longopt , and _parse_usage in bash_completion using this utility.

Also, if the hook kicks in just based on the command, e.g. one that does exclusion of completions, might also just exclude such that match the string, but are perhaps from a completely different completed part of the command.

For example if I'd set for ssh a hook that filters out ^example$, my understanding is that it wouldn't just do so for the hostname, but also e.g. for -l, right?

Right, so the PR doesn't exactly satisfy your needs in this sense.

I actually think it's really nice. What I meant was rather that any other normal (possibly interested) user (who didn't follow this ticket) of this will likely not search for the goodies in the code - that's why some small documentation or tutorial could be nice

Yes, I think we can add documentation for this stuff if this would finally go into bash_completion.

@calestyo
Copy link
Contributor

Thanks, I have updated the descriptions and the error message. e5ccbd7

# The other exit status is reserved and cancel the array filtering with an

That reads a bit unusual... exit statuses should be all other ones, so either Any other exit status or The other exit statuses.

Bash language discussion

I cannot test that right now since I'm away a few days over Easter... but...

I don't think there is a way you think simple.

Maybe I just miss something, but you set up the return trap in _init_completion(), which is e.g. called by _ssh() or _ping(), and the restoration of the old return trap happens when these (_ssh() or _ping()) are left, right?

These however also call _known_hosts_real()... so shouldn't _comp_finalize() be called also when e.g. _known_hosts_real() (as called from e.g. _ssh()) returns?

If so, wouldn't it be possible to use the user hook from BASH_COMPLETION_FINALIZE_HOOKS and check for the parent function in FUNCNAME and thereby get what I want?

Right, so the PR doesn't exactly satisfy your needs in this sense.

I still think it should get merged, because it does satisfy some of my use cases.

@calestyo
Copy link
Contributor

If you want to track the upstream definition, you may use the approach mentioned in #720 (comment).

That would of course work, too.

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Apr 19, 2022

That reads a bit unusual... exit statuses should be all other ones, so either Any other exit status or The other exit statuses.

Thanks, I'll later modify it. (edit: fixed d8bccad)

Bash language discussion

These however also call _known_hosts_real()... so shouldn't _comp_finalize() be called also when e.g. _known_hosts_real() (as called from e.g. _ssh()) returns?

No, it's not called as I explained in the following reply. _comp_finalize is only called twice for each completion against the retuns of _init_completion (_comp_finalize ignores it) and the caller of _init_completion.

quoted from #739 (comment)

As for the callees, DEBUG and RETURN traps will not be inherited unless the callee function is marked with the trace attribute by declare -ft funcname. The function marked with the trace attribute intentionally captures all the RETURN actions, so I don't think that is the problem.

I still think it should get merged, because it does satisfy some of my use cases.

It's technically possible by giving all the functions the trace attribute or by running Bash in the debug mode (extdebug or functrace), but I don't think it is a good design to affect all the returns in the call chain. It will make the entire processing slower. It will fail when another function wants to use the RETURN trap for another purpose. The affected range is too large compared to what to be achieved.

@akinomyoga akinomyoga force-pushed the BASH_COMPLETION_FINALIZE_HOOKS branch 2 times, most recently from 0b334d7 to 59e69cb Compare April 21, 2022 01:51
@calestyo
Copy link
Contributor

It's technically possible by giving all the functions the trace attribute or by running Bash in the debug mode (extdebug or functrace), but I don't think it is a good design to affect all the returns in the call chain. It will make the entire processing slower. It will fail when another function wants to use the RETURN trap for another purpose. The affected range is too large compared to what to be achieved.

Hmm I see. And AFAIU, you didn't like the idea of e.g. manually calling a hook from functions of interest?!

@akinomyoga
Copy link
Collaborator Author

Let me comment on the current situation of this technical experiment on the RETURN trap.

The approach by the RETURN trap still has the issue reported by #786. In particular, the RETURN trap may be skipped by SIGINT as described in #786 (comment). A possible workaround by an explicit trap ... INT was suggested in #786 (comment). I also confirmed that there seem to be no issues with the other signals that can possibly be caused by the user inputs. I later attempted to implement the idea by trap ... INT in 93c2a86, but it turned out that Bash's handling of trap ... INT in completion functions is broken as I explained in #827 (comment). I have submitted a patch to the upstream Bash, and the fix is applied recently in Bash's devel branch. However, I think it will not be applied to older versions of Bash.

@akinomyoga
Copy link
Collaborator Author

The approach by the RETURN trap still has the issue reported by #786.

I'm not sure if it is robust, but I posted a possible solution in #786 (comment).

@calestyo
Copy link
Contributor

Any expectation whether and when any of this might get merged?

@akinomyoga
Copy link
Collaborator Author

akinomyoga commented Mar 22, 2023

We need to discuss and make a consensus.

As for the "finalize" hook, I think we are positive in merging as far as the abovementioned problem is fixed in a robust way with sufficient testing. The bugfix I submitted to Bash would be in Bash 5.3, but it will take at least ten years for Bash <= 5.2 to go away from the market, so we need an additional mechanism to work around the problem. Currently, I feel the third solution in #786 (comment) would be the cleanest, but I haven't carefully tested it.

As for _comp_xfunc_ARRAY_filter, I honestly think it is not needed for bash-completion, though we might think of creating a separate Bash library that contains the function.

@calestyo
Copy link
Contributor

calestyo commented Apr 5, 2023

As for _comp_xfunc_ARRAY_filter, I honestly think it is not needed for bash-completion, though we might think of creating a separate Bash library that contains the function.

But I guess that would make it in practise mostly useless or better said: much harder to use.

bash-completion is shipped basically everywhere, while such a separate library would not.

@akinomyoga
Copy link
Collaborator Author

As for _comp_xfunc_ARRAY_filter, I honestly think it is not needed for bash-completion, though we might think of creating a separate Bash library that contains the function.

But I guess that would make it in practise mostly useless or better said: much harder to use.

I didn't mean to create a Bash library that contains only one function ARRAY_filter. You can create a general utility library and you can include the function as a part of an array library.

In that sense, even if we would include it in bash-completion, it is strange that only a single utility function, for a very specific purpose and unused by bash-completion itself, is provided as a part of bash-completion. People wouldn't expect bash-completion to have general utility functions and wouldn't try to find it in the first place.

Some people might be careful finding the utility possibly in the documentation and use it if bash-completion offers it, but I would expect such people would more easily find the utility if the utility would be shipped in an appropriate project that aims at providing utilities. bash-completion doesn't aim at being a general Bash library, and only contains the functions that bash-completion would use by itself.

bash-completion is shipped basically everywhere, while such a separate library would not.

This hears like the reason that we would include the array in bash-completion is to borrow the ubiquity of bash-completion. The function is not specifically related to bash-completion, and bash-completion is not a repository of distributing random scripts.

scop and others added 6 commits May 13, 2024 12:53
- deal with nested completion initializations
- rename the function `_comp_{return_hook => finalize}`
- add an associative array `BASH_COMPLETION_FINALIZE_CMD_HOOKS`
  originally suggested as `_comp_return_hooks` in Ref. [1]
- add a new array `BASH_COMPLETION_FINALIZE_HOOKS`

[1] scop#720 (comment)
@akinomyoga akinomyoga force-pushed the BASH_COMPLETION_FINALIZE_HOOKS branch from 21c1a88 to 8d41bf5 Compare May 13, 2024 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants