Skip to content

Commit

Permalink
Fix KEYBD trap for multibyte characters (re: 4886463)
Browse files Browse the repository at this point in the history
The KEYBD trap should now be fully functional for UTF-8 and other
multibyte locales. Thanks to Johnothan King for finding this fix!

Analysis: The KEYBD trap code processes character code points
stored in e_lbuf by ed_read(). But shell variables store bytes, not
characters. So, in UTF-8 locales for example, the Unicode code
points need to be converted to multibyte UTF-8 encoding. This is
needed to calculate the length of each encoded character in bytes
(which fixes the corruption issue) and for keytrap() to store its
UTF-8 representation in ${.sh.edchar}.

src/cmd/ksh93/edit/edit.c: ed_getchar():
- Remove the workaround from the referenced commit.
- Use mbconv to convert innput codepoints to bytes before adding
  them to inbuff, a char array that is passed on to keytrap().

Related:  https://bugzilla.redhat.com/show_bug.cgi?id=1503922
Related:  att#197
Related:  #307
Resolves: #460
Co-authored-by: Johnothan King <[email protected]>
  • Loading branch information
McDutchie and JohnoKing committed Dec 30, 2024
1 parent b769451 commit dfd3ca9
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 11 deletions.
6 changes: 6 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@ This documents significant changes in the 1.0 branch of ksh 93u+m.
For full details, see the git log at: https://github.com/ksh93/ksh/tree/1.0
Uppercase BUG_* IDs are shell bug IDs as used by the Modernish shell library.

2024-12-30:

- The KEYBD trap should now be fully functional for multibyte characters
(for example, non-Latin characters in UTF-8 locales). This fixes a bug
inherited from AT&T and worked around on 2022-02-12.

2024-12-25:

- The dirname path-bound built-in now accepts multiple operands.
Expand Down
9 changes: 3 additions & 6 deletions src/cmd/ksh93/edit/edit.c
Original file line number Diff line number Diff line change
Expand Up @@ -836,14 +836,11 @@ int ed_getchar(Edit_t *ep,int mode)
{
if(mode<=0 && -c == ep->e_intr)
killpg(getpgrp(),SIGINT);
if(mode<=0 && sh.st.trap[SH_KEYTRAP]
/* workaround for <https://github.com/ksh93/ksh/issues/307>:
* do not trigger KEYBD for non-ASCII in multibyte locale */
&& (!mbwide() || c > -128))
if(mode<=0 && sh.st.trap[SH_KEYTRAP])
{
ep->e_keytrap = 1;
n=1;
if((readin[0]= -c) == ESC)
n = mbconv(readin, -c);
if(n==1 && readin[0]==ESC)
{
while(1)
{
Expand Down
2 changes: 1 addition & 1 deletion src/cmd/ksh93/include/version.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
#include <ast_release.h>
#include "git.h"

#define SH_RELEASE_DATE "2024-12-25" /* must be in this format for $((.sh.version)) */
#define SH_RELEASE_DATE "2024-12-30" /* must be in this format for $((.sh.version)) */
/*
* This comment keeps SH_RELEASE_DATE a few lines away from SH_RELEASE_SVER to avoid
* merge conflicts when cherry-picking dev branch commits onto a release branch.
Expand Down
4 changes: 0 additions & 4 deletions src/cmd/ksh93/sh.1
Original file line number Diff line number Diff line change
Expand Up @@ -9513,10 +9513,6 @@ Thus, a trap on
.B CHLD
won't be executed until the foreground job terminates.
.PP
In locales that use a multibyte character set such as UTF-8, the
.B KEYBD
trap is only triggered for ASCII characters (1-127).
.PP
It is a good idea to leave a space after the comma operator in
arithmetic expressions to prevent the comma from being interpreted
as the decimal point character in certain locales.

0 comments on commit dfd3ca9

Please sign in to comment.