Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(windows): core needs to preprocess U+000D U+000A to U+000A in context before any other processing #10471

Open
mcdurdin opened this issue Jan 23, 2024 · 7 comments
Assignees
Milestone

Comments

@mcdurdin
Copy link
Member

mcdurdin commented Jan 23, 2024

see EDIT:
I am anticipating that this will become a problem soon (encountered while working on the keyboard debugger).

I think this belongs under LDML keyboardprocessor. On Windows at least, \r\n should always be deleted as a block for K_BKSP.

We may need to special-case this, testing for the presence of this pair at the end of the context and requesting 2 back-deletions rather than 1.

(Note: I think this will become visible with the move from action queue to action struct in #10441, and become more obvious after #10415 is implemented.)

EDIT:

Principle -- Engine MUST preprocess context from compliant apps to convert \r\n to \n before supplying to Core, and then when emitting into compliant apps, do the inverse, \n to \r\n. Note the Keyman Developer debugger also needs to consider doing this.

See PR #10697 Which implements the proposed algorithm in the engine but we want it in the core.

We track what the engine gives the core input context, and give the same pattern back. This then covers all combinations of \r \n \r\n.

Test cases will be needed around buffer limits (if the algorithm causes truncation).
Also testing the developer debugger vs the built and installed keyboard should be tested to make the developer debugger exhibits the same behaviour.
Testing on each platform also (linux, windows macos)

@keymanapp-test-bot keymanapp-test-bot bot added bug core/ Keyman Core labels Jan 23, 2024
@mcdurdin mcdurdin added this to the A17S31 milestone Jan 23, 2024
@mcdurdin
Copy link
Member Author

@rc-swag assigning to you but happy to discuss on who owns.

@srl295
Copy link
Member

srl295 commented Jan 26, 2024

Q: how does kmx handle this?

Is this going to be an issue for authors? that is, will keyboards see A\nB on some platforms and A\r\nB on others? dare I say, almost a normalization issue

@rc-swag
Copy link
Contributor

rc-swag commented Jan 28, 2024

I don't follow when will there be a need to backspace over a \r\n? Internally in the core the context will be invalidated on a "carriage return". On the platform side for Context-aware/Compliant apps the set_if_needed will only have the context up to the start of line.

@mcdurdin mcdurdin changed the title bug(core): LDML keyboardprocessor needs to delete U+000D U+000A as a unit on Windows feat(windows): engine needs to preprocess U+000D U+000A to U+000A before passing into Core Jan 29, 2024
@keymanapp-test-bot keymanapp-test-bot bot added feat windows/ and removed bug core/ Keyman Core labels Jan 29, 2024
@mcdurdin
Copy link
Member Author

mcdurdin commented Jan 29, 2024

Note: see edit in OP for change in perspective post discussion with Ross. Keeping the same issue for now.

Principle -- preprocess context from compliant apps to convert \r\n to \n, and then when emitting into compliant apps, do the inverse, \n to \r\n.

@mcdurdin
Copy link
Member Author

Q: how does kmx handle this?

Is this going to be an issue for authors? that is, will keyboards see A\nB on some platforms and A\r\nB on others? dare I say, almost a normalization issue

Yes we had the same discussion. Resolution is to normalize by Engine before passing into Core, given it applies mainly to Windows. Then keyboard authors will only ever see \n.

@mcdurdin
Copy link
Member Author

Internally in the core the context will be invalidated on a "carriage return". On the platform side for Context-aware/Compliant apps the set_if_needed will only have the context up to the start of line.

The context may include a \n -- some apps start a new context on a new para, others treat the entire text buffer as a single unit.

@mcdurdin
Copy link
Member Author

Note: we could add a hint to the keyboard compiler in the future to note that 0x0D will never be seen in context (noting that more work needs to be done on KMW for this to be the case).

@mcdurdin mcdurdin modified the milestones: A17S31, B17S1 Feb 3, 2024
@mcdurdin mcdurdin modified the milestones: B17S1, B17S2 Feb 17, 2024
@mcdurdin mcdurdin modified the milestones: B17S2, B17S3 Mar 3, 2024
@darcywong00 darcywong00 modified the milestones: B17S3, B17S4 Mar 16, 2024
@rc-swag rc-swag modified the milestones: B17S4, 18.0 Mar 20, 2024
@rc-swag rc-swag changed the title feat(windows): engine needs to preprocess U+000D U+000A to U+000A before passing into Core feat(windows): core needs to preprocess U+000D U+000A to U+000A in context before any other processing Mar 22, 2024
@mcdurdin mcdurdin modified the milestones: 18.0, A18S9 Apr 29, 2024
@darcywong00 darcywong00 modified the milestones: A18S9, A18S10 Aug 31, 2024
@darcywong00 darcywong00 modified the milestones: A18S10, A18S11 Sep 14, 2024
@darcywong00 darcywong00 modified the milestones: A18S11, A18S12 Sep 28, 2024
@darcywong00 darcywong00 modified the milestones: A18S12, A18S13 Oct 11, 2024
@darcywong00 darcywong00 modified the milestones: A18S13, A18S14 Oct 26, 2024
@darcywong00 darcywong00 modified the milestones: A18S14, A18S15 Nov 9, 2024
@rc-swag rc-swag modified the milestones: A18S15, A18S20 Nov 11, 2024
@rc-swag rc-swag modified the milestones: A18S20, 19.0 Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

4 participants