Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutated vowels are not rendered correctly while entering text in an UTF8-t3 file using the curses interface. #4

Open
goto40 opened this issue Jul 16, 2019 · 1 comment

Comments

@goto40
Copy link

goto40 commented Jul 16, 2019

I use https://github.com/mikawa/G-TADS to create a german TADS project.
I installed the 'de_de' files from there.

I created an example project with

  • an utf8 charset and
  • a starting room containing mutated vowel (ÄÖÜäöüß)

beispiel.t

#charset "utf8"
//...
#include <adv3.h>
#include <de_de.h>
//...
startRoom: Room 'Startraum'
    "Das ist der Startraum. Umlaute sind ÄÖÜ äöü ß etc... Gib 'über' ein..."
;

Then i compile my project and I run the t3-file:

frob -k utf8 beispiel.t3

The output is ok (I see the mutated vowel rendered correctly with the curses interface).

However, if I enter a mutated vowel I get "??" instead of the vowel (e.g. '??ber' instead of 'über'). The text is interpreted correctly, only the rendered input text is wrong.

Note: with the plain interface works as expected.

@realnc
Copy link
Owner

realnc commented Jul 26, 2019

The part of the TADS base code that deals with character-oriented displays (osgen3.c) does not actually support Unicode. It always assumes the text consists of one byte (8 bits) per character. UTF-8 output kind of works only because it's compatible with 8-bit text when sending it to the various osgen3 text functions. But these functions do not actually treat it correctly, even if it appears to display mostly correctly. They get the text length wrong and as such there's some glitches even on output.

Input, however, is where this problem really shows, because each single character matters and getting one character wrong results in the rest not being parsed correctly.

I'm not sure what to do about this. I'd need to implement an osgen3 replacement that is Unicode-aware. No immediate plans for this right now, so this bug will probably remain open for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants