Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Displaying Multibyte Characters #32

Open
Comamoca opened this issue May 25, 2022 · 13 comments
Open

Displaying Multibyte Characters #32

Comamoca opened this issue May 25, 2022 · 13 comments

Comments

@Comamoca
Copy link

Hello! I am creating a TUI application that displays Japanese using illwill.
However, I am having a problem with displaying Japanese characters with illwill, and I am wondering what to do about it.

  • How to display Japanese and other multibyte characters in illwill?
  • Whether illwill support multibyte characters in the future?

I would appreciate it if you could tell me about the above two questions.
Thank you!

@johnnovak
Copy link
Owner

johnnovak commented May 26, 2022

What platform are you using? On Windows, multibyte character support is problematic in the standard command prompt. It should work a lot better on Linux and MacOS.

Internally, illwill uses UTF-8 codepoints for everything, so things should just work. However, I haven't tested it specifically with languages that use multibyte characters heavily.

I'm suspecting that what you're experiencing is more like a console limitation.

Answering your questions:

  1. I don't know, I have no idea how Japanese characters work.
  2. It should already handle multibyte characters, need more info on what problems you're experiencing and what setup you have. Posting a test case here would help.

@Comamoca
Copy link
Author

Thanks for your reply!

I using platform Linux(Manjaro).

I will post the case together with the case in which the problem occurred.
This is a display corruption that occurs when executing the code described below on Wezterm.
I have tried running the same code on other terminal such as XfceTerminal, LXTerminal, Kitty, etc. and they all output the same results.

Screenshot
scshot_2022-05-28_04-29-20

Source Code

import illwill
import os
import json
import strformat
import strutils


# ------------------------------------
# Process to prepare content
# ------------------------------------


# Draw the preview screen on the left side of the screen
proc previewArea(tb: var TerminalBuffer, content: string) =
  let width = toInt(tb.width / 3)

  tb.setForegroundColor(fgYellow)
  tb.drawRect(tb.width - width - 1, 0, tb.width-1, tb.height-1)
  tb.write(tb.width - width, 1, content)

# Draw the selection menu on the right side of the screen
proc selectArea(tb: var TerminalBuffer, y: int, pos: int, datas: seq[Results]) =
  for i, data in datas:
    if i == pos:
      tb.setForegroundColor(fgBlack, true)
      tb.setBackgroundColor(bgGreen)
      tb.write(2, y+i+1, data.title)
    else:
      tb.write(2, y+i+1, data.title)
    tb.resetAttributes()


proc exitProc() {.noconv.} =
  illwillDeinit()
  showCursor()
  quit(0)

illwillInit(fullscreen = true)
setControlCHook(exitProc)
hideCursor()

# cursor position
var pos: int

while true:
  var tb = newTerminalBuffer(terminalWidth(), terminalHeight())
  var key = getKey()

  tb.selectArea(0, pos, results)
  tb.previewArea(results[pos].content)

  case key
  of Key.J:
    pos = pos + 1
    if results.len <= pos:
      pos = 0
  of Key.K:
    pos = pos - 1
    if pos < 0:
      pos = results.len - 1
      continue
  of Key.Escape, Key.Q: exitProc()
  of Key.Enter:
    exitProc()
    echo pos
  else: discard

  tb.display()
  sleep(20)

@johnnovak
Copy link
Owner

So judging by your screenshot, I think what's happening is that the Japanese characters physically seem to take up the width of two Latin characters, but they're encoded as multiple UTF-8 code points, most likely not two codepoints, but perhaps more?

Like I said, I don't speak Japanese and know very little about the Japanese language, the symbols, and how the symbols are encoded on computers. I just found this page and there seems to be a lot of complexity regarding Japanese encodings:

https://www.sljfaq.org/afaq/encodings.html

I'm somewhat interested in getting to the bottom of this as it might affect not just Japanese but other non-Latin languages as well. But you'll need to provide a program that I can compile and execute — the above program you posted does not compile; I'm guessing it's a part of a larger program, and it doesn't output any Japanese characters, so it doesn't allow me to reproduce the issue visually on my computer.

Please don't assume anything, and provide all the following:

  1. What character encoding are you using in your terminal (is it UTF-8? is it something else that you need to use to display Japanese characters correctly?)
  2. A self-contained Nim program that I can compile and run that outputs some lines of text in Japanese that demonstrates the alignment problem.
  3. If you put the same few lines of text into a textfile, and you display it in the console with echo in the same terminal, or you open the textfile in Vim (in the same terminal), do the alignment issues still happen? A textfile like that would be very useful for testing.

@Comamoca
Copy link
Author

The encoding used in the terminal is UTF-8.
In addition, Japanese fonts must be installed to correctly display Japanese characters on the terminal.
The font used on my terminal is UDEV-Gothic, but if you are using a Linux distribution or similar, noto-fonts-cjk is the easiest to install. There is probably no difference in display between the different fonts.

This is a program that reads a file named test.txt in the current directory and displays it on the right side of the screen and a rectangle that imitates a preview window on the left side of the screen.
Please place the attached text file in the same directory and run it.

2022-05-30_02-47-15

Also, the screenshot here is a shot of these files opened in Vim. I have not encountered any problems displaying Japanese.
2022-05-30_02-49-36

Programs

import illwill
import os
import strutils

proc loadTxtFile(path: string): seq[string] =
  block:
    var f : File = open("test.txt" , FileMode.fmRead)
    defer :
      close(f)
      echo "closed"
    return f.readAll().split("\n")
    #echo f.readLine()


proc drawText(tb: var TerminalBuffer, texts: seq[string]) =
  for idx, text in texts:
    tb.write(0, terminalHeight()-idx, text)

proc exitProc() {.noconv.} =
  illwillDeinit()
  showCursor()
  quit(0)

illwillInit(fullscreen = true)
setControlCHook(exitProc)
hideCursor()

var pos: int

var texts = loadTxtFile("test.txt")
while true:
  var tb = newTerminalBuffer(terminalWidth(), terminalHeight())
  var key = getKey()

  tb.drawText(texts)
  tb.drawRect(terminalWidth() div 2, 0, terminalWidth(), terminalHeight())

  case key
  of Key.Escape, Key.Q: exitProc()
  of Key.Enter:
    exitProc()
    echo pos
  else: discard

  tb.display()
  sleep(20)

Text file used for loading. (Save the file as test.txt.)

いろはにほへと ちりぬるを
わかよたれそ  つねならむ
うゐのおくやま けふこえて
あさきゆめみし ゑひもせすん

色は匂へど 散りぬるを
我が世誰そ 常ならむ
有為の奥山 今日越えて
浅き夢見じ 酔ひもせず

@johnnovak
Copy link
Owner

Cheers, I'm busy with other stuff now, I'll have a look at this at some point.

@forthlee
Copy link

Try to modify displayFull() in illwill.nim .

proc displayFull(tb: TerminalBuffer) =
  let widthTable = [
    (126,  1), (159,  0), (687,   1), (710,  0), (711,  1),
    (727,  0), (733,  1), (879,   0), (1154, 1), (1161, 0),
    (4347,  1), (4447,  2), (7467,  1), (7521, 0), (8369, 1),
    (8426,  0), (9000,  1), (9002,  2), (11021, 1), (12350, 2),
    (12351, 1), (12438, 2), (12442,  0), (19893, 2), (19967, 1),
    (55203, 2), (63743, 1), (64106,  2), (65039, 1), (65059, 0),
    (65131, 2), (65279, 1), (65376,  2), (65500, 1), (65510, 2),
    (120831, 1), (262141, 2), (1114109, 1)
  ]

  proc getWidth(c: int): int =
    if c == 0xe or c == 0xf:
      return 0
    for (num, wid) in widthTable:
      if c <= num:
        return wid
    return 1

  var buf = ""
  var skipNo = 0

  proc flushBuf() =    
    if buf.len > 0:
      put buf
      buf = ""

  for y in 0..<tb.height:
    setPos(0, y)
    for x in 0..<tb.width:
      let c = tb[x,y]
      if c.bg != gCurrBg or c.fg != gCurrFg or c.style != gCurrStyle:
        flushBuf()        
        setAttribs(c)
      var cc = $c.ch
      skipNo += (getWidth(cc.runeAt(0).int) - 1)
      if cc == " " and skipNo>0:
        cc = ""
        skipNo -= 1
      buf &= cc 
      
    flushBuf()
    skipNo = 0

@Comamoca
Copy link
Author

I tried that way, I was looked at expected behavior.
Thank you!

image

@Comamoca Comamoca reopened this Nov 27, 2022
@Comamoca
Copy link
Author

@forthlee
I am thinking of using this code to create a Japanese (and also Chinese, Korean, etc.) fork of illwill.
Would it be ok to include the code you provided in the fork?
Thank you.

@forthlee
Copy link

The code is referenced from Urwid
urwid is licensed under the GNU Lesser General Public License v2.1

@johnnovak
Copy link
Owner

@forthlee I am thinking of using this code to create a Japanese (and also Chinese, Korean, etc.) fork of illwill. Would it be ok to include the code you provided in the fork? Thank you.

Feel free to raise a PR if you think Japanese/Asian languages can be supported unobtrusively.

@Comamoca
Copy link
Author

Comamoca commented Feb 4, 2023

@forthlee

thanks!
I have spring break so, I want to work on this issue.

@Comamoca
Copy link
Author

Comamoca commented Feb 4, 2023

@johnnovak

I will consider sending PR when I have reached some degree.
Please take care of me then.😉

@johnnovak
Copy link
Owner

@johnnovak

I will consider sending PR when I have reached some degree. Please take care of me then.😉

Sure thing. Good luck! 😎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants