Skip to content
This repository has been archived by the owner on Jul 22, 2022. It is now read-only.

Commit

Permalink
Handle independent H-W VOICED SOUND MARK
Browse files Browse the repository at this point in the history
When independent half-width voiced sound mark exist,
ex. U+FF9E HALFWIDTH KATAKANA VOICED SOUND MARK
It become full-width voiced sound mark
(U+309B) or U+0022 double-quote.

Fixes #115 infinite loop

Signed-off-by: Hiroshi Miura <[email protected]>
  • Loading branch information
miurahr committed Feb 7, 2021
1 parent 976aec4 commit 7568000
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 2 deletions.
3 changes: 2 additions & 1 deletion src/data/halfkana.utf8
Original file line number Diff line number Diff line change
Expand Up @@ -91,4 +91,5 @@
「 「
」 」
。 。
、 、
、 、
\u309B \uFF9E
1 change: 1 addition & 0 deletions src/data/hepburnhira.utf8
Original file line number Diff line number Diff line change
Expand Up @@ -248,3 +248,4 @@ wi \U0001b150
we \U0001b151
wo \U0001b152
;;
" ゛
8 changes: 7 additions & 1 deletion src/pykakasi/scripts.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,13 @@ def convert_h(self, text):
if length > 0:
max_len += length
x += length
Hstr = Hstr + chr(ord(kstr) - self._diff)
if ord(kstr) == 0x309B:
Hstr = Hstr + kstr
else:
Hstr = Hstr + chr(ord(kstr) - self._diff)
else:
max_len += 1
x += 1 # skip unknown character(issue #115)
else: # pragma: no cover
break
return (Hstr, max_len)
Expand Down
8 changes: 8 additions & 0 deletions tests/test_pykakasi_issues.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,3 +139,11 @@ def test_issue114():
assert result[0]['hepburn'] == 'omotta'
assert result[2]['hepburn'] == 'itta'
assert result[4]['hepburn'] == 'itta'


def test_issue115():
kks = pykakasi.kakasi()
result = kks.convert('゙っ、') # \uFF9E
assert result[0]['hira'] == '\u309Bっ、'
assert result[0]['kana'] == '\uFF9Eッ、'
assert result[0]['hepburn'] == '"tsu,'

0 comments on commit 7568000

Please sign in to comment.