Skip to content

Commit

Permalink
Clean up test contents
Browse files Browse the repository at this point in the history
  • Loading branch information
polm committed Dec 20, 2024
1 parent 0550ebd commit 61fdccc
Showing 1 changed file with 2 additions and 5 deletions.
7 changes: 2 additions & 5 deletions cutlet/test/test_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@
("私はテストです", "Watakushi wa test desu"), # issue #4, 私 -> 代名詞
("《月》", "(gatsu)"), # issue #7, unfamiliar punctuation
("2 【電子版特典付】", "2 [denshi ban tokutentsuke]"), # issue #7
# This looks weird but MeCab tokenizes at alpha-num barriers
("cutlet23", "Cutlet23"),
# Test some kana unks - issue #8
("アマガミ Sincerely Your S シンシアリーユアーズ", "Amagami Sincerely Your S shinshiariiyuaazu"),
Expand Down Expand Up @@ -89,14 +88,12 @@
# don't add spaces around apostrophe if it wasn't there
("McDonald's", "McDonald's"),
("Text McDonald's text", "Text McDonald's text"),
# Following are quote weirdness. Not good but hard to fix.
# An issue is that ," or .' is a single token.
("It's 'delicious.'", "It's 'delicious.'"),
('"Hello," he said.', '"Hello," he said.'),
# this is a very strange typo
("アトランテッィク", "Atoranteku"),
# odoriji. Note at this point these rarely work properly, they mainly
# don't blow up.
# odoriji. Note at this point these rarely work properly, these mainly test
# that they don't blow up.
("くゞる", "Kuguru"), # note this is actually in unidic-lite
("くヽる", "Ku ru"),
("今度クヾペへ行こう", "Kondo kugupe e ikou"), # made up word
Expand Down

0 comments on commit 61fdccc

Please sign in to comment.