Skip to content
This repository has been archived by the owner on Aug 23, 2022. It is now read-only.

Transliteration #87

Open
Klaranth opened this issue Jun 12, 2021 · 2 comments
Open

Transliteration #87

Klaranth opened this issue Jun 12, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@Klaranth
Copy link

@S-S-X
For pandorabot: https://discord.com/channels/513329453741637637/668090931861258250/729456054001467443

@BuckarooBanzay
I tried the nodejs variant (https://www.npmjs.com/package/transliteration) but it does not work with some utf8 characters
console.log(transliteration.transliterate('𝒟𝒶𝓃𝒹𝑒')); just prints out "" .

@S-S-X
Well, I tried that lua transliterator oob: print(t:transliteration_get('𝒟𝒶𝓃𝒹𝑒')) and that printed me??me??me??me??me?? to console.

Lists seems to be specialized and pretty short for most transliteration projects around web but probably someone has combined that stuff already... maybe.. just cant find it.. or it does not exist.

Seems like at least google knows how to transliterate those.

Somewhat related :
mt-mods/beerchat#38
#39
#16

@BuckarooBanzay BuckarooBanzay added the enhancement New feature or request label Jun 14, 2021
@BuckarooBanzay
Copy link
Owner

This looks promising too: https://github.com/kshetline/unidecode-plus

> unidecode('Café 北京, 😀😁😇😈😱', { smartSpacing: true })
'Cafe Bei Jing, :-) :-D O:-) >:-) =:-O'

@S-S-X
Copy link

S-S-X commented Oct 12, 2021

Looked around a bit and that one indeed seems like attempt to answer my earlier question

someone has combined that stuff already... maybe.. just cant find it.. or it does not exist

It is nowhere near complete and there will be a lot of missing stuff but it seems to be about transliteration that is also best fitting for our use case, from readme.md of unicode-plus:

Some of the transliterations go for matching the shape of characters rather than their pronunciation

Also MIT license for code but different license for data, I'm not familiar with Perl license:

Note that all the files named 'x??.js' in data are originally derived directly from equivalent Perl files, distributed under the Perl license, not the BSD or MIT licenses.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants