Skip to content
This repository has been archived by the owner on Jun 15, 2024. It is now read-only.

Easy examples yield funny results #69

Open
barney-bv opened this issue May 24, 2022 · 0 comments
Open

Easy examples yield funny results #69

barney-bv opened this issue May 24, 2022 · 0 comments

Comments

@barney-bv
Copy link

Code to reproduce:

import gcld3
detector = gcld3.NNetLanguageIdentifier(min_num_bytes=0, max_num_bytes=10000)
results = detector.FindTopNMostFreqLangs(text=sample_text, num_langs=2)
print(sample_text)
for result in results:
    print(result.language, result.is_reliable, result.probability, result.proportion)

Weird results:
tus ojos me hace sentir
lt True 0.786892831325531 1.0 # 🤖😬🤣
und False 0.0 0.0

sin red y voy a mil
af True 0.8103252649307251 1.0 # y is not in afrikaans
und False 0.0 0.0

yo te veo pero tu no ves
ja-Latn True 0.9469742178916931 1.0 # japanese, really? these are the most basic spanish words
und False 0.0 0.0

aunque no me veas, mirame
de True 0.9972571730613708 1.0 # no and me are very simple words that are not German
und False 0.0 0.0

esta al reves
eo True 0.7365820407867432 1.0 # in Esperanto there's no word ending with -es
und False 0.0 0.0

aunque no veas
de True 0.9875902533531189 1.0 # no and me are very simple words that are not German
und False 0.0 0.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant