Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark #1

Open
ZJaume opened this issue Apr 24, 2024 · 2 comments
Open

Benchmark #1

ZJaume opened this issue Apr 24, 2024 · 2 comments

Comments

@ZJaume
Copy link
Owner

ZJaume commented Apr 24, 2024

Running with 5000 random sentences from openlid

method time (s)
fasttext lid201 0.89
HeLI OTS 9.65
heli-otr 7.83
+ Lang Enum 3.98
+ Fnv hash model 3.22
+ Fnv hash identifier 2.84
+ Patricia tree 7.92

hashing functions comparison

method time (s)
fnv 2.84
seahash 3.46
highway march=native 4.00
murmur2 3.40
murmur3 4.15
xxhash 3.66
ahash* 2.73
wyhash 2.75
wyhash2 2.71

* output not stable in different computers.

@ZJaume ZJaume changed the title Bernchmark Benchmark Apr 29, 2024
@ZJaume
Copy link
Owner Author

ZJaume commented Jun 25, 2024

Model loading time:

method time (s)
fasttext lid193 0.53s
heli OTS 7.2s
heli-otr bincode 4.3s
heli-otr rkyv 2.0s
heli-otr bitcode 0.92s
+ separated ngram files 0.67s

@ZJaume
Copy link
Owner Author

ZJaume commented Jun 25, 2024

Now running with 100k sentences, since 5k seem to be too few.

method time (s)
CLD2 1.12
HeLI-OTS 60.37
lingua all high preloaded 56.29
lingua all low preloaded 23.34
fasttext lid193 8.44
heli-otr wyhash + static scorers 5.28
+ bitcode 4.72
+ vec<lang, prob> 2.40
+ early char count 2.33
+ rayon 32thread 0.90
+ score_lang vectorized 2.09

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant