Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting automatically the location of the languages #16

Open
GoogleCodeExporter opened this issue Aug 21, 2015 · 1 comment
Open

setting automatically the location of the languages #16

GoogleCodeExporter opened this issue Aug 21, 2015 · 1 comment

Comments

@GoogleCodeExporter
Copy link

futures:
1. use the default language if exist
2. propose the download if no exist
3. must be a menu to chose the possbility of download

Original issue reported on code.google.com by [email protected] on 21 Nov 2008 at 12:55

@GoogleCodeExporter
Copy link
Author

Moreover, the available tesseract languages should be autodetected. On startup,
Lector will check for required files and show all installed languages in the 
left
panel switch.

These files are stored in /usr/share/tesseract/tessdata/ directory and are 8 
for each
language (???.DangAmbigs  ???.inttemp    ???.pffmtable   ???.user-words 
???.freq-dawg
  ???.normproto  ???.unicharset  ???.word-dawg), where the ??? is the lang code from
[1] . Also, there were requests for detection of digits 0-9 only.

I include a file extracted from [1], containing languages in the format
  cze     Czech      Čeština
  deu     German     Deutsch
and two additional files containing the code along with only original or 
english name.
____
[1]: http://en.wikipedia.org/wiki/List_of_ISO_639-2_codes

Original comment by [email protected] on 10 Feb 2009 at 9:25

Attachments:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant