-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Syriac to Reader #6
base: master
Are you sure you want to change the base?
Conversation
I would have liked to make each text configurable for Hebrew vs Syriac, to allow having texts of both languages in the same PDF, maybe a future PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much! 😊 This is something I never thought about, but I think it will be useful for many people.
I have mostly some requests to keep the code as I would expect it to be, so that it is easier to maintain for me. There are 23 comments, but many are very small and the same. When this is done, I will refactor the tex templates. They contain some duplicate code now, which can be reduced by setting the language as a macro.
Unless you prefer I don't, I can add your name as a contributor to the readme and index.html.
If you ever want to add the option to add passages in different languages, we should first refactor the UI. Please sync with me first then, as I already have some ideas for this.
@@ -1,7 +1,6 @@ | |||
*.aux | |||
*.log | |||
*.tex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be ignored, because the program generates these, and files like pre.tex
should be excluded with a !
line immediately below.
apt-get upgrade -qq &&\ | ||
apt-get install -qq $INSTALL_PACKAGES $PACKAGES &&\ | ||
apt-get install -qq build-essential curl fontconfig perl python3-dev python3-setuptools subversion libffi-dev libfontconfig1 python3 tar &&\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason all this is done is that the final image is smaller. The apt-get install
and apt-get remove && rm -rf /var/lib/apt/lists/*
should be in the same RUN
command, so that the packages that are only needed temporarily are not stored in the Docker cache.
You are right that this makes development a bit slow; I usually change the Dockerfile locally but don't commit the changes. Nowadays it is also possible to create a multi-stage build, so you could have a look if you can get that working. If you don't want to spend time on it, please just undo the changes here.
COPY NotoSansSyriac-Regular.ttf /usr/local/share/fonts/syriac_2.ttf | ||
COPY NotoSerif-Regular.ttf /usr/local/share/fonts/greek.ttf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of things:
- If possible I would prefer to download the fonts (as is done with the Hebrew one); this makes sure that we always have the latest version.
- Please keep original font names (i.e.
NotoSansSyriac-Regular.ttf
instead ofsyriac_2.ttf
). - Does Noto Serif Regular have all the required glyphs? If I enter the LXX of Gen 1:1 on https://fonts.google.com/noto/specimen/Noto+Serif/tester I see empty boxes for the first letter of the first and third word. Why not use SBL Greek, as we are using SBL Hebrew already?
- At the bottom of index.html there is a note about the fonts that are being used; please add these.
from minitf import gather_context | ||
|
||
VERSE_NODES = dict() | ||
|
||
def gather_chapter(api, book, chap): | ||
def gather_chapter(api, book, chap, lang): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make lang
the second argument, so that the order matches the one in the VERSE_NODES
hierarchy.
@@ -32,40 +32,70 @@ def gather_chapter(api, book, chap): | |||
node = next_verse[0] | |||
return nodes | |||
|
|||
def gather_book(api, book): | |||
def gather_book(api, book, lang): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please swap book
and lang
.
tex.write('\n\n' + templates['pretext']) | ||
if lang == "hebrew": | ||
tex.write('\n\n' + templates['pretext']) | ||
elif lang == "syriac": | ||
tex.write('\n\n' + templates['pretext_syr']) | ||
elif lang == "greek": | ||
tex.write('\n\n' + templates['pretext_grk']) | ||
tex.write('\n'.join(text)) | ||
tex.write('\n' + templates['posttext']) | ||
if lang == "hebrew": | ||
tex.write('\n' + templates['posttext']) | ||
elif lang == "syriac": | ||
tex.write('\n' + templates['posttext_syr']) | ||
elif lang == "greek": | ||
tex.write('\n' + templates['posttext_grk']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you rename the files to pretext_LANGUAGE.tex
, you can simply use templates['pretext_' + lang]
here.
try: | ||
api = load_data(passage) | ||
text, words = get_passage_and_words(passage, api) | ||
except: | ||
raise ValueError('Could not find reference "{}"'.format(passage_text)) | ||
api = load_data(passage, lang) | ||
text, words = get_passage_and_words(passage, api, lang) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
@@ -95,17 +93,17 @@ def fix_gloss(gloss): | |||
return 'I' | |||
return re.sub(r'<(.*)>', r'\\textit{\1}', gloss) | |||
|
|||
def get_passage_and_words(passage, api, separate_chapters=True, verse_nos=True): | |||
def get_passage_and_words(passage, api, lang, separate_chapters=True, verse_nos=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make the order lang, api, passage
? (I see now that the old order is incorrect, too.)
generate(passages, | ||
include_voca is not None and include_voca, | ||
combine_voca is not None and combine_voca, | ||
clearpage_before_voca is not None and clearpage_before_voca, | ||
text_size is not None and int(text_size[0]) > 0, | ||
text_size is not None and int(text_size[0]) > 1, | ||
tex, None if fmt == 'tex' else pdf, | ||
TEMPLATES, quiet=True) | ||
TEMPLATES, lang, quiet=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use lang[0]
here (it is an artifact of the server interface that we get a list with one element, and generate
should be agnostic to the interface).
@@ -300,7 +329,7 @@ def main(): | |||
templates['postvoca'] = args.post_voca_tex.read() | |||
generate(args.passages, | |||
args.include_voca, args.combine_voca, args.clearpage_before_voca, | |||
args.large_text, False, args.tex, args.pdf, templates) | |||
args.large_text, False, args.tex, args.pdf, args.lang, templates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? The p_misc
parser did not get a lang
option, so I would expect args.lang
to always be None
. I would expect to be able to do something like (cf the readme):
./hebrewreader.py --lang greek --pdf genesis.pdf Genesis
This PR adds Syriac Peshitta to the Reader Generator. Unfortunately, the Syriac TF data doesn't contain the gloss attribute, so there's not yet a way to add a vocabulary.
Also, I've modified the Dockerfile a bit to make rebuilding the container faster by separating the setup stuff from the file data so that when file data is changed, the container can use cached layers for setup and only load in the changed files.