-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore a note that occurs right after \id marker #203
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 4 files at r1, all commit messages.
Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @ddaspit)
src/SIL.Machine/Corpora/UsfmTokenizer.cs
line 393 at r1 (raw file):
if ( usfm[usfm.Length - 1] == ' ' && ((prevToken != null && prevToken.ToUsfm().Trim() != "") || !tokensHaveWhitespace)
What's the point of this logic here? I'm not quite getting it.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #203 +/- ##
==========================================
+ Coverage 67.31% 67.32% +0.01%
==========================================
Files 441 441
Lines 35001 35021 +20
Branches 4695 4700 +5
==========================================
+ Hits 23560 23579 +19
Misses 10352 10352
- Partials 1089 1090 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)
src/SIL.Machine/Corpora/UsfmTokenizer.cs
line 393 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
What's the point of this logic here? I'm not quite getting it.
Sorry, I haven't approved this yet because I was hoping to get a better handle on this logic here. Could you explain what's going on, @ddaspit ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)
src/SIL.Machine/Corpora/UsfmTokenizer.cs
line 393 at r1 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
Sorry, I haven't approved this yet because I was hoping to get a better handle on this logic here. Could you explain what's going on, @ddaspit ?
This checks to see if we need to strip off the space at the end before a newline. There are two cases that we have to handle:
- The tokens contain whitespace. This occurs when we are trying to preserve the whitespace from the original USFM.
- The tokens do not contain whitespace. This occurs when we want to normalize the whitespace.
This logic is preserved from the original USFM parser code in Paratext.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)
src/SIL.Machine/Corpora/UsfmTokenizer.cs
line 393 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
This checks to see if we need to strip off the space at the end before a newline. There are two cases that we have to handle:
- The tokens contain whitespace. This occurs when we are trying to preserve the whitespace from the original USFM.
- The tokens do not contain whitespace. This occurs when we want to normalize the whitespace.
This logic is preserved from the original USFM parser code in Paratext.
Thank you!
This change is