-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scope of \rem #130
Comments
The important bit about \rem being a paragraph marker is that it begins
after a newline. The function of \rem is to disable parsing until the next
newline.
The comment about it being a paragraph is to suggest it does not belong mid
sentence, and that there no well-formed \rem* end-tag.
Your understanding seems correct to me,
1. \rem disables USFM parsing until the next end-of-line tag.
but with the further understanding that
1. \rem must begin after a newline/return
2. there is no such thing as a \rem ... \rem* character tag.
3. \there is no multi-line \rem tag. if the comment goes multiple lines,
multiple \rem tags are required.
…On Mon, Oct 25, 2021 at 1:49 AM mhosken ***@***.***> wrote:
The specification document gives no indication of the range of coverage of
a \rem. The stylesheet specifies it as being a paragraph marker. But my
impression is that \rem covers everything up to the next newline.
If the specification is that it is a pargraph marker, then it is
impossible to put \rem around any paragraph marker and the content must be
a well formed paragraph (with appropriate closing markers where necessary).
If the specification is everything up to the next newline, then any marker
etc. can be remarked away including paragraph markers and the like. My
impression is that this is how most people of think of it.
All clarification welcomed.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#130>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC2DE4QBOQC7SJ4LEBKBCWDUIT4V5ANCNFSM5GUPMMMQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
A bit of additional input: As noted, An attempt has been made to describe whitespace in USFM. In a well-formed USFM document all paragraph markers should be preceded by a newline -- but a newline does not indicate the end of the current paragraph or the start of a new paragraph. The whitespace notes indicate that USFM considers space (U+0020), tab (U+0009), and newline characters to be whitespace, and that multiple whitespace within the body text of a paragraph are not significant and should be normalized when trying to produce a 'well-formed' document (vs just valid USFM). |
So... to summarise:
(a) Is badly formed USFM, |
My own 2 cents is that PTXprint's interpretation (that If that's the case, then either If the above is not true, then the result is that there is no way within USFM to have a comment in the text which has a meta-reference to a paragraph marker. E.g.: I find these sorts of comments particularly helpful when creating Bible Modules. |
I agree with the sentiment to have clarification that \rem is terminated at
the first end of the actual paragraph, regardless of tags that might appear
within the remark.
However, I regularly scrub backslashes out of \rem lines and replace the
USFM tag with bracketed information: \p becomes [p] within the comment.
This is done in USFM, not on conversion. That is, any backslash within a
"(^\\rem [^\n]+)\\(\w+)" search is an error, and replaced with $1 [$2]. I
do this specifically to limit overprocessing the file on conversion. This
check occurs after Jeff's described "processing into well formed USFM
whitespace" which means stray line endings with no tag following them are
already replaced with a single space.
My USFM parser is extensible, meaning any slash it finds following a
newline is treated like a paragraph tag, whether it has a style for it or
not. And any tag it finds midline that hasn't already been processed is
treated like a character style. This limits the entire USFM->XML filter to
< 200 replacements, and that's for all 800ish tags that the USFM manual(s)
imply.
So "well-formed" PSFM in your case would look like
\rem The following [p] marker was changed to [m] because...
Which to me doesn't affect readability, but does ensure the tags don't
escape into print.
…On Wed, Aug 23, 2023 at 6:04 AM mnjames ***@***.***> wrote:
My own 2 cents is that PTXprint's interpretation (that \rem kills
everything until the new line) should be considered the correct one, and
that the USFM documentation should be updated to clarify that.
If that's the case, then either \rem becomes a new, unique kind of
marker, or it's a paragraph marker but with some significant caveats (like
the fact that other paragraph markers can occur on the same line and don't
have to start a new line in that case).
If the above is not true, then the result is that there is *no way*
within USFM to have a comment in the text which has a meta-reference to a
paragraph marker. E.g.:
\rem The following \p marker was changed to \m because...
I find these sorts of comments particularly helpful when creating Bible
Modules.
—
Reply to this email directly, view it on GitHub
<#130 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC2DE4SJ2NSZSVAVT43LSCTXWXPTLANCNFSM5GUPMMMQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
The specification document gives no indication of the range of coverage of a
\rem
. The stylesheet specifies it as being a paragraph marker. But my impression is that\rem
covers everything up to the next newline.If the specification is that it is a pargraph marker, then it is impossible to put \rem around any paragraph marker and the content must be a well formed paragraph (with appropriate closing markers where necessary). If the specification is everything up to the next newline, then any marker etc. can be remarked away including paragraph markers and the like. My impression is that this is how most people of think of it.
PTXprint currently treats it as a paragraph marker with no consideration of the impact of a newline.
All clarification welcomed.
If we say that the
\rem
marker scope is a single line, then that raises the question of whether there are other markers that are similarly scoped (\toc#, \h). But that is a wider debate with each one being taken case by case.The text was updated successfully, but these errors were encountered: