Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add handling of double underscores in generated html #160

Merged
merged 11 commits into from
Sep 24, 2023

Conversation

benjub
Copy link
Collaborator

@benjub benjub commented Aug 20, 2023

Almost certainly buggy since I don't know the codebase nor the language. Also, probably inefficient since I copied code from a more complex case than just replacing __ with _. This is more like pseudocode than anything else, actually, but I wanted to get the ball rolling since the need for this feature was emphasized in metamath/set.mm#3389 (comment) by @avekens (i.e., we want to display command names like MINIMIZE_WITH and TRACE_BACK without triggering italics or subscripts in the generated html).

Help would be appreciated (it may be simpler to start another PR). Maybe from @digama0 or @wlammen or @tirix ?

@tirix
Copy link

tirix commented Aug 20, 2023

I've added some comments, but that's a lot of different changes.
I could not propose changes directly in the repository (maybe I don't have the rights for that).
What I came up with is:

      /* Double-underscore handling: double-underscores are "escaped
      underscores", so replace a double-underscore with a single
      underscore and do not modify italic or subscript. */
      if (cmt[pos1] == '_') {
            if (g_htmlFlag) {  /* HTML */
              let(&cmt, cat(left(cmt, pos1), /* Skip (delete) "_" */
                  right(cmt, pos1 + 2), NULL));
              let(&cmtMasked, cat(left(cmtMasked, pos1), /* Skip (delete) "_" */
                  right(cmtMasked, pos1 + 2), NULL));
              pos1 ++; /* Adjust for 1 extra char '_' */
            } else {  /* LaTeX */
              let(&cmt, cat(left(cmt, pos1 - 1),  /* Skip (delete) "_" */
                  "\\texttt{\\_}",
                  right(cmt, pos1 + 2), NULL));
              let(&cmtMasked, cat(left(cmtMasked, pos1 - 1),  /* Skip (delete) "_" */
                  "\\texttt{\\_}",
                  right(cmtMasked, pos1 + 2), NULL));
              pos1 = pos1 + 11; /* Adjust for 11 extra chars "\texttt{\_}" */
            }
        continue;
      } /* End of double-underscore handling */

I've tested it successfully with HTML mode, but it does not work for TEX as the whole thing seems to be skipped at line 1941 with a test on the g_htmlFlag global variable.

@digama0
Copy link
Member

digama0 commented Aug 20, 2023

needs a test

@benjub
Copy link
Collaborator Author

benjub commented Sep 2, 2023

Added Thierry's fix and tests.

@benjub
Copy link
Collaborator Author

benjub commented Sep 2, 2023

As for the tex: the if (g_htmlFlag) at Line 1970 of mmwtex.c is inside the if (g_htmlFlag != 0 && at Line 1941, and there semms to be no side effects affecting g_htmlFlag in between, so the else branch beginning Line 1976 is probably useless. (Same thing for Lines 2013/2024, and 2049/2057, which are not part of this PR.)

src/mmwtex.c Outdated
pos1 ++; /* Adjust for 1 extra char '_' */
} else { /* LaTeX */
let(&cmt, cat(left(cmt, pos1 - 1),
"\texttt{\_}",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My compiler did not like this: \t is a tab, \_ is an unknown escape.
You have to escape the backslashes, as in "\\texttt{\\_}"

src/mmwtex.c Outdated
Comment on lines 1972 to 1975
let(&cmt, cat(left(cmt, pos1),
"",
seg(cmt, pos1 + 1, pos1 + 2), /* Skip (delete) "_" */
"", right(cmt, pos1 + 2), NULL));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cat is a concatenation.
Here you concatenate:

  • a left part,
  • an empty string (?)
  • additional characters which include the second underscore and the character right after it,
  • another empty string (?)
  • the right part

That seems to be too much. We only need:

  • a left part, including one underscore,
  • the right part, after the second underscore.

src/mmwtex.c Outdated
/* Double-underscore handling: double-underscores are "escaped
underscores", so replace a double-underscore with a single
underscore and do not modify italic or subscript. */
if (cmt[pos1 + 1] == '_') {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you have to use cmt[pos1] here, without the +1.

@tirix
Copy link

tirix commented Sep 4, 2023

As for the tex: the if (g_htmlFlag) at Line 1970 of mmwtex.c is inside the if (g_htmlFlag != 0 && at Line 1941, and there semms to be no side effects affecting g_htmlFlag in between, so the else branch beginning Line 1976 is probably useless. (Same thing for Lines 2013/2024, and 2049/2057, which are not part of this PR.)

That one is probably best left aside for now. This is not directly related with this PR.
One would have to understand if annotations are handled somewhere else for LaTeX (and those cases don't need to be handled here) or not (and special care has to be taken for LaTeX in that case too).

@benjub
Copy link
Collaborator Author

benjub commented Sep 4, 2023

Reverted the last commit, as @tirix suggested. This looks ready to me. Better reviewed as a whole, since there was a commit revert. @digama0 ?

@benjub benjub requested review from tirix and digama0 September 4, 2023 21:09
Copy link

@tirix tirix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@benjub
Copy link
Collaborator Author

benjub commented Sep 9, 2023

@digama0 does it look good to you ?

By the way: if we keep tests/.gitignore, we should add *.produced to it and remove it from .gitignore, or else we only keep a single ./gitignore at the root and put everything there.

@benjub
Copy link
Collaborator Author

benjub commented Sep 23, 2023

@digama0 does it look good to you ?

@digama0
Copy link
Member

digama0 commented Sep 24, 2023

LGTM, I didn't check the replacement logic in detail but I'm happy to follow up with bugfixes if we find any issues later. This should be replicated in metamath-knife if it hasn't been done already.

@digama0 digama0 merged commit 8c6a5a5 into metamath:master Sep 24, 2023
2 checks passed
@benjub benjub deleted the underscore branch September 24, 2023 09:07
@benjub
Copy link
Collaborator Author

benjub commented Sep 24, 2023

This should be replicated in metamath-knife if it hasn't been done already.

Not to my knowledge... and I'm not that fluent in Rust. Maybe @tirix knows/can add it ?

@tirix
Copy link

tirix commented Sep 24, 2023

Maybe @tirix knows/can add it ?

The first part is in metamath/metamath-knife#127.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants