Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✍️ LaTeX/JATS improvements for Proofs #694

Merged
merged 3 commits into from
Oct 22, 2023
Merged

Conversation

rowanc1
Copy link
Member

@rowanc1 rowanc1 commented Oct 20, 2023

This parses algorithm environments in latex and adds the serializers for JATS. There are also a number of latex parsing improvements.

Comment on lines 40 to 45
/**
* Line is, e.g., a line in an algorithm and can be numbered as well as indented.
* Otherwise this works the same as a paragraph, ideally with tighter styling.
* The Line is used in Algorithms (e.g. when parsing from LaTeX)
*/
export type Line = Parent & { type: 'line'; indent?: number; enumerator?: string };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fwkoch this is probably the most controversial change in the PR. I have added a new node of type Line which can have indentation and line numbers. This is to support the algorithm parsing, falling back to a div here is totally fine in the current theme.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, my only thoughts are:

  • The name Line conflicts with line in unist positions; maybe slightly confusing that it does not correspond to a "line" in the source. But that's probably ok? Not sure what else we'd call it, Step? Meh - something more specific like AgorithmLine? (This would be similar to DefinitionTerm which can be used in places other than just definitions.)
  • Why not just add indent and enumerator as optional attributes on Paragraph? We overload most of the other default node types with extra attributes. That would make fall-back behaviour more straightforward (just ignore the attributes)... I dunno, maybe then it feels like we are going down a path of "you can enumerate everything" which feels... complicated (as opposed to having intentionality about what may be enumerated).

Overall, though, this is probably fine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did start there, adding attributes to paragraph, and went back and forth. I wanted the JAST to have a "specific-use" identifier and the UI to have a special style, and this seemed special enough for its own type. A div is also probably a more appropriate fallback based on style than a paragraph in this case?

At some point we should do an assessment of these types and do a clean up. I think that we can collapse a few things like admonitionTitle and proof --> statement perhaps.

I changed this to algorithmLine based on your first point!

Copy link
Collaborator

@fwkoch fwkoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed over a lot of the tex macro, etc, additions (looks like there is decent test coverage, but it doesn't quite get it all - not a big deal, tons of repetition...).

Overall, looks good, left my comments on Line but fine with me to go ahead here.

Comment on lines 40 to 45
/**
* Line is, e.g., a line in an algorithm and can be numbered as well as indented.
* Otherwise this works the same as a paragraph, ideally with tighter styling.
* The Line is used in Algorithms (e.g. when parsing from LaTeX)
*/
export type Line = Parent & { type: 'line'; indent?: number; enumerator?: string };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, my only thoughts are:

  • The name Line conflicts with line in unist positions; maybe slightly confusing that it does not correspond to a "line" in the source. But that's probably ok? Not sure what else we'd call it, Step? Meh - something more specific like AgorithmLine? (This would be similar to DefinitionTerm which can be used in places other than just definitions.)
  • Why not just add indent and enumerator as optional attributes on Paragraph? We overload most of the other default node types with extra attributes. That would make fall-back behaviour more straightforward (just ignore the attributes)... I dunno, maybe then it feels like we are going down a path of "you can enumerate everything" which feels... complicated (as opposed to having intentionality about what may be enumerated).

Overall, though, this is probably fine?

paragraphs.forEach((p, i) => {
const l = p as unknown as Line;
l.type = 'line';
l.enumerator = String(i + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enumeration is at a granular level, without the customization allowed for other enumerators. Again, probably fine but it means the enumerator attribute works in different ways.

Maybe, though, having all the normal enumerator customization here is also the long-term goal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did start out with this being called "number", but also felt like we should use common terminology. Certainly a bit different than the other enumerators, with this one being much more granular.

const label = texToText(labelNode);
const countWith = texToText(x) || undefined;
const countAfter = texToText(y) || undefined;
state.data.theorems[name] = { label, countWith, countAfter };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm missing where these theorems stored on data are actually consumed... 🤔 Maybe they just aren't yet...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not yet, correct. When we start enumerating theorems in a custom way, they this count information will likely be used.

@rowanc1
Copy link
Member Author

rowanc1 commented Oct 22, 2023

Thanks for the review @fwkoch, I am going to move ahead with this being a separate type, having this be different enough from a paragraph in the latex and jats implementations. Not totally happy with it, but also I think this is something we can back out of in the future when/if we get to a better solution.

@rowanc1 rowanc1 merged commit 417efdc into main Oct 22, 2023
3 checks passed
@rowanc1 rowanc1 deleted the feat/tex-improvements branch October 22, 2023 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants