Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement @whitespace setter. #42

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Implement @whitespace setter. #42

wants to merge 3 commits into from

Conversation

kaj
Copy link
Owner

@kaj kaj commented Jan 25, 2019

As suggested in #29. A keyword @whitespace can be used to switch whitespace handling between the tree modes as-is, compact, and removed.

  • Implement the functionality. Set default mode to compact and update tests.
  • Decide on the exact syntax; should some keyword change? Should @ws be allowed as an alias for @whitespace? Should the removed alternative be called trim? Or some other word change?
  • Add documentation and more tests / examples.
  • Anything else?

@kaj
Copy link
Owner Author

kaj commented Jan 25, 2019

Currently, compact mode can lead to two consecutive space/newline characters, where one comes from the outside of a block and one from the inside. I don't really know how to fix that, but maybe it doesn't matter that much?

@kornelski , any thoughts on this or the naming / syntax questions?

@kornelski
Copy link
Contributor

I like the @ws shortcut.

trim is used to mean start/end stripping, not all of whitespace, so it doesn't fit.

as-is could be preserve, to borrow from XML (and then removed -> remove).

You wanted to be markup-agnostic. As such, compact may not be a safe default, as it'll mangle the <pre> blocks.

@kornelski
Copy link
Contributor

kornelski commented Jan 25, 2019

To support collapsing of whitespace around @ws(compact), you'd need to do it as a post-processing pass on the AST, not in the parser itself. You'd need to inform the collapsing function whether previous block ends with whitespace, or next one starts with whitespace.

@kaj
Copy link
Owner Author

kaj commented Jan 25, 2019

I like the @ws shortcut.

Yes, me to. So I'll add that.

trim is used to mean start/end stripping, not all of whitespace, so it doesn't fit.

True. Technically, it is "compact, then trim". The suggestion was mainly to supply that the alternative does not remove all whitespace, only whitespace that is considered irrelevant.

Which makes me think of one thing that is probably a bug: whitespace between words is noralized, not removed, so one other becomes one other, but whitespace between items is removed, so one @n other becomes e.g. one17other. This seems a bit strange to me, should we consider this behaviour correct? Or should whitespace around expressions be normalized while whitespace around blocks is removed?

as-is could be preserve, to borrow from XML (and then removed -> remove).

Yes, preserve sounds better than as-is, and in that case removed should be remove, unless we find another better word. And compact can remain, unless normalize is a better word for that behaviour?

You wanted to be markup-agnostic. As such, compact may not be a safe default, as it'll mangle the <pre> blocks.

Yes, but I don't think that is a problem. Regarding <pre> blocks, that is pretty much the reason to have the preserve alternative at all, and regarding different file formats with different requirements, the default mode could be different for different formats. compact is just the default for .html templates.

To support collapsing of whitespace around @ws(compact), you'd need to do it as a post-processing pass on the AST, not in the parser itself. You'd need to inform the collapsing function whether previous block ends with whitespace, or next one starts with whitespace.

Yes, that's true (or technically, it would be enough to do a pass for this one level up in the tree). I was just thinking about existing blocks (@if, etc) inside an @ws(compact), which would possible to do in the current pass. But yes, I guess leading (trailing) whitespace should be removed in a compact block that is preceded (followed) by whitespace.

@kornelski
Copy link
Contributor

@ws(compact) shouldn't remove whitespace around expressions — that would be a huge pain to work with. Even collapsing around an inline @if is problematic: foo @if x {1} else {2} bar.

It seems fine to remove /{\n +/ and / +\n}/, as this whitespace is for indentation, but whitespace before @ and after } may be important to keep.

@kaj
Copy link
Owner Author

kaj commented Jan 27, 2019

Sorry, I think I was a bit unclear above. This paragraph:

Which makes me think of one thing that is probably a bug: whitespace between words is noralized, not removed, so one other becomes one other, but whitespace between items is removed, so one @n other becomes e.g. one17other. This seems a bit strange to me, should we consider this behaviour correct? Or should whitespace around expressions be normalized while whitespace around blocks is removed?

... refers to the beviour in a @ws(remove) block, e.g. the difference in the result of these templates:

<p>@ws(remove) {
  one two three
}</p>

or

<p>@ws(remove) {
  one @n three
}</p>

The first would (currently) result in <p>one two three</p> and the latter in <p>onetwothree</p> (if the n variable contains "two"). So my question is if we should fix this by handling expressions different from blocks? Or should @ws(remove) { ... } actually remove all (non-escaped) whitespace, even between words?

In @ws(compact) mode one whitespace is allways preserved where there is whitespace.

@kornelski
Copy link
Contributor

kornelski commented Jan 27, 2019

@if contributors_as_a_team {(}<a href="@repo.contributors_http_url()">@if contributors == 100 {over }@contributors contributors</a>@if contributors_as_a_team {)}@if !owners.is_empty() {.}

I'd probably want to format like:

@if contributors_as_a_team {(}
<a href="@repo.contributors_http_url()">
  @if contributors == 100 {over }
  @contributors contributors
</a>
@if contributors_as_a_team {)}
@if !owners.is_empty() {.}

So in this case it'd work for me if whitespace runs containing \n were completely removed, but not other inline whitespace.

kaj added 3 commits March 16, 2019 13:11
As suggested in #29.  A keyword `@whitespace` can be used to switch
whitespace handling between the tree modes `as-is`, `compact`, and
`removed`.

- [x] Implment the functionality.  Set default mode to `compact` and
  update tests.
- [ ] Decide on the exact syntax; should some keyword change?  Should
  `@ws` be allowed as an alias for `@whitespace`?  Should the
  `removed` alternative be called `trim`?  Or some other word change?
- [ ] Add documentation and more tests / examples.
- [ ] Anything else?
Only spaces (ascii 32), tabs, newlines and carriage returns are
considered whitespace.  Other variants (such as no-break space and
spaces of specific width) will probably only appear in templates if
they are actually wanted in output.
@kaj
Copy link
Owner Author

kaj commented Jul 10, 2019

Hmm ... I think it would be rather strange if any sequence of whicespace and tab characters resulted in more whitespace than any such sequence that also contained a newline.

@kaj
Copy link
Owner Author

kaj commented Jul 10, 2019

Maybe we should instead do something like what is done in rust strings: If there is backslash (or possibly some other marker) as a last character before a newline, then the backslash, the newline and any indenting whitespace characters is removed?

@kornelski
Copy link
Contributor

Interesting. That could work.

@kaj kaj marked this pull request as draft January 21, 2023 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants