Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.3.4 #308

Merged
merged 4 commits into from
Feb 3, 2025
Merged

v1.3.4 #308

merged 4 commits into from
Feb 3, 2025

Conversation

vincerubinetti
Copy link
Collaborator

@vincerubinetti vincerubinetti commented Jan 31, 2025

Closes #305

This explanation is a bit hand-wavy, because I still can't follow the logic exactly. I find it hard to debug the flow of things in Jekyll because, while there's a --trace flag, the trace seems to be just the under-the-hood Ruby code, and not e.g. "you used this _include in this markdown file on this line, which then used this other _include, which then ran this Ruby filter, etc.".

But in general:

Certain nested content (I think within the post-excerpt hidden data-search attribute) ended up making its way to relative_url in content.html (which should just be a url), but only in certain rare cases where the peculiarity of the regex matching in the section component didn't prevent it.

  • change regex matching of section params in content.html to non-greedy, as they should've always been
  • change string pattern of passing section params to content.html from param: value; to <param>value</param>. this should reduce the chances of incorrect parsing, because one of the params is a URL, which can have ; characters in them, which would confuse the old format.
  • fix unrelated bug where home page tab title looks like | Lab Website Template when it should just be Lab Website Template because the page title is ""
  • strip all html from post-excerpt data-search attribute, as it always should've been. is part of the main bug fix of this PR, but also will reduce false-positive search matches, because without it, data-search contains e.g. single p characters leftover from <p> tags. now data-search should truly only be the content of the post.

While not fully understanding the full flow of the root issue here, I think I've added enough safety mechanisms in enough spots to prevent similar issues from occurring in the future.

New template version checklist:

  • I have updated CITATION and CHANGELOG as appropriate.
  • I have updated lab-website-template-docs as appropriate.
  • I have checked the testbed as appropriate.

Copy link

github-actions bot commented Jan 31, 2025

PR Preview Action v1.4.7
Preview removed because the pull request was closed.
2025-02-03 20:23 UTC

@falquaddoomi
Copy link
Contributor

falquaddoomi commented Feb 3, 2025

FWIW, I was able to replicate your bug from #305 and this PR does seem to fix it.

Disclaimer, I don't know very much about Jekyll or Ruby, but I think I was able to kind of trace the source of the original issue. Given your trace from that issue, I assumed that the contents of section.html were somehow ending up getting passed to a URL parser, and sure enough in https://github.com/greenelab/lab-website-template/blob/main/_includes/content.html#L27 there's a part where image is passed to relative_url.

If you remove the relative_url | uri_escape parts, rebuild the site (which should proceed now without the exception), and search for --image: url( in _site, you should see strings that look like <section class="background" data-size="page" style="--image: url('; dark: ; size: ')">. I assume relative_url was trying to parse whatever came first as a scheme (i.e., "http://", "ftp://", etc.) and "; dark" clearly isn't one.

I have to agree with you that it isn't easy to debug things in Jekyll. Sorry for the redundant advice if you're already doing this, but I learned from trying to debug your issue that looking at incomplete output in _site, specifically by throwing <pre label="some-string">{{ var | inspect }}</pre> into templates and then searching for the <pre label="..."> bit, helped a bit.

As for the source of the issue, again I'm not very familiar with Jekyll, but it may have something to do with unexpected newlines being present in your input. In most environments the . regex character class doesn't match newlines by default (you typically enable a "newline mode" flag for the regex parser if you do want it to include them), so it may have messed up something that you were trying to parse out.

@vincerubinetti vincerubinetti merged commit d515175 into main Feb 3, 2025
4 checks passed
@vincerubinetti vincerubinetti deleted the v1.3.4 branch February 3, 2025 20:22
@vincerubinetti
Copy link
Collaborator Author

The pre is a nice trick but yeah I've been using inspect, but unfortunately it only works if the site is able to build. I was also trying to hack the jekyll source code to console log some information but that ended up just being a time sink. It could also be newlines as you said. Unfortunately I can't afford to spend more time on this, so the exact flow of code and how it cause the exact conditions to manifest will have to remain a mystery.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid scheme format: '; dark' in /usr/src/app/_layouts/default.html
2 participants