Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX Don't strip <header> tag from HTMLValue #11302

Merged

Conversation

GuySartorelli
Copy link
Member

@GuySartorelli GuySartorelli commented Jul 7, 2024

Description

Stops HTMLValue from stripping the <header> tag.

Manual testing steps

See linked issue

Issues

$content = preg_replace('#</?(html|head|body)[^>]*>#si', '', $content);
$content = preg_replace('#</?(html|head|body)(\s[^>]*)?>#si', '', $content);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went for the more broad approach here rather than the suggestion in the issue - HTML6 could introduce new tags like <heading>, or someone could be using a front-end JS library that consumes custom tags that this would otherwise strip.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE: I've reverted back to using the suggestion from the issue because it doesn't have the same CI failure my attempt did. Specifically there was an edge case (which we already have a unit test for, which is what failed) where some really weird invalid HTML could result in multiple <body> tags existing in the HTML.

The HTMLValue implementation relies on exactly one <body> tag being present, so that's a non-starter. People who want custom weird tags will have to name them something else, and we'll just have to be aware of changes when future versions of HTML roll around.

@GuySartorelli
Copy link
Member Author

Prefer-lowest failures are expected - that's happening all over the place. I suspect that's what silverstripe/gha-ci#132 is intended to fix.

Copy link
Member

@emteknetnz emteknetnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solution looks fine, though there's failing unit tests

@GuySartorelli GuySartorelli force-pushed the pulls/5.2/no-strip-header branch from 30e7959 to e68b65c Compare July 9, 2024 00:15
@GuySartorelli GuySartorelli force-pushed the pulls/5.2/no-strip-header branch from e68b65c to f46c0d8 Compare July 9, 2024 00:24
@GuySartorelli
Copy link
Member Author

See #11302 (comment)

@emteknetnz emteknetnz merged commit c13ec34 into silverstripe:5.2 Jul 9, 2024
15 checks passed
@emteknetnz emteknetnz deleted the pulls/5.2/no-strip-header branch July 9, 2024 01:18
@@ -32,7 +32,7 @@ public function __construct($fragment = null)
*/
public function setContent($content)
{
$content = preg_replace('#</?(html|head|body)[^>]*>#si', '', $content);
$content = preg_replace('#</?(html|head(?!er)|body)[^>]*>#si', '', $content);
Copy link
Contributor

@michalkleiner michalkleiner Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment here — I know it's cooler to use more regex foo, but imho here it would be much easier for anyone reading the list of tags if it was explicitly listed as html|head|body|header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants