HTML API: Fix - avoid calling subclass method while internally scanning in Tag Processor #5475

dmsnell · 2023-10-12T21:49:50Z

After modifying tags in the HTML API, the Tag Processor backs up to before the tag being modified and then re-parses its attributes. This saves on the code-complexity involved in applying updates, which have already been transformed to "lexical updates" by the time they are applied.

In order to do that, get_updated_html() calls next_tag() to reuse its logic. Unfortunately, as a public method, subclasses may change the behavior of that method, and the HTML Processor does just this. It maintains an HTML stack of open elements and when the Tag Processor calls this method to re-scan a tag and its attributes, it leads to a broken stack.

To fix this, this patch replaces the call to next_tag() with a more appropriate reapplication of its internal parsing logic to rescan the tag name and its attributes. Given the limited nature of what's occurring in get_updated_html() this should bring with it certain guarantees that no HTML structure is being changed (that structure will only be changed by subclasses like the HTML Processor).

This patch resolves an issue discovered by @adamziel during testing of the HTML Processor.

There is no evidence that this makes a clear impact on performance

dmsnell · 2023-10-12T23:34:45Z

cc: @adamziel @westonruter @ockham

adamziel

Hey this is better than the previous implementation ❤️

…dating HTML.

SergeyBiryukov · 2023-10-16T14:01:02Z

Thanks for the PR! Merged in r56941.

dmsnell · 2023-10-16T17:49:57Z

thanks @SergeyBiryukov!

andrewserong · 2023-10-17T04:25:33Z

I'm not sure if it's related, but Gutenberg trunk PHP tests have started failing, in particular WP_Directive_Processor tests:

There were 2 failures:

1) WP_Directive_Processor_Test::test_set_inner_html_subsequent_updates_on_the_same_tag_work
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'<div>outside</div><section>This is the even newer section content.</section>'
+'<This is the even newer section content.</section>'

/var/www/html/wp-content/plugins/gutenberg/phpunit/experimental/interactivity-api/class-wp-directive-processor-test.php:90
phpvfscomposer:///var/www/html/wp-content/plugins/gutenberg/vendor/phpunit/phpunit/phpunit:60

2) WP_Directive_Processor_Test::test_set_inner_html_preceded_by_set_attribute_works
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'<div>outside</div><section id="thesection">This is the new section content.</section>'
+'<This is the new section content.</section>'

/var/www/html/wp-content/plugins/gutenberg/phpunit/experimental/interactivity-api/class-wp-directive-processor-test.php:108
phpvfscomposer:///var/www/html/wp-content/plugins/gutenberg/vendor/phpunit/phpunit/phpunit:60

Could that be related to this change, since WP_Directive_Processor extends WP_HTML_Tag_Processor?

There's also a bit of discussion over in the #core-editor Slack channel: https://wordpress.slack.com/archives/C02QB2JS7/p1697514694299429?thread_ts=1697471144.510409&cid=C02QB2JS7

Fixes a bug introduced in WordPress#5475. When applying updates to HTML, one step was left out in WordPress#5475 which updated the position of the end of the current tag. This made it possible to create bookmarks with null or earlier end positions than their start position. This in turn broke the Directive Processor in Gutenberg during the backport of changes from Core into Gutenberg. In this patch, after applying updates, the HTML document is now scanned fully to the end of the current tag, updating the internal pointer to its end, so that nothing else will be broken or misaligned.

dmsnell changed the title ~~HTML API: Fix - unwind stack when modifying tag.~~ HTML API: Fix - avoid calling subclass method while internally scanning in Tag Processor Oct 12, 2023

dmsnell marked this pull request as ready for review October 12, 2023 23:24

dmsnell force-pushed the html-api/fix-problems-found-during-testing branch from 838f4f4 to 7aebc80 Compare October 12, 2023 23:33

adamziel approved these changes Oct 13, 2023

View reviewed changes

dmsnell added 2 commits October 13, 2023 14:23

Add test codifying set_attribute bug modifying breadcrumbs.

eea28e5

Call internal parsing methods instead of public next_tag() after up…

3026215

…dating HTML.

dmsnell force-pushed the html-api/fix-problems-found-during-testing branch from 7aebc80 to 3026215 Compare October 13, 2023 21:23

SergeyBiryukov closed this Oct 16, 2023

dmsnell deleted the html-api/fix-problems-found-during-testing branch October 16, 2023 17:50

dmsnell mentioned this pull request Oct 17, 2023

Fix: Directive processor failing on updated HTML API WordPress/gutenberg#55404

Merged

dmsnell mentioned this pull request Oct 17, 2023

HTML API: Scan to end of tag when getting updated HTML output. #5506

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTML API: Fix - avoid calling subclass method while internally scanning in Tag Processor #5475

HTML API: Fix - avoid calling subclass method while internally scanning in Tag Processor #5475

dmsnell commented Oct 12, 2023 •

edited

Loading

dmsnell commented Oct 12, 2023

adamziel left a comment

SergeyBiryukov commented Oct 16, 2023

dmsnell commented Oct 16, 2023

andrewserong commented Oct 17, 2023

HTML API: Fix - avoid calling subclass method while internally scanning in Tag Processor #5475

HTML API: Fix - avoid calling subclass method while internally scanning in Tag Processor #5475

Conversation

dmsnell commented Oct 12, 2023 • edited Loading

dmsnell commented Oct 12, 2023

adamziel left a comment

Choose a reason for hiding this comment

SergeyBiryukov commented Oct 16, 2023

dmsnell commented Oct 16, 2023

andrewserong commented Oct 17, 2023

dmsnell commented Oct 12, 2023 •

edited

Loading