Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: Speedup content cache flush by using cte in findAncestorNodeAggregateIds #5261

Merged
merged 17 commits into from
Nov 4, 2024

Conversation

mhsdesign
Copy link
Member

@mhsdesign mhsdesign commented Sep 24, 2024

Followup to #5221

ContentGraph::findParentNodeAggregates becomes slower on bigger datasets. Due to the mass on executions on a cr:replay, this sums up very quickly. Via #5268 the query will be improved but this pr introduces ContentGraph::findAncestorNodeAggregateIds to make this operation as performant as possible.

  • Adds comment for CacheFlushingStrategy strategies

  • Introduces ContentGraphInterface::findAncestorNodeAggregateIds using native sql cte to speedup cache flushing. (see comment)

  • Move test which creates illegal state to content graph package and use native sql to create the state to not run any catchup hooks. Previously we needed to handle the case of infinite loops to not crash:

    Prevent infinite loops
    NOTE: Normally, the content graph cannot contain cycles. However, during the
    testcase "Features/ProjectionIntegrityViolationDetection/AllNodesAreConnectedToARootNodePerSubgraph.feature"
    and in case of bugs, it could have actually cycles.
    The content cache catchup hook leverage this method and would otherwise be hanging up in an endless loop.
    That's why we track the seen NodeAggregateIds to be sure we don't travers them multiple times.

  • fixes a bug where you cannot replay because the workspace is "missing" and no content graph exists

Upgrade instructions

Review instructions

Checklist

  • Code follows the PSR-2 coding style
  • Tests have been created, run and adjusted as needed
  • The PR is created against the lowest maintained branch
  • Reviewer - PR Title is brief but complete and starts with FEATURE|TASK|BUGFIX
  • Reviewer - The first section explains the change briefly for change-logs
  • Reviewer - Breaking Changes are marked with !!! and have upgrade-instructions

@mhsdesign mhsdesign marked this pull request as draft September 24, 2024 19:19
@mhsdesign mhsdesign marked this pull request as ready for review September 25, 2024 11:43
@mhsdesign mhsdesign marked this pull request as draft September 25, 2024 17:33
@mhsdesign mhsdesign changed the title TASK: content cache flusher followup FEATURE: Speedup content cache flush by using cte in findAncestorNodeAggregateIds Sep 25, 2024
@mhsdesign mhsdesign marked this pull request as ready for review September 26, 2024 09:17
In the case of the content cache flusher we do not care about the order and ordering it by parentnodeanchor and position (for siblings) is slower and not even correct in all situations as the parentnodeanchor is just an autoincrement without meaning.
@nezaniel nezaniel self-requested a review October 23, 2024 20:54
Copy link
Member

@nezaniel nezaniel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minimal naming stuff, looks fine otherwise (and is a lot faster in the non-trivial test cases)

@dlubitz
Copy link
Contributor

dlubitz commented Oct 29, 2024

Looks fine to me by reading. I left two comments.

`cn` should actually be named `ch` as it's joined as the child hierarchy relation
@mhsdesign mhsdesign force-pushed the task/contentCacheFlusher-followup branch from 3a6a28f to 2dbd3a0 Compare November 4, 2024 10:28
mhsdesign and others added 2 commits November 4, 2024 11:30
The query could be optimized a bit more (at least for deeper trees), if we join the node table only after we built the hierarchy.

Co-authored-by: Denny Lubitz <[email protected]>
Copy link
Member

@bwaidelich bwaidelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 by reading, thanks!

@mhsdesign mhsdesign merged commit 51d5e4e into 9.0 Nov 4, 2024
10 checks passed
@mhsdesign mhsdesign deleted the task/contentCacheFlusher-followup branch November 4, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants