Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpublishing the parent page does not include the children #88

Closed
ssmarco opened this issue Sep 14, 2023 · 8 comments
Closed

Unpublishing the parent page does not include the children #88

ssmarco opened this issue Sep 14, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@ssmarco
Copy link

ssmarco commented Sep 14, 2023

In the CMS, unpublishing the parent page also unpublishes the children automatically.
However, the same behaviour does not seem to sync when it comes to removing them from Elastic.
In this situation, the search results from Elastic will include the supposedly unpublished content. This looks like it affects version 2 and 3 but not tried on v1

Here are steps below to replicate this issue.

  • Create a parent and child page then publish them
  • Make sure to run reindex if need in order to get them to Elastic index
  • Once these pages are indexed in Elastic; we unpublish the parent page.

image

  • Check the queued jobs section to observe that only the parent page is queued to be removed and completed.

image

  • Once the queued job finishes, the API log from Elastic confirmed that only the parent page is removed

image

  • From the documents section in Elastic dashboard the child page is still there.
    image
@chrispenny chrispenny added the bug Something isn't working label Sep 17, 2023
@chrispenny
Copy link
Collaborator

Thanks, @ssmarco!

Adding a note for when we implement a fix for this:

SiteTree has a config enforce_strict_hierarchy, which is what controls the unpublishing action on the child pages when a parent is unpublished. We should check that this is still set to true before we purge any child pages during this parent page unpublishing action.

ssmarco pushed a commit to ssmarco/silverstripe-search-service that referenced this issue Sep 28, 2023
Issue [silverstripe#88](silverstripe#88)

Uses `SiteTree::enforce_strict_hierarchy` config in adding child pages in getting the dependent documents for indexing
ssmarco pushed a commit to ssmarco/silverstripe-search-service that referenced this issue Sep 28, 2023
Issue [silverstripe#88](silverstripe#88) Uses `SiteTree::enforce_strict_hierarchy` config in getting the dependent documents for indexing child pages.
ssmarco pushed a commit to ssmarco/silverstripe-search-service that referenced this issue Sep 28, 2023
Issue [silverstripe#88](silverstripe#88) Uses `SiteTree::enforce_strict_hierarchy` config in getting the dependent documents for indexing child pages.
ssmarco pushed a commit to ssmarco/silverstripe-search-service that referenced this issue Oct 5, 2023
Issue [silverstripe#88](silverstripe#88) Uses `SiteTree::enforce_strict_hierarchy` config in getting the dependent documents for indexing child pages.
@GuySartorelli
Copy link
Member

Can confirm that this is an issue - though probably one that should be fixed in framework or versioned instead.
I think this module is doing the right thing by performing the "removeFromIndexes" behaviour in its onAfterUnpublish() extension hook.

The problem is really that child pages which get unpublished don't get that extension hook called on them, which is most likely down to the convoluted way cascading publish/unpublish events end up getting performed.

@ssmarco
Copy link
Author

ssmarco commented Feb 11, 2024

Thanks @GuySartorelli
Agree with your point. I did some investigation as well during the time of this PR, and I thought that the approach of cascade unpublishing of children pages was for performance reasons hence onAfterUnPublish() was not called on each child page.

@michalkleiner
Copy link

The content of the index should basically contain the same pages as a public user can see/navigate to, so we should keep them in sync. I guess during query time there's a canView check done for each result so we won't present unwanted content, but it'd be best if it even wasn't in the index in the first place.

@chrispenny
Copy link
Collaborator

I guess during query time there's a canView check done

It depends which search implementation you're using. If you're using (say) the JS libraries supplied by Elastic, then your client side is communicating directly with your Elastic engine, so there's no canView (or any other Silverstripe framework) interaction going on.

We had a similar issue to this with the static publisher module (child pages not having their cache purged if the parent is unpublished). The solution there was to fetch and purge child pages as part of the un-publish that was actioned on the parent. Noting again that this only happens if enforce_strict_hierarchy is set to true.

@GuySartorelli
Copy link
Member

GuySartorelli commented Feb 11, 2024

Even if you do the search request from PHP and then perform a canView() check before sending results to the frontend, you'll end up with the same problems with pagination that you get elsewhere with Silverstripe CMS (i.e. your page 1 shows results "1-10 of 20", but only 3 pages can be seen because the others fail permission checks)

@GuySartorelli
Copy link
Member

I did some investigation as well during the time of this PR, and I thought that the approach of cascade unpublishing of children pages was for performance reasons hence onAfterUnPublish() was not called on each child page.

That's probably accurate, yeah. It'd probably be worth checking what sort of performance hit you'd be taking by calling the hook on each child, but I can't imagine it'd be too bad in most cases.

blueo added a commit that referenced this issue Apr 12, 2024
* FIX Unpublising parent pages should include child pages
fixes Issue [#88](#88) Uses `SiteTree::enforce_strict_hierarchy` config in getting the dependent documents for indexing child pages.

* Bugfix `DataObjectDocument::getDependentDocuments` to get dependencies via `has_one` relationships
---------

Co-authored-by: Bernard Hamlin <[email protected]>
@blueo
Copy link
Collaborator

blueo commented Apr 12, 2024

fixed by #89

@blueo blueo closed this as completed Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants