Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mechanism to temporarily prevent text retrieval #396

Open
hi-ko opened this issue Apr 4, 2022 · 0 comments
Open

mechanism to temporarily prevent text retrieval #396

hi-ko opened this issue Apr 4, 2022 · 0 comments

Comments

@hi-ko
Copy link

hi-ko commented Apr 4, 2022

the content tracker still needs to run synchronously. We need a mechanism to temporarily prevent text retrieval to avoid scalability issues and timeouts (caused by async transactions) - especially if we already know that they are long running like for ocr.

In the old, sync transformer framework it was possible to fake such a feature by setting cm:isContentIndexed=false to prevent the node to be catched up from the repository before it has been transformed and to remove that aspect later when the text transformation is available.

#395 / SEARCH-2974 breakes this old "feature". So either we get a new feature to postpone the text retrieval or the mechanism for isContentIndexed is working again as expected e.g.

  • if a new node get's cm:isContentIndexed=false property added by behavior it must not result into an empty index doc
  • when removing the aspect or setting isContentIndexed to true later the text should be indexed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant