Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multireader Support in Searcher Manager #13976

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Shibi-bala
Copy link
Contributor

Description

Copied from #13975

I'd like to use MultiReader inside my searcher manager, but currently there is only support for DirectoryReader. Not sure about the context for. why this was the case initially, seems specific to some desired use-case https://issues.apache.org/jira/browse/LUCENE-6087.

Anyway should be a 1-line change here:

public SearcherManager(DirectoryReader reader, SearcherFactory searcherFactory)

I found a relevant stack overflow question here:
https://stackoverflow.com/questions/49817453/searchermanager-and-multireader-in-lucene

@mikemccand
Copy link
Member

Thanks @Shibi-bala -- I agree it's odd it was scoped to just DirectoryReader -- any IndexReader should work as long as it can openIfChanged on itself.

I think English.java (from Lucene's test-framework) was maybe deleted long ago? Maybe simplify the test to not bother with English words... just Integer.toString(i) or "" + i should be fine?

Also, please revert the wildcard import (import org.apache.lucene.index.*) -- I think our style checker (jtidy/spotless) will be unhappy with that.

@vigyasharma
Copy link
Contributor

any IndexReader should work as long as it can openIfChanged on itself.

Does MultiReader implement openIfChanged() ? I see a check in SearcherManager#refreshIfNeeded() that asserts for the reader to be a DirectoryReader instance. This is used by the ReferenceManager base class whenever maybeRefresh() is called.

@jpountz
Copy link
Contributor

jpountz commented Nov 13, 2024

To add to @vigyasharma, I have been wondering if we should remove SearcherManager and encourage users to use IndexReaderManager. IndexSearcher is cheap to create and there are som reasons why it may be interesting to create a different IndexSearcher per query already, e.g. to tune parallelism based on current load, or to configure a timeout on the IndexSearcher.

Maybe MultiReader falls in the same bucket. Presumably, the MultiReader is made from multiple DirectoryReaders, so maybe the application code should not create a SearcherManager that works with a MultiReader but instead manage two (or more, one per Directory) ReaderManagers and dynamically create a MultiReader and then an IndexSearcher on top of this MultiReader on every search?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants