Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can Lucene support selective preloading of just slices/files within a CFS file? #13967

Open
gautamworah96 opened this issue Oct 30, 2024 · 2 comments

Comments

@gautamworah96
Copy link
Contributor

Description

At Amazon Product Search, we warm up our service by preloading vector files into RAM and use the mmapDir.setPreload API for this.
However, when vector files get compacted into a .cfs file, they do not get paged into RAM due to setPreload being called on the basis of .vec, .veq extensions.

This would also help for lexical search files that get compacted into .cfs files and which need preloading for service warm up purposes..

@mikemccand
Copy link
Member

One simple workaround is to disable compound file format (.cfs files).

@uschindler
Copy link
Contributor

In general, I would recommend to allow that, too.

We are currently a bit planning how to make CFS files and madvise work together in a better way and this already improved in Lucene 10.

IMHO, the preloading is no longer a good idea when we are using madvise correctly (WILL_NEED). Preloading should be replaced by that (the JVM does the same behind scenes). For this we already have support in CFS files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants