How to configure TieredMergePolicy for very low segment count? #14004

jpountz · 2024-11-20T17:06:46Z

Description

I have been experimenting with configuring TieredMergePolicy to keep the segment count very low:

segsPerTier = 2
floorSegmentSize = 512MB

This typically helps if you run queries that have a high per-segment overhead (vector search, multi-term queries) and have a low indexing throughput (especially if indexing and search run on separate hardware so that merges don't disturb searches).

Interestingly, an index that is less than 1GB can still have 10 segments with the above merge policy because of the constraint to not run merges where the resulting segment is less than 50% bigger than the biggest input segment. E.g. consider the following segment sizes: 100kB, 300kB, 800kB, 2MB, 5MB, 12MB, 30MB, 70MB, 150MB, 400MB. There is no pair of segments where the sum is more than 50% bigger than the max input segment.

I have bias against removing this constraint since containing write amplification is important to not run into quadratic merging, but I wonder if there are other ways how we could further reduce the number of segments.

For instance, TieredMergePolicy automatically takes the min of maxMergeAtOnce and numSegsPerTier as a merge factor, but it's not clear to me why this is important. If the merge policy allowed merges to have between 2 and 10 segments in the above example, it could find merges in the described segment structure, and this would likely help have lower write amplification for the same segment count?

Other ideas?

The text was updated successfully, but these errors were encountered:

jpountz · 2024-11-20T17:09:36Z

Interestingly, it looks like this PR #266 would do what I'm looking for, though it aimed at solving a different problem (!)

jpountz · 2024-11-20T21:55:54Z

I confirmed that #266 seems to help for this case. It does find merges to run with the above example (100kB, 300kB, 800kB, 2MB, 5MB, 12MB, 30MB, 70MB, 150MB, 400MB). And when I run our simulation tests in BaseMergePolicyTestCase, it ends up with a similar number of segments but at a lower write amplification.

jpountz · 2024-11-22T17:33:17Z

I ran the IndexGeoNames benchmark with 1 indexing thread, SerialMergeScheduler, 10k buffered docs, 100MB floor segment size, 2 segments per tier. This made the total indexing time go from 275s to 250s with this #266, which suggests that the reduced write amplification is indeed helping, despite flushing a single segment at once.

mikemccand · 2024-11-22T18:13:10Z

Interestingly, an index that is less than 1GB can still have 10 segments with the above merge policy because of the constraint to not run merges where the resulting segment is less than 50% bigger than the biggest input segment. E.g. consider the following segment sizes: 100kB, 300kB, 800kB, 2MB, 5MB, 12MB, 30MB, 70MB, 150MB, 400MB. There is no pair of segments where the sum is more than 50% bigger than the max input segment.

Hmm shouldn't the `floorSegmentSize=512MB' mean all of these segments are considered the "same size" (level) and all mergeable? Or does that 50% check trump the flooring?

For instance, TieredMergePolicy automatically takes the min of maxMergeAtOnce and numSegsPerTier as a merge factor, but it's not clear to me why this is important. If the merge policy allowed merges to have between 2 and 10 segments in the above example, it could find merges in the described segment structure, and this would likely help havea lower write amplification for the same segment count?

I think it did this in order to aim for an index geometry over time that has logarithmically sized levels (ish), where segment sizes tend to cluster into these close-ish levels? It mimics the behavior of the more carefully defined LogDoc/ByteSizeMergePolicy. But this doesn't seem intrinsic/important -- +1 to allow it to pick a range of segments to merge at once?

"Optimal" merging is hard! I wish we had the perfect merge policy that simply took as input what amortized write amplification is acceptable during indexing, and given that budget would aim to maximize search performance (by some approximate measure)... for apps using NRT segment replication, where one JVM indexes and N JVMs search (physically/virtually different instances) the efficiently incrementally replicated index, they would tolerate possibly very high write amplification (Amazon Product Search falls into this case). But for other apps that are indexing and searching on the same node, there's likely much less tolerance in burning so much CPU/IO to eek out a slightly more efficient index for searching ...

jpountz added the type:enhancement label Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to configure TieredMergePolicy for very low segment count? #14004

How to configure TieredMergePolicy for very low segment count? #14004

jpountz commented Nov 20, 2024

jpountz commented Nov 20, 2024

jpountz commented Nov 20, 2024

jpountz commented Nov 22, 2024

mikemccand commented Nov 22, 2024

How to configure TieredMergePolicy for very low segment count? #14004

How to configure TieredMergePolicy for very low segment count? #14004

Comments

jpountz commented Nov 20, 2024

Description

jpountz commented Nov 20, 2024

jpountz commented Nov 20, 2024

jpountz commented Nov 22, 2024

mikemccand commented Nov 22, 2024