Pruning of estimating the point value count since BooleanScorerSupplier #13554

kkewwei · 2024-07-09T15:02:24Z

Description

In #13199, we add isEstimatedPointCountGreaterThanOrEqualTo to dynamic pruning in the point value, there also too many functions call estimatePointCount directly, dynamic pruning is not used.

lucene/lucene/core/src/java/org/apache/lucene/index/PointValues.java

Line 387 in 295c5d3

public final long estimatePointCount(IntersectVisitor visitor) {

One of my ideas is pruning since BooleanScorerSupplier:

lucene/lucene/core/src/java/org/apache/lucene/search/BooleanScorerSupplier.java

Line 318 in 295c5d3

long leadCost =

The example is as follow:

   long leadCost = Long.MAX_VALUE;
    leadCost = subs.get(Occur.MUST).stream().mapToLong(ScorerSupplier::cost(leadCost)).min().orElse(Long.MAX_VALUE);
    leadCost =
        subs.get(Occur.FILTER).stream().mapToLong(ScorerSupplier::cost(leadCost)).min().orElse(leadCost);

If it's a good idea, if is, I'm pleasure to implement.

The text was updated successfully, but these errors were encountered:

jpountz · 2024-07-10T07:00:21Z

The idea makes sense to me, but I worry that it wouldn't look good API-wise. I also imagine that the gains would be lower than in #13199 since Weight#scorerSupplier is called one time per segment while comparators used to estimate the point count multiple times per segment.

kkewwei · 2024-07-10T13:28:42Z

@jpountz, thank you for reply.

I will do benchmark if it's useful.

kkewwei added the type:enhancement label Jul 9, 2024

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Jul 9, 2024

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Jul 9, 2024

kkewwei changed the title ~~Pruning of estimating the point value count from BooleanScorerSupplier~~ Pruning of estimating the point value count since BooleanScorerSupplier Jul 9, 2024

kkewwei linked a pull request Nov 12, 2024 that will close this issue

Pruning of estimating the point value count in BooleanScorerSupplier #13988

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pruning of estimating the point value count since BooleanScorerSupplier #13554

Pruning of estimating the point value count since BooleanScorerSupplier #13554

kkewwei commented Jul 9, 2024 •

edited

Loading

jpountz commented Jul 10, 2024

kkewwei commented Jul 10, 2024

Pruning of estimating the point value count since BooleanScorerSupplier #13554

Pruning of estimating the point value count since BooleanScorerSupplier #13554

Comments

kkewwei commented Jul 9, 2024 • edited Loading

Description

jpountz commented Jul 10, 2024

kkewwei commented Jul 10, 2024

kkewwei commented Jul 9, 2024 •

edited

Loading