Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc cleanups to TopScoreDocCollector #13935

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

original-brownbear
Copy link
Member

We can be a little less verbose for one big conditional and can optimize a few obvious spots in the logic (the hit comparison change is particularly helpful here I think).
For wikimedium this is a small but clear win:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                          IntNRQ       88.51     (14.0%)       88.65     (14.7%)    0.2% ( -24% -   33%) 0.960
                      AndHighMed      173.56      (3.9%)      174.72      (3.8%)    0.7% (  -6% -    8%) 0.437
            BrowseDateSSDVFacets        1.25      (6.1%)        1.26      (8.7%)    0.7% ( -13% -   16%) 0.672
                      OrHighHigh       85.29      (9.2%)       86.10      (8.1%)    0.9% ( -14% -   20%) 0.624
            HighTermTitleBDVSort       27.17      (5.6%)       27.49      (6.2%)    1.2% ( -10% -   13%) 0.367
                       OrHighMed      277.10      (4.5%)      280.71      (4.6%)    1.3% (  -7% -   10%) 0.199
         AndHighMedDayTaxoFacets       61.39      (2.4%)       62.32      (3.2%)    1.5% (  -3% -    7%) 0.016
     BrowseRandomLabelSSDVFacets        3.33      (4.5%)        3.38      (2.8%)    1.5% (  -5% -    9%) 0.071
                      AndHighLow     1988.25      (4.8%)     2018.70      (4.4%)    1.5% (  -7% -   11%) 0.133
                        PKLookup      240.84      (1.8%)      245.08      (1.8%)    1.8% (  -1% -    5%) 0.000
                     AndHighHigh       52.61      (5.6%)       53.57      (4.8%)    1.8% (  -8% -   12%) 0.114
               HighTermTitleSort       69.04      (3.8%)       70.37      (4.3%)    1.9% (  -5% -   10%) 0.034
                        HighTerm      479.90      (6.3%)      489.18      (5.7%)    1.9% (  -9% -   14%) 0.150
                 LowSloppyPhrase       86.07     (11.0%)       87.75     (11.3%)    2.0% ( -18% -   27%) 0.431
            MedTermDayTaxoFacets       21.43      (3.6%)       21.85      (3.6%)    2.0% (  -5% -    9%) 0.015
                          Fuzzy2       85.21      (3.2%)       86.98      (3.9%)    2.1% (  -4% -    9%) 0.010
                      TermDTSort      177.37      (6.9%)      181.13      (6.8%)    2.1% ( -10% -   17%) 0.169
                         LowTerm      720.26      (5.4%)      735.66      (5.0%)    2.1% (  -7% -   13%) 0.068
                       MedPhrase      332.87      (5.3%)      340.01      (5.7%)    2.1% (  -8% -   13%) 0.082
                HighSloppyPhrase        5.88      (8.2%)        6.02     (10.3%)    2.4% ( -14% -   22%) 0.242
             LowIntervalsOrdered       99.26      (6.0%)      101.70      (5.5%)    2.5% (  -8% -   14%) 0.056
                         Prefix3      424.88      (4.6%)      435.41      (4.1%)    2.5% (  -5% -   11%) 0.011
        AndHighHighDayTaxoFacets       18.44      (3.8%)       18.91      (3.5%)    2.5% (  -4% -   10%) 0.002
                   OrHighNotHigh      434.36      (5.4%)      445.63      (5.0%)    2.6% (  -7% -   13%) 0.026
                      HighPhrase      287.40      (5.3%)      294.89      (6.4%)    2.6% (  -8% -   15%) 0.047
             MedIntervalsOrdered       39.67      (3.9%)       40.70      (3.7%)    2.6% (  -4% -   10%) 0.002
                       OrHighLow      688.03      (4.4%)      706.76      (4.0%)    2.7% (  -5% -   11%) 0.003
                    OrHighNotMed      523.00      (6.5%)      537.25      (7.3%)    2.7% ( -10% -   17%) 0.076
                          Fuzzy1       79.64      (2.1%)       81.81      (1.9%)    2.7% (  -1% -    6%) 0.000
           BrowseMonthSSDVFacets        4.46      (7.0%)        4.59      (5.8%)    2.8% (  -9% -   16%) 0.055
                     LowSpanNear      145.66      (2.2%)      149.71      (2.0%)    2.8% (  -1% -    7%) 0.000
                       LowPhrase      181.90      (3.7%)      187.03      (4.5%)    2.8% (  -5% -   11%) 0.002
               HighTermMonthSort     1467.83      (6.2%)     1510.29      (6.0%)    2.9% (  -8% -   16%) 0.034
                    OrNotHighMed      552.11      (6.8%)      568.79      (6.4%)    3.0% (  -9% -   17%) 0.040
                         Respell       35.74      (1.4%)       36.83      (2.0%)    3.0% (   0% -    6%) 0.000
          OrHighMedDayTaxoFacets        8.67      (5.7%)        8.94      (5.8%)    3.1% (  -7% -   15%) 0.015
                 MedSloppyPhrase        9.79      (6.6%)       10.10      (8.5%)    3.2% ( -11% -   19%) 0.062
                    OrNotHighLow     1115.78      (5.6%)     1151.15      (5.2%)    3.2% (  -7% -   14%) 0.009
                     MedSpanNear       84.96      (5.0%)       87.69      (4.7%)    3.2% (  -6% -   13%) 0.003
            HighIntervalsOrdered       38.24      (5.5%)       39.48      (5.0%)    3.2% (  -6% -   14%) 0.006
     BrowseRandomLabelTaxoFacets        4.34      (5.2%)        4.49      (5.1%)    3.3% (  -6% -   14%) 0.005
                   OrNotHighHigh      316.03      (5.8%)      326.59      (5.8%)    3.3% (  -7% -   15%) 0.010
                         MedTerm      628.56      (8.6%)      650.22      (7.2%)    3.4% ( -11% -   21%) 0.051
                        Wildcard      187.07      (4.3%)      193.66      (4.3%)    3.5% (  -4% -   12%) 0.000
                    HighSpanNear       18.01      (5.0%)       18.67      (4.9%)    3.6% (  -5% -   14%) 0.001
                    OrHighNotLow      580.06      (6.0%)      601.25      (7.4%)    3.7% (  -9% -   18%) 0.015
           BrowseMonthTaxoFacets       12.43     (26.2%)       12.89     (27.5%)    3.7% ( -39% -   77%) 0.538
       BrowseDayOfYearTaxoFacets        5.21      (9.6%)        5.40     (11.5%)    3.7% ( -15% -   27%) 0.115
           HighTermDayOfYearSort      322.24      (8.0%)      334.40      (9.7%)    3.8% ( -12% -   23%) 0.057
            BrowseDateTaxoFacets        5.11      (9.7%)        5.31     (11.6%)    4.0% ( -15% -   27%) 0.094
       BrowseDayOfYearSSDVFacets        4.47      (4.1%)        4.65      (6.9%)    4.2% (  -6% -   15%) 0.001

We can be a little less verbose for one big conditional and
optimize a few obvious spots in the logic.
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me. FWIW the only change that may affect benchmark results in my opinion in the one to the comparison logic, changes to prePopulate only run once per query and changes to the collector only affect paginated queries, which you none of your tasks exercise.

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants