You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The custom MatchHashesAndScoreQuery currently uses a counter that requires allocating an array of shorts with one entry for every document in the segment to track the number of matches for each doc, and then iterating over that array twice to get the top k docs. This is actually substantially faster than using any sort of hashmap I've found, including primitive hash maps. I've tried hppc, hppcrt, and fastlib, and all of them are least 2x as slow (e.g. a segment with 1.1M docs gets 40 q/s with arrays, 20 q/s with hashmaps). I figure this kind of array setup won't scale forever, but I don't want to change it until there's some comparably fast alternative.
The text was updated successfully, but these errors were encountered:
By far the fastest primitive hashmap implementation I found is https://github.com/leventov/Koloboke. It only contains maps and sets, and the original website is down, but the code quality is still there.
The custom
MatchHashesAndScoreQuery
currently uses a counter that requires allocating an array of shorts with one entry for every document in the segment to track the number of matches for each doc, and then iterating over that array twice to get the top k docs. This is actually substantially faster than using any sort of hashmap I've found, including primitive hash maps. I've tried hppc, hppcrt, and fastlib, and all of them are least 2x as slow (e.g. a segment with 1.1M docs gets 40 q/s with arrays, 20 q/s with hashmaps). I figure this kind of array setup won't scale forever, but I don't want to change it until there's some comparably fast alternative.The text was updated successfully, but these errors were encountered: