Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimize mapSubcript by caching input keys in a hash map. (facebookin…
…cubator#7191) Summary: Pull Request resolved: facebookincubator#7191 There are cases where the mapSubscript function can receive the same (base) map over and over again. In that case searching the same map again and again is redundant. This optimization makes mapSubscript a stateful function, if it sees that the same base map is being provided for it, it will cache a materialized version of the input. The shadowed query e2e runtime reduces from 1.50 hours to 16.48 min. This optimization is enabled for non-bool primitive types only. A benchmark is added. The function itself becomes so much faster. Speedup depends on the number of batches and the size of the base vector. The production case we have 1 map with 80k entries, so speedup for the function is extremely high. ``` ============================================================================ [...]hmarks/ExpressionBenchmarkBuilder.cpp relative time/iter iters/s ============================================================================ INTEGER_10000_1##subscript 25.15ms 39.76 INTEGER_10000_1##subscriptNocaching 3.48s 287.58m INTEGER_10000_1000##subscript 81.40ms 12.29 INTEGER_10000_1000##subscriptNocaching 3.30s 303.07m INTEGER_1000_1##subscript 23.00ms 43.48 INTEGER_1000_1##subscriptNocaching 319.80ms 3.13 INTEGER_1000_1000##subscript 36.41ms 27.46 INTEGER_1000_1000##subscriptNocaching 372.11ms 2.69 INTEGER_100_1##subscript 22.45ms 44.53 INTEGER_100_1##subscriptNocaching 52.40ms 19.08 INTEGER_100_1000##subscript 27.77ms 36.01 INTEGER_100_1000##subscriptNocaching 57.05ms 17.53 INTEGER_10_1##subscript 23.65ms 42.28 INTEGER_10_1##subscriptNocaching 22.81ms 43.83 INTEGER_10_1000##subscript 24.18ms 41.36 INTEGER_10_1000##subscriptNocaching 23.94ms 41.78 VARCHAR_10000_1##subscript 62.20ms 16.08 VARCHAR_10000_1##subscriptNocaching 4.77s 209.59m VARCHAR_10000_1000##subscript 155.07ms 6.45 VARCHAR_10000_1000##subscriptNocaching 7.21s 138.77m VARCHAR_1000_1##subscript 55.51ms 18.01 VARCHAR_1000_1##subscriptNocaching 483.55ms 2.07 VARCHAR_1000_1000##subscript 90.37ms 11.07 VARCHAR_1000_1000##subscriptNocaching 584.56ms 1.71 VARCHAR_100_1##subscript 53.77ms 18.60 VARCHAR_100_1##subscriptNocaching 69.78ms 14.33 VARCHAR_100_1000##subscript 66.42ms 15.06 VARCHAR_100_1000##subscriptNocaching 87.73ms 11.40 VARCHAR_10_1##subscript 31.04ms 32.21 VARCHAR_10_1##subscriptNocaching 33.17ms 30.14 VARCHAR_10_1000##subscript 33.75ms 29.63 VARCHAR_10_1000##subscriptNocaching 35.62ms 28.07 ``` Reviewed By: oerling Differential Revision: D50544250 fbshipit-source-id: 6a901473a2cf018c290cbbbe3b73b13ab22c7bfa
- Loading branch information