Cache profiling (WIP) #188

mrflip · 2022-08-07T05:55:58Z

Still working on this, but here are some benchmarks for the cache classes.

Results so far are that the LRUCache is about 1.5-2x as fast as the LRUMap through all initial tries, and that the differences between LRUCache and LRUCacheWithDelete (or Map/MapWD) is small.

flat writes, 1x gentle read spread LRUCache x 27.93 ops/sec ±1.28% (50 runs sampled)
flat writes, 1x gentle read spread LRUCacheWithDelete x 24.34 ops/sec ±2.58% (44 runs sampled)
flat writes, 1x gentle read spread LRUMap x 12.61 ops/sec ±1.83% (36 runs sampled)
flat writes, 1x gentle read spread LRUMapWithDelete x 11.42 ops/sec ±4.76% (33 runs sampled)
flat writes, 1x gentle read spread LRUCache x 21.32 ops/sec ±1.21% (39 runs sampled)

Random-order reads (no writes) of 60,000 distinct values.
The top 30k of values occur ~70% of the time and the top 10k values 33% of the time.

individual get then set, sharp spread LRUCache x 2,192,917 ops/sec ±1.47% (92 runs sampled)
individual get then set, sharp spread LRUCacheWithDelete x 2,348,333 ops/sec ±1.19% (89 runs sampled)
individual get then set, sharp spread LRUMap x 1,560,643 ops/sec ±2.33% (80 runs sampled)
individual get then set, sharp spread LRUMapWithDelete x 1,408,341 ops/sec ±6.98% (70 runs sampled)
individual get then set, sharp spread LRUCache x 2,331,573 ops/sec ±1.57% (86 runs sampled)

individual get then set, flat spread LRUCache x 3,931,479 ops/sec ±1.92% (85 runs sampled)
individual get then set, flat spread LRUCacheWithDelete x 4,175,905 ops/sec ±1.28% (86 runs sampled)
individual get then set, flat spread LRUMap x 2,242,931 ops/sec ±3.71% (76 runs sampled)
individual get then set, flat spread LRUMapWithDelete x 2,541,944 ops/sec ±3.40% (79 runs sampled)
individual get then set, flat spread LRUCache x 3,748,182 ops/sec ±2.24% (84 runs sampled)

Pre-loaded 30k capacity caches, random-order reads (no writes) of 42,000 distinct values.
The top 30k of values occur ~97% of the time and the top 10k values 75% of the time.

read-only sharp spread LRUCache x 112 ops/sec ±1.28% (72 runs sampled)
read-only sharp spread LRUCacheWithDelete x 100 ops/sec ±1.62% (73 runs sampled)
read-only sharp spread LRUMap x 59.71 ops/sec ±2.01% (62 runs sampled)
read-only sharp spread LRUMapWithDelete x 59.43 ops/sec ±4.16% (61 runs sampled)
read-only sharp spread LRUCache x 102 ops/sec ±3.10% (75 runs sampled)

Pre-loaded 30k capacity caches, random-order reads (no writes) of 60,000 distinct values.
The top 30k of values occur ~70% of the time and the top 10k values 33% of the time.

read-only gentle spread LRUCache x 90.15 ops/sec ±2.23% (73 runs sampled)
read-only gentle spread LRUCacheWithDelete x 86.24 ops/sec ±1.25% (74 runs sampled)
read-only gentle spread LRUMap x 57.07 ops/sec ±1.77% (60 runs sampled)
read-only gentle spread LRUMapWithDelete x 62.46 ops/sec ±1.45% (65 runs sampled)
read-only gentle spread LRUCache x 94.03 ops/sec ±1.14% (76 runs sampled)

I have added the ubiquitous benchmark.js library to help power this if that's OK.

@Yomguithereal I fumbled together a method to give back a pareto-distributed random integer, or at least fake it well enough to serve this benchmark, but I suspect that you will have good advice on how to do it right. If this isn't built in to pandemonium it would be nice to have a few such distributions in the toolbag.

…on impl

Yomguithereal · 2022-08-08T09:37:41Z

Hello @mrflip,

Have you checked this: https://github.com/dominictarr/bench-lru? Are you only trying to assess whether the cache is faster than the map and what is the impact of deletion here?

Note that the Map is sometimes, on some node versions, faster if you handle specific keys such as long strings etc.

Yomguithereal · 2022-08-08T09:41:19Z

If this isn't built in to pandemonium it would be nice to have a few such distributions in the toolbag.

Why not. What would the API looks like? Something taking a rng and some alpha params and returning a distributed rng? At some point it might start being too statistically complicated for pandemonium which is mostly about algorithms to fit another lib such as simple-statistics.

mrflip · 2022-08-08T22:24:13Z

This is primarily in service of the later PR for TTK time-expiring the cache and knowing what the tradeoffs were on Cache vs Map, not so much on comparisons to others (I found out about this lib via that suite). Mixed in with that PR (I'll pull it back to here) is a shell script to run any of the benchmarks with the node profiler turned on.

I'll make the test exercise long strings, shortish strings and number keys.

Do you have any concerns on adding the dev dependency of benchmark.js? I'd also recommend adding the chai-js library -- it makes tests more explanatory and beautiful to read. I was surprised by how many more and deeper tests we wrote after adopting it. For just one example it becomes very pleasant to add type guards like expect(result).to.include.keys([...]), expect(arr).to.be.an('array').length(5), which give much clearer errors than the runtime exceptions when you access that wrong-typed return.

* LRUCache and family .inspect limits its output -- showing the youngest items, and ellipsis, and the oldest item. Options allow dumping the raw object or controlling the size of output (and the number of items a console.log will mindlessly iterate over). * LRUCache and family all have inspect wired up to the magic 'nodejs.util.inspect.custom' symbol property that drives console.log output * LRUCache and family all have a summaryString method returning eg 'LRUCache[8/200]' for a cache with size 8 and capacity 200, wired to the magic Symbol.toStringTag property that drives string interpolation (partially addresses Yomguithereal#129).

feat: #isEqual set helpers (wip); test coverage; alternate intersecti…

44b77af

…on impl

mrflip force-pushed the CacheProfiling branch from 5ced2f4 to 5e5bd95 Compare August 7, 2022 05:56

mrflip added 7 commits August 8, 2022 18:14

chore: addressing code review style guidance

e0f1084

chore: added mocha linter

e60b077

chore: celebrate success of test suite

343152d

chore: Add latest node to test matrix

afade3a

feat: LRU cache with time-to-keep Expiration

d279848

chore: benchmark runner with profiling support

ff991d6

mrflip force-pushed the CacheProfiling branch 4 times, most recently from 22cea6b to a2d5172 Compare August 9, 2022 02:29

chore: benchmarks for lru-cache family

b31fb10

mrflip force-pushed the CacheProfiling branch from a2d5172 to b31fb10 Compare August 9, 2022 04:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache profiling (WIP) #188

Cache profiling (WIP) #188

mrflip commented Aug 7, 2022

Yomguithereal commented Aug 8, 2022

Yomguithereal commented Aug 8, 2022

mrflip commented Aug 8, 2022

Cache profiling (WIP) #188

Are you sure you want to change the base?

Cache profiling (WIP) #188

Conversation

mrflip commented Aug 7, 2022

Yomguithereal commented Aug 8, 2022

Yomguithereal commented Aug 8, 2022

mrflip commented Aug 8, 2022