Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds time-to-keep expiration to the LRUCacheWithDelete.
Besides the other ledgers of the backing cache, on every write operation the update time is saved in a fixed-sized array. No other speed or memory overhead is required for regular operations: most importantly, that means there is no age verification on read. Customer must periodically call the expire method, which evicts items older than the time-to-keep.
Our use case is a 15-minute time-to-keep on caches of 64k capacity, where we can tolerate a modest slop in expiration and small periodic expiry sweeps. The benchmark script demonstrates a 1M-element cache with a 10s ttk being expired every 2 seconds, spending 50-60ms on each expiry sweep. I don't know what use case would need those extremes but there you go.
Limitations:
The time-to-live (maximum age of any record returned in a read operation) has only the weak guarantee of
(time-to-keep + maximum-delay-between-expires)
.The expire operation must be done as a whole, and this does not offer to do it in a separate thread or make any attempt to be thread safe. However, the benchmarks in
benchmark/lru-cache
show that a full delete of every item in a 30,000 element cache runs in less than 10ms on a 2019 Macbook Pro.Having two expire operations scheduled in the same thread should be harmless but would give no speedup, so if you found yourself in a situation where the expire was taking significant time it could be big trouble.
Due to floating-point shenanigans a custom clock returning fractional times may behave unexpectedly
Alternatives considered and discarded:
Potential Opportunities for improvement:
Reduce the memory footprint of the age ledger by chunking the timestamps to bytes or words. In the case of byte (256 age bins), a 10-minute ttk, and otherwise default parameters, an expire operation would discard everything older than (ttk - ttk/64) (guaranteeing the ttk but reaping an additional 1.5% of records). Expire must be called at least once every
2 * ttk
(20 minutes) or the newest records would be indistinguishable from the oldest records. (Everything would work but it would be a damn shame for cache efficiency). A word-sized (64k bins) ledger makes the tradeoffs minimal, but offering both choices shouldn't be a problem.For every record it expires we independently call delete, which doctors the read-age linked list. It may make more sense to walk the linked list. I stubbed that out but would value guidance on how to do surgery on the list as it is traversed.
The commit history here is a hot mess that includes and crosses with the other PR on profiling. I'll reorganize with squashed commits once I button things up.