Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
use F14VectorSet in ThreadLocalStatsT to optimize aggregate
Summary: `ThreadLocalStatsT::aggregate()` iterates the set of stats. The fastest type of container to loop over is a contiguous container such as vector-set, which `F14VectorSet` is. Modifications to the set are rare and their costs are marginal. Iteration is infrequent but occurs on a schedule and the cost of the iteration scales with the size of the set. When `reset()`ing the stats in the set within `aggregate()`, the cost of each `reset()` can dwarf the cost of the `iterator::operator++()`. But if the `reset()` checks whether there have been any updates to extract before doing the extraction, as `TLStatsThreadSafe::TimeSeriesType` does, then the cost of a `reset()` which does nothing is likely comparable to the cost of the `iterator::operator++()` to advance to that stat. There are cases of applications with many outstanding stat objects in the map, which are mostly dead. For example, when we have a sharded application where each process owns a set of shards, but where shard motion between processes is frequent, and where there are many per-shard counters. In such cases, optimizing the cpu cost of `iterator::operator++()` can be beneficial, and this is done by selecting a vector-set. Reviewed By: a-square Differential Revision: D66591772 fbshipit-source-id: 15cf7b4c710dae0fbdf3f95bc2aca670d66669c9
- Loading branch information