-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree-based APC storage engine #37
Conversation
Signed-off-by: Kreiner <[email protected]>
Signed-off-by: Kreiner <[email protected]>
Signed-off-by: Kreiner <[email protected]>
Signed-off-by: Kreiner <[email protected]>
I would prefer to make this a dedicated Storage type (and deprecate the old one), because of the changed Storage layout. Changing this might need a new major release which I would like to prevent. Just out of curiosity, do you have any benchmarks? I would like to see the difference between the old implementation and the new implementation in the case of adding new keys (when N keys are already inserted) and. Otherwise, the code looks fine on the first look, really good job! Thank you! |
Hey @thedeacon, ping in case you missed my comment :) |
Hey @LKaemmerling I'm still alive!
Update: I have created a new/clean PR #72 which supersedes this PR. I will keep this PR open for a little while so the existing discussion is not lost, but new discussion should be on PR #72 . |
As #72 will be merged i will close this MR :) |
Initial pass at implementing the directed acyclic graph as described in #36 . This code also contains an upper-bound on how many failed calls to
apcu_cas()
in a tight loop will be allowed before an exception is thrown, to prevent an infinite loop arising from some unforeseen APCu misbehavior. I'm not strongly attached to this loop-catcher for the general case if anybody is opposed, but for my needs it was deemed important to have, as a defensive-coding pattern.The DAG is a tree structure, starting with a root node, which contains a serialized array of all the metadata keys stored in APCu. Each metadata key contains a collection of the typical metadata (name, help, metric type) as well as an array of labelNames associated with the metric being measured, and if the metric is a histogram then an array of buckets. The metadata APCu key names end with ":meta".
For each labelNames item, a ":label" APCu key is created, listing all labels for each labelName. Then for each permutation of labels, a ":value" APCu key is created once the label-tuple contains a value. Similarly, for each histogram bucket assigned to a label-tuple, a bucket-specific ":value" APCu key is created once the bucket contains a value. This also applies for the special buckets "sum" and "+Inf", which are not stored as part of the metadata APCu key for the metric.
The described structure allows all APCu keys containing values to be quickly enumerated, simply by reading the root node, loading each metadata APCu key referenced by the root node, and programmatically generating every ":label" and ":value" APCu key that could exist based on the contents of that metadata key. Wiping all data follows a similar pattern: enumerate all keys which contain data and delete them, starting at the leaf nodes and working backward toward the root, deleting the root node last of all.
Note there is a small race condition in
addItemToKey()
, between reading the array and writing it back, another thread could add an element to the contents of the key. For my purposes this is acceptable, because if a metadata pointer gets deleted from the root node, for instance, the next thread to write to that metric will re-create the missing array pointer and will likely succeed. When hundreds of threads per second are writing metrics, the window where this pointer is missing will be exceedingly small -- and when traffic is light, chances are good that this race condition will not be triggered. However, it could be improved -- I'm still thinking about ways to serialize access to the critical part of that function -- ideas are welcome!