-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move epoch garbage collection to flize #95
Conversation
Codecov Report
|
Some more on the issue with the I've seen this panic from As I didn't (or at least didn't intend to) touch the underlying logic of the map implementation, in particular soundness of |
cc @xacrimon for both concurrent maps and |
There seems to also be some work left with updated trait bounds for non-default features, so I'll add that to the list. |
I've recently made a change which may fix a bug related to invalid advancing of the epoch. You may want to retry the tests with master if you think flize may be the issue. You may also want to try turning off default features for flize. |
Gave that a quick try, master + no features still triggers the error. |
I think I may have found a bug in flize now that I am looking through the code. Seems to be very rare but plausible in some conditions. I will be opening an issue and hopefully I can fix it today or tomorrow. |
Created issue at flize with link https://github.com/xacrimon/flize/issues/69. I will reply here once progress is made. |
This is really cool — thank you for putting in the effort! ❤️ I'm very swamped with finishing up my thesis at the moment, but will get back to this once my schedule frees up a little. |
@domenicquirl I've now published v4.2.0 "Blaze It" which contains fixes for a few bugs previously identified. |
After a quick check with the new version (nice release name 😉) the panics still occur with the |
While going through the comments again I noticed that, since |
I've released 4.2.1 which fixes yet another bug. Encouraging not using the global collector and using structure specific ones to be able to life those static bounds is one of the major points. Flize should be easy to use without a global collector. |
Working on yet another patch now |
Hm, I don't get these clippy warnings locally. But I've made note of them and will go through them another time. @xacrimon I updated to current The miri failure is also due to using Finally, current bench since who doesn't like more numbers:INSERT -- RwLockStd -- Contrie -- Flurry -- DashMapV3 READ -- RwLockStd -- Contrie -- Flurry -- DashMapV3 UPDATE -- RwLockStd -- Contrie -- Flurry -- DashMapV3 |
The miri failure can be fixed in testing by disabling the fast-barrier feature so it doesn't attempt to use the accelerated OS barriers strategy. I'll take a look at ASAN/LSAN but are you recompiling std when getting those warnings? |
I've also added accelerated barriers for macOS now which may be of interest. |
Hm, this would be a separate |
It's done by passing |
Ok so idk why (a) I am not getting half the clippy warnings even with nightly and (b) it shows them one after the other where I do one |
Eek. I've created the |
I tried it out locally and it mostly works fine. Will run benchmarks again when I can do so on the same machine in idle. In one of the test runs I got a spurious |
Still a lot of complaints it seems :/ |
Now that #102 is merged I think this can be closed, unless you're still planning on pursuing this @domenicquirl? |
Yes, I'll close it. One of the primary motivations of this PR was to make It's also been quite some time since last working on this, and part of why this PR stalled was that there were sanitizer errors with the Thank You for coming up with an even better way (and continuing to iterate and improve on it in #80)! |
After chatting with Acrimon on Discord, we added the missing features (mostly unprotected shields) to
flize
, making it possible to replacecrossbeam_epoch
with it influrry
. The main motivation behind this change is to overcome some of our performance issues that seem to be related to garbage collection (see #50, jonhoo/bustle#2). There's some amount of stuff going on here, but I tried to keep the code as familiar as possible. Preliminary evaluation:Disadvantages
flize
'sShield
is a trait with an associated lifetime bound, which permeates into a lot of our types and functions now, making it harder to understand and maintain them (in particular for new contributors). For example,NodeIter<'g, K, V>
is nowNodeIter<'m, 'g, K, V, SH>
, whereSH: flize::Shield<'m>
. Or, for a function example, my now favourite function (guard lifetimes elided):flize
doesn't have the concept of a global default collector.MapRef
andSetRef
cannot be constructedwith_guard
anymoreebr
module to extendflize
to our needs, which is some additional code to maintainDebug
impls (though that can be fixed very easily by implementingDebug
forflize
types, mainly atomics)Advantages
Owned
Type:flize
only usesAtomic
andShared
. This removes a lot ofOwned::new(...).into_shared(&guard)
, which almost always happens together influrry
, and has proven to be quite ergonomic.disallow_evil
).flize
is fast, which meansflurry
is now also fast! The map now has usable performance comparable to other concurrent maps (see bench results below)Overall, even if the
flize
implementation is slightly less ergonomic than the currentcrossbeam
one this change is a massive win in my eyes. I'd much rather have a library which is a valid alternative for real use than a "useless" one which is easier to maintain. Big air-quotes here, as I am very well aware that this started as a learning project, given that it did for me as well. But even for that it always had work done onflurry
feel unrewarding in some respect, just because "it's not like anyone will ever use this anyways".Benchmarks
As mentioned above, running (the first version of) https://github.com/xacrimon/conc-map-bench indicates major performance improvements for most tasks. Our main slow points still seem to be inserts in general and single-thread overhead, but for other tasks and higher thread counts see for yourself:
Bench results on my machine
INSERT-- RwLockStd
25165824 operations across 1 thread(s) in 4.1842974s; time/op = 165ns
25165824 operations across 2 thread(s) in 4.5482347s; time/op = 179ns
25165824 operations across 3 thread(s) in 4.8963708s; time/op = 193ns
25165824 operations across 4 thread(s) in 4.9127281s; time/op = 194ns
25165824 operations across 5 thread(s) in 5.0914631s; time/op = 201ns
25165824 operations across 6 thread(s) in 5.1736532s; time/op = 204ns
25165824 operations across 7 thread(s) in 5.3755057s; time/op = 212ns
25165824 operations across 8 thread(s) in 5.6824829s; time/op = 225ns
-- CHashMap
25165824 operations across 1 thread(s) in 5.5475931s; time/op = 219ns
25165824 operations across 2 thread(s) in 4.1545495s; time/op = 164ns
25165824 operations across 3 thread(s) in 3.6631931s; time/op = 145ns
25165824 operations across 4 thread(s) in 3.5636953s; time/op = 141ns
25165824 operations across 5 thread(s) in 3.8907204s; time/op = 154ns
25165824 operations across 6 thread(s) in 3.7126855s; time/op = 147ns
25165824 operations across 7 thread(s) in 3.7939694s; time/op = 150ns
25165824 operations across 8 thread(s) in 3.6153663s; time/op = 143ns
-- Contrie
25165824 operations across 1 thread(s) in 19.3256565s; time/op = 766ns
25165824 operations across 2 thread(s) in 9.8763793s; time/op = 391ns
25165824 operations across 3 thread(s) in 6.5838239s; time/op = 261ns
25165824 operations across 4 thread(s) in 5.4415494s; time/op = 215ns
25165824 operations across 5 thread(s) in 4.9086136s; time/op = 194ns
25165824 operations across 6 thread(s) in 5.0110217s; time/op = 198ns
25165824 operations across 7 thread(s) in 4.4787998s; time/op = 177ns
25165824 operations across 8 thread(s) in 5.0837114s; time/op = 201ns
-- Flurry
25165824 operations across 1 thread(s) in 11.0327772s; time/op = 438ns
25165824 operations across 2 thread(s) in 4.9960757s; time/op = 197ns
25165824 operations across 3 thread(s) in 3.5890423s; time/op = 142ns
25165824 operations across 4 thread(s) in 2.6178409s; time/op = 103ns
25165824 operations across 5 thread(s) in 2.4521642s; time/op = 96ns
25165824 operations across 6 thread(s) in 2.350806s; time/op = 92ns
25165824 operations across 7 thread(s) in 2.1080351s; time/op = 83ns
25165824 operations across 8 thread(s) in 2.1402698s; time/op = 84ns
-- DashMapV3
25165824 operations across 1 thread(s) in 4.9713333s; time/op = 196ns
25165824 operations across 2 thread(s) in 3.0867169s; time/op = 122ns
25165824 operations across 3 thread(s) in 2.4237726s; time/op = 95ns
25165824 operations across 4 thread(s) in 2.1705871s; time/op = 85ns
25165824 operations across 5 thread(s) in 1.8898701s; time/op = 74ns
25165824 operations across 6 thread(s) in 1.8302189s; time/op = 71ns
25165824 operations across 7 thread(s) in 1.9578536s; time/op = 77ns
25165824 operations across 8 thread(s) in 2.5197997s; time/op = 99ns
READ
-- RwLockStd
25165824 operations across 1 thread(s) in 2.8450142s; time/op = 112ns
25165824 operations across 2 thread(s) in 2.6005495s; time/op = 102ns
25165824 operations across 3 thread(s) in 2.9530743s; time/op = 116ns
25165824 operations across 4 thread(s) in 3.7836839s; time/op = 150ns
25165824 operations across 5 thread(s) in 3.1795386s; time/op = 126ns
25165824 operations across 6 thread(s) in 3.3245617s; time/op = 131ns
25165824 operations across 7 thread(s) in 3.9840663s; time/op = 158ns
25165824 operations across 8 thread(s) in 3.6619738s; time/op = 145ns
-- CHashMap
25165824 operations across 1 thread(s) in 6.1441085s; time/op = 243ns
25165824 operations across 2 thread(s) in 4.1612992s; time/op = 164ns
25165824 operations across 3 thread(s) in 3.2587518s; time/op = 129ns
25165824 operations across 4 thread(s) in 3.0584123s; time/op = 121ns
25165824 operations across 5 thread(s) in 3.4522208s; time/op = 136ns
25165824 operations across 6 thread(s) in 3.4502188s; time/op = 136ns
25165824 operations across 7 thread(s) in 3.5243859s; time/op = 139ns
25165824 operations across 8 thread(s) in 3.2542338s; time/op = 129ns
-- Contrie
25165824 operations across 1 thread(s) in 4.4747205s; time/op = 176ns
25165824 operations across 2 thread(s) in 2.4922609s; time/op = 98ns
25165824 operations across 3 thread(s) in 1.6747643s; time/op = 65ns
25165824 operations across 4 thread(s) in 1.5041451s; time/op = 59ns
25165824 operations across 5 thread(s) in 1.0541919s; time/op = 41ns
25165824 operations across 6 thread(s) in 925.9906ms; time/op = 36ns
25165824 operations across 7 thread(s) in 933.8123ms; time/op = 37ns
25165824 operations across 8 thread(s) in 1.0068524s; time/op = 39ns
-- Flurry
25165824 operations across 1 thread(s) in 2.5889212s; time/op = 102ns
25165824 operations across 2 thread(s) in 1.7753731s; time/op = 69ns
25165824 operations across 3 thread(s) in 1.6216983s; time/op = 63ns
25165824 operations across 4 thread(s) in 1.4349656s; time/op = 56ns
25165824 operations across 5 thread(s) in 1.4070166s; time/op = 55ns
25165824 operations across 6 thread(s) in 1.0889074s; time/op = 42ns
25165824 operations across 7 thread(s) in 1.068765s; time/op = 41ns
25165824 operations across 8 thread(s) in 1.065837s; time/op = 41ns
-- DashMapV3
25165824 operations across 1 thread(s) in 2.6373341s; time/op = 104ns
25165824 operations across 2 thread(s) in 1.6152775s; time/op = 63ns
25165824 operations across 3 thread(s) in 1.2954522s; time/op = 50ns
25165824 operations across 4 thread(s) in 1.2461302s; time/op = 48ns
25165824 operations across 5 thread(s) in 1.1239509s; time/op = 43ns
25165824 operations across 6 thread(s) in 1.1168707s; time/op = 43ns
25165824 operations across 7 thread(s) in 1.2209516s; time/op = 47ns
25165824 operations across 8 thread(s) in 1.1155557s; time/op = 43ns
UPDATE
-- RwLockStd
25165824 operations across 1 thread(s) in 3.1979387s; time/op = 126ns
25165824 operations across 2 thread(s) in 3.6861083s; time/op = 146ns
25165824 operations across 3 thread(s) in 3.3732953s; time/op = 133ns
25165824 operations across 4 thread(s) in 3.3740196s; time/op = 133ns
25165824 operations across 5 thread(s) in 3.5900936s; time/op = 142ns
25165824 operations across 6 thread(s) in 3.7601384s; time/op = 149ns
25165824 operations across 7 thread(s) in 3.8402625s; time/op = 152ns
25165824 operations across 8 thread(s) in 3.9234121s; time/op = 155ns
-- CHashMap
25165824 operations across 1 thread(s) in 5.5699635s; time/op = 220ns
25165824 operations across 2 thread(s) in 3.7534172s; time/op = 148ns
25165824 operations across 3 thread(s) in 3.340075s; time/op = 132ns
25165824 operations across 4 thread(s) in 3.5080474s; time/op = 139ns
25165824 operations across 5 thread(s) in 3.3624801s; time/op = 133ns
25165824 operations across 6 thread(s) in 3.5680631s; time/op = 141ns
25165824 operations across 7 thread(s) in 3.6638752s; time/op = 145ns
25165824 operations across 8 thread(s) in 3.5436977s; time/op = 140ns
-- Contrie
25165824 operations across 1 thread(s) in 8.0046352s; time/op = 317ns
25165824 operations across 2 thread(s) in 4.365312s; time/op = 172ns
25165824 operations across 3 thread(s) in 2.9479402s; time/op = 116ns
25165824 operations across 4 thread(s) in 2.2043506s; time/op = 87ns
25165824 operations across 5 thread(s) in 2.1326051s; time/op = 84ns
25165824 operations across 6 thread(s) in 1.5834109s; time/op = 62ns
25165824 operations across 7 thread(s) in 1.439043s; time/op = 56ns
25165824 operations across 8 thread(s) in 1.5088232s; time/op = 59ns
-- Flurry
25165824 operations across 1 thread(s) in 3.8062255s; time/op = 151ns
25165824 operations across 2 thread(s) in 2.4032358s; time/op = 95ns
25165824 operations across 3 thread(s) in 1.8500285s; time/op = 72ns
25165824 operations across 4 thread(s) in 1.5584881s; time/op = 61ns
25165824 operations across 5 thread(s) in 1.4674148s; time/op = 57ns
25165824 operations across 6 thread(s) in 1.3686862s; time/op = 53ns
25165824 operations across 7 thread(s) in 1.4423715s; time/op = 56ns
25165824 operations across 8 thread(s) in 1.4779185s; time/op = 57ns
-- DashMapV3
25165824 operations across 1 thread(s) in 3.1669407s; time/op = 125ns
25165824 operations across 2 thread(s) in 2.2521689s; time/op = 89ns
25165824 operations across 3 thread(s) in 1.5633163s; time/op = 61ns
25165824 operations across 4 thread(s) in 1.4805222s; time/op = 58ns
25165824 operations across 5 thread(s) in 1.4239066s; time/op = 55ns
25165824 operations across 6 thread(s) in 1.3356814s; time/op = 52ns
25165824 operations across 7 thread(s) in 1.331327s; time/op = 52ns
25165824 operations across 8 thread(s) in 1.816247s; time/op = 71ns
Roadmap for remaining work/discussion:
concurrent_associate
test to fail (was on theflize
side)SharedExt::into_box
should beunsafe
: Originally I thought I'd create a Wrapper struct aroundflize::Shared
which then always heap allocates new shared objects. However, the shared pointers are passed as a parameter a lot, makingDeref
not play out as nicely as I hoped from a code perspective. So instead, the current code uses an extension trait, which makes it technically possible to constructShared
s to other-than-heap data and then call into box. This is not public interface, but may be a consideration also for other developers.MaybeaddDebug
impls toflize
first and remove the manual impls in this PRThis change is