Allow optimistic mirror of kbuckets with read lock only #229

emhane · 2024-01-22T19:48:12Z

Description

allows optimistic and less blocking interaction with discv5

Notes & open questions

Change checklist

Self-review
Documentation updates if relevant
Tests if relevant

jxs

Hi Emilia, doesn't the kbuckets function cover these usecases?

emhane · 2024-01-25T19:30:47Z

Hi Emilia, doesn't the kbuckets function cover these usecases?

hey :) one of them, yes true, NodeStatus of entries would get cloned on cloning kbucktes. still, what else gets cloned? I want to return as little mem as possible from discv5 instance. so the node_ids only call is essential.

also it occurred to me, a try_r and try_w method in the api would be super useful. then we can implement async locks in the app using discv5. maybe you would like to implement this?

jxs · 2024-01-26T17:34:31Z

hey :) one of them, yes true, NodeStatus of entries would get cloned on cloning kbucktes. still, what else gets cloned? I want to return as little mem as possible from discv5 instance. so the node_ids only call is essential.

but table_entries_rlock also clones the entry, with the added disadvantage that you are iterating again just to collect after, to probably iterate again after right?

Regarding little as memory, a clone of KBucketsTable is returned with kbuckets which can last as little as a scope to be iterated and gathered the data required, without the double iteration. Can I ask you the reason for the memory constraints? To avoid the double iteration you can follow this technique and return impl Iterator<Item =NodeId>

emhane · 2024-01-28T17:21:28Z

hey :) one of them, yes true, NodeStatus of entries would get cloned on cloning kbucktes. still, what else gets cloned? I want to return as little mem as possible from discv5 instance. so the node_ids only call is essential.

but table_entries_rlock also clones the entry, with the added disadvantage that you are iterating again just to collect after, to probably iterate again after right?

but it only clones the node id (32 bytes) and not the enr (up to 300 bytes encoded length) for each entry

Regarding little as memory, a clone of KBucketsTable is returned with kbuckets which can last as little as a scope to be iterated and gathered the data required, without the double iteration. Can I ask you the reason for the memory constraints? To avoid the double iteration you can follow this technique and return impl Iterator<Item =NodeId>

I want to use this api in reth, using minimal resources is baseline for all client development.

emhane · 2024-01-28T17:30:41Z

src/discv5.rs

+    ///     .map(|entry| *entry.node.key.preimage())
+    ///     .collect::<Vec<_>>());
+    /// ```
+    pub fn with_kbuckets<F, T>(&self, f: F) -> T


replaced the methods for this one which covers all cases

AgeManning · 2024-01-29T19:47:48Z

The idea is to not Clone the kbuckets table right?

I was thinking about this, and the data structure for kbuckets ins't great. But I estimate in most cases, we usually have a couple of buckets full, then the rest dying off. So we probably have like 50 ENRs in a table. If they are maxed out that's like 300 bytes per ENR which is like 15kb per table. So worst case we are prob copying 15kb in memory, which isn't too bad imo, but I can see the desire to avoid the clone.

I think in this case, an easy solution would be just to expose self.kbuckets as an Arc<RwLock> right? Because then you could do whatever you like to the Arc?

I think the original problem with exposing this, is that the public API is then tagged to the Arc and the exact version of parking_lot that exposes RwLock, so we'd have to re-export parking_lot. As this current PR exposes this now anyway, maybe we just expose self.kbuckets?

Will that satisfy all the requirements you need?

emhane · 2024-01-29T20:44:11Z

The idea is to not Clone the kbuckets table right?

I was thinking about this, and the data structure for kbuckets ins't great. But I estimate in most cases, we usually have a couple of buckets full, then the rest dying off. So we probably have like 50 ENRs in a table. If they are maxed out that's like 300 bytes per ENR which is like 15kb per table. So worst case we are prob copying 15kb in memory, which isn't too bad imo, but I can see the desire to avoid the clone.

I think in this case, an easy solution would be just to expose self.kbuckets as an Arc right? Because then you could do whatever you like to the Arc?

I think the original problem with exposing this, is that the public API is then tagged to the Arc and the exact version of parking_lot that exposes RwLock, so we'd have to re-export parking_lot. As this current PR exposes this now anyway, maybe we just expose self.kbuckets?

Will that satisfy all the requirements you need?

totally agree, I changed the code in this pr to do that yesterday

AgeManning · 2024-01-30T07:19:30Z

Yeah, so what about something like this?:

    /// Returns the routing table of the discv5 service
    pub fn kbuckets_arc(&self) -> Arc<RwLock<KBucketsTable<NodeId, Enr>> {
        self.kbuckets.clone()
    }

emhane · 2024-02-01T18:55:26Z

Yeah, so what about something like this?:

    /// Returns the routing table of the discv5 service
    pub fn kbuckets_arc(&self) -> Arc<RwLock<KBucketsTable<NodeId, Enr>> {
        self.kbuckets.clone()
    }

the code I have made is essentially the same thing, besides it's safer, since any lock taken is at least dropped at the end of closure.

AgeManning · 2024-02-08T01:55:34Z

Yeah, this api has a few downsides:

It exposes the specific version of parking_lot and Arc the library is using
It allows the user to potentially create dead-locks

I don't mind adding it in, if you really need it.

The downside of putting a closure into the API, is that its now limited to a FnOnce. If another user comes along that needs something more generic, Fn FnMut etc, then we'd have to create more APIs. Where-as just giving them the Arc, they can do with whatever they want with it. Its up to them to not cause deadlocks by holding locks imo.

I'm not sure other people should use this for the downsides above. But if you need it in reth, happy to add it in.

emhane · 2024-02-08T09:28:36Z

Yeah, this api has a few downsides:

It exposes the specific version of parking_lot and Arc the library is using

the same goes for the Enr type used by discv5, had some issues due to this. would have liked to set another const generic than the CombinedKey type, will make an issue to use this as default rather than the only type supported by discv5.

sure it exposes the specific parking lot version. this api allows to call try_read and try_write, I want that so I can make an async wrapper for it.

It allows the user to potentially create dead-locks

yes, In the docs I added a note on the user's responsibility to drop the lock. the risk is in practice minimised by locking in a closure's scope like this in my experience.

I don't mind adding it in, if you really need it.

it would be very useful to have that flexibility. I could even feature gate it if you like, have some feature called "dev-mode", wdyt?

The downside of putting a closure into the API, is that its now limited to a FnOnce. If another user comes along that needs something more generic, Fn FnMut etc, then we'd have to create more APIs. Where-as just giving them the Arc, they can do with whatever they want with it. Its up to them to not cause deadlocks by holding locks imo.

to take a read or write lock we just need a ref to self FnOnce. why would we need FnMut or Fn here?

I'm not sure other people should use this for the downsides above. But if you need it in reth, happy to add it in.

that would be great, thanks.

Allow optimistic mirror of kbuckets with read lock only

a29538d

emhane force-pushed the rlock-on-kbuckets-in-api-call branch from e1cc1b9 to a29538d Compare January 22, 2024 20:10

jxs reviewed Jan 25, 2024

View reviewed changes

emhane added 2 commits January 28, 2024 17:36

Publicly expose kbuckets as closure

ccf5979

Add code example to docs

8748e15

emhane commented Jan 28, 2024

View reviewed changes

fixup! Add code example to docs

38cbfcf

emhane requested a review from jxs January 28, 2024 17:34

emhane requested a review from AgeManning February 17, 2024 01:22

AgeManning approved these changes Feb 21, 2024

View reviewed changes

AgeManning merged commit 3c0d3d7 into sigp:master Feb 21, 2024
6 checks passed

emhane deleted the rlock-on-kbuckets-in-api-call branch March 2, 2024 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow optimistic mirror of kbuckets with read lock only #229

Allow optimistic mirror of kbuckets with read lock only #229

emhane commented Jan 22, 2024 •

edited

Loading

jxs left a comment

emhane commented Jan 25, 2024 •

edited

Loading

jxs commented Jan 26, 2024

emhane commented Jan 28, 2024

emhane Jan 28, 2024

AgeManning commented Jan 29, 2024

emhane commented Jan 29, 2024

AgeManning commented Jan 30, 2024

emhane commented Feb 1, 2024

AgeManning commented Feb 8, 2024

emhane commented Feb 8, 2024

Allow optimistic mirror of kbuckets with read lock only #229

Allow optimistic mirror of kbuckets with read lock only #229

Conversation

emhane commented Jan 22, 2024 • edited Loading

Description

Notes & open questions

Change checklist

jxs left a comment

Choose a reason for hiding this comment

emhane commented Jan 25, 2024 • edited Loading

jxs commented Jan 26, 2024

emhane commented Jan 28, 2024

emhane Jan 28, 2024

Choose a reason for hiding this comment

AgeManning commented Jan 29, 2024

emhane commented Jan 29, 2024

AgeManning commented Jan 30, 2024

emhane commented Feb 1, 2024

AgeManning commented Feb 8, 2024

emhane commented Feb 8, 2024

emhane commented Jan 22, 2024 •

edited

Loading

emhane commented Jan 25, 2024 •

edited

Loading