Only sync with top peers #2983

thomash-acinq · 2025-01-17T15:02:48Z

By default we sync with every peer when reconnecting, which can be a lot. We now only sync on reconnection with our top peers (by capacity of our shared channels).

t-bast

Unfortunately moving this logic entirely to the Router doesn't work, because it doesn't take into account our private channels (which are added to the privateChannels map after being reconnected). It doesn't even take into account our new channels, because they're added to the publicChannels list only after being announced, so when a node starts without any channel, it won't sync the graph at all with anyone with your change because its new channels will only be taken into account after a node restart.

I think we must start with the channels information we have at the initialization of the Switchboard: that's where we can reliably extract the list of our top peers (including public and private channels) and add our sync whitelist to it to create a list of peers we want to sync with.

There is one additional thing that must be handled: if we don't have enough peers with channels to reach our router.sync.peer-limit, we must sync with new peers once we have a channel with them that becomes ready. That is very important to handle the case of a new node being started: initially it doesn't have any peer, but once it has connected to remote node and opened a channel with them, we want to immediately sync the graph.

I think that one way of achieving that is to keep the simplifications you made to Peer and PeerConnection (where we don't check any rate-limit) but send the SendChannelQuery message (or a "wrapped" version of it) to the switchboard instead of the router (to do that, the peer connection can send to its peer parent which can relay to its switchboard parent): then the switchboard can decide whether to forward to the router or drop, depending on whether we've exceeded the rate limit or not. The switchboard can thus update the list of peers we're syncing with dynamically when we initially didn't have enough peers with channels to fill our sync.peer-limit. What do you think?

eclair-core/src/main/scala/fr/acinq/eclair/NodeParams.scala

docs/release-notes/eclair-vnext.md

By default we sync with every peer when reconnecting, which can be a lot. We now only sync on reconnection with our top peers (by capacity of our shared channels).

t-bast

Nice, this is indeed much simpler and seems to be working well!

One interesting thing to note is that even when we've already reached our peer limit, we will sync with new peers once we have a first channel created with them (in the event handler for ChannelIdAssigned in Peer.scala). This is a good thing, because we're only syncing once, and will re-sync on reconnection only if our peersToSyncWith wasn't full.

One tiny drawback is that if we create channels with a new peer that should replace an existing peer in our top capacity list, this will only be discovered when restarting the node. But it's fine, the important thing is that we sync with a subset of peers, it doesn't have to perfectly match our top capacity peers all the time, it's not worth the extra complexity.

t-bast · 2025-01-29T13:24:40Z

eclair-core/src/test/scala/fr/acinq/eclair/io/SwitchboardSpec.scala

@@ -116,13 +116,13 @@ class SwitchboardSpec extends TestKitBaseClass with AnyFunSuiteLike {
  }

  test("sync if no whitelist is defined and peer has channels") {
-    val nodeParams = Alice.nodeParams.copy(syncWhitelist = Set.empty)
+    val nodeParams = Alice.nodeParams.copy(routerConf = Alice.nodeParams.routerConf.copy(syncConf = Alice.nodeParams.routerConf.syncConf.copy(whitelist = Set.empty)))


If you import com.softwaremill.quicklens.ModifyPimp you can greatly simplify the changes in this file, it will let you do something like:

val nodeParams = Alice.nodeParams.modify(_.routerConf.syncConf.whitelist).setTo(Set.empty)

We're using this library in tests in a few other places.

t-bast · 2025-01-29T13:27:26Z

eclair-core/src/test/scala/fr/acinq/eclair/io/SwitchboardSpec.scala

    sendFeatures(nodeParams, Nil, remoteNodeId, nodeParams.features.initFeatures(), expectedSync = true)
  }

  test("don't sync if whitelist doesn't contain peer") {
-    val nodeParams = Alice.nodeParams.copy(syncWhitelist = Set(randomKey().publicKey, randomKey().publicKey, randomKey().publicKey))
+    val nodeParams = Alice.nodeParams.copy(routerConf = Alice.nodeParams.routerConf.copy(syncConf = Alice.nodeParams.routerConf.syncConf.copy(peerLimit = 0, whitelist = Set(randomKey().publicKey, randomKey().publicKey, randomKey().publicKey))))


There is no test of the new behavior, this is important to add. We absolutely must add tests whenever we implement new behavior. Can you test that:

when peerLimit is set without a whitelist, we only sync with our top peers and don't sync with others

when peerLimit and a whitelist is set, we sync with our top-capacity peers AND the whitelisted peers even if they're not part of our top-capacity peers

t-bast · 2025-01-29T13:28:45Z

eclair-core/src/main/scala/fr/acinq/eclair/io/Switchboard.scala

+      val peerCapacities = channels.map {
+        case channelData: ChannelDataWithoutCommitments => (channelData.remoteNodeId, 0L)
+        case channelData: ChannelDataWithCommitments => (channelData.remoteNodeId, channelData.commitments.capacity.toLong)
+      }.groupMapReduce[PublicKey, Long](_._1)(_._2)(_ + _)
+      val topCapacityPeers = peerCapacities.toSeq.sortWith { case ((_, c1), (_, c2)) => c1 > c2 }.take(nodeParams.routerConf.syncConf.peerLimit).map(_._1).toSet


Can you move this above the log.info line? It's important for readability to call unstashAll right before context.become, and we'd like the log line's timestamp to match the end of the init phase.

Also, I find the custom sort harder to read than using the default one and taking the last elements:

val topCapacityPeers = peerCapacities.toSeq.sortBy(_._2).takeRight(nodeParams.routerConf.syncConf.peerLimit).map(_._1).toSet

Nice usage of groupMapReduce, I didn't know we had that function in the standard library and it's quite useful!

thomash-acinq marked this pull request as ready for review January 20, 2025 15:56

thomash-acinq requested a review from t-bast January 21, 2025 13:00

t-bast requested changes Jan 22, 2025

View reviewed changes

eclair-core/src/main/scala/fr/acinq/eclair/NodeParams.scala Outdated Show resolved Hide resolved

docs/release-notes/eclair-vnext.md Show resolved Hide resolved

thomash-acinq force-pushed the sync-whitelist branch 4 times, most recently from 75fac18 to 352a786 Compare January 27, 2025 16:06

thomash-acinq added 3 commits January 27, 2025 17:47

Only sync with top peers

0d2c04c

By default we sync with every peer when reconnecting, which can be a lot. We now only sync on reconnection with our top peers (by capacity of our shared channels).

documentation

625854b

Test

d6d495b

thomash-acinq force-pushed the sync-whitelist branch from 352a786 to 8ff8186 Compare January 27, 2025 16:48

move logic back to switchboard

b35f864

thomash-acinq force-pushed the sync-whitelist branch from 8ff8186 to b35f864 Compare January 28, 2025 10:25

thomash-acinq requested a review from t-bast January 28, 2025 10:33

t-bast reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only sync with top peers #2983

Only sync with top peers #2983

thomash-acinq commented Jan 17, 2025

t-bast left a comment

t-bast left a comment

t-bast Jan 29, 2025

t-bast Jan 29, 2025

t-bast Jan 29, 2025

Only sync with top peers #2983

Are you sure you want to change the base?

Only sync with top peers #2983

Conversation

thomash-acinq commented Jan 17, 2025

t-bast left a comment

Choose a reason for hiding this comment

t-bast left a comment

Choose a reason for hiding this comment

t-bast Jan 29, 2025

Choose a reason for hiding this comment

t-bast Jan 29, 2025

Choose a reason for hiding this comment

t-bast Jan 29, 2025

Choose a reason for hiding this comment