-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for using last_funding_locked tlv in channel_reestablish #3007
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #3007 +/- ##
==========================================
+ Coverage 86.02% 86.04% +0.01%
==========================================
Files 228 228
Lines 20452 20816 +364
Branches 846 865 +19
==========================================
+ Hits 17594 17911 +317
- Misses 2858 2905 +47
|
34209f2
to
5cd1827
Compare
Some things to note during review:
|
This TLV is relevant for splice funding txs only. When the `my_current_funding_locked_txid` TLV attribute confirms the latest funding tx we prune previous funding transaction similarly to receiving `splice_locked` from our peer for that txid. When we receive `your_last_funding_locked_txid` that does not match our latest confirmed funding tx, then we know our peer did not receive our last `splice_locked` and retransmit it. For public channels, nodes also retransmit `splice_locked` after `channel_reestablish` if they have not received `announcement_signatures` for the latest confirmed funding tx. This is needed to prompt our peer to also retransmit their own `splice_locked` and `announcement_signatures`. For public channels, nodes respond to splice_locked with their own splice_locked if it has already been received, but only once. If any announcement_signatures have been sent then we do not do this to prevent exchanging an endless loop of splice_locked messages. These changes ensures both sides will have exchanged `splice_locked` after a disconnect and will be relevant for simple taproot channels to exchange nonces. If the last_funding_locked tlv is not set then nodes always send splice_locked on reconnect to preserve previous behavior for retransmitting `splice_locked`. Note: Previous behavior was susceptible to a race condition if one node sent a channel update after `channel_reestablish`, but before receiving `splice_locked` from a peer that had confirmed the latest funding tx while offline. cf. lightning/bolts#1223
2633d17
to
c1ebbbc
Compare
In c1ebbbc I modified the PR to account for legacy |
eclair-core/src/main/scala/fr/acinq/eclair/wire/protocol/ChannelTlv.scala
Outdated
Show resolved
Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/wire/protocol/LightningMessageTypes.scala
Outdated
Show resolved
Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/channel/Commitments.scala
Outdated
Show resolved
Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/channel/Commitments.scala
Outdated
Show resolved
Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/channel/Commitments.scala
Outdated
Show resolved
Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/channel/fsm/Channel.scala
Outdated
Show resolved
Hide resolved
Note: the special cases for the funding tx without locked remote commit will be simplified in another PR.
…l_ready When reconnecting in the Normal state, if a legacy channel does not have its remote funding status set to `Locked`, set and store it to migrate older channels. After reconnecting in other states, the remote funding status will be set to `Locked` and stored when receiving `channel_ready` or deferred if still waiting for the funding tx to be confirmed locally.
Ensure we do not respond to the splice_locked we sent after channel_reestablish with the same splice_locked. There can only be one signed channel announcement and so only one corresponding last splice_locked that could be re-sent after channel_reestablish.
4fc3cd2
to
c2ad77d
Compare
// retransmit splice_locked to avoid a loop. | ||
// NB: It is important both nodes retransmit splice_locked after reconnecting to ensure new Taproot nonces | ||
// are exchanged for channel announcements. | ||
val spliceLocked_opt = d.lastAnnouncement_opt.collect { | ||
case ann if announcementSigsSent.isEmpty && commitment.shortChannelId_opt.contains(ann.shortChannelId) => | ||
case ann if !spliceLockedSentAfterReestablish.contains(ann.shortChannelId) && commitment.shortChannelId_opt.contains(ann.shortChannelId) => | ||
spliceLockedSentAfterReestablish = Some(ann.shortChannelId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is necessary? Can splice_locked
be sent again if it has already been sent? I thought it was safer to set spliceLockedSentAfterReestablish
but we could take it out and the loop after channel_reestablish
will still not occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can
splice_locked
be sent again if it has already been sent?
Yes it can, this is harmless. I think we should simplify this once you've introduced the spliceLockedSent
map as requested in my other comment to do the following when we receive splice_locked
:
- check the
spliceLockedSent
map: if it contains thisscid
, we don't need to send anything (we may have already re-sentsplice_locked
on reestablish) - otherwise, check if the local funding status is
Locked
, and if it is, retransmit oursplice_locked
- that means we may unnecessarily retransmit (if their
channel_reestablish
said that they already received it), but it's harmless, simplifies the logic (no need to look at the announcement at all), and shouldn't happen often
val lastLocalLocked_opt: Option[Commitment] = active.filter(_.localFundingStatus.isInstanceOf[LocalFundingStatus.Locked]).sortBy(_.fundingTxIndex).lastOption | ||
val lastRemoteLocked_opt: Option[Commitment] = active.filter(c => c.remoteFundingStatus == RemoteFundingStatus.Locked).sortBy(_.fundingTxIndex).lastOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this with the other val
s at the beginning of the class definition?
val commitments1 = commitments.modify(_.remoteNextCommitInfo).setTo(Right(channelReady.nextPerCommitmentPoint)) | ||
// Set the remote status for all initial funding commitments to Locked. If there are RBF attempts, only one will be confirmed locally. | ||
val commitments2 = commitments1.copy(active = commitments.active.collect { case c if c.fundingTxIndex == 0 => c.copy(remoteFundingStatus = RemoteFundingStatus.Locked) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collect
seems fishy here, because it drops other commitments. Even though it's safe with the way we use it, it's not the responsibility of this function to drop the other attempts, that is done when we locally see one of the funding txs confirm. It would be better to use a map
, and cleaner to do all changes in a single call:
val commitments1 = commitments.modify(_.remoteNextCommitInfo).setTo(Right(channelReady.nextPerCommitmentPoint)) | |
// Set the remote status for all initial funding commitments to Locked. If there are RBF attempts, only one will be confirmed locally. | |
val commitments2 = commitments1.copy(active = commitments.active.collect { case c if c.fundingTxIndex == 0 => c.copy(remoteFundingStatus = RemoteFundingStatus.Locked) }) | |
val commitments1 = commitments.copy( | |
// Set the remote status for all initial funding commitments to Locked. If there are RBF attempts, only one can be confirmed locally. | |
active = commitments.active.map { | |
case c if c.fundingTxIndex == 0 => c.copy(remoteFundingStatus = RemoteFundingStatus.Locked) | |
case c => c | |
}, | |
remoteNextCommitInfo = Right(channelReady.nextPerCommitmentPoint) | |
) |
@@ -116,6 +118,7 @@ class WaitForChannelReadyStateSpec extends TestKitBaseClass with FixtureAnyFunSu | |||
val listener = TestProbe() | |||
alice.underlying.system.eventStream.subscribe(listener.ref, classOf[ChannelOpened]) | |||
bob2alice.forward(alice) | |||
assert(alice.stateData.asInstanceOf[DATA_NORMAL].commitments.active.head.remoteFundingStatus == RemoteFundingStatus.Locked) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race here, forward
doesn't wait for the message to be fully handled, so you should replace assert
with awaitCond
(same below).
Note that we don't need to do the same change in your updates to WaitForDualFundingReadyStateSpec
because we always start with awaitCond(xxx.state == NORMAL)
before this check, which ensures that we've done the transition already when we reach the assert
.
stay() using d.copy(commitments = d.commitments.copy(active = d.commitments.active.collect { | ||
case c if c.fundingTxIndex == 0 => c.copy(remoteFundingStatus = RemoteFundingStatus.Locked) | ||
})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point using collect
is really dangerous: it will drop every other commitment (e.g. pending splices)! You must use map
here to ensure you don't modify the other commitments:
stay() using d.copy(commitments = d.commitments.copy(active = d.commitments.active.collect { | |
case c if c.fundingTxIndex == 0 => c.copy(remoteFundingStatus = RemoteFundingStatus.Locked) | |
})) | |
stay() using d.copy(commitments = d.commitments.copy(active = d.commitments.active.map { | |
case c if c.fundingTxIndex == 0 => c.copy(remoteFundingStatus = RemoteFundingStatus.Locked) | |
case c => c | |
})) |
// we keep track of the splice_locked we sent after channel_reestablish to avoid sending it again | ||
private var spliceLockedSentAfterReestablish: Option[RealShortChannelId] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need a Map
of the splice_locked
messages we sent to keep track of our local nonces for taproot, so this is a good opportunity to do it right now, otherwise we'll need to change this code anyway...
Can you turn this into private var spliceLockedSent = Map.empty[RealShortChannelId, SpliceLocked]
? And implement a mecanism to prune the oldest RealShortChannelId
s when we exceed 10 elements, like we do for announcement signatures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use TxId
as the key instead of RealShortChannelId
because we might have sent SpliceLocked
for a zero conf channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's probably what we need here!
// retransmit splice_locked to avoid a loop. | ||
// NB: It is important both nodes retransmit splice_locked after reconnecting to ensure new Taproot nonces | ||
// are exchanged for channel announcements. | ||
val spliceLocked_opt = d.lastAnnouncement_opt.collect { | ||
case ann if announcementSigsSent.isEmpty && commitment.shortChannelId_opt.contains(ann.shortChannelId) => | ||
case ann if !spliceLockedSentAfterReestablish.contains(ann.shortChannelId) && commitment.shortChannelId_opt.contains(ann.shortChannelId) => | ||
spliceLockedSentAfterReestablish = Some(ann.shortChannelId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can
splice_locked
be sent again if it has already been sent?
Yes it can, this is harmless. I think we should simplify this once you've introduced the spliceLockedSent
map as requested in my other comment to do the following when we receive splice_locked
:
- check the
spliceLockedSent
map: if it contains thisscid
, we don't need to send anything (we may have already re-sentsplice_locked
on reestablish) - otherwise, check if the local funding status is
Locked
, and if it is, retransmit oursplice_locked
- that means we may unnecessarily retransmit (if their
channel_reestablish
said that they already received it), but it's harmless, simplifies the logic (no need to look at the announcement at all), and shouldn't happen often
@@ -2227,13 +2245,19 @@ class Channel(val nodeParams: NodeParams, val wallet: OnChainChannelFunder with | |||
} | |||
case _ => Set.empty | |||
} | |||
val lastFundingLockedTlvs: Set[ChannelReestablishTlv] = | |||
if (d.commitments.params.remoteParams.initFeatures.hasFeature(Features.SplicePrototype) && d.commitments.params.localParams.initFeatures.hasFeature(Features.SplicePrototype)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to gate this on init
features, we're using odd TLVs which should be ignored by nodes that don't understand them. It's simpler to always send those TLVs and they may be used outside of splicing when lnd folks for taproot channel announcements (which will potentially require channel_ready
retransmission).
It's only helpful to look at init
features when we receive channel_reestablish
, because if our peer supports splicing and does NOT set your_last_funding_locked
that means we MUST retransmit channel_ready
or splice_locked
, regardless of other checks around the commitment_number
.
By the way, we are missing channel_ready
retransmission based on those TLVs: the condition line 2350 must be updated to always retransmit channel_ready
if d.commitments.params.remoteParams.initFeatures.hasFeature(Features.SplicePrototype) && channelReestablish.yourLastFundingLocked_opt.isEmpty
.
// Prune previous funding transactions and RBF attempts if we already sent splice_locked for the last funding | ||
// transaction confirmed by our counterparty; we either missed their splice_locked or it confirmed while disconnected. | ||
val commitments1: Commitments = channelReestablish.myCurrentFundingLocked_opt.flatMap(myCurrentFundingLocked => | ||
d.commitments.updateRemoteFundingStatus(myCurrentFundingLocked, d.lastAnnouncedFundingTxId_opt).toOption.map(_._1)) | ||
.getOrElse(d.commitments) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it dangerous to have multiple places where we update the commitments and rely on numbering variable names, it requires scanning the whole block of code to ensure we're using the right version...can you instead move this change below, where we do the discardUnsignedUpdates
, and chain the two updates like this:
// Prune previous funding transactions and RBF attempts if we already sent splice_locked for the last funding
// transaction that is also locked by our counterparty; we either missed their splice_locked or it confirmed
// while disconnected.
val commitments1: Commitments = channelReestablish.myCurrentFundingLocked_opt
.flatMap(remoteFundingTxLocked => d.commitments.updateRemoteFundingStatus(remoteFundingTxLocked, d.lastAnnouncedFundingTxId_opt).toOption.map(_._1))
.getOrElse(d.commitments)
// We then clean up unsigned updates that haven't been received before the disconnection.
.discardUnsignedUpdates()
// 1) If they did not receive our last splice_locked; | ||
// 2) or the last splice_locked they received is different from what we sent; | ||
// 3) or (public channels only) we have not received their announcement_signatures for our latest locked commitment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite confusing, because the second statement is contained in the first one: if the last splice_locked
they received doesn't match what we sent, it means they did not receive our last splice_locked
?
// 1) If they did not receive our last splice_locked; | |
// 2) or the last splice_locked they received is different from what we sent; | |
// 3) or (public channels only) we have not received their announcement_signatures for our latest locked commitment. | |
// 1) If they did not receive our last splice_locked | |
// 2) or if this is a public channel and we have not created the channel_announcement yet |
.collect { case c if !channelReestablish.yourLastFundingLocked_opt.contains(c.fundingTxId) || | ||
(commitments1.announceChannel && d.lastAnnouncement_opt.forall(ann => !c.shortChannelId_opt.contains(ann.shortChannelId))) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those conditions are quite hard to parse, it's helpful to decompose them with comments:
val spliceLocked = commitments1.lastLocalLocked_opt match {
case None => None
// We only send splice_locked for splice transactions.
case Some(c) if c.fundingTxIndex == 0 => None
case Some(c) =>
// If our peer has not received our splice_locked, we retransmit it.
val notReceivedByRemote = !channelReestablish.yourLastFundingLocked_opt.contains(c.fundingTxId)
// If this is a public channel and we haven't announced the splice, we retransmit our splice_locked and
// will exchange announcement_signatures afterwards.
val notAnnouncedYet = commitments1.announceChannel && d.lastAnnouncement_opt.forall(ann => !c.shortChannelId_opt.contains(ann.shortChannelId))
if (notReceivedByRemote || notAnnouncedYet) {
log.debug("re-sending splice_locked for fundingTxId={}", c.fundingTxId)
spliceLockedSentAfterReestablish = c.shortChannelId_opt
Some(SpliceLocked(d.channelId, c.fundingTxId))
} else {
None
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished reviewing the tests, they're really good!
alice2bob.forward(bob) | ||
alice2bob.expectNoMessage(100 millis) | ||
assert(bob2alice.expectMsgType[SpliceLocked].fundingTxId == spliceTx.txid) | ||
assert(bob2alice.expectMsgType[SpliceLocked].fundingTxId == spliceTx.txid) // Bob resends `splice_locked` in response to Alice's `splice_locked` after channel_reestablish. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to verify that this is really in response to Alice's splice_locked
, we should add bob2alice.expectNoMessage(100 millis)
right after reconnect(f)
.
// Because `your_last_funding_locked_txid` from Bob matches the last `splice_locked` txid sent by Alice; there is no need | ||
// for Alice to resend `splice_locked`. Alice processes the `my_current_funding_locked` from Bob as if she received | ||
// `splice_locked` from Bob and prunes the initial funding commitment. | ||
assert(alice.stateData.asInstanceOf[ChannelDataWithCommitments].commitments.active.size == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a race condition here between the test code and Alice's handler for the channel_reestablish
message:
assert(alice.stateData.asInstanceOf[ChannelDataWithCommitments].commitments.active.size == 1) | |
awaitCond(alice.stateData.asInstanceOf[ChannelDataWithCommitments].commitments.active.size == 1) |
eclair-core/src/test/scala/fr/acinq/eclair/channel/states/e/NormalSplicesStateSpec.scala
Show resolved
Hide resolved
// If either node receives `splice_locked` again before reconnecting, reply with `announcement_signatures` and not | ||
// `splice_locked` to avoid a loop. | ||
alice2bob.forward(bob, aliceSpliceLocked) | ||
bob2alice.forward(alice, bobSpliceLocked) | ||
alice2bob.expectMsgType[AnnouncementSignatures] | ||
alice2bob.forward(bob) | ||
bob2alice.expectMsgType[AnnouncementSignatures] | ||
bob2alice.forward(alice) | ||
alice2bob.expectNoMessage(100 millis) | ||
bob2alice.expectNoMessage(100 millis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, that's a very good test because this is not the expected behavior! We've already sent announcement_signatures
so we definitely should not re-send it here, which means we're missing a check on the announcementSigsSent
in the handler for the remote SpliceLocked
message in Channel.scala
. I didn't notice it reading the code, it's nice that a test will catch this invalid behavior now ;)
// `splice_locked` to avoid a loop. | ||
alice2bob.forward(bob, aliceSpliceLocked) | ||
bob2alice.forward(alice, bobSpliceLocked) | ||
alice2bob.expectMsgType[AnnouncementSignatures] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, this should instead be alice2bob.expectNoMessage(100 millis)
and bob2alice.expectNoMessage(100 millis)
This TLV is relevant for splice funding txs only.
When the
my_current_funding_locked_txid
TLV attribute confirms the latest funding tx we prune previous funding transaction similarly to receivingsplice_locked
from our peer for that txid.When we receive
your_last_funding_locked_txid
that does not match our latest confirmed funding tx, then we know our peer did not receive our lastsplice_locked
and retransmit it.For public channels, nodes also retransmit
splice_locked
afterchannel_reestablish
if they have not receivedannouncement_signatures
for the latest confirmed funding tx. This is needed to prompt our peer to also retransmit their ownsplice_locked
andannouncement_signatures
.For public channels, nodes respond to
splice_locked
with their ownsplice_locked
if it has already been received, but only once. If any announcement_signatures have been sent then we do not do this to prevent exchanging an endless loop ofsplice_locked
messages.These changes ensures both sides will have exchanged
splice_locked
after a disconnect and will be relevant for simple taproot channels to exchange nonces.If the
last_funding_locked
tlv is not set then nodes always sendsplice_locked
on reconnect to preserve previous behavior for retransmittingsplice_locked
.Note: Previous behavior was susceptible to a race condition if one node sent a channel update after
channel_reestablish
, but before receivingsplice_locked
from a peer that had confirmed the latest funding tx while offline.cf. lightning/bolts#1223