Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fee after channel becomes active #8876

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rkfg
Copy link

@rkfg rkfg commented Jun 28, 2024

Change Description

See #8790, PR recreated for several reasons. The fee update timer is initially set to zero to trigger the fee update check right after the channel becomes active, the next update interval will be random (between 10 and 60 minutes) as usual.

Steps to Test

Have a low commitment fee, make the estimated fee high and reconnect the peer. The commitment fee should update.

Pull Request Checklist

Testing

  • Your PR passes all CI checks.
  • Tests covering the positive and negative (error paths) are included.
  • Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

📝 Please see our Contribution Guidelines for further guidance.

Copy link
Contributor

coderabbitai bot commented Jun 28, 2024

Important

Review skipped

Auto reviews are limited to specific labels.

Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@rkfg rkfg marked this pull request as ready for review June 28, 2024 07:58
htlcswitch/link.go Outdated Show resolved Hide resolved
@Roasbeef
Copy link
Member

Roasbeef commented Jul 3, 2024

Approved CI run.

@rkfg
Copy link
Author

rkfg commented Jul 4, 2024

This is very interesting, some lock-up is triggered if this timer is set to a small value. Couldn't that be a sign of a bigger issue somewhere? If it's set to 10-60 minutes as it is by now then the tests work fine. If I set it to less than 10 minutes it hangs. Could be a race condition between the tests?

@rkfg
Copy link
Author

rkfg commented Jul 4, 2024

So the actual failing test is TestChannelLinkCancelFullCommitment, the update there blocks it from progressing. If I change the initial 0 to time.Second it starts running but since it takes a long time to settle all the HTLCs the fee update times out with failing link: unable to complete dance with error: remote unresponsive (I enabled log output for this test to see what's wrong). Normally that shouldn't be a problem I suppose, if we ever need to settle ≈483 HTLCs and then need to update the fee and it fails because it took us longer than pending-commit-interval to finish, the link would be dropped after that. Then we should reconnect and continue normally. So one solution would be to fix the test so that the update doesn't get in the way and break the link like this:

	n.aliceChannelLink.updateFeeTimer.Reset(time.Minute * 10)
	n.bobChannelLink.updateFeeTimer.Reset(time.Minute * 10)

as well as setting the original PR line to l.updateFeeTimer = time.NewTimer(time.Second * 1). In this case the test passes and doesn't hang indefinitely. I'll check the other unit tests, if there are too many that needs such a fix maybe it wasn't a good idea in the first place.

@rkfg
Copy link
Author

rkfg commented Jul 4, 2024

Many tests seem to rely on update fee never happening unless explicitly requested. It either breaks their expectations or the link. Since the tests usually complete in under 2 minutes and the default fee update interval is at least 10 minutes those tests have never broken before. Not sure how to fix it reliably, I'd like the fee update to happen in 1-10 seconds after the link is established but then too many tests break so I set it to 1 minute now.

Another way could be patching the mock links like this:

diff --git a/htlcswitch/test_utils.go b/htlcswitch/test_utils.go
index 9a72197ec..70e8e71f0 100644
--- a/htlcswitch/test_utils.go
+++ b/htlcswitch/test_utils.go
@@ -1185,7 +1185,7 @@ func (h *hopNetwork) createChannelLink(server, peer *mockServer,
                        }
                }
        }()
-
+       link.(*channelLink).updateFeeTimer.Reset(time.Minute * 10)
        return link, nil
 }

and then adding a separate test to check for this short interval working as expected. Please tell me what solution is better for the project.

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One alternative is to move this chunk in a new method, say handleUpdateFee,

lnd/htlcswitch/link.go

Lines 1323 to 1365 in 71ba355

// If we're not the initiator of the channel, don't we
// don't control the fees, so we can ignore this.
if !l.channel.IsInitiator() {
continue
}
// If we are the initiator, then we'll sample the
// current fee rate to get into the chain within 3
// blocks.
netFee, err := l.sampleNetworkFee()
if err != nil {
l.log.Errorf("unable to sample network fee: %v",
err)
continue
}
minRelayFee := l.cfg.FeeEstimator.RelayFeePerKW()
newCommitFee := l.channel.IdealCommitFeeRate(
netFee, minRelayFee,
l.cfg.MaxAnchorsCommitFeeRate,
l.cfg.MaxFeeAllocation,
)
// We determine if we should adjust the commitment fee
// based on the current commitment fee, the suggested
// new commitment fee and the current minimum relay fee
// rate.
commitFee := l.channel.CommitFeeRate()
if !shouldAdjustCommitFee(
newCommitFee, commitFee, minRelayFee,
) {
continue
}
// If we do, then we'll send a new UpdateFee message to
// the remote party, to be locked in with a new update.
if err := l.updateChannelFee(newCommitFee); err != nil {
l.log.Errorf("unable to update fee rate: %v",
err)
continue
}

Then call this method once right before we enter the for loop,

lnd/htlcswitch/link.go

Lines 1266 to 1268 in 71ba355

for {
// We must always check if we failed at some point processing

This way we can avoid touching the updateFeeTimer, not sure if the tests would be easier this way.

@rkfg
Copy link
Author

rkfg commented Jul 4, 2024

Yes, I tried this before and the tests failed in the same way. Changing the timer period is better in this case because we can delay the fee update if needed. Currently the update period is 10-60 minutes, all tests are faster than that, no test accounts for a random fee update in the middle of the run (in production it can happen of course in a similar scenario). If we make a fee update soon after a channel becomes active, the test gets an unexpected update and misbehaves. Either many tests should be fixed (other issues could be discovered during that but idk tho) or we pretend nothing changed and test the short update period separately.

l.updateFeeTimer = time.NewTimer(l.randomFeeUpdateTimeout())
// this timer will fire in one minute after the channel is ready
// and it will reset to a random interval after that
l.updateFeeTimer = time.NewTimer(time.Minute)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking back to the issue,

Open the app, do any payments, close it in under 10 minutes

Though less likely, if the following happened we'd still have this issue or?

Open the app, do any payments, close it in under 1 minutes

@@ -856,6 +856,10 @@ func TestChannelLinkCancelFullCommitment(t *testing.T) {
n := newTwoHopNetwork(
t, channels.aliceToBob, channels.bobToAlice, testStartingHeight,
)
// This test takes a long time to finish and the link breaks if the fee
// update times out during it, blocking the test indefinitely
n.aliceChannelLink.updateFeeTimer.Reset(time.Minute * 10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mitigation works, though it may introduce flakiness to the unit tests since we are now relying on them to be finished in under 1min. We should still give the "send out a fee update immediately on startup" a shot, and I'll help figure out how to best handle the tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many tests break if we immediately send an update. If you're up to the task then sure, this change is trivial enough. But I still think it's better to set the timer to 0 instead (achieving the same effect without forcing it in all cases), reset it to 10 minutes for the tests + write a separate test for this instant update. I'll take a look in a day or two!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yyforyongyu Hi, maybe you could take this over? I really don't have time now it seems...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, thanks for all the input so far!

@yyforyongyu yyforyongyu self-assigned this Aug 20, 2024
@lightninglabs-deploy
Copy link

@rkfg, remember to re-request review from reviewers when ready

@yyforyongyu
Copy link
Member

!lightninglabs-deploy mute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants