Skip to content

Commit

Permalink
fix(iroh): Queue sent datagrams longer (#3129)
Browse files Browse the repository at this point in the history
## Description

The problem is that while the connection to the relay server is still
being established sent packets are already being dropped while being
queued to send.  This means when the connection is finally established
they are not there to be sent and depending on some scheduling luck
connections will often fail.  Extending this timeout makes
establishing connections via the relay only much more reliable.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

"depending on scheduling luck" is a bit hand-wavy.  I would have
expected QUIC to recover from this and re-send the packets.  I think
it depends on exactly how long it takes to establish the connection,
re-tries could still end up being dropped in this queue if badly
timed.

It is hard to say if 3*PTO is sufficient.  There is an argument for
even longer, but it is a trade-off of blocking the entire relay queue
if it is too long and giving enough time to establish a normal
connection.

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [x] Tests if relevant.
- [x] All breaking changes documented.
  • Loading branch information
flub authored Jan 14, 2025
1 parent b7a3568 commit e756710
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion iroh/src/magicsock/relay_actor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,9 @@ const CONNECT_TIMEOUT: Duration = Duration::from_secs(10);
/// When the [`ActiveRelayActor`] is not connected it can not deliver datagrams. However it
/// will still receive datagrams to send from the [`RelayActor`]. If connecting takes
/// longer than this timeout datagrams will be dropped.
const UNDELIVERABLE_DATAGRAM_TIMEOUT: Duration = Duration::from_millis(400);
///
/// This value is set to 3 times the QUIC initial Probe Timeout (PTO).
const UNDELIVERABLE_DATAGRAM_TIMEOUT: Duration = Duration::from_secs(3);

/// An actor which handles the connection to a single relay server.
///
Expand Down

0 comments on commit e756710

Please sign in to comment.