Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: registrar key/list update race condition #22

Merged
merged 2 commits into from
Nov 21, 2024

Conversation

0xIryna
Copy link
Collaborator

@0xIryna 0xIryna commented Nov 20, 2024

Proposed changes

  • add message sequence to Send Registrar Key and Update Registrar List Status messages sent by HubPortal.
  • store the latest receive message sequences in SpokePortal and update Registrar Key and Registrar List Status only if the incoming sequence is greater than the current one to prevent race condition as Wormhole Relayer doesn't enforce message ordering. Kirill L-03 and AR M-01

@0xIryna 0xIryna requested a review from PierrickGT November 20, 2024 19:14
Copy link

github-actions bot commented Nov 20, 2024

LCOV of commit 93200ee during Forge Coverage #116

Summary coverage rate:
  lines......: 96.9% (220 of 227 lines)
  functions..: 90.5% (67 of 74 functions)
  branches...: 80.4% (41 of 51 branches)

Files changed coverage rate:
                                 |Lines       |Functions  |Branches    
  Filename                       |Rate     Num|Rate    Num|Rate     Num
  =====================================================================
  src/HubPortal.sol              | 100%     45| 100%    10|85.7%      7
  src/SpokePortal.sol            | 100%     39| 100%     9| 100%     11
  src/libs/PayloadEncoder.sol    | 100%     34| 100%     8| 100%      1

Copy link

github-actions bot commented Nov 20, 2024

Changes to gas cost

Generated at commit: e55d511ee2eea3bd3f613a29d64dd47370c2414e, compared to commit: b8b570fc01e64445006eccc1b67867f7d1864917

🧾 Summary (20% most significant diffs)

Contract Method Avg (+/-) %
SpokePortal owner +22 ❌ +0.92%

Full diff report 👇
Contract Deployment Cost (+/-) Method Min (+/-) % Avg (+/-) % Median (+/-) % Max (+/-) % # Calls (+/-)
SpokePortal 5,392,349 (+64,704) attestationReceived
getMode
initialize
mToken
owner
quoteDeliveryPrice
setPeer
transfer
59,806 (+22)
404 (+22)
127,628 (-45)
301 (-22)
2,413 (+22)
47,799 (+22)
54,239 (+22)
308,517 (-65)
+0.04%
+5.76%
-0.04%
-6.81%
+0.92%
+0.05%
+0.04%
-0.02%
111,408 (+829)
404 (+22)
127,628 (-45)
301 (-22)
2,413 (+22)
47,799 (+22)
54,239 (+22)
308,517 (-65)
+0.75%
+5.76%
-0.04%
-6.81%
+0.92%
+0.05%
+0.04%
-0.02%
139,884 (+22)
404 (+22)
127,628 (-45)
301 (-22)
2,413 (+22)
47,799 (+22)
54,239 (+22)
308,517 (-65)
+0.02%
+5.76%
-0.04%
-6.81%
+0.92%
+0.05%
+0.04%
-0.02%
139,884 (+22)
404 (+22)
127,628 (-45)
301 (-22)
2,413 (+22)
47,799 (+22)
54,239 (+22)
308,517 (-65)
+0.02%
+5.76%
-0.04%
-6.81%
+0.92%
+0.05%
+0.04%
-0.02%
7 (0)
27 (0)
27 (0)
27 (0)
27 (0)
2 (0)
36 (0)
2 (0)

@deluca-mike
Copy link
Collaborator

deluca-mike commented Nov 20, 2024

Why are idempotent actions being sequenced?

Without sequencing, the possibility of "unordered" messages is that some messages would need to be resent in order for Spoke value to converge on the Hub values.

Now, with sequencing, the possibility of "unordered" messages is that some messages would need to be resent in order for Spoke value to converge on the Hub values.

And it's worse because, without sequencing, all messages arrive. Now with sequencing, if the last message arrives first (mistake or abuse), all other messages silently fail and need to be resent (costing more gas).

I believe this is a suboptimal solution to a non-problem, that actually creates more problems.

@0xIryna
Copy link
Collaborator Author

0xIryna commented Nov 20, 2024

Why are idempotent actions being sequenced?

Without sequencing, the possibility of "unordered" messages is that some messages would need to be resent in order for Spoke value to converge on the Hub values.

Now, with sequencing, the possibility of "unordered" messages is that some messages would need to be resent in order for Spoke value to converge on the Hub values.

And it's worse because, without sequencing, all messages arrive. Now with sequencing, if the last message arrives first (mistake or abuse), all other messages silently fail and need to be resent (costing more gas).

@deluca-mike it's not idempotent if values in the Registrar on the Hub were updated.
The issue was pointed during the audit. The Wormhole Relayer doesn't guarantee ordering, so a message with obsolete registrar value can arrive after the message with the current value. Both Kirill and AR suggested to enforce ordering with nonces.

@0xIryna 0xIryna requested a review from toninorair November 20, 2024 20:29
@deluca-mike
Copy link
Collaborator

deluca-mike commented Nov 20, 2024

Idempotent means that a message or action on its own will always result in the same outcome.

If the current value on the hub goes from X to Y, the message to set the spoke value to X will set the spoke value to X, and the message to set the spoke value to Y will set the spoke value to Y, regardless how many times they are processed, respectively or together.

The wormhole messages themselves are unique and cannot be reprocessed, but the valid actions/intents are idempotent.

An old lingering message to set the spoke value to X when the hub's current value is Y:

  • is rare
  • if it happened, just needs to resend the message to set the value to Y again
  • demonstrates that issues due to race conditions are isolated to individual key-values

With sequencing, batches of messages being unordered:

  • is much less rare
  • if it happened, now needs to resend one or more messages that had lower sequence numbers
  • expands the issues due to the race conditions to all messages (i.e. one unordered message could result in the silent failure and invalidation of a bunch of unrelated messages)

So the problem is "Bob's wormhole message to set himself as an earner, that was created before the registrar was updated to disable him as an earner, but was not yet processed, got processed after the message to disable Bob as an earner is both created and processed".

It could be solved on a case-by-case basis by simply having anybody resend the message to remove Bob from the earners list.

Sequencing solves this rare problem, but now creates problems that are more likely to occur, and will result in more need to replay messages.

It would be frustrating, costly, and more likely when sending five messages to enable five earners at the spoke resulted in four of them failing because the last one got processed first.

Perhaps the nonces should be tied to individual index actions or registrar key values (i.e. a mapping of nonces where the key is the action or registrar key) so that a handful of completely irrelevant messages don't get invalidated because of one.

Edit: I just read Kirill's audit, and this is exactly what he suggests "Consider caching the last message sequence for each particular registrar key".


IRegistrarLike(registrar).setKey(key_, value_);
// Update the key only if the incoming message has the higher sequence or is the fist message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Update the key only if the incoming message has the higher sequence or is the fist message
// Update the key only if the incoming message has the higher sequence or is the first message

} else {
IRegistrarLike(registrar).removeFromList(listName_, account_);
uint64 lastUpdateListSequence_ = lastUpdateListSequence;
// Update the status only if the incoming message has the higher sequence or is the fist message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Update the status only if the incoming message has the higher sequence or is the fist message
// Update the status only if the incoming message has the higher sequence or is the first message

Copy link
Collaborator

@toninorair toninorair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some architecture decisions needs to be approved

  1. one sequence variable can be used for all messages with custom payload. It will allow to avoid basic copy-paste
  2. emission of events when registrar keys are actually updated. Right now there is no way to see from event logs if value was only received or received and set

@toninorair toninorair self-requested a review November 20, 2024 22:08
@0xIryna
Copy link
Collaborator Author

0xIryna commented Nov 21, 2024

Some architecture decisions needs to be approved

  1. one sequence variable can be used for all messages with custom payload. It will allow to avoid basic copy-paste
  2. emission of events when registrar keys are actually updated. Right now there is no way to see from event logs if value was only received or received and set

I've cleaned up the implementation by using only one lastProcessedSequence variable for both types of messages and reverting if the sequence is obsolete. It's consistent with other checks that we already perform on destination like checking if the chain was forked.

@toninorair toninorair merged commit fa18db7 into main Nov 21, 2024
4 checks passed
@toninorair toninorair deleted the fix/registrar-key-race-condition branch November 21, 2024 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants