Improved light client requests networking protocol #9

tomaka · 2023-07-19T09:48:32Z

This is a continuation of w3f/PPPs#10

Compared to w3f/PPPs#10, I've changed my mind about truncated proofs. I think that it's better if the client can fully verify the proofs that are returned even when the response size limit would be exceeded. This also means that the proof no longer needs to be deterministic, and I've removed the paragraph about this.

cc @cheme

tomaka · 2023-08-22T09:20:10Z

@cheme Do you have any opinion? I think that you and I are pretty much the only people who have opinions/interest on this topic.

cheme · 2023-08-28T09:57:59Z

text/0009-improved-net-light-client-requests.md

+Implementers of the replier side should be careful to detect early on when a reply would exceed the maximum reply size, rather than inconditionally generate a reply, as this could take a very large amount of CPU, disk I/O, and memory. Existing implementations might currently be accidentally protected from such an attack thanks to the fact that requests have a maximum size, and thus that the list of keys in the query was bounded. After this proposal, this accidental protection would no longer exist.
+
+Malicious server nodes might truncate Merkle proofs even when they don't strictly need to, and it is not possible for the client to (easily) detect this situation. However, malicious server nodes can already do undesirable things such as throttle down their upload bandwidth or simply not respond. There is no need to handle unnecessarily truncated Merkle proofs any differently than a server simply not answering the request.
+


Agree with malicious remark.
One thing that client can see as malicious is restarting a truncated proof at the wrong node level (like proof is node a-b-c first reply is a-b if second reply is b-c this is malicious).

cheme · 2023-08-28T10:02:59Z

text/0009-improved-net-light-client-requests.md

+
+This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.
+
+The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.


In my implementation I do not allow restarting a proof with a wrong trie node eg we query k1 k2 and k3, proof to all is nodes a-b-c-d-e proof to k1-k3 is only abde then if first reply is ab we got k1 but if second is de we don't have k2 and fail.
So only second valid reply is one starting with node c and any usefull following nodes.
So not really sure if the intention was that to discard allow discarding key in the middle of a query, but I think it will be simpler to word "allowed to discard all following keys of the request and terminate reply".

Yeah I think this makes sense. Basically the server is allowed to drop keys, but we don't allow to drop random keys. So, that keysAfter will not later include keys that already got send.

cheme · 2023-08-28T10:06:26Z

text/0009-improved-net-light-client-requests.md

+For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is `concat(":child_storage:default:", child_trie_hash)` and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning.
+Also note that child tries aren't considered as descendants of the main trie when it comes to the `includeDescendants` flag. In other words, if the request concerns the main trie, no content coming from child tries is ever sent back.
+
+This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.


Here, in my implementation the replier is allowed to return a single trie node even if it does not add a reply to any key (eg a big branch), as long as it make the query reply progress.
(I can implement the constraint at substrate level though (at trie level it makes sense to me to allow chunk with any non empty trie content that participate in query progress).

shawntabrizi · 2024-12-05T11:42:57Z

This RFC was brought up at the current Parity retreat as an important unblocker for teams like PAPI and Chopsticks, who seem to be pushing the boundaries of light client support today.

@cheme are any of your comments blockers in your head, or just comments?

@tomaka are you okay with the current status, and should we put it to a vote?

Seems like if there is further iteration needed, we could benefit from merging this one now, and doing improvements in a follow up RFC, assuming nothing is too different than what is already here.

cc @bkchr @ggwpez

tomaka · 2024-12-05T12:47:04Z

The RFC is good as is and would be an improvement over the current protocol.

The remaining open questions are about how to enforce a maximum response size, however the existing protocol also suffers from this maximum response size problem.
In other words, this RFC does not open any new problem but also doesn't solve all the problems with existing protocol. It would have been nice to solve all the problems, but that was unresolved.

shawntabrizi · 2024-12-05T13:51:01Z

@tomaka would you then do the honors of /rfc propose? :)

Let's take it to a vote!

tomaka · 2024-12-05T15:24:56Z

/rfc propose

paritytech-rfc-bot · 2024-12-05T15:25:11Z

Hey @tomaka, here is a link you can use to create the referendum aiming to approve this RFC number 0009.

Instructions

Open the link.
Switch to the Submission tab.

Adjust the transaction if needed (for example, the proposal Origin).
Submit the Transaction

It is based on commit hash 228d53b9fbbb829852802fdb06af76524bc269dc.

The proposed remark text is: RFC_APPROVE(0009,96353a0a10c3634c6757315630711354e5fdaf1159823a888290c0116952a480).

github-actions · 2024-12-05T18:25:21Z

Voting for this referenda is ongoing.

Vote for it here

cheme · 2024-12-09T08:21:47Z

This RFC was brought up at the current Parity retreat as an important unblocker for teams like PAPI and Chopsticks, who seem to be pushing the boundaries of light client support today.

@cheme are any of your comments blockers in your head, or just comments?

@tomaka are you okay with the current status, and should we put it to a vote?

Seems like if there is further iteration needed, we could benefit from merging this one now, and doing improvements in a follow up RFC, assuming nothing is too different than what is already here.

cc @bkchr @ggwpez

just comments/idea, no blockers.

bkchr

Generally looking good. Left some question. Would be nice to get some answer before I'm voting on chain/may require a new proposal if accepted.

bkchr · 2024-12-09T15:06:13Z

text/0009-improved-net-light-client-requests.md

+
+If `onlyKeysAfterIgnoreLastNibble` is missing, it is equivalent to `false`. If `onlyKeysAfterIgnoreLastNibble` is `true` and `onlyKeysAfter` is missing or empty, then the request is invalid.
+
+For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is `concat(":child_storage:default:", child_trie_hash)` and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning.


concat(":child_storage:default:", child_trie_hash) this is wrong? The storage key in the main trie should be

concat(":child_storage:default:", child_storage_key)

And the value associated to this key is the hash of the child trie. Or I just don't get what you want to say with this sentence.

I think that's just a typo. The intent is that this mimics exactly what the runtime sees.

bkchr · 2024-12-09T15:07:21Z

text/0009-improved-net-light-client-requests.md

+If `onlyKeysAfterIgnoreLastNibble` is missing, it is equivalent to `false`. If `onlyKeysAfterIgnoreLastNibble` is `true` and `onlyKeysAfter` is missing or empty, then the request is invalid.
+
+For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is `concat(":child_storage:default:", child_trie_hash)` and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning.
+Also note that child tries aren't considered as descendants of the main trie when it comes to the `includeDescendants` flag. In other words, if the request concerns the main trie, no content coming from child tries is ever sent back.


But the roots of the child tries are included in the response, as they are part of the main trie?

Yes, concat(":child_storage:default:", child_storage_key) is part of the main trie.

What this small text clarifies is that the content of the child trie isn't considered as a descendant of concat(":child_storage:default:", child_storage_key).
Some people might otherwise interpret that includeDescendants includes the content of child tries.

bkchr · 2024-12-09T15:12:23Z

text/0009-improved-net-light-client-requests.md

+
+This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.
+
+The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.


Yeah I think this makes sense. Basically the server is allowed to drop keys, but we don't allow to drop random keys. So, that keysAfter will not later include keys that already got send.

bkchr · 2024-12-09T15:15:13Z

text/0009-improved-net-light-client-requests.md

+
+This proposal doesn't handle one specific situation: what if a proof containing a single specific item would exceed the response size limit? For example, if the response size limit was 1 MiB, querying the runtime code (which is typically 1.0 to 1.5 MiB) would be impossible as it's impossible to generate a proof less than 1 MiB. The response size limit is currently 16 MiB, meaning that no single storage item must exceed 16 MiB.
+
+Unfortunately, because it's impossible to verify a Merkle proof before having received it entirely, parsing the proof in a streaming way is also not possible.


If you order the proof properly, you can also stream it to the receiver. They can check the proof on the fly. This could also work for bigger items, but yeah there would be no way to knowing how big an item is before streaming it.

I don't remember what I had in mind when I wrote this, but yes I think the problem is about making sure that the sender isn't sending random garbage data. If the sender says that a storage item is 2 GiB, the receiver will have to receive the full 2 GiB before being able to detect whether it was legitimate or not.

the receiver will have to receive the full 2 GiB before being able to detect whether it was legitimate or not

That's not 100% true. In theory, the light client could simply pass through the storage item through the JSON-RPC API to the code that decodes that storage item, and that code could detect early on whether the data makes sense (and maybe also store the decoded data in a more space-optimal way).
And if turns out that the hash of the storage item doesn't match what is expected, the JSON-RPC server could tell to the JSON-RPC client "wait, nevermind, everything I sent is wrong".

But in practice that solution would be extremely complicated.

github-actions · 2024-12-10T18:27:06Z

PR can be merged.

Write the following command to trigger the bot

/rfc process 0xcdad9ffe7f78f870f6188871d3d2bb338d1c1d081e6f0c1fc3f8075f4aa356a2

bkchr · 2024-12-10T21:33:32Z

/rfc process 0xcdad9ffe7f78f870f6188871d3d2bb338d1c1d081e6f0c1fc3f8075f4aa356a2

paritytech-rfc-bot · 2024-12-10T21:33:50Z

The on-chain referendum has approved the RFC.

tomaka added 5 commits July 19, 2023 11:48

Improved light client requests networking protocol

5a9237d

Small tweaks

38faa9b

Forgot the author field

db2d399

Add remark about too many requested keys

1f33ef5

Add remark about malicious servers

228d53b

cheme reviewed Aug 28, 2023

View reviewed changes

bkchr mentioned this pull request Sep 6, 2023

Specify merkle value paritytech/json-rpc-interface-spec#89

Merged

tomaka mentioned this pull request Oct 10, 2023

chainHead_storage: Support more resilient pagination of descendants? paritytech/json-rpc-interface-spec#85

Open

anaelleltd added the Stale Is no longer pursued. label Sep 10, 2024

anaelleltd added Reviewed Is ready for on-chain voting. and removed Stale Is no longer pursued. labels Dec 5, 2024

bkchr approved these changes Dec 9, 2024

View reviewed changes

paritytech-rfc-bot bot merged commit 73c9853 into polkadot-fellows:main Dec 10, 2024

tomaka deleted the rfc-9 branch December 11, 2024 09:22

tomaka mentioned this pull request Dec 11, 2024

Implement RFC 9 smol-dot/smoldot#2064

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved light client requests networking protocol #9

Improved light client requests networking protocol #9

tomaka commented Jul 19, 2023

tomaka commented Aug 22, 2023

cheme Aug 28, 2023

cheme Aug 28, 2023

bkchr Dec 9, 2024

cheme Aug 28, 2023

shawntabrizi commented Dec 5, 2024

tomaka commented Dec 5, 2024

shawntabrizi commented Dec 5, 2024 •

edited

Loading

tomaka commented Dec 5, 2024

paritytech-rfc-bot bot commented Dec 5, 2024

github-actions bot commented Dec 5, 2024

cheme commented Dec 9, 2024

bkchr left a comment

bkchr Dec 9, 2024

tomaka Dec 9, 2024

bkchr Dec 9, 2024

tomaka Dec 9, 2024 •

edited

Loading

bkchr Dec 9, 2024

bkchr Dec 9, 2024

tomaka Dec 9, 2024

tomaka Dec 9, 2024 •

edited

Loading

github-actions bot commented Dec 10, 2024

bkchr commented Dec 10, 2024

paritytech-rfc-bot bot commented Dec 10, 2024

		Implementers of the replier side should be careful to detect early on when a reply would exceed the maximum reply size, rather than inconditionally generate a reply, as this could take a very large amount of CPU, disk I/O, and memory. Existing implementations might currently be accidentally protected from such an attack thanks to the fact that requests have a maximum size, and thus that the list of keys in the query was bounded. After this proposal, this accidental protection would no longer exist.

		Malicious server nodes might truncate Merkle proofs even when they don't strictly need to, and it is not possible for the client to (easily) detect this situation. However, malicious server nodes can already do undesirable things such as throttle down their upload bandwidth or simply not respond. There is no need to handle unnecessarily truncated Merkle proofs any differently than a server simply not answering the request.


		This protocol keeps the same maximum response size limit as currently exists (16 MiB). It is not possible for the querier to know in advance whether its query will lead to a reply that exceeds the maximum size. If the reply is too large, the replier should send back only a limited number (but at least one) of requested items in the proof. The querier should then send additional requests for the rest of the items. A response containing none of the requested items is invalid.

		The server is allowed to silently discard some keys of the request if it judges that the number of requested keys is too high. This is in line with the fact that the server might truncate the response.


		If `onlyKeysAfterIgnoreLastNibble` is missing, it is equivalent to `false`. If `onlyKeysAfterIgnoreLastNibble` is `true` and `onlyKeysAfter` is missing or empty, then the request is invalid.

		For the purpose of this networking protocol, it should be considered as if the main trie contained an entry for each default child trie whose key is `concat(":child_storage:default:", child_trie_hash)` and whose value is equal to the trie root hash of that default child trie. This behavior is consistent with what the host functions observe when querying the storage. This behavior is present in the existing networking protocol, in other words this proposal doesn't change anything to the situation, but it is worth mentioning.


		This proposal doesn't handle one specific situation: what if a proof containing a single specific item would exceed the response size limit? For example, if the response size limit was 1 MiB, querying the runtime code (which is typically 1.0 to 1.5 MiB) would be impossible as it's impossible to generate a proof less than 1 MiB. The response size limit is currently 16 MiB, meaning that no single storage item must exceed 16 MiB.

		Unfortunately, because it's impossible to verify a Merkle proof before having received it entirely, parsing the proof in a streaming way is also not possible.

Improved light client requests networking protocol #9

Improved light client requests networking protocol #9

Conversation

tomaka commented Jul 19, 2023

tomaka commented Aug 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shawntabrizi commented Dec 5, 2024

tomaka commented Dec 5, 2024

shawntabrizi commented Dec 5, 2024 • edited Loading

tomaka commented Dec 5, 2024

paritytech-rfc-bot bot commented Dec 5, 2024

github-actions bot commented Dec 5, 2024

cheme commented Dec 9, 2024

bkchr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomaka Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tomaka Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Dec 10, 2024

bkchr commented Dec 10, 2024

paritytech-rfc-bot bot commented Dec 10, 2024

shawntabrizi commented Dec 5, 2024 •

edited

Loading

tomaka Dec 9, 2024 •

edited

Loading

tomaka Dec 9, 2024 •

edited

Loading