Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: Push for data/shares with decoupled namespace #55

Open
Wondertan opened this issue May 4, 2022 · 2 comments
Open

api: Push for data/shares with decoupled namespace #55

Wondertan opened this issue May 4, 2022 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@Wondertan
Copy link
Member

Wondertan commented May 4, 2022

Currently, tree.Push requires passing namespace.PrefixedData, which is a byte slice prepending namespace.ID. However, there is a use case where we don't need to push data that is not prepended with a namespace.ID, and it can be passed as a separate field. This is the case where we push parity data generated via rsmt2d. Currently, we prepend parity namespace to the parity shares, which is not required. This is one of the issues causing us to store an additional 8 bytes of parity namespace per parity share via NMTWrapper, as outlined in celestiaorg/celestia-node#183. I propose adding an additional method like PushWith to the nmt.Tree to allow passing decoupled namespace from the share data to allow mentioned disk usage optimization.

@Wondertan Wondertan added the enhancement New feature or request label May 4, 2022
@adlerjohn adlerjohn moved this to TODO in Celestia Node May 4, 2022
@Wondertan Wondertan changed the title api: Push data with decoupled namespace api: Push for data/shares with decoupled namespace May 4, 2022
@liamsi
Copy link
Member

liamsi commented May 5, 2022

This is interesting. In the past the Push method used to have these already separated:

nmt/nmt.go

Line 254 in 6e8a6a5

func (n *NamespacedMerkleTree) Push(id namespace.ID, data []byte) error {

We can revert to that! Back then there was a lot of discussions around serialzation (of course there always is 🙈 ) and I think (not sure anymore) the type aliases and the API was original meant to be able to take in a type that defines its own serialzation. IMO, the tree should just take bytes anyways and serialzation is sth that should happen before and not inside the tree or anything like that.
Long story short: if we can save some storage and bandwidth with this simple change to the NMT, then let's do it! I propose to only have one push method instead an additional pushWith. Unless there is a good reason to keep both around.

@liamsi liamsi self-assigned this May 5, 2022
@liamsi
Copy link
Member

liamsi commented May 5, 2022

After discussing with @Wondertan, we realized that celestiaorg/celestia-node#183 can be tackled without modifying the NMT library at all (and instead by modifying this visitor). So we could put this on the back burner for now.

I still think the suggested change makes sense. It would also allow us to delete the PrefixedData type entirely. But it might make sense to do this as part of a somewhat larger refactoring (can/should be split into several PRs) where we also tackle #15. In it's current form and because of the usage of this interface we still need to do a copy which merges the nid and the data. Hlib convinced me that it is better to hide this copying inside of the NMT though. Currently, it happens in the wrapper which is bad usability. Especially as Hlib put it:

The reason is; right now, we can avoid storing unnecessary data, but it's hacky. Meaning that on Push we need to prepend, and on Visit remove that we prepend so we don't store it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Status: TODO
Development

Successfully merging a pull request may close this issue.

2 participants