-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove requirement for lexigraphic ordering of JSON file. #9
Comments
Well it IS very important that the fields be sorted lexicographically when calculating the contract hash. If the asset registry now requires submitting contracts in correct ordering (I don't know if it does), it would be to avoid users calculating the wrong contract hash. If they would hash the JSON they inputted but it's not ordered correctly, they would have a different contract hash (and thus asset ID) as the registry. The Elements RPC currently takes a contract hash as input. An alternative would be for the registry to have some |
The contract hash needs to be calculated on the full contract's hash, byte for byte. The only requirement is that the registry is able to store the full text that is hashed, and that it follows JSON standards. Adding arbitrary undocumented requirements reduces usability. |
It does not, the lexicographical ordering requirement is only necessary for the purpose of calculating the hash. You can submit the JSON contract itself in whatever order you want. (maybe there are some instructions somewhere that say otherwise? I didn't write these so I'm not sure.) The reason for this requirement is that JSON doesn't normally have any defined field ordering (from the JSON specs, RFC 7159: "An object is an unordered collection of zero or more name/value pairs"). In practice the ordering varies between different JSON parser/serializer implementation. Some of them order them lexicographically, some retain insertion order, and for some its just undefined and arbitrary. It's possible that de-serializing and re-serializing the same JSON object with the same implementation will spit back different ordering, which makes this very prone to errors when (de)serializing and moving JSON objects around. Requiring an explicit well-defined order avoids these issues by having one canonical serialization format that everyone should use, regardless of their specific environment and parser. Without a canonical format, people trying to validate the contract commitment are very likely to get tripped by this. Another consideration related to the registry implementation itself is that we use
This repo actually has a CLI utility that can do that, documented here. Another option is to use a standalone library like json-stable-stringify, like so: |
When the hash does not match what the user created, then we have an issue.
There is no need to serialize, only deserialize. As long as the contract is saved in its original format, and the useful information is extracted. This adds compelxity and burden to the user for the benefit of potentially simpler code (until a newer version of the library breaks all of this anyway by making some minor but valid JSON change). Having a command line tool doesn't help use cases where people are generating themselves and are not calling from the command line tool. This behavior means everyone must match EXACTLY what serde is doing for serialization (and all versions of serde need to match it in the future, something I highly doubt they guarantee since JSON is not intended to be canoicalized. |
When only considering specifically what the server-side registry has to do during asset registration, yes. But I'm also considering third-parties writing code to register assets or to validate them, in which case they'll be dealing with JSON objects that move around, and are very likely to get tripped by a JSON implementation that uses different sorting. Retaining the issuer's original JSON ordering would basically mean that everyone will always have to store and move the JSONs around as strings rather than as actual objects. The moment you try to deserialize the string into a JSON object, it'll lose the original ordering information with no way to recover it. So for example if you're storing the contract JSON in a database, you would have to use string fields and not the specialized json field type, meaning that you can't utilize any special json capabilities. Or you'll have to store it twice and make sure the two versions are in-sync. And as mentioned in my previous comment, the same goes for the API layer, or any other layer that deals with these JSONs.
The benefit is not simpler code, it is having canonical encoding to reduce the foot-gun potential of undefined field ordering leading to errors.
No, it means that everyone has to use the typical lexicographical sorting (which is quite common with JSON for purposes like these and has implementations in many languages, like json-stable-stringify for javascript), and that if serde ever changes to do something different then we'll have to change the registry implementation to do lexicographical sorting using some other way. |
This behavior has already resulted in footguns, with users getting unexpected behavior, or issuing assets that are perfectly valid JSON contracts that cannot be registered.
Since the JSON can have arbitrary extra fields created by the user, there's no need to store this as a database - we only need the 5 or so fields that we actually care about. Right now this tool is unusable by most people who deviate from our simple example. |
@shesek, serde actually doesn't do lexicographical ordering at all. For But f.e. if you have a Rust struct, it might just use the struct's field order, so you'd have to serialize, deserialize as For the record, I'm quite indifferent on this. I think there's merit to either side. Regardless of what we decide, I think a |
Again, I was thinking about this more broadly, not just our own usage but also third parties dealing with this contract json in ways that are unknown to us.
Right now there are clear and explicit instructions that people can follow to make sure they use the same serialization as everyone else will be using. There are plenty of ways to lexicographically order
As far as I can tell, serde (when compiled without the |
To summarize: the current implementation lets you store/process/transport the contract as an actual json object, with a convention that allows everyone to serialize it in the exact same way using information available from the object representation alone. The alternative you're proposing would require to always store/process/transport the contract as a plain string and be aware that you're not treating it as an object in places where you shouldn't, causing it to unintentionally lose the ordering information. This would either result in having two duplicated fields (in storage, in structs definitions, in the API layer, etc) or in additional implementation complexity/inefficiency/inconvenience to go from strings to objects on-the-fly instead of having the object representation directly available for use. |
I think here's where the problem lies. We don't have a clear documentation of what the registry does and checks, we only have an illustrated bash example. I think we need some registry technical documentation, if we want people to be able to issue and register assets for their own arbitrary applications |
I gave a shot at documenting the current expectations here, including some snippets for canonicalizing the json and getting the contract hash. |
Ah, I just realized I didn't document details about the domain ownership proof and asset deletion. I can complete that too if needed. |
@shesek there's no mention to the requirement of ticker uniqueness under each domain, or did I miss it?
I think it's worth having those too, if you can add them. |
Another thing not documented there is how to independently verify that the asset issuance commits to the metadata and that the domain ownership proof exists. This can be done with the CLI utility, but its not open-sourced yet... |
I'd prefer from our sample registered asset doc to reference directly to this repo and specifically to that document that Shesek prepared, rather than replicating info, so that issuers can also understand what's the registry at all. @allenpiscitello @greenaddress is there any blocker to open-sourcing this repo, other than reviewing before release and perfecting a deployment process? |
There's no reason to have a strict requirement of how someone constructs the JSON file as long as it has the valid fields. This leads to errors where people create assets that cannot be registered due to arbitrarily strict rules.
The text was updated successfully, but these errors were encountered: