-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
propose car-metadata multicodec #334
Conversation
When transporting car data, it can be useful to insert a block of data with key/value metadata signalling inline with the car data blocks. This PR allocates a codec `car-metadata` to signal such data as blocks within the stream of car blocks.
This is a bad idea.
|
Thank you @aschmahmann for a quick response and the pointer to the previous discussion about what qualifies as a codec.
In our use case, we want to append an extra metadata block at the end of the CARv1 file to provide a kind of checksum: byte length of the CAR file, hash of the CAR bytes, and signature of the previous two fields. Do you consider such metadata as a control plane too? |
Yeah, I'm not really a fan either, for the same reasons @aschmahmann outlined (and as per all the previous discussions on this stuff). Would it not be enough to just put a dag-cbor block at the end with a strongly typed schema that you can check? (as dag-json) {
"car-metadata/v1": {
"properties": {
"byte-length": 18390,
"multihash": {
"bytes": {
"/": "..."
}
}
}
"signature": {
"bytes": {
"/": "..."
}
}
},
} (schema)
This was one of the original design goals of IPLD schemas - they should be fast and efficient enough to do these kinds of checks. There's also the "messaging" hack I prototyped for CARv2: ipld/go-car#322 & ipld/js-car#89 after a similar request for this kind of thing came up from the web3.storage team who wanted to put some metadata inside of their CARs. It was never used, and it fits in before the data payload, which may not be ideal - but since a CARv2 wraps a CARv1, given a 2-step process it might make even more sense than this since you're providing metadata about a thing within the thing itself, which is not so ideal since the metadata needs to be excluded from the thing somehow—and we'd have to assume that there'd be some layer that would strip this cleanly so it doesn't end up making a mess of other layers of the stack where such a codec code would make a mess of things. |
I'm sure @lidel can pull up some record of the discussions about putting a tombstone at the end of trustless gateway CAR responses, those are probably relevant here and maybe back on the table? If this becomes part of the protocol, then having a new codec code in CIDs is going to cause a bit of havoc for users—doing a simple But having said all that, if we do go with a tombstone, then a couple of extra thoughts:
|
|
I think the next step is to propose a spec+code change in I would appreciate pointers on other spec impact or parts of this project that should also be considered as we proceed |
Back to the basic request here of adding a new serialization/codec to the table - we can't do that where you're asking for a duplicate of an existing serialization method. We've got ample history of rejecting such additions where someone just wants a code to represent a codec+schema, which is what this is. So as long as this is, at base, cbor, json, or some other existing unschema'd codec, then it can't go in. Schema signalling via codec code is a door that we've tried hard to keep closed, pushing people to keep that level of signalling out of the CID. The demand for such signalling tells us that there is a hole in the stack that people want utilities for, but that's where discussions about CIDv2 (and other methods) come in; jamming it into CIDv1 is going to make things much harder to manage. |
|
In ipfs/specs#431, we pivoted to a different solution that does not require any new multicodec. I think this PR can be closed now. |
When transporting car data, it can be useful to insert a block of data with key/value metadata signalling inline with the car data blocks.
This PR allocates a codec
car-metadata
to signal such data as blocks within the stream of car blocks.