-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validate row schema #128
base: master
Are you sure you want to change the base?
validate row schema #128
Conversation
schema := jsonschema.Reflect(&table.valueEmpty) | ||
buf, _ := schema.MarshalJSON() | ||
if err := opt.DB.Set(bondTableKey(opt.TableID, rowSchema), buf, Sync); err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just return error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we can use jsonschema
here. We are using other serializers here and the tags will not be read.
We may need to use something else. At the same time, we would need to make sure that all of the serializers use the same tags. Or get rid of the serializers altogether and use only one of them. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it works event if JSON tags are not set. On the same note, we could have problem if the tag is set to json:"-"
.
I agree with setting one serializer. maybe right time, we introduce capnproto or flatbuffers. We could certainly use it's advantage for filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that the schema will change if the names in the tags will change. So if this is not checked then we have one big hole in our schema checker.
If it comes to serializers, It will be hard to convince Peter to change the serializer from cbor
. So must have good arguments and data that justify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you may investigate how much it would take and how much faster we would get if we switched to capnproto
. It probably would simplify schema control as it has schema files. How this would work in this context too?
We have benchmarks that can use any serializer, so it would be the best to compare those numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other solution is needed. The jsonschema
will not work with other serializers than json
.
btw ... I'm just catching up .. the idea of this PR is to query a table metadata of the schema for a row, and then compare it with the app..? this is a cool idea. And, the issue is we use pluggable serialization formats, as a result, we won't know the exact schema..? I wonder... with bond tables, don't we pass the serializer a table is using..? if so, then wouldn't we just include the serializer in the metadata, and compare the schema against that...? another idea, we could add struct tags like I don't think we need to rely on capnproto for this stuff..? as our schemas can vary..? but lmk what you guys are thinking |
@pkieltyka I & @poonai discussed |
hey guys -- yea I've heard of capnproto from way back.. it's just a bit weird for us to have so many serialization formats were using.. I know we have a pluggable serialization system which is nice, and then I guess we're potentially deciding to use one of them for this row schema validation? or are you mentioning something else, where we have a value from a row to compute its schema without deserializing the data..? if so, when is it useful for us to get the schema before we deserialize the entire value..? |
we were discussing for different use case. Not specific to schema. The ideas is to filter rows without deserializing. eg: query := Table.Query().Filter(cond.Func(func(row *Row) bool {
if row.Expired {
return false
}
return true
}) Most of our use cases involves filters, and we have to incur additional cost of serialization while filtering the irrelevant records. I and @marino39 were discussing for long time about the idea of filtering the rows without deserializing the rows. |
I think we will need to settle for one and remove pluggable serializers. Otherwise, it might be really hard to figure out this schema for all of them. Since we use |
No description provided.