Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve unrecognized logical types and properties #469

Merged
merged 6 commits into from
Oct 11, 2024

Conversation

jhump
Copy link
Contributor

@jhump jhump commented Oct 11, 2024

This updates the parsing of schema JSON documents and the way properties are possibly discarded (to avoid problematic properties that conflict with proper fields of a type and could result in invalid JSON if included when marshaling to JSON).

The tests now demonstrate that all properties and logical types are preserved as extra properties, even for unrecognized and incorrectly configured logical types, even for null types.

Just a note, in the parser, it should not be needed to add LogicalType where it does not naturally exist, and it will be added to Props as part of the ,remain.

I added these everywhere so that this property would be properly type-checked: if it is present, it should be a string. So now this will surface as an error if it's not a string, even for an "array" or "record" type.

Resolves #435.

schema_parse.go Outdated Show resolved Hide resolved
schema_parse.go Outdated Show resolved Hide resolved
…move LogicalType field from intermediate structs that don't really use it
@jhump jhump force-pushed the jh/preserve-unrecognized branch from 84d6ba2 to 7576339 Compare October 11, 2024 16:57
schema_parse.go Outdated
Comment on lines 656 to 665
func checkLogicalType(props map[string]any) error {
val, ok := props["logicalType"]
if !ok {
return nil
}
if _, isString := val.(string); !isString {
return fmt.Errorf(`"logicalType" attribute must be a string, got %T`, val)
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, what necessitates logicalType in this context be a string? At this point it is metadata, and by spec could be any type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for consistency. The way logicalType is parsed for other fixed and primitive types, it will be an error if it is not a string. Why would it be an error there but not for the other kinds of types? The spec doesn't say anything about the attribute being specific to fixed or primitive types:

A logical type is an Avro primitive or complex type with extra attributes to represent a derived type. The attribute logicalType must always be present for a logical type, and is a string with the name of one of the logical types listed later in this section. Other attributes may be defined for particular logical types.

So if it's not an error here, and we allow treating a non-string value as an extra property, then I should probably also change the way fixed and primitive types are parsed, to similarly allow non-string values which would result in it just being another property.

Copy link
Contributor Author

@jhump jhump Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I reworked it so that logical type does not have to be a string; it's just another property. For fixed and primitive types, if it's not a string, it's also tossed into the set of other properties and otherwise ignored.

Does that look better?

Copy link
Member

@nrwiersma nrwiersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nrwiersma nrwiersma merged commit 631f6dd into hamba:main Oct 11, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

logicalType - keep property
2 participants