Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

field [fieldName] is expected to be one of these: RECORD, NULL, for nested record with non defined nullable values #9

Open
gadaldo opened this issue Oct 27, 2016 · 3 comments

Comments

@gadaldo
Copy link

gadaldo commented Oct 27, 2016

I got:
Could not evaluate union, field [fieldName] is expected to be one of these: RECORD, NULL. If this is a complex type, check if offending field: trafficSource.adwordsClickInfo adheres to schema.
when I have nested records, where some of the 'nullable' fields are not specified.

schema sample:

{
    "type": "record",
    "name": "Root",
    "fields": [
        {
            "name": "field1",
            "type": [
                "long",
                "null"
            ]
        },
        {
            "name": "nestedRecord",
            "type": [
                {
                    "type": "record",
                    "namespace": "root",
                    "name": "NestedRecord",
                    "fields": [
                        {
                            "name": "nested1",
                            "type": [
                                "long",
                                "null"
                            ]
                        },
                        {
                            "name": "nested2",
                            "type": [
                                "long",
                                "null"
                            ]
                        }
                    ]
                },
                "null"
            ]
        }
    ]
}

and json string such as:

{
    "field1" : 10999859003, 
    "nestedRecord": 
    { 
        "nested1" : 123321321 
    }
}

I think when it goes in recursion it is not able to skip missing values because for those missing value at level 0 it skips the missing values.

Thank you

@jghoman
Copy link
Contributor

jghoman commented Dec 16, 2016

Hey @gadaldo,
In this case, the error json-avro-convert is throwing is correct because nested2 is not defined in your sample json and the schema provides no default for it. Avro should only accept this datum in either of those cases and I've verified it does in both with this code:

(using a default value)

    def 'should convert nested nullable records'() {
        given:
        def schema = '''
            {
                "type": "record",
                "name": "Root",
                "fields": [
                    {
                        "name": "field1",
                        "type": [
                            "long",
                            "null"
                        ]
                    },
                    {
                        "name": "nestedRecord",
                        "type": [
                            {
                                "type": "record",
                                "namespace": "root",
                                "name": "NestedRecord",
                                "fields": [
                                    {
                                        "name": "nested1",
                                        "type": [
                                            "long",
                                            "null"
                                        ]
                                    },
                                    {
                                        "name": "nested2",
                                        "type": [
                                            "long",
                                            "null"
                                        ], "default": 42
                                    }
                                ]
                            },
                            "null"
                        ]
                    }
                ]
            }
        '''

        def json = '''
            {
                "field1" : 10999859003,
                "nestedRecord":
                {
                    "nested1" : 123321321, "nested2":42
                }
            }
        '''

        when:
        def result = converter.convertToJson(converter.convertToAvro(json.bytes, schema), schema)

        then:
        toMap(result) == toMap(json)
    }```

**(using a provided value)**
```groovy
    def 'should convert nested nullable records2'() {
        given:
        def schema = '''
            {
                "type": "record",
                "name": "Root",
                "fields": [
                    {
                        "name": "field1",
                        "type": [
                            "long",
                            "null"
                        ]
                    },
                    {
                        "name": "nestedRecord",
                        "type": [
                            {
                                "type": "record",
                                "namespace": "root",
                                "name": "NestedRecord",
                                "fields": [
                                    {
                                        "name": "nested1",
                                        "type": [
                                            "long",
                                            "null"
                                        ]
                                    },
                                    {
                                        "name": "nested2",
                                        "type": [
                                            "long",
                                            "null"
                                        ]
                                    }
                                ]
                            },
                            "null"
                        ]
                    }
                ]
            }
        '''

        def json = '''
            {
                "field1" : 10999859003,
                "nestedRecord":
                {
                    "nested1" : 123321321, "nested2":43
                }
            }
        '''

        when:
        def result = converter.convertToJson(converter.convertToAvro(json.bytes, schema), schema)

        then:
        toMap(result) == toMap(json)
    }```

@gadaldo
Copy link
Author

gadaldo commented Dec 16, 2016

At level 0 of the tree it does, the problem is just when the algorithm goes in recursion.
Anyway, I created my own version because I needed it, that JSON comes from TableRow object when reading from Bigquery (with BigQueryIO) and I have to transform in AVRO. It'a a feature that Google does behind the scene but they don't want to expose the API seen they do a further middle transformation to proto as documented here.
So based on this algorithm I created mine but I don't know if you can close the issue.
Thank you anyway

@jainanuj07
Copy link

is this issue got fixed? Please let me know how to resolve it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants