Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-47739][SQL] Register logical avro type
### What changes were proposed in this pull request? In this pull request I propose that we register logical avro types when we initialize `AvroUtils` and `AvroFileFormat`, otherwise for first schema discovery we might get wrong result on very first execution after spark starts. <img width="1727" alt="image" src="https://github.com/apache/spark/assets/150366084/3eaba6e3-34ec-4ca9-ae89-d0259ce942ba"> example ```scala val new_schema = """ | { | "type": "record", | "name": "Entry", | "fields": [ | { | "name": "rate", | "type": [ | "null", | { | "type": "long", | "logicalType": "custom-decimal", | "precision": 38, | "scale": 9 | } | ], | "default": null | } | ] | }""".stripMargin spark.read.format("avro").option("avroSchema", new_schema).load().printSchema // maps to long - WRONG spark.read.format("avro").option("avroSchema", new_schema).load().printSchema // maps to Decimal - CORRECT ``` ### Why are the changes needed? To fix issue with resolving avro schema upon spark startup. ### Does this PR introduce _any_ user-facing change? No, its a bugfix ### How was this patch tested? Unit tests ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#45895 from milastdbx/dev/milast/fixAvroLogicalTypeRegistration. Lead-authored-by: milastdbx <[email protected]> Co-authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
- Loading branch information