Skip to content

[DNM] test from_json #30801

[DNM] test from_json

[DNM] test from_json #30801

Triggered via pull request December 24, 2024 06:52
@zhouyuanzhouyuan
synchronize #8318
Status Success
Total duration 18s
Artifacts

dev_cron.yml

on: pull_request_target
Process
8s
Process
Fit to window
Zoom out
Zoom in

Annotations

50 errors and 1 warning
GlutenJsonFunctionsSuite.from_json invalid json - check modes: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7655/601334808@72ac61f2))] +- LocalRelation [value#718151] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#718151, Some(America/Los_Angeles)) AS from_json(value)#718154] +- LocalRelation [value#718151] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#718151, Some(America/Los_Angeles)) AS from_json(value)#718154] +- LocalRelation [value#718151] == Physical Plan == VeloxColumnarToRow +- ^(26274) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#718151, Some(America/Los_Angeles)) AS from_json(value)#718154] +- ^(26274) InputIteratorTransformer[value#718151] +- RowToVeloxColumnar +- LocalTableScan [value#718151] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
GlutenJsonFunctionsSuite.corrupt record column in the middle: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7655/601334808@72ac61f2))] +- LocalRelation [value#718160] == Analyzed Logical Plan == from_json(value): struct<a:int,_unparsed:string,b:int> Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#718160, Some(America/Los_Angeles)) AS from_json(value)#718163] +- LocalRelation [value#718160] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#718160, Some(America/Los_Angeles)) AS from_json(value)#718163] +- LocalRelation [value#718160] == Physical Plan == VeloxColumnarToRow +- ^(26276) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#718160, Some(America/Los_Angeles)) AS from_json(value)#718163] +- ^(26276) InputIteratorTransformer[value#718160] +- RowToVeloxColumnar +- LocalTableScan [value#718160] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>> [[2,null,12]] [[2,null,12]] ![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7655/601334808@72ac61f2))] +- Project [value#718415 AS c0#718418] +- LocalRelation [value#718415] == Analyzed Logical Plan == from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>> Project [from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), c0#718418, Some(America/Los_Angeles)) AS from_json(c0)#718420] +- Project [value#718415 AS c0#718418] +- LocalRelation [value#718415] == Optimized Logical Plan == Project [from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), value#718415, Some(America/Los_Angeles)) AS from_json(c0)#718420] +- LocalRelation [value#718415] == Physical Plan == VeloxColumnarToRow +- ^(26303) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), value#718415, Some(America/Los_Angeles)) AS from_json(c0)#718420] +- ^(26303) InputIteratorTransformer[value#718415] +- RowToVeloxColumnar +- LocalTableScan [value#718415] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>> ![[null]] [[[123456,null]]]
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row: org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
GlutenJsonFunctionsSuite.from_json invalid json - check modes: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))] +- LocalRelation [value#788444] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788444, Some(America/Los_Angeles)) AS from_json(value)#788447] +- LocalRelation [value#788444] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788444, Some(America/Los_Angeles)) AS from_json(value)#788447] +- LocalRelation [value#788444] == Physical Plan == VeloxColumnarToRow +- ^(30133) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788444, Some(America/Los_Angeles)) AS from_json(value)#788447] +- ^(30133) InputIteratorTransformer[value#788444] +- RowToVeloxColumnar +- LocalTableScan [value#788444] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))] +- LocalRelation [value#788453] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788453, Some(America/Los_Angeles)) AS from_json(value)#788456] +- LocalRelation [value#788453] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788453, Some(America/Los_Angeles)) AS from_json(value)#788456] +- LocalRelation [value#788453] == Physical Plan == VeloxColumnarToRow +- ^(30135) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788453, Some(America/Los_Angeles)) AS from_json(value)#788456] +- ^(30135) InputIteratorTransformer[value#788453] +- RowToVeloxColumnar +- LocalTableScan [value#788453] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
GlutenJsonFunctionsSuite.corrupt record column in the middle: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))] +- LocalRelation [value#788462] == Analyzed Logical Plan == from_json(value): struct<a:int,_unparsed:string,b:int> Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#788462, Some(America/Los_Angeles)) AS from_json(value)#788465] +- LocalRelation [value#788462] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#788462, Some(America/Los_Angeles)) AS from_json(value)#788465] +- LocalRelation [value#788462] == Physical Plan == VeloxColumnarToRow +- ^(30137) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#788462, Some(America/Los_Angeles)) AS from_json(value)#788465] +- ^(30137) InputIteratorTransformer[value#788462] +- RowToVeloxColumnar +- LocalTableScan [value#788462] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>> [[2,null,12]] [[2,null,12]] ![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))] +- Project [value#788717 AS c0#788720] +- LocalRelation [value#788717] == Analyzed Logical Plan == from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>> Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#788720, Some(America/Los_Angeles)) AS from_json(c0)#788722] +- Project [value#788717 AS c0#788720] +- LocalRelation [value#788717] == Optimized Logical Plan == Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#788717, Some(America/Los_Angeles)) AS from_json(c0)#788722] +- LocalRelation [value#788717] == Physical Plan == VeloxColumnarToRow +- ^(30164) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#788717, Some(America/Los_Angeles)) AS from_json(c0)#788722] +- ^(30164) InputIteratorTransformer[value#788717] +- RowToVeloxColumnar +- LocalTableScan [value#788717] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>> ![[null]] [[[123456,null]]]
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row: org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
GlutenJsonFunctionsSuite.from_json with option (allowComments): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowComments,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616062] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowComments,true), value#616062, Some(America/Los_Angeles)) AS from_json(value)#616065] +- LocalRelation [value#616062] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowComments,true), value#616062, Some(America/Los_Angeles)) AS from_json(value)#616065] +- LocalRelation [value#616062] == Physical Plan == VeloxColumnarToRow +- ^(32271) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowComments,true), value#616062, Some(America/Los_Angeles)) AS from_json(value)#616065] +- ^(32271) InputIteratorTransformer[value#616062] +- RowToVeloxColumnar +- LocalTableScan [value#616062] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedFieldNames): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616071] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#616071, Some(America/Los_Angeles)) AS from_json(value)#616074] +- LocalRelation [value#616071] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#616071, Some(America/Los_Angeles)) AS from_json(value)#616074] +- LocalRelation [value#616071] == Physical Plan == VeloxColumnarToRow +- ^(32273) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#616071, Some(America/Los_Angeles)) AS from_json(value)#616074] +- ^(32273) InputIteratorTransformer[value#616071] +- RowToVeloxColumnar +- LocalTableScan [value#616071] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowSingleQuotes): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowSingleQuotes,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616080] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#616080, Some(America/Los_Angeles)) AS from_json(value)#616083] +- LocalRelation [value#616080] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#616080, Some(America/Los_Angeles)) AS from_json(value)#616083] +- LocalRelation [value#616080] == Physical Plan == VeloxColumnarToRow +- ^(32275) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#616080, Some(America/Los_Angeles)) AS from_json(value)#616083] +- ^(32275) InputIteratorTransformer[value#616080] +- RowToVeloxColumnar +- LocalTableScan [value#616080] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowNumericLeadingZeros): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616089] == Analyzed Logical Plan == from_json(value): struct<int:int> Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#616089, Some(America/Los_Angeles)) AS from_json(value)#616092] +- LocalRelation [value#616089] == Optimized Logical Plan == Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#616089, Some(America/Los_Angeles)) AS from_json(value)#616092] +- LocalRelation [value#616089] == Physical Plan == VeloxColumnarToRow +- ^(32277) ProjectExecTransformer [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#616089, Some(America/Los_Angeles)) AS from_json(value)#616092] +- ^(32277) InputIteratorTransformer[value#616089] +- RowToVeloxColumnar +- LocalTableScan [value#616089] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<int:int>> ![[18]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowBackslashEscapingAnyCharacter): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616098] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#616098, Some(America/Los_Angeles)) AS from_json(value)#616101] +- LocalRelation [value#616098] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#616098, Some(America/Los_Angeles)) AS from_json(value)#616101] +- LocalRelation [value#616098] == Physical Plan == VeloxColumnarToRow +- ^(32279) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#616098, Some(America/Los_Angeles)) AS from_json(value)#616101] +- ^(32279) InputIteratorTransformer[value#616098] +- RowToVeloxColumnar +- LocalTableScan [value#616098] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[$10]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedControlChars): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616116] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#616116, Some(America/Los_Angeles)) AS from_json(value)#616119] +- LocalRelation [value#616116] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#616116, Some(America/Los_Angeles)) AS from_json(value)#616119] +- LocalRelation [value#616116] == Physical Plan == VeloxColumnarToRow +- ^(32281) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#616116, Some(America/Los_Angeles)) AS from_json(value)#616119] +- ^(32281) InputIteratorTransformer[value#616116] +- RowToVeloxColumnar +- LocalTableScan [value#616116] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[ab]] [null]
GlutenJsonFunctionsSuite.from_json invalid json - check modes: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616687] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616687, Some(America/Los_Angeles)) AS from_json(value)#616690] +- LocalRelation [value#616687] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616687, Some(America/Los_Angeles)) AS from_json(value)#616690] +- LocalRelation [value#616687] == Physical Plan == VeloxColumnarToRow +- ^(32335) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616687, Some(America/Los_Angeles)) AS from_json(value)#616690] +- ^(32335) InputIteratorTransformer[value#616687] +- RowToVeloxColumnar +- LocalTableScan [value#616687] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616696] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616696, Some(America/Los_Angeles)) AS from_json(value)#616699] +- LocalRelation [value#616696] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616696, Some(America/Los_Angeles)) AS from_json(value)#616699] +- LocalRelation [value#616696] == Physical Plan == VeloxColumnarToRow +- ^(32337) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616696, Some(America/Los_Angeles)) AS from_json(value)#616699] +- ^(32337) InputIteratorTransformer[value#616696] +- RowToVeloxColumnar +- LocalTableScan [value#616696] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
GlutenJsonFunctionsSuite.corrupt record column in the middle: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- LocalRelation [value#616705] == Analyzed Logical Plan == from_json(value): struct<a:int,_unparsed:string,b:int> Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#616705, Some(America/Los_Angeles)) AS from_json(value)#616708] +- LocalRelation [value#616705] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#616705, Some(America/Los_Angeles)) AS from_json(value)#616708] +- LocalRelation [value#616705] == Physical Plan == VeloxColumnarToRow +- ^(32339) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#616705, Some(America/Los_Angeles)) AS from_json(value)#616708] +- ^(32339) InputIteratorTransformer[value#616705] +- RowToVeloxColumnar +- LocalTableScan [value#616705] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>> [[2,null,12]] [[2,null,12]] ![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- Project [value#616960 AS c0#616963] +- LocalRelation [value#616960] == Analyzed Logical Plan == from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>> Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#616963, Some(America/Los_Angeles)) AS from_json(c0)#616970] +- Project [value#616960 AS c0#616963] +- LocalRelation [value#616960] == Optimized Logical Plan == Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#616960, Some(America/Los_Angeles)) AS from_json(c0)#616970] +- LocalRelation [value#616960] == Physical Plan == VeloxColumnarToRow +- ^(32368) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#616960, Some(America/Los_Angeles)) AS from_json(c0)#616970] +- ^(32368) InputIteratorTransformer[value#616960] +- RowToVeloxColumnar +- LocalTableScan [value#616960] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>> ![[null]] [[[123456,null]]]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON arrays with objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- Project [value#616976 AS c0#616979] +- LocalRelation [value#616976] == Analyzed Logical Plan == from_json(c0): array<struct<c1:string,c2:array<struct<a:bigint>>>> Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), c0#616979, Some(America/Los_Angeles)) AS from_json(c0)#616986] +- Project [value#616976 AS c0#616979] +- LocalRelation [value#616976] == Optimized Logical Plan == Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#616976, Some(America/Los_Angeles)) AS from_json(c0)#616986] +- LocalRelation [value#616976] == Physical Plan == VeloxColumnarToRow +- ^(32372) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#616976, Some(America/Los_Angeles)) AS from_json(c0)#616986] +- ^(32372) InputIteratorTransformer[value#616976] +- RowToVeloxColumnar +- LocalTableScan [value#616976] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):array<struct<c1:string,c2:array<struct<a:bigint>>>>> ![null] [ArrayBuffer([abc,null])]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON maps: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- Project [value#616992 AS c0#616995] +- LocalRelation [value#616992] == Analyzed Logical Plan == from_json(c0): struct<c1:map<string,int>,c2:string> Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), c0#616995, Some(America/Los_Angeles)) AS from_json(c0)#617002] +- Project [value#616992 AS c0#616995] +- LocalRelation [value#616992] == Optimized Logical Plan == Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#616992, Some(America/Los_Angeles)) AS from_json(c0)#617002] +- LocalRelation [value#616992] == Physical Plan == VeloxColumnarToRow +- ^(32376) ProjectExecTransformer [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#616992, Some(America/Los_Angeles)) AS from_json(c0)#617002] +- ^(32376) InputIteratorTransformer[value#616992] +- RowToVeloxColumnar +- LocalTableScan [value#616992] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<c1:map<string,int>,c2:string>> ![[null,null]] [[null,abc]]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for objects with values as JSON arrays: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))] +- Project [value#617030 AS c0#617033] +- LocalRelation [value#617030] == Analyzed Logical Plan == from_json(c0): array<struct<c1:array<struct<c2:array<int>>>>> Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), c0#617033, Some(America/Los_Angeles)) AS from_json(c0)#617040] +- Project [value#617030 AS c0#617033] +- LocalRelation [value#617030] == Optimized Logical Plan == Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#617030, Some(America/Los_Angeles)) AS from_json(c0)#617040] +- LocalRelation [value#617030] == Physical Plan == VeloxColumnarToRow +- ^(32384) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#617030, Some(America/Los_Angeles)) AS from_json(c0)#617040] +- ^(32384) InputIteratorTransformer[value#617030] +- RowToVeloxColumnar +- LocalTableScan [value#617030] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):array<struct<c1:array<struct<c2:array<int>>>>>> ![null] [ArrayBuffer([WrappedArray([null])])]
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row: org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
GlutenJsonFunctionsSuite.from_json with option (allowComments): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowComments,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#691605] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowComments,true), value#691605, Some(America/Los_Angeles)) AS from_json(value)#691608] +- LocalRelation [value#691605] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowComments,true), value#691605, Some(America/Los_Angeles)) AS from_json(value)#691608] +- LocalRelation [value#691605] == Physical Plan == VeloxColumnarToRow +- ^(36031) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowComments,true), value#691605, Some(America/Los_Angeles)) AS from_json(value)#691608] +- ^(36031) InputIteratorTransformer[value#691605] +- RowToVeloxColumnar +- LocalTableScan [value#691605] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedFieldNames): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#691614] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#691614, Some(America/Los_Angeles)) AS from_json(value)#691617] +- LocalRelation [value#691614] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#691614, Some(America/Los_Angeles)) AS from_json(value)#691617] +- LocalRelation [value#691614] == Physical Plan == VeloxColumnarToRow +- ^(36033) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#691614, Some(America/Los_Angeles)) AS from_json(value)#691617] +- ^(36033) InputIteratorTransformer[value#691614] +- RowToVeloxColumnar +- LocalTableScan [value#691614] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowSingleQuotes): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowSingleQuotes,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#691623] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#691623, Some(America/Los_Angeles)) AS from_json(value)#691626] +- LocalRelation [value#691623] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#691623, Some(America/Los_Angeles)) AS from_json(value)#691626] +- LocalRelation [value#691623] == Physical Plan == VeloxColumnarToRow +- ^(36035) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#691623, Some(America/Los_Angeles)) AS from_json(value)#691626] +- ^(36035) InputIteratorTransformer[value#691623] +- RowToVeloxColumnar +- LocalTableScan [value#691623] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowNumericLeadingZeros): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#691632] == Analyzed Logical Plan == from_json(value): struct<int:int> Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#691632, Some(America/Los_Angeles)) AS from_json(value)#691635] +- LocalRelation [value#691632] == Optimized Logical Plan == Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#691632, Some(America/Los_Angeles)) AS from_json(value)#691635] +- LocalRelation [value#691632] == Physical Plan == VeloxColumnarToRow +- ^(36037) ProjectExecTransformer [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#691632, Some(America/Los_Angeles)) AS from_json(value)#691635] +- ^(36037) InputIteratorTransformer[value#691632] +- RowToVeloxColumnar +- LocalTableScan [value#691632] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<int:int>> ![[18]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowBackslashEscapingAnyCharacter): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#691641] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#691641, Some(America/Los_Angeles)) AS from_json(value)#691644] +- LocalRelation [value#691641] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#691641, Some(America/Los_Angeles)) AS from_json(value)#691644] +- LocalRelation [value#691641] == Physical Plan == VeloxColumnarToRow +- ^(36039) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#691641, Some(America/Los_Angeles)) AS from_json(value)#691644] +- ^(36039) InputIteratorTransformer[value#691641] +- RowToVeloxColumnar +- LocalTableScan [value#691641] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[$10]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedControlChars): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#691659] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#691659, Some(America/Los_Angeles)) AS from_json(value)#691662] +- LocalRelation [value#691659] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#691659, Some(America/Los_Angeles)) AS from_json(value)#691662] +- LocalRelation [value#691659] == Physical Plan == VeloxColumnarToRow +- ^(36041) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#691659, Some(America/Los_Angeles)) AS from_json(value)#691662] +- ^(36041) InputIteratorTransformer[value#691659] +- RowToVeloxColumnar +- LocalTableScan [value#691659] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[ab]] [null]
GlutenJsonFunctionsSuite.from_json invalid json - check modes: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#692230] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692230, Some(America/Los_Angeles)) AS from_json(value)#692233] +- LocalRelation [value#692230] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692230, Some(America/Los_Angeles)) AS from_json(value)#692233] +- LocalRelation [value#692230] == Physical Plan == VeloxColumnarToRow +- ^(36095) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692230, Some(America/Los_Angeles)) AS from_json(value)#692233] +- ^(36095) InputIteratorTransformer[value#692230] +- RowToVeloxColumnar +- LocalTableScan [value#692230] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#692239] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692239, Some(America/Los_Angeles)) AS from_json(value)#692242] +- LocalRelation [value#692239] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692239, Some(America/Los_Angeles)) AS from_json(value)#692242] +- LocalRelation [value#692239] == Physical Plan == VeloxColumnarToRow +- ^(36097) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692239, Some(America/Los_Angeles)) AS from_json(value)#692242] +- ^(36097) InputIteratorTransformer[value#692239] +- RowToVeloxColumnar +- LocalTableScan [value#692239] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
GlutenJsonFunctionsSuite.corrupt record column in the middle: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- LocalRelation [value#692248] == Analyzed Logical Plan == from_json(value): struct<a:int,_unparsed:string,b:int> Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692248, Some(America/Los_Angeles)) AS from_json(value)#692251] +- LocalRelation [value#692248] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692248, Some(America/Los_Angeles)) AS from_json(value)#692251] +- LocalRelation [value#692248] == Physical Plan == VeloxColumnarToRow +- ^(36099) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692248, Some(America/Los_Angeles)) AS from_json(value)#692251] +- ^(36099) InputIteratorTransformer[value#692248] +- RowToVeloxColumnar +- LocalTableScan [value#692248] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>> [[2,null,12]] [[2,null,12]] ![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- Project [value#692503 AS c0#692506] +- LocalRelation [value#692503] == Analyzed Logical Plan == from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>> Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#692506, Some(America/Los_Angeles)) AS from_json(c0)#692513] +- Project [value#692503 AS c0#692506] +- LocalRelation [value#692503] == Optimized Logical Plan == Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#692503, Some(America/Los_Angeles)) AS from_json(c0)#692513] +- LocalRelation [value#692503] == Physical Plan == VeloxColumnarToRow +- ^(36128) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#692503, Some(America/Los_Angeles)) AS from_json(c0)#692513] +- ^(36128) InputIteratorTransformer[value#692503] +- RowToVeloxColumnar +- LocalTableScan [value#692503] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>> ![[null]] [[[123456,null]]]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON arrays with objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- Project [value#692519 AS c0#692522] +- LocalRelation [value#692519] == Analyzed Logical Plan == from_json(c0): array<struct<c1:string,c2:array<struct<a:bigint>>>> Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), c0#692522, Some(America/Los_Angeles)) AS from_json(c0)#692529] +- Project [value#692519 AS c0#692522] +- LocalRelation [value#692519] == Optimized Logical Plan == Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#692519, Some(America/Los_Angeles)) AS from_json(c0)#692529] +- LocalRelation [value#692519] == Physical Plan == VeloxColumnarToRow +- ^(36132) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#692519, Some(America/Los_Angeles)) AS from_json(c0)#692529] +- ^(36132) InputIteratorTransformer[value#692519] +- RowToVeloxColumnar +- LocalTableScan [value#692519] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):array<struct<c1:string,c2:array<struct<a:bigint>>>>> ![null] [ArrayBuffer([abc,null])]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON maps: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- Project [value#692535 AS c0#692538] +- LocalRelation [value#692535] == Analyzed Logical Plan == from_json(c0): struct<c1:map<string,int>,c2:string> Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), c0#692538, Some(America/Los_Angeles)) AS from_json(c0)#692545] +- Project [value#692535 AS c0#692538] +- LocalRelation [value#692535] == Optimized Logical Plan == Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#692535, Some(America/Los_Angeles)) AS from_json(c0)#692545] +- LocalRelation [value#692535] == Physical Plan == VeloxColumnarToRow +- ^(36136) ProjectExecTransformer [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#692535, Some(America/Los_Angeles)) AS from_json(c0)#692545] +- ^(36136) InputIteratorTransformer[value#692535] +- RowToVeloxColumnar +- LocalTableScan [value#692535] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<c1:map<string,int>,c2:string>> ![[null,null]] [[null,abc]]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for objects with values as JSON arrays: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- Project [value#692573 AS c0#692576] +- LocalRelation [value#692573] == Analyzed Logical Plan == from_json(c0): array<struct<c1:array<struct<c2:array<int>>>>> Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), c0#692576, Some(America/Los_Angeles)) AS from_json(c0)#692583] +- Project [value#692573 AS c0#692576] +- LocalRelation [value#692573] == Optimized Logical Plan == Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#692573, Some(America/Los_Angeles)) AS from_json(c0)#692583] +- LocalRelation [value#692573] == Physical Plan == VeloxColumnarToRow +- ^(36144) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#692573, Some(America/Los_Angeles)) AS from_json(c0)#692583] +- ^(36144) InputIteratorTransformer[value#692573] +- RowToVeloxColumnar +- LocalTableScan [value#692573] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):array<struct<c1:array<struct<c2:array<int>>>>>> ![null] [ArrayBuffer([WrappedArray([null])])]
GlutenJsonFunctionsSuite.SPARK-48863: parse object as an array with partial results enabled: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))] +- Project [value#692611 AS c0#692614] +- LocalRelation [value#692611] == Analyzed Logical Plan == from_json(c0): array<struct<a:string,c:int>> Project [from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), c0#692614, Some(America/Los_Angeles)) AS from_json(c0)#692621] +- Project [value#692611 AS c0#692614] +- LocalRelation [value#692611] == Optimized Logical Plan == Project [from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), value#692611, Some(America/Los_Angeles)) AS from_json(c0)#692621] +- LocalRelation [value#692611] == Physical Plan == VeloxColumnarToRow +- ^(36152) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), value#692611, Some(America/Los_Angeles)) AS from_json(c0)#692621] +- ^(36152) InputIteratorTransformer[value#692611] +- RowToVeloxColumnar +- LocalTableScan [value#692611] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):array<struct<a:string,c:int>>> ![null] [ArrayBuffer([b,null])]
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row: org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
GlutenJsonFunctionsSuite.from_json with option (allowComments): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowComments,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692128] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowComments,true), value#692128, Some(America/Los_Angeles)) AS from_json(value)#692131] +- LocalRelation [value#692128] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowComments,true), value#692128, Some(America/Los_Angeles)) AS from_json(value)#692131] +- LocalRelation [value#692128] == Physical Plan == VeloxColumnarToRow +- ^(36034) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowComments,true), value#692128, Some(America/Los_Angeles)) AS from_json(value)#692131] +- ^(36034) InputIteratorTransformer[value#692128] +- RowToVeloxColumnar +- LocalTableScan [value#692128] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedFieldNames): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692137] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#692137, Some(America/Los_Angeles)) AS from_json(value)#692140] +- LocalRelation [value#692137] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#692137, Some(America/Los_Angeles)) AS from_json(value)#692140] +- LocalRelation [value#692137] == Physical Plan == VeloxColumnarToRow +- ^(36036) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#692137, Some(America/Los_Angeles)) AS from_json(value)#692140] +- ^(36036) InputIteratorTransformer[value#692137] +- RowToVeloxColumnar +- LocalTableScan [value#692137] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowSingleQuotes): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowSingleQuotes,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692146] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#692146, Some(America/Los_Angeles)) AS from_json(value)#692149] +- LocalRelation [value#692146] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#692146, Some(America/Los_Angeles)) AS from_json(value)#692149] +- LocalRelation [value#692146] == Physical Plan == VeloxColumnarToRow +- ^(36038) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#692146, Some(America/Los_Angeles)) AS from_json(value)#692149] +- ^(36038) InputIteratorTransformer[value#692146] +- RowToVeloxColumnar +- LocalTableScan [value#692146] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[World]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowNumericLeadingZeros): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692155] == Analyzed Logical Plan == from_json(value): struct<int:int> Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#692155, Some(America/Los_Angeles)) AS from_json(value)#692158] +- LocalRelation [value#692155] == Optimized Logical Plan == Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#692155, Some(America/Los_Angeles)) AS from_json(value)#692158] +- LocalRelation [value#692155] == Physical Plan == VeloxColumnarToRow +- ^(36040) ProjectExecTransformer [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#692155, Some(America/Los_Angeles)) AS from_json(value)#692158] +- ^(36040) InputIteratorTransformer[value#692155] +- RowToVeloxColumnar +- LocalTableScan [value#692155] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<int:int>> ![[18]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowBackslashEscapingAnyCharacter): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692164] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#692164, Some(America/Los_Angeles)) AS from_json(value)#692167] +- LocalRelation [value#692164] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#692164, Some(America/Los_Angeles)) AS from_json(value)#692167] +- LocalRelation [value#692164] == Physical Plan == VeloxColumnarToRow +- ^(36042) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#692164, Some(America/Los_Angeles)) AS from_json(value)#692167] +- ^(36042) InputIteratorTransformer[value#692164] +- RowToVeloxColumnar +- LocalTableScan [value#692164] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[$10]] [[null]]
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedControlChars): org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692182] == Analyzed Logical Plan == from_json(value): struct<str:string> Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#692182, Some(America/Los_Angeles)) AS from_json(value)#692185] +- LocalRelation [value#692182] == Optimized Logical Plan == Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#692182, Some(America/Los_Angeles)) AS from_json(value)#692185] +- LocalRelation [value#692182] == Physical Plan == VeloxColumnarToRow +- ^(36044) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#692182, Some(America/Los_Angeles)) AS from_json(value)#692185] +- ^(36044) InputIteratorTransformer[value#692182] +- RowToVeloxColumnar +- LocalTableScan [value#692182] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(value):struct<str:string>> ![[ab]] [null]
GlutenJsonFunctionsSuite.from_json invalid json - check modes: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692753] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692753, Some(America/Los_Angeles)) AS from_json(value)#692756] +- LocalRelation [value#692753] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692753, Some(America/Los_Angeles)) AS from_json(value)#692756] +- LocalRelation [value#692753] == Physical Plan == VeloxColumnarToRow +- ^(36098) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692753, Some(America/Los_Angeles)) AS from_json(value)#692756] +- ^(36098) InputIteratorTransformer[value#692753] +- RowToVeloxColumnar +- LocalTableScan [value#692753] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692762] == Analyzed Logical Plan == from_json(value): struct<a:int,b:int,_unparsed:string> Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692762, Some(America/Los_Angeles)) AS from_json(value)#692765] +- LocalRelation [value#692762] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692762, Some(America/Los_Angeles)) AS from_json(value)#692765] +- LocalRelation [value#692762] == Physical Plan == VeloxColumnarToRow +- ^(36100) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692762, Some(America/Los_Angeles)) AS from_json(value)#692765] +- ^(36100) InputIteratorTransformer[value#692762] +- RowToVeloxColumnar +- LocalTableScan [value#692762] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>> [[2,12,null]] [[2,12,null]] ![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
GlutenJsonFunctionsSuite.corrupt record column in the middle: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- LocalRelation [value#692771] == Analyzed Logical Plan == from_json(value): struct<a:int,_unparsed:string,b:int> Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692771, Some(America/Los_Angeles)) AS from_json(value)#692774] +- LocalRelation [value#692771] == Optimized Logical Plan == Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692771, Some(America/Los_Angeles)) AS from_json(value)#692774] +- LocalRelation [value#692771] == Physical Plan == VeloxColumnarToRow +- ^(36102) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692771, Some(America/Los_Angeles)) AS from_json(value)#692774] +- ^(36102) InputIteratorTransformer[value#692771] +- RowToVeloxColumnar +- LocalTableScan [value#692771] == Results == == Results == !== Correct Answer - 2 == == Gluten Answer - 2 == !struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>> [[2,null,12]] [[2,null,12]] ![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- Project [value#693026 AS c0#693029] +- LocalRelation [value#693026] == Analyzed Logical Plan == from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>> Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#693029, Some(America/Los_Angeles)) AS from_json(c0)#693036] +- Project [value#693026 AS c0#693029] +- LocalRelation [value#693026] == Optimized Logical Plan == Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#693026, Some(America/Los_Angeles)) AS from_json(c0)#693036] +- LocalRelation [value#693026] == Physical Plan == VeloxColumnarToRow +- ^(36131) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#693026, Some(America/Los_Angeles)) AS from_json(c0)#693036] +- ^(36131) InputIteratorTransformer[value#693026] +- RowToVeloxColumnar +- LocalTableScan [value#693026] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>> ![[null]] [[[123456,null]]]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON arrays with objects: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- Project [value#693042 AS c0#693045] +- LocalRelation [value#693042] == Analyzed Logical Plan == from_json(c0): array<struct<c1:string,c2:array<struct<a:bigint>>>> Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), c0#693045, Some(America/Los_Angeles)) AS from_json(c0)#693052] +- Project [value#693042 AS c0#693045] +- LocalRelation [value#693042] == Optimized Logical Plan == Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#693042, Some(America/Los_Angeles)) AS from_json(c0)#693052] +- LocalRelation [value#693042] == Physical Plan == VeloxColumnarToRow +- ^(36135) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#693042, Some(America/Los_Angeles)) AS from_json(c0)#693052] +- ^(36135) InputIteratorTransformer[value#693042] +- RowToVeloxColumnar +- LocalTableScan [value#693042] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):array<struct<c1:string,c2:array<struct<a:bigint>>>>> ![null] [ArraySeq([abc,null])]
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON maps: org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [unresolvedalias(from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))] +- Project [value#693058 AS c0#693061] +- LocalRelation [value#693058] == Analyzed Logical Plan == from_json(c0): struct<c1:map<string,int>,c2:string> Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), c0#693061, Some(America/Los_Angeles)) AS from_json(c0)#693068] +- Project [value#693058 AS c0#693061] +- LocalRelation [value#693058] == Optimized Logical Plan == Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#693058, Some(America/Los_Angeles)) AS from_json(c0)#693068] +- LocalRelation [value#693058] == Physical Plan == VeloxColumnarToRow +- ^(36139) ProjectExecTransformer [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#693058, Some(America/Los_Angeles)) AS from_json(c0)#693068] +- ^(36139) InputIteratorTransformer[value#693058] +- RowToVeloxColumnar +- LocalTableScan [value#693058] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == !struct<> struct<from_json(c0):struct<c1:map<string,int>,c2:string>> ![[null,null]] [[null,abc]]
Process
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636