[DNM] test from_json #30801
Annotations
50 errors and 1 warning
GlutenJsonFunctionsSuite.from_json invalid json - check modes:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7655/601334808@72ac61f2))]
+- LocalRelation [value#718151]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#718151, Some(America/Los_Angeles)) AS from_json(value)#718154]
+- LocalRelation [value#718151]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#718151, Some(America/Los_Angeles)) AS from_json(value)#718154]
+- LocalRelation [value#718151]
== Physical Plan ==
VeloxColumnarToRow
+- ^(26274) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#718151, Some(America/Los_Angeles)) AS from_json(value)#718154]
+- ^(26274) InputIteratorTransformer[value#718151]
+- RowToVeloxColumnar
+- LocalTableScan [value#718151]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.corrupt record column in the middle:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7655/601334808@72ac61f2))]
+- LocalRelation [value#718160]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,_unparsed:string,b:int>
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#718160, Some(America/Los_Angeles)) AS from_json(value)#718163]
+- LocalRelation [value#718160]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#718160, Some(America/Los_Angeles)) AS from_json(value)#718163]
+- LocalRelation [value#718160]
== Physical Plan ==
VeloxColumnarToRow
+- ^(26276) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#718160, Some(America/Los_Angeles)) AS from_json(value)#718163]
+- ^(26276) InputIteratorTransformer[value#718160]
+- RowToVeloxColumnar
+- LocalTableScan [value#718160]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>>
[[2,null,12]] [[2,null,12]]
![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7655/601334808@72ac61f2))]
+- Project [value#718415 AS c0#718418]
+- LocalRelation [value#718415]
== Analyzed Logical Plan ==
from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), c0#718418, Some(America/Los_Angeles)) AS from_json(c0)#718420]
+- Project [value#718415 AS c0#718418]
+- LocalRelation [value#718415]
== Optimized Logical Plan ==
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), value#718415, Some(America/Los_Angeles)) AS from_json(c0)#718420]
+- LocalRelation [value#718415]
== Physical Plan ==
VeloxColumnarToRow
+- ^(26303) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true), StructField(c2,ArrayType(StructType(StructField(c3,LongType,true), StructField(c4,StringType,true)),true),true)),true), value#718415, Some(America/Los_Angeles)) AS from_json(c0)#718420]
+- ^(26303) InputIteratorTransformer[value#718415]
+- RowToVeloxColumnar
+- LocalTableScan [value#718415]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>>
![[null]] [[[123456,null]]]
|
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row:
org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
|
GlutenJsonFunctionsSuite.from_json invalid json - check modes:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))]
+- LocalRelation [value#788444]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788444, Some(America/Los_Angeles)) AS from_json(value)#788447]
+- LocalRelation [value#788444]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788444, Some(America/Los_Angeles)) AS from_json(value)#788447]
+- LocalRelation [value#788444]
== Physical Plan ==
VeloxColumnarToRow
+- ^(30133) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788444, Some(America/Los_Angeles)) AS from_json(value)#788447]
+- ^(30133) InputIteratorTransformer[value#788444]
+- RowToVeloxColumnar
+- LocalTableScan [value#788444]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))]
+- LocalRelation [value#788453]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788453, Some(America/Los_Angeles)) AS from_json(value)#788456]
+- LocalRelation [value#788453]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788453, Some(America/Los_Angeles)) AS from_json(value)#788456]
+- LocalRelation [value#788453]
== Physical Plan ==
VeloxColumnarToRow
+- ^(30135) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#788453, Some(America/Los_Angeles)) AS from_json(value)#788456]
+- ^(30135) InputIteratorTransformer[value#788453]
+- RowToVeloxColumnar
+- LocalTableScan [value#788453]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
|
GlutenJsonFunctionsSuite.corrupt record column in the middle:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))]
+- LocalRelation [value#788462]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,_unparsed:string,b:int>
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#788462, Some(America/Los_Angeles)) AS from_json(value)#788465]
+- LocalRelation [value#788462]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#788462, Some(America/Los_Angeles)) AS from_json(value)#788465]
+- LocalRelation [value#788462]
== Physical Plan ==
VeloxColumnarToRow
+- ^(30137) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#788462, Some(America/Los_Angeles)) AS from_json(value)#788465]
+- ^(30137) InputIteratorTransformer[value#788462]
+- RowToVeloxColumnar
+- LocalTableScan [value#788462]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>>
[[2,null,12]] [[2,null,12]]
![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8327/294283693@47400946))]
+- Project [value#788717 AS c0#788720]
+- LocalRelation [value#788717]
== Analyzed Logical Plan ==
from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#788720, Some(America/Los_Angeles)) AS from_json(c0)#788722]
+- Project [value#788717 AS c0#788720]
+- LocalRelation [value#788717]
== Optimized Logical Plan ==
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#788717, Some(America/Los_Angeles)) AS from_json(c0)#788722]
+- LocalRelation [value#788717]
== Physical Plan ==
VeloxColumnarToRow
+- ^(30164) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#788717, Some(America/Los_Angeles)) AS from_json(c0)#788722]
+- ^(30164) InputIteratorTransformer[value#788717]
+- RowToVeloxColumnar
+- LocalTableScan [value#788717]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>>
![[null]] [[[123456,null]]]
|
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row:
org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
|
GlutenJsonFunctionsSuite.from_json with option (allowComments):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowComments,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616062]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowComments,true), value#616062, Some(America/Los_Angeles)) AS from_json(value)#616065]
+- LocalRelation [value#616062]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowComments,true), value#616062, Some(America/Los_Angeles)) AS from_json(value)#616065]
+- LocalRelation [value#616062]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32271) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowComments,true), value#616062, Some(America/Los_Angeles)) AS from_json(value)#616065]
+- ^(32271) InputIteratorTransformer[value#616062]
+- RowToVeloxColumnar
+- LocalTableScan [value#616062]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedFieldNames):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616071]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#616071, Some(America/Los_Angeles)) AS from_json(value)#616074]
+- LocalRelation [value#616071]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#616071, Some(America/Los_Angeles)) AS from_json(value)#616074]
+- LocalRelation [value#616071]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32273) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#616071, Some(America/Los_Angeles)) AS from_json(value)#616074]
+- ^(32273) InputIteratorTransformer[value#616071]
+- RowToVeloxColumnar
+- LocalTableScan [value#616071]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowSingleQuotes):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowSingleQuotes,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616080]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#616080, Some(America/Los_Angeles)) AS from_json(value)#616083]
+- LocalRelation [value#616080]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#616080, Some(America/Los_Angeles)) AS from_json(value)#616083]
+- LocalRelation [value#616080]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32275) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#616080, Some(America/Los_Angeles)) AS from_json(value)#616083]
+- ^(32275) InputIteratorTransformer[value#616080]
+- RowToVeloxColumnar
+- LocalTableScan [value#616080]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowNumericLeadingZeros):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616089]
== Analyzed Logical Plan ==
from_json(value): struct<int:int>
Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#616089, Some(America/Los_Angeles)) AS from_json(value)#616092]
+- LocalRelation [value#616089]
== Optimized Logical Plan ==
Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#616089, Some(America/Los_Angeles)) AS from_json(value)#616092]
+- LocalRelation [value#616089]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32277) ProjectExecTransformer [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#616089, Some(America/Los_Angeles)) AS from_json(value)#616092]
+- ^(32277) InputIteratorTransformer[value#616089]
+- RowToVeloxColumnar
+- LocalTableScan [value#616089]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<int:int>>
![[18]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowBackslashEscapingAnyCharacter):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616098]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#616098, Some(America/Los_Angeles)) AS from_json(value)#616101]
+- LocalRelation [value#616098]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#616098, Some(America/Los_Angeles)) AS from_json(value)#616101]
+- LocalRelation [value#616098]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32279) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#616098, Some(America/Los_Angeles)) AS from_json(value)#616101]
+- ^(32279) InputIteratorTransformer[value#616098]
+- RowToVeloxColumnar
+- LocalTableScan [value#616098]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[$10]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedControlChars):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616116]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#616116, Some(America/Los_Angeles)) AS from_json(value)#616119]
+- LocalRelation [value#616116]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#616116, Some(America/Los_Angeles)) AS from_json(value)#616119]
+- LocalRelation [value#616116]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32281) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#616116, Some(America/Los_Angeles)) AS from_json(value)#616119]
+- ^(32281) InputIteratorTransformer[value#616116]
+- RowToVeloxColumnar
+- LocalTableScan [value#616116]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[ab]] [null]
|
GlutenJsonFunctionsSuite.from_json invalid json - check modes:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616687]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616687, Some(America/Los_Angeles)) AS from_json(value)#616690]
+- LocalRelation [value#616687]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616687, Some(America/Los_Angeles)) AS from_json(value)#616690]
+- LocalRelation [value#616687]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32335) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616687, Some(America/Los_Angeles)) AS from_json(value)#616690]
+- ^(32335) InputIteratorTransformer[value#616687]
+- RowToVeloxColumnar
+- LocalTableScan [value#616687]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616696]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616696, Some(America/Los_Angeles)) AS from_json(value)#616699]
+- LocalRelation [value#616696]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616696, Some(America/Los_Angeles)) AS from_json(value)#616699]
+- LocalRelation [value#616696]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32337) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#616696, Some(America/Los_Angeles)) AS from_json(value)#616699]
+- ^(32337) InputIteratorTransformer[value#616696]
+- RowToVeloxColumnar
+- LocalTableScan [value#616696]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
|
GlutenJsonFunctionsSuite.corrupt record column in the middle:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- LocalRelation [value#616705]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,_unparsed:string,b:int>
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#616705, Some(America/Los_Angeles)) AS from_json(value)#616708]
+- LocalRelation [value#616705]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#616705, Some(America/Los_Angeles)) AS from_json(value)#616708]
+- LocalRelation [value#616705]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32339) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#616705, Some(America/Los_Angeles)) AS from_json(value)#616708]
+- ^(32339) InputIteratorTransformer[value#616705]
+- RowToVeloxColumnar
+- LocalTableScan [value#616705]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>>
[[2,null,12]] [[2,null,12]]
![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- Project [value#616960 AS c0#616963]
+- LocalRelation [value#616960]
== Analyzed Logical Plan ==
from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#616963, Some(America/Los_Angeles)) AS from_json(c0)#616970]
+- Project [value#616960 AS c0#616963]
+- LocalRelation [value#616960]
== Optimized Logical Plan ==
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#616960, Some(America/Los_Angeles)) AS from_json(c0)#616970]
+- LocalRelation [value#616960]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32368) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#616960, Some(America/Los_Angeles)) AS from_json(c0)#616970]
+- ^(32368) InputIteratorTransformer[value#616960]
+- RowToVeloxColumnar
+- LocalTableScan [value#616960]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>>
![[null]] [[[123456,null]]]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON arrays with objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- Project [value#616976 AS c0#616979]
+- LocalRelation [value#616976]
== Analyzed Logical Plan ==
from_json(c0): array<struct<c1:string,c2:array<struct<a:bigint>>>>
Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), c0#616979, Some(America/Los_Angeles)) AS from_json(c0)#616986]
+- Project [value#616976 AS c0#616979]
+- LocalRelation [value#616976]
== Optimized Logical Plan ==
Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#616976, Some(America/Los_Angeles)) AS from_json(c0)#616986]
+- LocalRelation [value#616976]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32372) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#616976, Some(America/Los_Angeles)) AS from_json(c0)#616986]
+- ^(32372) InputIteratorTransformer[value#616976]
+- RowToVeloxColumnar
+- LocalTableScan [value#616976]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):array<struct<c1:string,c2:array<struct<a:bigint>>>>>
![null] [ArrayBuffer([abc,null])]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON maps:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- Project [value#616992 AS c0#616995]
+- LocalRelation [value#616992]
== Analyzed Logical Plan ==
from_json(c0): struct<c1:map<string,int>,c2:string>
Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), c0#616995, Some(America/Los_Angeles)) AS from_json(c0)#617002]
+- Project [value#616992 AS c0#616995]
+- LocalRelation [value#616992]
== Optimized Logical Plan ==
Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#616992, Some(America/Los_Angeles)) AS from_json(c0)#617002]
+- LocalRelation [value#616992]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32376) ProjectExecTransformer [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#616992, Some(America/Los_Angeles)) AS from_json(c0)#617002]
+- ^(32376) InputIteratorTransformer[value#616992]
+- RowToVeloxColumnar
+- LocalTableScan [value#616992]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<c1:map<string,int>,c2:string>>
![[null,null]] [[null,abc]]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for objects with values as JSON arrays:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$7883/1730354573@3f63dd33))]
+- Project [value#617030 AS c0#617033]
+- LocalRelation [value#617030]
== Analyzed Logical Plan ==
from_json(c0): array<struct<c1:array<struct<c2:array<int>>>>>
Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), c0#617033, Some(America/Los_Angeles)) AS from_json(c0)#617040]
+- Project [value#617030 AS c0#617033]
+- LocalRelation [value#617030]
== Optimized Logical Plan ==
Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#617030, Some(America/Los_Angeles)) AS from_json(c0)#617040]
+- LocalRelation [value#617030]
== Physical Plan ==
VeloxColumnarToRow
+- ^(32384) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#617030, Some(America/Los_Angeles)) AS from_json(c0)#617040]
+- ^(32384) InputIteratorTransformer[value#617030]
+- RowToVeloxColumnar
+- LocalTableScan [value#617030]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):array<struct<c1:array<struct<c2:array<int>>>>>>
![null] [ArrayBuffer([WrappedArray([null])])]
|
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row:
org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
|
GlutenJsonFunctionsSuite.from_json with option (allowComments):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowComments,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#691605]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowComments,true), value#691605, Some(America/Los_Angeles)) AS from_json(value)#691608]
+- LocalRelation [value#691605]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowComments,true), value#691605, Some(America/Los_Angeles)) AS from_json(value)#691608]
+- LocalRelation [value#691605]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36031) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowComments,true), value#691605, Some(America/Los_Angeles)) AS from_json(value)#691608]
+- ^(36031) InputIteratorTransformer[value#691605]
+- RowToVeloxColumnar
+- LocalTableScan [value#691605]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedFieldNames):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#691614]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#691614, Some(America/Los_Angeles)) AS from_json(value)#691617]
+- LocalRelation [value#691614]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#691614, Some(America/Los_Angeles)) AS from_json(value)#691617]
+- LocalRelation [value#691614]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36033) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#691614, Some(America/Los_Angeles)) AS from_json(value)#691617]
+- ^(36033) InputIteratorTransformer[value#691614]
+- RowToVeloxColumnar
+- LocalTableScan [value#691614]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowSingleQuotes):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowSingleQuotes,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#691623]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#691623, Some(America/Los_Angeles)) AS from_json(value)#691626]
+- LocalRelation [value#691623]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#691623, Some(America/Los_Angeles)) AS from_json(value)#691626]
+- LocalRelation [value#691623]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36035) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#691623, Some(America/Los_Angeles)) AS from_json(value)#691626]
+- ^(36035) InputIteratorTransformer[value#691623]
+- RowToVeloxColumnar
+- LocalTableScan [value#691623]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowNumericLeadingZeros):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#691632]
== Analyzed Logical Plan ==
from_json(value): struct<int:int>
Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#691632, Some(America/Los_Angeles)) AS from_json(value)#691635]
+- LocalRelation [value#691632]
== Optimized Logical Plan ==
Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#691632, Some(America/Los_Angeles)) AS from_json(value)#691635]
+- LocalRelation [value#691632]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36037) ProjectExecTransformer [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#691632, Some(America/Los_Angeles)) AS from_json(value)#691635]
+- ^(36037) InputIteratorTransformer[value#691632]
+- RowToVeloxColumnar
+- LocalTableScan [value#691632]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<int:int>>
![[18]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowBackslashEscapingAnyCharacter):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#691641]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#691641, Some(America/Los_Angeles)) AS from_json(value)#691644]
+- LocalRelation [value#691641]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#691641, Some(America/Los_Angeles)) AS from_json(value)#691644]
+- LocalRelation [value#691641]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36039) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#691641, Some(America/Los_Angeles)) AS from_json(value)#691644]
+- ^(36039) InputIteratorTransformer[value#691641]
+- RowToVeloxColumnar
+- LocalTableScan [value#691641]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[$10]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedControlChars):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#691659]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#691659, Some(America/Los_Angeles)) AS from_json(value)#691662]
+- LocalRelation [value#691659]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#691659, Some(America/Los_Angeles)) AS from_json(value)#691662]
+- LocalRelation [value#691659]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36041) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#691659, Some(America/Los_Angeles)) AS from_json(value)#691662]
+- ^(36041) InputIteratorTransformer[value#691659]
+- RowToVeloxColumnar
+- LocalTableScan [value#691659]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[ab]] [null]
|
GlutenJsonFunctionsSuite.from_json invalid json - check modes:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#692230]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692230, Some(America/Los_Angeles)) AS from_json(value)#692233]
+- LocalRelation [value#692230]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692230, Some(America/Los_Angeles)) AS from_json(value)#692233]
+- LocalRelation [value#692230]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36095) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692230, Some(America/Los_Angeles)) AS from_json(value)#692233]
+- ^(36095) InputIteratorTransformer[value#692230]
+- RowToVeloxColumnar
+- LocalTableScan [value#692230]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#692239]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692239, Some(America/Los_Angeles)) AS from_json(value)#692242]
+- LocalRelation [value#692239]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692239, Some(America/Los_Angeles)) AS from_json(value)#692242]
+- LocalRelation [value#692239]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36097) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692239, Some(America/Los_Angeles)) AS from_json(value)#692242]
+- ^(36097) InputIteratorTransformer[value#692239]
+- RowToVeloxColumnar
+- LocalTableScan [value#692239]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
|
GlutenJsonFunctionsSuite.corrupt record column in the middle:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- LocalRelation [value#692248]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,_unparsed:string,b:int>
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692248, Some(America/Los_Angeles)) AS from_json(value)#692251]
+- LocalRelation [value#692248]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692248, Some(America/Los_Angeles)) AS from_json(value)#692251]
+- LocalRelation [value#692248]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36099) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692248, Some(America/Los_Angeles)) AS from_json(value)#692251]
+- ^(36099) InputIteratorTransformer[value#692248]
+- RowToVeloxColumnar
+- LocalTableScan [value#692248]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>>
[[2,null,12]] [[2,null,12]]
![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- Project [value#692503 AS c0#692506]
+- LocalRelation [value#692503]
== Analyzed Logical Plan ==
from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#692506, Some(America/Los_Angeles)) AS from_json(c0)#692513]
+- Project [value#692503 AS c0#692506]
+- LocalRelation [value#692503]
== Optimized Logical Plan ==
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#692503, Some(America/Los_Angeles)) AS from_json(c0)#692513]
+- LocalRelation [value#692503]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36128) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#692503, Some(America/Los_Angeles)) AS from_json(c0)#692513]
+- ^(36128) InputIteratorTransformer[value#692503]
+- RowToVeloxColumnar
+- LocalTableScan [value#692503]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>>
![[null]] [[[123456,null]]]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON arrays with objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- Project [value#692519 AS c0#692522]
+- LocalRelation [value#692519]
== Analyzed Logical Plan ==
from_json(c0): array<struct<c1:string,c2:array<struct<a:bigint>>>>
Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), c0#692522, Some(America/Los_Angeles)) AS from_json(c0)#692529]
+- Project [value#692519 AS c0#692522]
+- LocalRelation [value#692519]
== Optimized Logical Plan ==
Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#692519, Some(America/Los_Angeles)) AS from_json(c0)#692529]
+- LocalRelation [value#692519]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36132) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#692519, Some(America/Los_Angeles)) AS from_json(c0)#692529]
+- ^(36132) InputIteratorTransformer[value#692519]
+- RowToVeloxColumnar
+- LocalTableScan [value#692519]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):array<struct<c1:string,c2:array<struct<a:bigint>>>>>
![null] [ArrayBuffer([abc,null])]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON maps:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- Project [value#692535 AS c0#692538]
+- LocalRelation [value#692535]
== Analyzed Logical Plan ==
from_json(c0): struct<c1:map<string,int>,c2:string>
Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), c0#692538, Some(America/Los_Angeles)) AS from_json(c0)#692545]
+- Project [value#692535 AS c0#692538]
+- LocalRelation [value#692535]
== Optimized Logical Plan ==
Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#692535, Some(America/Los_Angeles)) AS from_json(c0)#692545]
+- LocalRelation [value#692535]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36136) ProjectExecTransformer [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#692535, Some(America/Los_Angeles)) AS from_json(c0)#692545]
+- ^(36136) InputIteratorTransformer[value#692535]
+- RowToVeloxColumnar
+- LocalTableScan [value#692535]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<c1:map<string,int>,c2:string>>
![[null,null]] [[null,abc]]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for objects with values as JSON arrays:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- Project [value#692573 AS c0#692576]
+- LocalRelation [value#692573]
== Analyzed Logical Plan ==
from_json(c0): array<struct<c1:array<struct<c2:array<int>>>>>
Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), c0#692576, Some(America/Los_Angeles)) AS from_json(c0)#692583]
+- Project [value#692573 AS c0#692576]
+- LocalRelation [value#692573]
== Optimized Logical Plan ==
Project [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#692573, Some(America/Los_Angeles)) AS from_json(c0)#692583]
+- LocalRelation [value#692573]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36144) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,ArrayType(StructType(StructField(c2,ArrayType(IntegerType,true),true)),true),true)),true), value#692573, Some(America/Los_Angeles)) AS from_json(c0)#692583]
+- ^(36144) InputIteratorTransformer[value#692573]
+- RowToVeloxColumnar
+- LocalTableScan [value#692573]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):array<struct<c1:array<struct<c2:array<int>>>>>>
![null] [ArrayBuffer([WrappedArray([null])])]
|
GlutenJsonFunctionsSuite.SPARK-48863: parse object as an array with partial results enabled:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8420/496914716@3bb05608))]
+- Project [value#692611 AS c0#692614]
+- LocalRelation [value#692611]
== Analyzed Logical Plan ==
from_json(c0): array<struct<a:string,c:int>>
Project [from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), c0#692614, Some(America/Los_Angeles)) AS from_json(c0)#692621]
+- Project [value#692611 AS c0#692614]
+- LocalRelation [value#692611]
== Optimized Logical Plan ==
Project [from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), value#692611, Some(America/Los_Angeles)) AS from_json(c0)#692621]
+- LocalRelation [value#692611]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36152) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(a,StringType,true),StructField(c,IntegerType,true)),true), value#692611, Some(America/Los_Angeles)) AS from_json(c0)#692621]
+- ^(36152) InputIteratorTransformer[value#692611]
+- RowToVeloxColumnar
+- LocalTableScan [value#692611]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):array<struct<a:string,c:int>>>
![null] [ArrayBuffer([b,null])]
|
GlutenJsonExpressionsSuite.from_json - input=array, schema=struct, output=single row:
org/apache/spark/sql/catalyst/expressions/GlutenJsonExpressionsSuite#L21
Incorrect evaluation: from_json(StructField(a,IntegerType,true), StructField(corrupted,StringType,true), (columnNameOfCorruptRecord,corrupted), [{"a": 1}, {"a": 2}], Some(UTC)), actual: [null,], expected: [null,[{"a": 1}, {"a": 2}]]
|
GlutenJsonFunctionsSuite.from_json with option (allowComments):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowComments,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692128]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowComments,true), value#692128, Some(America/Los_Angeles)) AS from_json(value)#692131]
+- LocalRelation [value#692128]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowComments,true), value#692128, Some(America/Los_Angeles)) AS from_json(value)#692131]
+- LocalRelation [value#692128]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36034) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowComments,true), value#692128, Some(America/Los_Angeles)) AS from_json(value)#692131]
+- ^(36034) InputIteratorTransformer[value#692128]
+- RowToVeloxColumnar
+- LocalTableScan [value#692128]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedFieldNames):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692137]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#692137, Some(America/Los_Angeles)) AS from_json(value)#692140]
+- LocalRelation [value#692137]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#692137, Some(America/Los_Angeles)) AS from_json(value)#692140]
+- LocalRelation [value#692137]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36036) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedFieldNames,true), value#692137, Some(America/Los_Angeles)) AS from_json(value)#692140]
+- ^(36036) InputIteratorTransformer[value#692137]
+- RowToVeloxColumnar
+- LocalTableScan [value#692137]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowSingleQuotes):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowSingleQuotes,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692146]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#692146, Some(America/Los_Angeles)) AS from_json(value)#692149]
+- LocalRelation [value#692146]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#692146, Some(America/Los_Angeles)) AS from_json(value)#692149]
+- LocalRelation [value#692146]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36038) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowSingleQuotes,true), value#692146, Some(America/Los_Angeles)) AS from_json(value)#692149]
+- ^(36038) InputIteratorTransformer[value#692146]
+- RowToVeloxColumnar
+- LocalTableScan [value#692146]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[World]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowNumericLeadingZeros):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692155]
== Analyzed Logical Plan ==
from_json(value): struct<int:int>
Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#692155, Some(America/Los_Angeles)) AS from_json(value)#692158]
+- LocalRelation [value#692155]
== Optimized Logical Plan ==
Project [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#692155, Some(America/Los_Angeles)) AS from_json(value)#692158]
+- LocalRelation [value#692155]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36040) ProjectExecTransformer [from_json(StructField(int,IntegerType,true), (allowNumericLeadingZeros,true), value#692155, Some(America/Los_Angeles)) AS from_json(value)#692158]
+- ^(36040) InputIteratorTransformer[value#692155]
+- RowToVeloxColumnar
+- LocalTableScan [value#692155]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<int:int>>
![[18]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowBackslashEscapingAnyCharacter):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692164]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#692164, Some(America/Los_Angeles)) AS from_json(value)#692167]
+- LocalRelation [value#692164]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#692164, Some(America/Los_Angeles)) AS from_json(value)#692167]
+- LocalRelation [value#692164]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36042) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowBackslashEscapingAnyCharacter,true), value#692164, Some(America/Los_Angeles)) AS from_json(value)#692167]
+- ^(36042) InputIteratorTransformer[value#692164]
+- RowToVeloxColumnar
+- LocalTableScan [value#692164]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[$10]] [[null]]
|
GlutenJsonFunctionsSuite.from_json with option (allowUnquotedControlChars):
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692182]
== Analyzed Logical Plan ==
from_json(value): struct<str:string>
Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#692182, Some(America/Los_Angeles)) AS from_json(value)#692185]
+- LocalRelation [value#692182]
== Optimized Logical Plan ==
Project [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#692182, Some(America/Los_Angeles)) AS from_json(value)#692185]
+- LocalRelation [value#692182]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36044) ProjectExecTransformer [from_json(StructField(str,StringType,true), (allowUnquotedControlChars,true), value#692182, Some(America/Los_Angeles)) AS from_json(value)#692185]
+- ^(36044) InputIteratorTransformer[value#692182]
+- RowToVeloxColumnar
+- LocalTableScan [value#692182]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(value):struct<str:string>>
![[ab]] [null]
|
GlutenJsonFunctionsSuite.from_json invalid json - check modes:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692753]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692753, Some(America/Los_Angeles)) AS from_json(value)#692756]
+- LocalRelation [value#692753]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692753, Some(America/Los_Angeles)) AS from_json(value)#692756]
+- LocalRelation [value#692753]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36098) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692753, Some(America/Los_Angeles)) AS from_json(value)#692756]
+- ^(36098) InputIteratorTransformer[value#692753]
+- RowToVeloxColumnar
+- LocalTableScan [value#692753]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,null,{"a" 1, "b": 11}]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-36069: from_json invalid json schema - check field name and field value:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692762]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,b:int,_unparsed:string>
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692762, Some(America/Los_Angeles)) AS from_json(value)#692765]
+- LocalRelation [value#692762]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692762, Some(America/Los_Angeles)) AS from_json(value)#692765]
+- LocalRelation [value#692762]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36100) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(b,IntegerType,true), StructField(_unparsed,StringType,true), (mode,PERMISSIVE), value#692762, Some(America/Los_Angeles)) AS from_json(value)#692765]
+- ^(36100) InputIteratorTransformer[value#692762]
+- RowToVeloxColumnar
+- LocalTableScan [value#692762]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,b:int,_unparsed:string>>
[[2,12,null]] [[2,12,null]]
![[null,11,{"a": "1", "b": 11}]] [[null,11,null]]
|
GlutenJsonFunctionsSuite.corrupt record column in the middle:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), 'value, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- LocalRelation [value#692771]
== Analyzed Logical Plan ==
from_json(value): struct<a:int,_unparsed:string,b:int>
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692771, Some(America/Los_Angeles)) AS from_json(value)#692774]
+- LocalRelation [value#692771]
== Optimized Logical Plan ==
Project [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692771, Some(America/Los_Angeles)) AS from_json(value)#692774]
+- LocalRelation [value#692771]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36102) ProjectExecTransformer [from_json(StructField(a,IntegerType,true), StructField(_unparsed,StringType,true), StructField(b,IntegerType,true), (columnNameOfCorruptRecord,_unparsed), value#692771, Some(America/Los_Angeles)) AS from_json(value)#692774]
+- ^(36102) InputIteratorTransformer[value#692771]
+- RowToVeloxColumnar
+- LocalTableScan [value#692771]
== Results ==
== Results ==
!== Correct Answer - 2 == == Gluten Answer - 2 ==
!struct<> struct<from_json(value):struct<a:int,_unparsed:string,b:int>>
[[2,null,12]] [[2,null,12]]
![[null,{"a" 1, "b": 11},null]] [[null,null,null]]
|
GlutenJsonFunctionsSuite.SPARK-33134: return partial results only for root JSON objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- Project [value#693026 AS c0#693029]
+- LocalRelation [value#693026]
== Analyzed Logical Plan ==
from_json(c0): struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), c0#693029, Some(America/Los_Angeles)) AS from_json(c0)#693036]
+- Project [value#693026 AS c0#693029]
+- LocalRelation [value#693026]
== Optimized Logical Plan ==
Project [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#693026, Some(America/Los_Angeles)) AS from_json(c0)#693036]
+- LocalRelation [value#693026]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36131) ProjectExecTransformer [from_json(StructField(data,StructType(StructField(c1,LongType,true),StructField(c2,ArrayType(StructType(StructField(c3,LongType,true),StructField(c4,StringType,true)),true),true)),true), value#693026, Some(America/Los_Angeles)) AS from_json(c0)#693036]
+- ^(36131) InputIteratorTransformer[value#693026]
+- RowToVeloxColumnar
+- LocalTableScan [value#693026]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<data:struct<c1:bigint,c2:array<struct<c3:bigint,c4:string>>>>>
![[null]] [[[123456,null]]]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON arrays with objects:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- Project [value#693042 AS c0#693045]
+- LocalRelation [value#693042]
== Analyzed Logical Plan ==
from_json(c0): array<struct<c1:string,c2:array<struct<a:bigint>>>>
Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), c0#693045, Some(America/Los_Angeles)) AS from_json(c0)#693052]
+- Project [value#693042 AS c0#693045]
+- LocalRelation [value#693042]
== Optimized Logical Plan ==
Project [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#693042, Some(America/Los_Angeles)) AS from_json(c0)#693052]
+- LocalRelation [value#693042]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36135) ProjectExecTransformer [from_json(ArrayType(StructType(StructField(c1,StringType,true),StructField(c2,ArrayType(StructType(StructField(a,LongType,true)),true),true)),true), value#693042, Some(America/Los_Angeles)) AS from_json(c0)#693052]
+- ^(36135) InputIteratorTransformer[value#693042]
+- RowToVeloxColumnar
+- LocalTableScan [value#693042]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):array<struct<c1:string,c2:array<struct<a:bigint>>>>>
![null] [ArraySeq([abc,null])]
|
GlutenJsonFunctionsSuite.SPARK-40646: return partial results for JSON maps:
org/apache/spark/sql/GlutenJsonFunctionsSuite#L19
Results do not match for query:
Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
Timezone Env:
== Parsed Logical Plan ==
'Project [unresolvedalias(from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), 'c0, None), Some(org.apache.spark.sql.Column$$Lambda$8360/495944540@12b373f0))]
+- Project [value#693058 AS c0#693061]
+- LocalRelation [value#693058]
== Analyzed Logical Plan ==
from_json(c0): struct<c1:map<string,int>,c2:string>
Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), c0#693061, Some(America/Los_Angeles)) AS from_json(c0)#693068]
+- Project [value#693058 AS c0#693061]
+- LocalRelation [value#693058]
== Optimized Logical Plan ==
Project [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#693058, Some(America/Los_Angeles)) AS from_json(c0)#693068]
+- LocalRelation [value#693058]
== Physical Plan ==
VeloxColumnarToRow
+- ^(36139) ProjectExecTransformer [from_json(StructField(c1,MapType(StringType,IntegerType,true),true), StructField(c2,StringType,true), value#693058, Some(America/Los_Angeles)) AS from_json(c0)#693068]
+- ^(36139) InputIteratorTransformer[value#693058]
+- RowToVeloxColumnar
+- LocalTableScan [value#693058]
== Results ==
== Results ==
!== Correct Answer - 1 == == Gluten Answer - 1 ==
!struct<> struct<from_json(c0):struct<c1:map<string,int>,c2:string>>
![[null,null]] [[null,abc]]
|
Process
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636
|