Skip to content

[VL] Align the timezone of timestamp partition value with Velox #29703

[VL] Align the timezone of timestamp partition value with Velox

[VL] Align the timezone of timestamp partition value with Velox #29703

Triggered via pull request November 21, 2024 08:49
@rui-morui-mo
synchronize #8015
Status Success
Total duration 16s
Artifacts

dev_cron.yml

on: pull_request_target
Process
7s
Process
Fit to window
Zoom out
Zoom in

Annotations

11 errors
VeloxIcebergSuite.iceberg partition type - timestamp: org/apache/gluten/execution/VeloxIcebergSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [*] +- 'UnresolvedRelation [part_by_timestamp], [], false == Analyzed Logical Plan == p: timestamp Project [p#25736] +- SubqueryAlias spark_catalog.default.part_by_timestamp +- RelationV2[p#25736] spark_catalog.default.part_by_timestamp == Optimized Logical Plan == RelationV2[p#25736] spark_catalog.default.part_by_timestamp == Physical Plan == VeloxColumnarToRow +- ^(1204) IcebergIcebergScanTransformer[p#25736] spark_catalog.default.part_by_timestamp [filters=] RuntimeFilters: [] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == struct<> struct<> ![2022-01-01 00:01:20.0] [2022-01-01 08:01:20.0]
VeloxIcebergSuite.iceberg partition type - timestamp: org/apache/gluten/execution/VeloxIcebergSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [*] +- 'UnresolvedRelation [part_by_timestamp], [], false == Analyzed Logical Plan == p: timestamp Project [p#26224] +- SubqueryAlias spark_catalog.default.part_by_timestamp +- RelationV2[p#26224] spark_catalog.default.part_by_timestamp == Optimized Logical Plan == RelationV2[p#26224] spark_catalog.default.part_by_timestamp == Physical Plan == VeloxColumnarToRow +- ^(1241) IcebergIcebergScanTransformer[p#26224] spark_catalog.default.part_by_timestamp (branch=null) [filters=, groupedBy=] RuntimeFilters: [] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == struct<> struct<> ![2022-01-01 00:01:20.0] [2022-01-01 08:01:20.0]
GlutenFileMetadataStructSuite.metadata struct (parquet): read partial/all metadata struct fields: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project ['name, 'age, 'info, '_metadata.file_name, '_metadata.file_path, '_metadata.file_size, '_metadata.file_block_start, '_metadata.file_block_length, '_metadata.file_modification_time] +- Relation [name#602751,age#602752,info#602753] parquet == Analyzed Logical Plan == name: string, age: int, info: struct<id:bigint,university:string>, file_name: string, file_path: string, file_size: bigint, file_block_start: bigint, file_block_length: bigint, file_modification_time: timestamp Project [name#602751, age#602752, info#602753, _metadata#602757.file_name AS file_name#602758, _metadata#602757.file_path AS file_path#602759, _metadata#602757.file_size AS file_size#602760L, _metadata#602757.file_block_start AS file_block_start#602761L, _metadata#602757.file_block_length AS file_block_length#602762L, _metadata#602757.file_modification_time AS file_modification_time#602763] +- Relation [name#602751,age#602752,info#602753,_metadata#602757] parquet == Optimized Logical Plan == Project [name#602751, age#602752, info#602753, _metadata#602757.file_name AS file_name#602758, _metadata#602757.file_path AS file_path#602759, _metadata#602757.file_size AS file_size#602760L, _metadata#602757.file_block_start AS file_block_start#602761L, _metadata#602757.file_block_length AS file_block_length#602762L, _metadata#602757.file_modification_time AS file_modification_time#602763] +- Relation [name#602751,age#602752,info#602753,_metadata#602757] parquet == Physical Plan == VeloxColumnarToRow +- ^(36662) ProjectExecTransformer [name#602751, age#602752, info#602753, _metadata#602757.file_name AS file_name#602758, _metadata#602757.file_path AS file_path#602759, _metadata#602757.file_size AS file_size#602760L, _metadata#602757.file_block_start AS file_block_start#602761L, _metadata#602757.file_block_length AS file_block_length#602762L, _metadata#602757.file_modification_time AS file_modification_time#602763] +- ^(36662) ProjectExecTransformer [name#602751, age#602752, info#602753, knownnotnull(named_struct(file_path, file_path#602778, file_name, file_name#602779, file_size, file_size#602780L, file_block_start, file_block_start#602781L, file_block_length, file_block_length#602782L, file_modification_time, file_modification_time#602783)) AS _metadata#602757] +- ^(36662) FileScanTransformer parquet [name#602751,age#602752,info#602753,file_path#602778,file_name#602779,file_size#602780L,file_block_start#602781L,file_block_length#602782L,file_modification_time#602783] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(2 paths)[file:/tmp/spark-d1e0a394-47f9-4794-8d0d-57ce30697eae/data/f1, file:/tm..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<name:string,age:int,info:struct<id:bigint,university:string>> == Results == == Results == !== Correct Answer - 2 == == Spark Answer - 2 == !struct<> struct<name:string,age:int,info:struct<id:bigint,university:string>,file_name:string,file_path:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp> ![jack,24,[12345,uom],part-00000-7664776a-cc9d-4b06-b747-e4e19da6dcc4-c000.snappy.parquet,file:/tmp/spark-d1e0a394-47f9-4794-8d0d-57ce30697eae/data/f0/part-00000-7664776a-cc9d-4b06-b747-e4e19da6dcc4-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:05.561] [jack,24,[12345,uom],part-00000-7664776a-cc9d-4b06-b747-e4e19da6dcc4-c000.snappy.parquet,file:/tmp/spark-d1e0a394-47f9-4794-8d0d-57ce30697eae/data/f0/part-00000-7664776a-cc9d-4b06-b747-e4e19da6dcc4-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:05.561] ![lily,31,[54321,ucb],part-00000-128f0b81-368a-41af-9ac2-76cf8754a36c-c000.snappy.parquet,file:/tmp/spark-d1e0a394-47f9-4794-8d0d-57ce30697eae/data/f1/part-00000-128f0b81-368a-41af-9ac2-76cf8754a36c-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:05.629] [lily,31,[54321,ucb],part-00000-128f0b81-368a-41af-9ac2-76cf8754a36c-c000.snappy.parquet,file:/tmp/spark-d1e0a394-47f9-4794-8d0d-57ce30697eae/data/f1/part-00000-128f0b81-368a-41af-9ac2-76cf8754a36c-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:05.629]
GlutenFileMetadataStructSuite.metadata struct (parquet): select only metadata: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project ['_metadata.file_name, '_metadata.file_path, '_metadata.file_size, '_metadata.file_block_start, '_metadata.file_block_length, '_metadata.file_modification_time] +- Relation [name#603272,age#603273,info#603274] parquet == Analyzed Logical Plan == file_name: string, file_path: string, file_size: bigint, file_block_start: bigint, file_block_length: bigint, file_modification_time: timestamp Project [_metadata#603278.file_name AS file_name#603279, _metadata#603278.file_path AS file_path#603280, _metadata#603278.file_size AS file_size#603281L, _metadata#603278.file_block_start AS file_block_start#603282L, _metadata#603278.file_block_length AS file_block_length#603283L, _metadata#603278.file_modification_time AS file_modification_time#603284] +- Relation [name#603272,age#603273,info#603274,_metadata#603278] parquet == Optimized Logical Plan == Project [_metadata#603278.file_name AS file_name#603279, _metadata#603278.file_path AS file_path#603280, _metadata#603278.file_size AS file_size#603281L, _metadata#603278.file_block_start AS file_block_start#603282L, _metadata#603278.file_block_length AS file_block_length#603283L, _metadata#603278.file_modification_time AS file_modification_time#603284] +- Relation [name#603272,age#603273,info#603274,_metadata#603278] parquet == Physical Plan == VeloxColumnarToRow +- ^(36682) ProjectExecTransformer [_metadata#603278.file_name AS file_name#603279, _metadata#603278.file_path AS file_path#603280, _metadata#603278.file_size AS file_size#603281L, _metadata#603278.file_block_start AS file_block_start#603282L, _metadata#603278.file_block_length AS file_block_length#603283L, _metadata#603278.file_modification_time AS file_modification_time#603284] +- ^(36682) ProjectExecTransformer [knownnotnull(named_struct(file_path, file_path#603295, file_name, file_name#603296, file_size, file_size#603297L, file_block_start, file_block_start#603298L, file_block_length, file_block_length#603299L, file_modification_time, file_modification_time#603300)) AS _metadata#603278] +- ^(36682) FileScanTransformer parquet [file_path#603295,file_name#603296,file_size#603297L,file_block_start#603298L,file_block_length#603299L,file_modification_time#603300] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(2 paths)[file:/tmp/spark-1b4ec7da-b117-45b1-bc33-195615ab89eb/data/f1, file:/tm..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<> == Results == == Results == !== Correct Answer - 2 == == Spark Answer - 2 == !struct<> struct<file_name:string,file_path:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp> ![part-00000-48fe8e34-a8bc-4844-9201-536ba67db381-c000.snappy.parquet,file:/tmp/spark-1b4ec7da-b117-45b1-bc33-195615ab89eb/data/f0/part-00000-48fe8e34-a8bc-4844-9201-536ba67db381-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:08.017] [part-00000-48fe8e34-a8bc-4844-9201-536ba67db381-c000.snappy.parquet,file:/tmp/spark-1b4ec7da-b117-45b1-bc33-195615ab89eb/data/f0/part-00000-48fe8e34-a8bc-4844-9201-536ba67db381-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:08.017] ![part-00000-763819fe-fd15-4b39-833e-6c600098f01a-c000.snappy.parquet,file:/tmp/spark-1b4ec7da-b117-45b1-bc33-195615ab89eb/data/f1/part-00000-763819fe-fd15-4b39-833e-6c600098f01a-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:08.065] [part-00000-763819fe-fd15-4b39-833e-6c600098f01a-c000.snappy.parquet,file:/tmp/spark-1b4ec7da-b117-45b1-bc33-195615ab89eb/data/f1/part-00000-763819fe-fd15-4b39-833e-6c600098f01a-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:08.065]
GlutenFileMetadataStructSuite.metadata struct (parquet): filter on metadata and user data: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Filter ('age = 31) +- Project [name#603806, age#603807, info#603808, file_name#603813, file_path#603814, file_size#603815L, file_block_start#603816L, file_block_length#603817L, file_modification_time#603818] +- Filter (_metadata#603812.file_path = file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1/part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet) +- Project [name#603806, age#603807, info#603808, file_name#603813, file_path#603814, file_size#603815L, file_block_start#603816L, file_block_length#603817L, file_modification_time#603818, _metadata#603812] +- Filter ((_metadata#603812.file_name = part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet) AND (name#603806 = lily)) +- Project [name#603806, age#603807, info#603808, _metadata#603812.file_name AS file_name#603813, _metadata#603812.file_path AS file_path#603814, _metadata#603812.file_size AS file_size#603815L, _metadata#603812.file_block_start AS file_block_start#603816L, _metadata#603812.file_block_length AS file_block_length#603817L, _metadata#603812.file_modification_time AS file_modification_time#603818, _metadata#603812] +- Relation [name#603806,age#603807,info#603808,_metadata#603812] parquet == Analyzed Logical Plan == name: string, age: int, info: struct<id:bigint,university:string>, file_name: string, file_path: string, file_size: bigint, file_block_start: bigint, file_block_length: bigint, file_modification_time: timestamp Filter (age#603807 = 31) +- Project [name#603806, age#603807, info#603808, file_name#603813, file_path#603814, file_size#603815L, file_block_start#603816L, file_block_length#603817L, file_modification_time#603818] +- Filter (_metadata#603812.file_path = file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1/part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet) +- Project [name#603806, age#603807, info#603808, file_name#603813, file_path#603814, file_size#603815L, file_block_start#603816L, file_block_length#603817L, file_modification_time#603818, _metadata#603812] +- Filter ((_metadata#603812.file_name = part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet) AND (name#603806 = lily)) +- Project [name#603806, age#603807, info#603808, _metadata#603812.file_name AS file_name#603813, _metadata#603812.file_path AS file_path#603814, _metadata#603812.file_size AS file_size#603815L, _metadata#603812.file_block_start AS file_block_start#603816L, _metadata#603812.file_block_length AS file_block_length#603817L, _metadata#603812.file_modification_time AS file_modification_time#603818, _metadata#603812] +- Relation [name#603806,age#603807,info#603808,_metadata#603812] parquet == Optimized Logical Plan == Project [name#603806, age#603807, info#603808, _metadata#603812.file_name AS file_name#603813, _metadata#603812.file_path AS file_path#603814, _metadata#603812.file_size AS file_size#603815L, _metadata#603812.file_block_start AS file_block_start#603816L, _metadata#603812.file_block_length AS file_block_length#603817L, _metadata#603812.file_modification_time AS file_modification_time#603818] +- Filter (((((isnotnull(name#603806) AND isnotnull(age#603807)) AND (_metadata#603812.file_name = part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet)) AND (name#603806 = lily)) AND (_metadata#603812.file_path = file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1/part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet)) AND (age#603807 = 31)) +- Relation [name#603806,age#603807,info#603808,_metadata#603812] parquet == Physical Plan == VeloxColumnarToRow +- ^(36706) ProjectExecTransformer [name#603806, age#603807, info#603808, _metadata#603812.file_name AS file_name#603813, _metadata#603812.file_path AS file_path#603814, _metadata#603812.file_size AS file_size#603815L, _metadata#603812.file_block_start AS file_block_start#603816L, _metadata#603812.file_block_length AS file_block_length#603817L, _metadata#603812.file_modification_time AS file_modification_time#603818] +- ^(36706) FilterExecTransformer (((((isnotnull(name#603806) AND isnotnull(age#603807)) AND (_metadata#603812.file_name = part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet)) AND (name#603806 = lily)) AND (_metadata#603812.file_path = file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1/part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet)) AND (age#603807 = 31)) +- ^(36706) ProjectExecTransformer [name#603806, age#603807, info#603808, knownnotnull(named_struct(file_path, file_path#603835, file_name, file_name#603836, file_size, file_size#603837L, file_block_start, file_block_start#603838L, file_block_length, file_block_length#603839L, file_modification_time, file_modification_time#603840)) AS _metadata#603812] +- ^(36706) FileScanTransformer parquet [name#603806,age#603807,info#603808,file_path#603835,file_name#603836,file_size#603837L,file_block_start#603838L,file_block_length#603839L,file_modification_time#603840] Batched: true, DataFilters: [isnotnull(name#603806), isnotnull(age#603807), (file_name#603836 = part-00000-1aafa27d-ddd8-47c7..., Format: Parquet, Location: InMemoryFileIndex(2 paths)[file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1, file:/tm..., PartitionFilters: [], PushedFilters: [IsNotNull(name), IsNotNull(age), EqualTo(name,lily), EqualTo(age,31)], ReadSchema: struct<name:string,age:int,info:struct<id:bigint,university:string>> == Results == == Results == !== Correct Answer - 1 == == Spark Answer - 1 == !struct<> struct<name:string,age:int,info:struct<id:bigint,university:string>,file_name:string,file_path:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp> ![lily,31,[54321,ucb],part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet,file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1/part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:10.173] [lily,31,[54321,ucb],part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet,file:/tmp/spark-25849b1f-70e9-4404-8300-30a153ab6b66/data/f1/part-00000-1aafa27d-ddd8-47c7-a4bc-c89ed7786044-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:10.173]
GlutenFileMetadataStructSuite.metadata struct (parquet): upper/lower case when case sensitive is true: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project ['name, 'age, '_METADATA, '_metadata] +- Relation [name#603934,age#603935,_METADATA#603936] parquet == Analyzed Logical Plan == name: string, age: int, _METADATA: struct<id:bigint,university:string>, _metadata: struct<file_path:string,file_name:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp,row_index:bigint> Project [name#603934, age#603935, _METADATA#603936, _metadata#603940] +- Relation [name#603934,age#603935,_METADATA#603936,_metadata#603940] parquet == Optimized Logical Plan == Relation [name#603934,age#603935,_METADATA#603936,_metadata#603940] parquet == Physical Plan == VeloxColumnarToRow +- ^(36710) ProjectExecTransformer [name#603934, age#603935, _METADATA#603936, knownnotnull(named_struct(file_path, file_path#603946, file_name, file_name#603947, file_size, file_size#603948L, file_block_start, file_block_start#603949L, file_block_length, file_block_length#603950L, file_modification_time, file_modification_time#603951, row_index, row_index#603952L)) AS _metadata#603940] +- ^(36710) FileScanTransformer parquet [name#603934,age#603935,_METADATA#603936,_tmp_metadata_row_index#603952L,file_path#603946,file_name#603947,file_size#603948L,file_block_start#603949L,file_block_length#603950L,file_modification_time#603951] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(2 paths)[file:/tmp/spark-bdc25467-35cf-498a-9c9c-618ea620f619/data/f1, file:/tm..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<name:string,age:int,_METADATA:struct<id:bigint,university:string>,_tmp_metadata_row_index:... == Results == == Results == !== Correct Answer - 2 == == Spark Answer - 2 == !struct<> struct<name:string,age:int,_METADATA:struct<id:bigint,university:string>,_metadata:struct<file_path:string,file_name:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp,row_index:bigint>> ![jack,24,[12345,uom],[file:/tmp/spark-bdc25467-35cf-498a-9c9c-618ea620f619/data/f0/part-00000-a9863a4f-953f-48cd-b1d1-9ff77c39d0cd-c000.snappy.parquet,part-00000-a9863a4f-953f-48cd-b1d1-9ff77c39d0cd-c000.snappy.parquet,1302,0,1302,2024-11-21 02:06:10.649,0]] [jack,24,[12345,uom],[file:/tmp/spark-bdc25467-35cf-498a-9c9c-618ea620f619/data/f0/part-00000-a9863a4f-953f-48cd-b1d1-9ff77c39d0cd-c000.snappy.parquet,part-00000-a9863a4f-953f-48cd-b1d1-9ff77c39d0cd-c000.snappy.parquet,1302,0,1302,2024-11-21 10:06:10.649,0]] ![lily,31,[54321,ucb],[file:/tmp/spark-bdc25467-35cf-498a-9c9c-618ea620f619/data/f1/part-00000-809e2c86-f6ab-40be-8e06-0c9a0400f03c-c000.snappy.parquet,part-00000-809e2c86-f6ab-40be-8e06-0c9a0400f03c-c000.snappy.parquet,1302,0,1302,2024-11-21 02:06:10.701,0]] [lily,31,[54321,ucb],[file:/tmp/spark-bdc25467-35cf-498a-9c9c-618ea620f619/data/f1/part-00000-809e2c86-f6ab-40be-8e06-0c9a0400f03c-c000.snappy.parquet,part-00000-809e2c86-f6ab-40be-8e06-0c9a0400f03c-c000.snappy.parquet,1302,0,1302,2024-11-21 10:06:10.701,0]]
GlutenFileMetadataStructSuite.metadata struct (parquet): read metadata with offheap set to true: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project ['name, 'age, 'info, '_metadata.file_name, '_metadata.file_path, '_metadata.file_size, '_metadata.file_block_start, '_metadata.file_block_length, '_metadata.file_modification_time] +- Relation [name#604172,age#604173,info#604174] parquet == Analyzed Logical Plan == name: string, age: int, info: struct<id:bigint,university:string>, file_name: string, file_path: string, file_size: bigint, file_block_start: bigint, file_block_length: bigint, file_modification_time: timestamp Project [name#604172, age#604173, info#604174, _metadata#604178.file_name AS file_name#604179, _metadata#604178.file_path AS file_path#604180, _metadata#604178.file_size AS file_size#604181L, _metadata#604178.file_block_start AS file_block_start#604182L, _metadata#604178.file_block_length AS file_block_length#604183L, _metadata#604178.file_modification_time AS file_modification_time#604184] +- Relation [name#604172,age#604173,info#604174,_metadata#604178] parquet == Optimized Logical Plan == Project [name#604172, age#604173, info#604174, _metadata#604178.file_name AS file_name#604179, _metadata#604178.file_path AS file_path#604180, _metadata#604178.file_size AS file_size#604181L, _metadata#604178.file_block_start AS file_block_start#604182L, _metadata#604178.file_block_length AS file_block_length#604183L, _metadata#604178.file_modification_time AS file_modification_time#604184] +- Relation [name#604172,age#604173,info#604174,_metadata#604178] parquet == Physical Plan == VeloxColumnarToRow +- ^(36718) ProjectExecTransformer [name#604172, age#604173, info#604174, _metadata#604178.file_name AS file_name#604179, _metadata#604178.file_path AS file_path#604180, _metadata#604178.file_size AS file_size#604181L, _metadata#604178.file_block_start AS file_block_start#604182L, _metadata#604178.file_block_length AS file_block_length#604183L, _metadata#604178.file_modification_time AS file_modification_time#604184] +- ^(36718) ProjectExecTransformer [name#604172, age#604173, info#604174, knownnotnull(named_struct(file_path, file_path#604199, file_name, file_name#604200, file_size, file_size#604201L, file_block_start, file_block_start#604202L, file_block_length, file_block_length#604203L, file_modification_time, file_modification_time#604204)) AS _metadata#604178] +- ^(36718) FileScanTransformer parquet [name#604172,age#604173,info#604174,file_path#604199,file_name#604200,file_size#604201L,file_block_start#604202L,file_block_length#604203L,file_modification_time#604204] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(2 paths)[file:/tmp/spark-8a3970b3-c38e-4e40-924f-b39a28bf72de/data/f1, file:/tm..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<name:string,age:int,info:struct<id:bigint,university:string>> == Results == == Results == !== Correct Answer - 2 == == Spark Answer - 2 == !struct<> struct<name:string,age:int,info:struct<id:bigint,university:string>,file_name:string,file_path:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp> ![jack,24,[12345,uom],part-00000-12fc1559-3f13-45dd-9e83-b829e144a5ff-c000.snappy.parquet,file:/tmp/spark-8a3970b3-c38e-4e40-924f-b39a28bf72de/data/f0/part-00000-12fc1559-3f13-45dd-9e83-b829e144a5ff-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:11.669] [jack,24,[12345,uom],part-00000-12fc1559-3f13-45dd-9e83-b829e144a5ff-c000.snappy.parquet,file:/tmp/spark-8a3970b3-c38e-4e40-924f-b39a28bf72de/data/f0/part-00000-12fc1559-3f13-45dd-9e83-b829e144a5ff-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:11.669] ![lily,31,[54321,ucb],part-00000-62303a82-574a-4832-955c-407fee4fa2ed-c000.snappy.parquet,file:/tmp/spark-8a3970b3-c38e-4e40-924f-b39a28bf72de/data/f1/part-00000-62303a82-574a-4832-955c-407fee4fa2ed-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:11.733] [lily,31,[54321,ucb],part-00000-62303a82-574a-4832-955c-407fee4fa2ed-c000.snappy.parquet,file:/tmp/spark-8a3970b3-c38e-4e40-924f-b39a28bf72de/data/f1/part-00000-62303a82-574a-4832-955c-407fee4fa2ed-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:11.733]
GlutenFileMetadataStructSuite.metadata struct (parquet): read metadata with offheap set to false: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project ['name, 'age, 'info, '_metadata.file_name, '_metadata.file_path, '_metadata.file_size, '_metadata.file_block_start, '_metadata.file_block_length, '_metadata.file_modification_time] +- Relation [name#604344,age#604345,info#604346] parquet == Analyzed Logical Plan == name: string, age: int, info: struct<id:bigint,university:string>, file_name: string, file_path: string, file_size: bigint, file_block_start: bigint, file_block_length: bigint, file_modification_time: timestamp Project [name#604344, age#604345, info#604346, _metadata#604350.file_name AS file_name#604351, _metadata#604350.file_path AS file_path#604352, _metadata#604350.file_size AS file_size#604353L, _metadata#604350.file_block_start AS file_block_start#604354L, _metadata#604350.file_block_length AS file_block_length#604355L, _metadata#604350.file_modification_time AS file_modification_time#604356] +- Relation [name#604344,age#604345,info#604346,_metadata#604350] parquet == Optimized Logical Plan == Project [name#604344, age#604345, info#604346, _metadata#604350.file_name AS file_name#604351, _metadata#604350.file_path AS file_path#604352, _metadata#604350.file_size AS file_size#604353L, _metadata#604350.file_block_start AS file_block_start#604354L, _metadata#604350.file_block_length AS file_block_length#604355L, _metadata#604350.file_modification_time AS file_modification_time#604356] +- Relation [name#604344,age#604345,info#604346,_metadata#604350] parquet == Physical Plan == VeloxColumnarToRow +- ^(36724) ProjectExecTransformer [name#604344, age#604345, info#604346, _metadata#604350.file_name AS file_name#604351, _metadata#604350.file_path AS file_path#604352, _metadata#604350.file_size AS file_size#604353L, _metadata#604350.file_block_start AS file_block_start#604354L, _metadata#604350.file_block_length AS file_block_length#604355L, _metadata#604350.file_modification_time AS file_modification_time#604356] +- ^(36724) ProjectExecTransformer [name#604344, age#604345, info#604346, knownnotnull(named_struct(file_path, file_path#604371, file_name, file_name#604372, file_size, file_size#604373L, file_block_start, file_block_start#604374L, file_block_length, file_block_length#604375L, file_modification_time, file_modification_time#604376)) AS _metadata#604350] +- ^(36724) FileScanTransformer parquet [name#604344,age#604345,info#604346,file_path#604371,file_name#604372,file_size#604373L,file_block_start#604374L,file_block_length#604375L,file_modification_time#604376] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(2 paths)[file:/tmp/spark-6aceeab4-e7fb-40bc-8f5e-91bfd07119e7/data/f1, file:/tm..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<name:string,age:int,info:struct<id:bigint,university:string>> == Results == == Results == !== Correct Answer - 2 == == Spark Answer - 2 == !struct<> struct<name:string,age:int,info:struct<id:bigint,university:string>,file_name:string,file_path:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp> ![jack,24,[12345,uom],part-00000-6990195d-208b-4ca5-a7bc-295fa719f364-c000.snappy.parquet,file:/tmp/spark-6aceeab4-e7fb-40bc-8f5e-91bfd07119e7/data/f0/part-00000-6990195d-208b-4ca5-a7bc-295fa719f364-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:12.261] [jack,24,[12345,uom],part-00000-6990195d-208b-4ca5-a7bc-295fa719f364-c000.snappy.parquet,file:/tmp/spark-6aceeab4-e7fb-40bc-8f5e-91bfd07119e7/data/f0/part-00000-6990195d-208b-4ca5-a7bc-295fa719f364-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:12.261] ![lily,31,[54321,ucb],part-00000-ef8a6a58-bc6f-440d-aab2-7e991df6b858-c000.snappy.parquet,file:/tmp/spark-6aceeab4-e7fb-40bc-8f5e-91bfd07119e7/data/f1/part-00000-ef8a6a58-bc6f-440d-aab2-7e991df6b858-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:12.321] [lily,31,[54321,ucb],part-00000-ef8a6a58-bc6f-440d-aab2-7e991df6b858-c000.snappy.parquet,file:/tmp/spark-6aceeab4-e7fb-40bc-8f5e-91bfd07119e7/data/f1/part-00000-ef8a6a58-bc6f-440d-aab2-7e991df6b858-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:12.321]
GlutenFileMetadataStructSuite.metadata struct (parquet): write _metadata in parquet and read back: org/apache/spark/sql/execution/datasources/GlutenFileMetadataStructSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [*] +- Relation [name#605025,age#605026,info#605027,_metadata#605028] parquet == Analyzed Logical Plan == name: string, age: int, info: struct<id:bigint,university:string>, _metadata: struct<file_path:string,file_name:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp,row_index:bigint> Project [name#605025, age#605026, info#605027, _metadata#605028] +- Relation [name#605025,age#605026,info#605027,_metadata#605028] parquet == Optimized Logical Plan == Relation [name#605025,age#605026,info#605027,_metadata#605028] parquet == Physical Plan == VeloxColumnarToRow +- ^(36752) FileScanTransformer parquet [name#605025,age#605026,info#605027,_metadata#605028] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/tmp/spark-10ef5b42-4086-4acd-be75-f5e6dc1251eb/new-data], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<name:string,age:int,info:struct<id:bigint,university:string>,_metadata:struct<file_path:st... == Results == == Results == !== Correct Answer - 2 == == Spark Answer - 2 == !struct<> struct<name:string,age:int,info:struct<id:bigint,university:string>,_metadata:struct<file_path:string,file_name:string,file_size:bigint,file_block_start:bigint,file_block_length:bigint,file_modification_time:timestamp,row_index:bigint>> ![jack,24,[12345,uom],[file:/tmp/spark-b4946f50-0c12-4780-98fb-08177c43738c/data/f0/part-00000-d1889ae8-8cb0-4ec1-9fc6-a9e5e49b6fab-c000.snappy.parquet,part-00000-d1889ae8-8cb0-4ec1-9fc6-a9e5e49b6fab-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:15.077,0]] [jack,24,[12345,uom],[file:/tmp/spark-b4946f50-0c12-4780-98fb-08177c43738c/data/f0/part-00000-d1889ae8-8cb0-4ec1-9fc6-a9e5e49b6fab-c000.snappy.parquet,part-00000-d1889ae8-8cb0-4ec1-9fc6-a9e5e49b6fab-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:15.077,0]] ![lily,31,[54321,ucb],[file:/tmp/spark-b4946f50-0c12-4780-98fb-08177c43738c/data/f1/part-00000-b2197a68-5155-4e30-b440-569394439ddc-c000.snappy.parquet,part-00000-b2197a68-5155-4e30-b440-569394439ddc-c000.snappy.parquet,1282,0,1282,2024-11-21 02:06:15.121,0]] [lily,31,[54321,ucb],[file:/tmp/spark-b4946f50-0c12-4780-98fb-08177c43738c/data/f1/part-00000-b2197a68-5155-4e30-b440-569394439ddc-c000.snappy.parquet,part-00000-b2197a68-5155-4e30-b440-569394439ddc-c000.snappy.parquet,1282,0,1282,2024-11-21 10:06:15.121,0]]
VeloxIcebergSuite.iceberg partition type - timestamp: org/apache/gluten/execution/VeloxIcebergSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [*] +- 'UnresolvedRelation [part_by_timestamp], [], false == Analyzed Logical Plan == p: timestamp Project [p#36183] +- SubqueryAlias spark_catalog.default.part_by_timestamp +- RelationV2[p#36183] spark_catalog.default.part_by_timestamp spark_catalog.default.part_by_timestamp == Optimized Logical Plan == RelationV2[p#36183] spark_catalog.default.part_by_timestamp == Physical Plan == VeloxColumnarToRow +- ^(1594) IcebergBatchScanTransformer spark_catalog.default.part_by_timestamp[p#36183] spark_catalog.default.part_by_timestamp (branch=null) [filters=, groupedBy=] RuntimeFilters: [] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == struct<> struct<> ![2022-01-01 00:01:20.0] [2022-01-01 08:01:20.0]
VeloxIcebergSuite.iceberg partition type - timestamp: org/apache/gluten/execution/VeloxIcebergSuite#L1
Results do not match for query: Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] Timezone Env: == Parsed Logical Plan == 'Project [*] +- 'UnresolvedRelation [part_by_timestamp], [], false == Analyzed Logical Plan == p: timestamp Project [p#37746] +- SubqueryAlias spark_catalog.default.part_by_timestamp +- RelationV2[p#37746] spark_catalog.default.part_by_timestamp spark_catalog.default.part_by_timestamp == Optimized Logical Plan == RelationV2[p#37746] spark_catalog.default.part_by_timestamp == Physical Plan == VeloxColumnarToRow +- ^(1775) IcebergBatchScanTransformer spark_catalog.default.part_by_timestamp[p#37746] spark_catalog.default.part_by_timestamp (branch=null) [filters=, groupedBy=] RuntimeFilters: [] == Results == == Results == !== Correct Answer - 1 == == Gluten Answer - 1 == struct<> struct<> ![2022-01-01 00:01:20.0] [2022-01-01 08:01:20.0]