Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3378][CORE] Move getLocalFilesNode logic to scan transformer #3650

Merged
merged 7 commits into from
Nov 17, 2023

Conversation

liujiayi771
Copy link
Contributor

@liujiayi771 liujiayi771 commented Nov 8, 2023

What changes were proposed in this pull request?

The Lake format has its own custom partition type, with a processing logic that is different from the existing FilePartition and GlutenMergeTreePartition. In the unified design, the Lake format will have its own scan transformer. In order to ensure that the injection of the Lake format does not affect gluten-core, the process of generating LocalFilesNode from handling partitions needs to be moved to the scan transformer, allowing the Lake format's scan transformer to customize the processing logic of the partitions.

Introduce ReadSplit interface for ExtensionTableNode and LocalFileNode, so that we can get rid of using java.io.Serializable.

How was this patch tested?

Exists CI.

Copy link

github-actions bot commented Nov 8, 2023

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

github-actions bot commented Nov 8, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 8, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 8, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 8, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 8, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 9, 2023

Run Gluten Clickhouse CI

if (allScanPartitions.exists(_.size != partitionLength)) {
val allScanLocalFilesNodes = basicScanExecTransformers.map(_.getLocalFilesNodes)
val partitionLength = allScanLocalFilesNodes.head.size
if (allScanLocalFilesNodes.exists(_.size != partitionLength)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe extract a private method to encapsulate this logic that contains getLocalFilesNodes, the assertion of whether the number of partitions is consistent, and the transposition. And doc for this method.

@@ -54,6 +54,11 @@ trait BasicScanExecTransformer extends LeafTransformSupport with SupportFormat {
// TODO: Remove this expensive call when CH support scan custom partition location.
def getInputFilePaths: Seq[String]

def getLocalFilesNodes: Seq[(java.io.Serializable, Array[String])] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we define a case class for (java.io.Serializable, Array[String])? To make Serializable as return type is so wired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because CK is using ExtensionTable instead of LocalFilesNode, and their common superclass is Serializable. Now, a new interface called ReadSplit is introduced to represent the abstraction of both classes. Additionally, preferredLocations have been included in the class to avoid using Tuple.


def getFilePartitionLocations(
filePaths: Array[String],
preferredLocations: Array[String]): Array[String] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain this definition a little?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the input of this method was FilePartition, but the InputPartitions for lake formats are custom-defined. This method primarily processes FilePartition.files.filePath and preferredLocations. Modifying the input type of this method would allow the lake formats to directly reuse this interface.

Copy link

github-actions bot commented Nov 9, 2023

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Contributor Author

@yma11 @zhztheplayer Could you help review these changes?

Copy link

Run Gluten Clickhouse CI

@liujiayi771 liujiayi771 marked this pull request as ready for review November 13, 2023 02:31
@liujiayi771
Copy link
Contributor Author

@zzcclp Could you help review this PR?

@@ -30,12 +30,13 @@
import java.util.List;
import java.util.Map;

public class LocalFilesNode implements Serializable {
public class LocalFilesNode implements ReadSplit, Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a better name instead of ReadSplit which can show the inheritance relationship here? maybe FilesNode if there is kind of IceBergFilesNode / DeltaFilesNode exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have also spent time thinking about the name of this interface. It mainly serves as an abstract interface for ExtensionTableNode and LocalFilesNode. In the future, both Iceberg and Delta will use LocalFilesNode, and there will be no IcebergFilesNode. We think that LocalFiles are split meant for handling by the underlying engine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then maybe SplitInfo is more suitable in this case and it has corresponding definition like "SplitInfo" or HiveConnectorSplit at velox side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yma11 I think it makes sense. I will change it to SplitInfo.


private def getReadSplitFromScanTransformer(
basicScanExecTransformers: Seq[BasicScanExecTransformer]): Seq[Seq[ReadSplit]] = {
// If these are two scan transformers, they must have same partitions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will there are multi scan transformers? Will they be against same table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check this test VeloxTPCHV1BhjSuite, it will trigger multi scan transformer. This situation can occur with a broadcast join.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not thinkg bhj have two scan transformers in one stage, the build side should have a broadcast exchange. Bucket table join is a valid case which contain two scan transformers in one stage.

@yma11
Copy link
Contributor

yma11 commented Nov 14, 2023

@rui-mo can you also take a review on this PR as it has some substrait module related changes?

@rui-mo rui-mo requested a review from zzcclp November 14, 2023 03:18
import java.util.List;

public interface ReadSplit {
List<String> preferredLocations();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some comments for this class and the member function preferredLocations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments. Do you think the name ReadSplit needs to be modified?

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@liujiayi771 liujiayi771 changed the title [WIP][CORE] Move getLocalFilesNode logic to scan transformer [CORE] Move getLocalFilesNode logic to scan transformer Nov 16, 2023
@@ -336,6 +330,32 @@ case class WholeStageTransformer(child: SparkPlan, materializeInput: Boolean = f

override protected def withNewChildInternal(newChild: SparkPlan): WholeStageTransformer =
copy(child = newChild, materializeInput = materializeInput)(transformStageId)

private def getReadSplitFromScanTransformer(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YannByron Add this private function.

@@ -30,12 +30,13 @@
import java.util.List;
import java.util.Map;

public class LocalFilesNode implements Serializable {
public class LocalFilesNode implements ReadSplit, Serializable {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have also spent time thinking about the name of this interface. It mainly serves as an abstract interface for ExtensionTableNode and LocalFilesNode. In the future, both Iceberg and Delta will use LocalFilesNode, and there will be no IcebergFilesNode. We think that LocalFiles are split meant for handling by the underlying engine.


private def getReadSplitFromScanTransformer(
basicScanExecTransformers: Seq[BasicScanExecTransformer]): Seq[Seq[ReadSplit]] = {
// If these are two scan transformers, they must have same partitions,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check this test VeloxTPCHV1BhjSuite, it will trigger multi scan transformer. This situation can occur with a broadcast join.

import java.util.List;

public interface ReadSplit {
List<String> preferredLocations();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments. Do you think the name ReadSplit needs to be modified?

Copy link

Run Gluten Clickhouse CI

@liujiayi771 liujiayi771 changed the title [CORE] Move getLocalFilesNode logic to scan transformer [GLUTEN-3378][CORE] Move getLocalFilesNode logic to scan transformer Nov 16, 2023
Copy link

#3378

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

GlutenPartition(
index,
substraitPlan.toByteArray,
splitInfos.head.preferredLocations().asScala.toArray)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitInfos(i).preferredLocations ?

Copy link
Contributor Author

@liujiayi771 liujiayi771 Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you Currently, the GlutenPartition only has one preferredLocations. However, within GlutenPartition, there are multiple SplitInfo instances, and each SplitInfo may have a different preferredLocations. However, the existing code only takes the first preferredLocations.

https://github.com/oap-project/gluten/blob/f21a96f727ab8368d1de0de25101b41c61d58038/backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/IteratorApiImpl.scala#L126

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, it's for one partition but for several scans. I think the currently implementation loses some of scan preferredLocations. Shall we use splitInfos.flatMap(_.preferredLocations).distinct ?

Copy link
Contributor Author

@liujiayi771 liujiayi771 Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Merge all preferredLocations is better. I will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional note: distinct is not need as Spark will count the locations to determine which locations are preferred.

partitionColumns,
fileFormat,
preferredLocations.toList.asJava)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: throw exception for default case

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@yma11 yma11 merged commit 49b8e06 into apache:main Nov 17, 2023
16 checks passed
@liujiayi771 liujiayi771 deleted the scan-part branch November 17, 2023 08:11
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3650_time.csv log/native_master_11_16_2023_510b22841_time.csv difference percentage
q1 34.28 34.35 0.068 100.20%
q2 24.85 24.60 -0.248 99.00%
q3 37.51 37.34 -0.176 99.53%
q4 36.56 37.04 0.475 101.30%
q5 70.17 68.82 -1.345 98.08%
q6 7.02 7.03 0.011 100.16%
q7 84.37 85.40 1.030 101.22%
q8 85.40 85.86 0.462 100.54%
q9 123.05 126.12 3.067 102.49%
q10 44.48 46.82 2.338 105.26%
q11 20.89 19.68 -1.210 94.21%
q12 25.00 24.50 -0.498 98.01%
q13 45.97 46.88 0.919 102.00%
q14 17.37 18.78 1.411 108.12%
q15 28.78 28.03 -0.754 97.38%
q16 15.08 15.56 0.478 103.17%
q17 101.64 99.58 -2.062 97.97%
q18 146.70 148.00 1.300 100.89%
q19 12.96 12.89 -0.071 99.45%
q20 28.64 28.05 -0.589 97.94%
q21 219.86 220.92 1.060 100.48%
q22 12.95 12.73 -0.214 98.35%
total 1223.52 1228.97 5.452 100.45%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants