Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CH] Fully Support writing parquet and mergetree in spark 3.5.x with delta protocol #7028

Closed
1 of 5 tasks
Tracked by #6067
baibaichen opened this issue Aug 27, 2024 · 0 comments · Fixed by #7029, #7170, #7234, #7279 or #7395
Closed
1 of 5 tasks
Tracked by #6067
Labels
enhancement New feature or request

Comments

@baibaichen
Copy link
Contributor

baibaichen commented Aug 27, 2024

Description

This is umbrella issue.

Previously, #6705 is just a POC to prove that we can implemtent Delta Write based on CumnarWriteFilesExec.

  1. [GLUTEN-7028][CH][Part-1] Using PushingPipelineExecutor to write merge tree #7029
  2. [GLUTEN-7028][CH][Part-2] Refactor: Move MergeTree related UT to mergetree module #7279
  3. [GLUTEN-7028][CH][Part-3] Refactor: Move mergetree related codes to backends-clickhouse #7234
  4. [GLUTEN-7028][CH][Part-4] Refactor DeltaMergeTreeFileFormat to read table configuration from deltalog's metadata #7170
  5. [GLUTEN-7028][CH][Part-5] Refactor: add NativeOutputWriter to unify CHDatasourceJniWrapper #7395
  6. [GLUTEN-7028][CH][Part-6] Introduce MergeTreeDelayedCommitProtocol #7506
  7. [GLUTEN-7028][CH][Part-7] Support one pipeline write for mergetree  #7788
  8. [GLUTEN-7028][CH][Part-8] Support one pipeline write for partition mergetree #7924
  9. [GLUTEN-7028][CH][Part-9] Collecting Delta stats for parquet #7993
  10. [GLUTEN-7028][CH][Part-10] Collecting Delta stats for MergeTree #8029
  11. [GLUTEN-7028][CH][Part-11] Support write parquet files with bucket #8052
  12. [GLUTEN-7028][CH][Part-12] Add Local SortExec for Partition Write in one pipeline mode #8237
  13. [GLUTEN-7028][CH][Part-13] Support partition with escape value #8158

backlog

@baibaichen baibaichen added the enhancement New feature or request label Aug 27, 2024
@baibaichen baibaichen reopened this Sep 6, 2024
baibaichen added a commit that referenced this issue Sep 26, 2024
…ackends-clickhouse (#7234)

This is second refacor PR for moving mergetree related codes to backends-clickhouse:
1. Move JniUtils from gluten-arrow to gluten-core and using config.proto, so we can use ConfigMap to pass configuration between java and c++.
2. Move ExtensionTableXXX from gluten-substrait to backends-clickhouse
3. Move `genWriteParameters` to `TransformerApi`, so we can pass mergetree related parameters.

(Fixes: \#7028)
@baibaichen baibaichen reopened this Sep 26, 2024
@baibaichen baibaichen reopened this Sep 30, 2024
@baibaichen baibaichen reopened this Oct 9, 2024
@baibaichen baibaichen reopened this Oct 23, 2024
@baibaichen baibaichen reopened this Nov 10, 2024
sharkdtu pushed a commit to sharkdtu/gluten that referenced this issue Nov 11, 2024
…ackends-clickhouse (apache#7234)

This is second refacor PR for moving mergetree related codes to backends-clickhouse:
1. Move JniUtils from gluten-arrow to gluten-core and using config.proto, so we can use ConfigMap to pass configuration between java and c++.
2. Move ExtensionTableXXX from gluten-substrait to backends-clickhouse
3. Move `genWriteParameters` to `TransformerApi`, so we can pass mergetree related parameters.

(Fixes: \apache#7028)
@baibaichen baibaichen reopened this Nov 13, 2024
@baibaichen baibaichen reopened this Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment