-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-5414] [VL] Support arrow csv option and schema #5850
Conversation
Run Gluten Clickhouse CI |
8 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
9 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Can you help review this PR and rerun the failed test? Thanks! @zhztheplayer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits. Thanks!
.github/workflows/velox_docker.yml
Outdated
key: cache-velox-build-${{ hashFiles('./cache-key') }} | ||
- name: Build Gluten Velox third party | ||
if: ${{ steps.cache.outputs.cache-hit != 'true' }} | ||
run: | | ||
source dev/ci-velox-buildstatic.sh | ||
- uses: actions/upload-artifact@v2 | ||
- name: 'Upload Artifact Native' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why quoting the name? Since we didn't do it on other names.
path: | | ||
./cpp/build/releases/ | ||
~/.m2/repository/org/apache/arrow/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we modify velox_docker_cache.yml as well? Did you check that file already?
Name;Language | ||
Juno;Java | ||
Peter;Python | ||
Celin;C++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The placing folder can be renamed from resources/datasource/csv
to resource/arrow-datasource/csv
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The csv file is not relevant to arrow datasource, it is just used to test Arrow.
It is just a normal CSV file, so I place it here.
Run Gluten Clickhouse CI |
Can you help merge this one? Thanks! @zhztheplayer |
Run Gluten Clickhouse CI |
This reverts commit be7710e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
mvn clean install -am \ | ||
-DskipTests -Drat.skip -Dmaven.gitcommitid.skip -Dcheckstyle.skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need compile whole project? it's a time consuming command...
I wonder which depends module can not be compiled by -am
options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arrow-bom module includes many modules, so we need to compile whole project
Support basic option now, will support more options after arrow patch merged.
apache/arrow#41646
Before this patch, if the required schema is different with file schema, csv read will fallback.
And changed to use index in file instead of check the file column name considering case sensitive.
Add a new common test function when the rule applies to Logical plan.
Compile arrow with version 15.0.0-gluten, upgrade arrow-dataset and arrow-c-data version from 15.0.0 to 15.0.0-gluten