Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Upgrade simdjson to 3.9.3 in vcpkg build #5938

Merged
merged 3 commits into from
Jun 3, 2024

Conversation

PHILO-HE
Copy link
Contributor

What changes were proposed in this pull request?

See velox commit:
facebookincubator/velox@f9ae45a

How was this patch tested?

CI build.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@PHILO-HE PHILO-HE force-pushed the upgrade-simdjson branch from 17ef4f4 to deffb7b Compare May 31, 2024 06:05
zhouyuan
zhouyuan previously approved these changes May 31, 2024
Copy link
Contributor

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented May 31, 2024

Just created a velox pr facebookincubator/velox#9997 to fix the below issue (vcpkg installed simdjson cannot be used by velox, instead velox still builds it from source):

-- Setting simdjson source to AUTO
2024-05-31T08:52:10.0849360Z CMake Warning at /__w/incubator-gluten/incubator-gluten/dev/vcpkg/.vcpkg/scripts/buildsystems/vcpkg.cmake:859 (_find_package):
2024-05-31T08:52:10.0850312Z   Could not find a configuration file for package "simdjson" that is
2024-05-31T08:52:10.0851050Z   compatible with requested version "3.8.0".
2024-05-31T08:52:10.0851548Z 
2024-05-31T08:52:10.0851888Z   The following configuration files were considered but not accepted:
2024-05-31T08:52:10.0852654Z 
2024-05-31T08:52:10.0853386Z     /__w/incubator-gluten/incubator-gluten/dev/vcpkg/vcpkg_installed/x64-linux-avx/share/simdjson/simdjson-config.cmake, version: 3.9.3
2024-05-31T08:52:10.0854100Z 
2024-05-31T08:52:10.0854222Z Call Stack (most recent call first):
2024-05-31T08:52:10.0854613Z   CMake/ResolveDependency.cmake:70 (find_package)
2024-05-31T08:52:10.0855048Z   CMakeLists.txt:451 (resolve_dependency)

@FelixYBW
Copy link
Contributor

@PHILO-HE Can you take this oppotunity to do a perf test of json parse? Masha said there is huge perf gain. Let's confirm from Gluten.

@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented Jun 3, 2024

@PHILO-HE Can you take this oppotunity to do a perf test of json parse? Masha said there is huge perf gain. Let's confirm from Gluten.

@FelixYBW, I just did a small benchmark test with Spark. Velox with simdjson can bring 3x perf. gain, compared with vanilla spark. Simdjson's upgrading to version 3.93 brings 10% perf. gain, compared with the old simdjson version.

@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented Jun 3, 2024

This pr is ready to merge. The above warning only requires velox code change to fix.

@PHILO-HE PHILO-HE merged commit 4dcda6a into apache:main Jun 3, 2024
38 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_5938_time.csv log/native_master_06_02_2024_26ff58d3b_time.csv difference percentage
q1 33.91 33.06 -0.849 97.50%
q2 26.49 26.68 0.192 100.73%
q3 35.18 36.70 1.528 104.34%
q4 35.15 35.50 0.357 101.02%
q5 69.89 69.83 -0.066 99.91%
q6 7.19 5.99 -1.200 83.30%
q7 80.10 81.70 1.606 102.00%
q8 84.56 84.47 -0.088 99.90%
q9 121.20 119.12 -2.077 98.29%
q10 45.14 45.39 0.246 100.55%
q11 21.88 20.18 -1.697 92.25%
q12 23.64 25.30 1.668 107.06%
q13 36.92 37.41 0.484 101.31%
q14 17.32 20.07 2.752 115.89%
q15 29.54 30.21 0.668 102.26%
q16 12.98 14.14 1.166 108.98%
q17 103.73 101.99 -1.740 98.32%
q18 145.00 147.16 2.160 101.49%
q19 15.78 17.14 1.360 108.62%
q20 27.51 26.85 -0.665 97.58%
q21 267.87 259.27 -8.600 96.79%
q22 12.03 12.39 0.354 102.94%
total 1253.01 1250.57 -2.440 99.81%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants