Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Link lib gluten to arrow's static libraries #6231

Merged
merged 2 commits into from
Jul 2, 2024

Conversation

PHILO-HE
Copy link
Contributor

@PHILO-HE PHILO-HE commented Jun 26, 2024

  1. Removes linking libvelox.so with arrow libs, which is useless.
  2. With static arrow libs linked, no need to keep lib arrow & lib parquet in Gluten Jar.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@PHILO-HE PHILO-HE changed the title [VL] Link static libraries of arrow [VL] Link lib gluten and lib velox to arrow's static libraries Jun 26, 2024
@PHILO-HE PHILO-HE force-pushed the use-static-arrow-libs branch from 2c8732f to 5b4dbdf Compare June 27, 2024 05:41
@PHILO-HE PHILO-HE changed the title [VL] Link lib gluten and lib velox to arrow's static libraries [VL] Link lib gluten to arrow's static libraries Jun 28, 2024
@PHILO-HE PHILO-HE force-pushed the use-static-arrow-libs branch 2 times, most recently from 085a295 to 9d6654b Compare June 28, 2024 05:50
@FelixYBW
Copy link
Contributor

Perf regression observed

@PHILO-HE
Copy link
Contributor Author

/Benchmark Velox

@PHILO-HE PHILO-HE force-pushed the use-static-arrow-libs branch 2 times, most recently from c0f1b69 to 3dc5daf Compare July 1, 2024 09:33
@PHILO-HE PHILO-HE force-pushed the use-static-arrow-libs branch from 3dc5daf to 72021cb Compare July 2, 2024 02:04
@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented Jul 2, 2024

Perf regression observed

@FelixYBW, seems the regression is caused by other code or some other test factors. After code rebase, AWS TPCH's perf. is 1067.17s now.

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_6231_time.csv log/native_master_07_01_2024_5d6d214f0_time.csv difference percentage
q1 37.90 34.57 -3.329 91.22%
q2 23.78 23.98 0.204 100.86%
q3 41.34 39.13 -2.209 94.66%
q4 32.72 32.81 0.087 100.26%
q5 71.63 70.64 -0.997 98.61%
q6 9.98 8.16 -1.823 81.74%
q7 81.96 83.92 1.957 102.39%
q8 85.51 85.80 0.283 100.33%
q9 118.33 120.92 2.593 102.19%
q10 44.73 46.13 1.406 103.14%
q11 19.86 21.88 2.019 110.17%
q12 26.51 25.17 -1.348 94.91%
q13 38.53 41.24 2.702 107.01%
q14 18.19 20.05 1.869 110.28%
q15 32.32 33.15 0.821 102.54%
q16 14.35 13.35 -1.001 93.02%
q17 104.86 104.45 -0.415 99.60%
q18 148.35 148.04 -0.310 99.79%
q19 14.90 13.81 -1.093 92.66%
q20 29.51 27.79 -1.716 94.18%
q21 266.60 266.41 -0.194 99.93%
q22 12.63 13.75 1.119 108.86%
total 1274.52 1275.14 0.624 100.05%

@PHILO-HE PHILO-HE merged commit 832b91c into apache:main Jul 2, 2024
40 checks passed
pushd $ARROW_PREFIX/cpp

cmake_install \
-DARROW_PARQUET=ON \
-DARROW_FILESYSTEM=ON \
-DARROW_PROTOBUF_USE_SHARED=OFF \
-DARROW_DEPENDENCY_USE_SHARED=OFF \
-DARROW_DEPENDENCY_SOURCE=BUNDLED \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-DARROW_DEPENDENCY_SOURCE=BUNDLED

does this change really necessary? some deps may already installed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yohahaha, thanks for your comment. If not bundled, we need some extra code to support resolve those installed dependencies and link them. Considering arrow libs are generally built once then installed, currently we can simply use BUNDLED to get a lib of all arrow's bundled dependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering arrow libs are generally built once then installed

it's important, hope we can reduce changes of arrow...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yohahaha, I will take a look.

@zhztheplayer
Copy link
Member

zhztheplayer commented Jul 3, 2024

Observed some issues in dynamic build after this change

[ 69%] Linking CXX executable BenchmarkCompression
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4_compress_default'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_initCStream'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_freeCStream'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressBound'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4_compress_HC'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressFrame'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressFrameBound'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_flush'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `inflateInit2_'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_minCLevel'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_createDecompressionContext'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_freeCompressionContext'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_decompress'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_getErrorName'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_maxCLevel'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_createCStream'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressBegin'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_isError'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4_decompress_safe'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_getErrorName'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `inflate'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_isError'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_resetDecompressionContext'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_createCompressionContext'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_endStream'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4_compressBound'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressionLevel_max'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_compressBound'
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `deflateReset'                                                                                                               
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `deflateInit2_'                                                                                                              
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_flushStream'                                                                                                           
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `snappy::RawCompress(char const*, unsigned long, char*, unsigned long*)'                                                     
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_decompress'                                                                                                            
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_createDStream'                                                                                                         
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressEnd'                                                                                                           
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_initDStream'                                                                                                           
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_freeDStream'                                                                                                           
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `inflateReset'                                                                                                               
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_compress'                                                                                                              
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `snappy::RawUncompress(char const*, unsigned long, char*)'                                                                   
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `snappy::GetUncompressedLength(char const*, unsigned long, unsigned long*)'                                                  
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `deflate'                                                                                                                    
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `arrow::fs::LocalFileSystem::LocalFileSystem(arrow::io::IOContext const&)'                                                   
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `deflateEnd'                                                                                                                 
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_compressStream'                                                                                                        
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_compressUpdate'                                                                                                        
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `deflateBound'                                                                                                               
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `ZSTD_decompressStream'                                                                                                      
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `inflateEnd'                                                                                                                 
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `arrow::fs::internal::ConcatAbstractPath[abi:cxx11](std::basic_string_view<char, std::char_traits<char> >, std::basic_string_
view<char, std::char_traits<char> >)'                                                                                                                                                         
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `snappy::MaxCompressedLength(unsigned long)'                                                                                 
/usr/bin/ld: ../../releases/libgluten.so: undefined reference to `LZ4F_freeDecompressionContext'                                                                                              
collect2: error: ld returned 1 exit status

The build command used is

dev/builddeps-veloxbe.sh --build_type=RelWithDebInfo --build_tests=ON --build_benchmarks=ON

@zhztheplayer
Copy link
Member

@PHILO-HE Update, if I change to --build_tests=OFF --build_benchmarks=OFF then compilation could succeed.

@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented Jul 3, 2024

@PHILO-HE Update, if I change to --build_tests=OFF --build_benchmarks=OFF then compilation could succeed.

@zhztheplayer, thanks for your feedback! I'll fix it.

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_6231_time.csv log/native_master_07_02_2024_6b6444e57e_time.csv difference percentage
q1 37.80 34.26 -3.541 90.63%
q2 23.76 22.32 -1.442 93.93%
q3 39.84 40.70 0.862 102.16%
q4 32.45 33.11 0.659 102.03%
q5 72.70 69.66 -3.041 95.82%
q6 10.76 7.95 -2.811 73.88%
q7 81.29 83.02 1.727 102.12%
q8 84.73 84.00 -0.729 99.14%
q9 121.25 122.20 0.948 100.78%
q10 44.07 47.32 3.253 107.38%
q11 20.83 20.46 -0.365 98.25%
q12 27.42 27.16 -0.256 99.07%
q13 38.92 39.74 0.818 102.10%
q14 21.31 19.78 -1.532 92.81%
q15 33.50 30.60 -2.902 91.34%
q16 14.21 14.01 -0.193 98.65%
q17 103.90 102.42 -1.475 98.58%
q18 147.90 151.18 3.288 102.22%
q19 13.71 14.80 1.090 107.95%
q20 30.70 31.05 0.353 101.15%
q21 263.40 264.02 0.624 100.24%
q22 12.17 12.38 0.205 101.69%
total 1276.61 1272.16 -4.458 99.65%

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCDS SF2000 with Velox backend, for reference only ====

query log/native_6231_time.csv log/native_master_07_02_2024_6b6444e57_time.csv difference percentage
q1 14.47 14.94 0.467 103.23%
q2 14.74 14.59 -0.154 98.96%
q3 4.20 3.99 -0.211 94.97%
q4 62.46 64.79 2.330 103.73%
q5 6.76 8.99 2.229 132.97%
q6 3.74 2.43 -1.313 64.89%
q7 5.62 4.25 -1.364 75.72%
q8 5.62 4.94 -0.678 87.94%
q9 24.12 18.15 -5.969 75.26%
q10 11.61 10.50 -1.111 90.43%
q11 35.74 37.08 1.334 103.73%
q12 2.62 2.38 -0.237 90.96%
q13 8.48 5.72 -2.751 67.54%
q14a 42.63 43.96 1.327 103.11%
q14b 40.03 39.09 -0.936 97.66%
q15 3.82 2.68 -1.137 70.22%
q16 38.88 41.98 3.096 107.96%
q17 5.81 5.93 0.121 102.09%
q18 6.43 6.27 -0.160 97.51%
q19 2.13 2.30 0.168 107.87%
q20 2.60 1.33 -1.263 51.35%
q21 1.07 1.10 0.030 102.78%
q22 8.24 8.35 0.106 101.29%
q23a 83.42 84.32 0.905 101.08%
q23b 105.04 103.47 -1.573 98.50%
q24a 79.48 78.77 -0.711 99.11%
q24b 73.92 72.76 -1.158 98.43%
q25 6.45 4.39 -2.066 67.98%
q26 3.09 2.96 -0.126 95.94%
q27 3.32 3.41 0.093 102.79%
q28 25.44 21.17 -4.267 83.22%
q29 7.18 7.08 -0.104 98.55%
q30 4.35 4.09 -0.255 94.14%
q31 6.23 6.30 0.070 101.12%
q32 2.24 1.23 -1.012 54.93%
q33 4.74 4.72 -0.021 99.57%
q34 4.22 6.86 2.644 162.69%
q35 7.16 7.65 0.489 106.83%
q36 3.56 3.67 0.116 103.25%
q37 9.35 4.66 -4.690 49.83%
q38 14.99 14.27 -0.728 95.14%
q39a 3.32 3.53 0.210 106.32%
q39b 2.84 2.90 0.057 102.00%
q40 3.58 3.69 0.106 102.96%
q41 0.60 0.70 0.102 117.10%
q42 0.87 1.08 0.210 124.27%
q43 3.71 4.02 0.305 108.22%
q44 8.21 8.66 0.455 105.55%
q45 3.65 8.26 4.617 226.63%
q46 3.44 3.48 0.037 101.06%
q47 14.20 14.36 0.161 101.14%
q48 4.81 4.60 -0.212 95.59%
q49 9.43 9.33 -0.094 99.00%
q50 20.99 22.19 1.199 105.71%
q51 8.65 11.70 3.048 135.22%
q52 1.01 1.09 0.080 107.92%
q53 2.00 2.02 0.021 101.06%
q54 3.30 3.32 0.014 100.42%
q55 1.90 1.16 -0.750 60.64%
q56 4.40 4.58 0.182 104.15%
q57 12.33 8.80 -3.526 71.39%
q58 2.77 2.67 -0.097 96.52%
q59 13.90 13.99 0.088 100.63%
q60 4.92 4.89 -0.037 99.25%
q61 5.49 5.49 -0.005 99.91%
q62 4.21 5.15 0.943 122.43%
q63 2.04 2.21 0.177 108.69%
q64 50.23 51.58 1.345 102.68%
q65 13.82 14.11 0.291 102.10%
q66 4.60 4.75 0.151 103.28%
q67 346.91 349.82 2.913 100.84%
q68 5.44 3.67 -1.774 67.40%
q69 7.45 6.44 -1.014 86.39%
q70 10.32 8.98 -1.334 87.07%
q71 2.51 3.30 0.783 131.18%
q72 185.97 187.54 1.571 100.84%
q73 2.42 2.34 -0.077 96.83%
q74 21.60 21.84 0.241 101.12%
q75 23.23 23.36 0.125 100.54%
q76 12.50 9.49 -3.010 75.93%
q77 2.07 2.15 0.077 103.74%
q78 39.04 38.92 -0.118 99.70%
q79 3.75 3.62 -0.125 96.67%
q80 14.30 11.04 -3.260 77.20%
q81 5.86 5.17 -0.691 88.21%
q82 7.19 6.63 -0.561 92.20%
q83 1.44 1.58 0.144 110.01%
q84 3.13 2.81 -0.316 89.90%
q85 6.90 7.01 0.104 101.51%
q86 3.16 3.38 0.219 106.95%
q87 15.17 12.61 -2.558 83.13%
q88 26.72 25.25 -1.465 94.52%
q89 3.12 3.21 0.093 103.00%
q90 4.33 3.86 -0.471 89.11%
q91 2.75 2.54 -0.207 92.47%
q92 1.33 1.32 -0.013 99.05%
q93 29.16 29.02 -0.140 99.52%
q94 21.28 22.00 0.718 103.37%
q9 82.48 81.02 -1.460 98.23%
q5 3.46 3.83 0.369 110.67%
q96 12.37 12.31 -0.059 99.52%
q97 2.12 2.06 -0.061 97.11%
q98 9.04 11.46 2.412 126.67%
q99 9.04 11.46 2.412 126.67%
total 1929.73 1911.40 -18.334 99.05%

@FelixYBW
Copy link
Contributor

FelixYBW commented Jul 3, 2024

@PHILO-HE can you confirm if BUILD_TYPE=relWithDebInfo is passed to Arrow when you build arrow? Looks the main branch doesn't pass it.

@PHILO-HE
Copy link
Contributor Author

@PHILO-HE can you confirm if BUILD_TYPE=relWithDebInfo is passed to Arrow when you build arrow? Looks the main branch doesn't pass it.

@FelixYBW, currently, we always build arrow with Release type, assuming generally no need to align the build type with main project. Hongze has helped fixed some issues related to thrift. And now both arrow/velox can use system installed arrow libs. If you have any issue, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants