Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CH-247] support async reader when reading parquet from hdfs #248

Open
wants to merge 670 commits into
base: clickhouse_backend
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
670 commits
Select commit Hold shift + click to select a range
8d53407
Merge pull request #32284 from ClickHouse/backport/21.9/32117
kitaisreal Dec 14, 2021
6a734ad
Merge pull request #32712 from ClickHouse/backport/21.9/32359
CurtizJ Dec 14, 2021
3bb99f5
Backport #32755 to 21.9: fix crash fuzzbits with multiply same fixeds…
Dec 15, 2021
324833a
Update Dockerfile
kssenii Dec 15, 2021
7d607a7
add benchmark code
Dec 15, 2021
2466173
Merge remote-tracking branch 'ch/21.9' into backport/21.9/32201
tavplubix Dec 15, 2021
ab49415
Merge pull request #31954 from ClickHouse/backport/21.9/31859
alesapin Dec 15, 2021
750cdda
Merge pull request #32718 from ClickHouse/backport/21.9/27822
Dec 16, 2021
a203597
Merge pull request #32541 from ClickHouse/backport/21.9/32201
tavplubix Dec 16, 2021
a6b6067
Merge pull request #32790 from ClickHouse/backport/21.9/32755
alesapin Dec 16, 2021
99c9b95
fix memory error
Dec 17, 2021
0d74c6e
init code of Q6
Dec 22, 2021
0972cb4
support filter and aggregate
Dec 27, 2021
6ddfb92
fix benchmark error
Dec 27, 2021
643dd25
add agg benchmark
Dec 27, 2021
1cbe87e
support Q6 benchmark
Dec 28, 2021
bdca247
fix ut errors
Dec 28, 2021
387e53b
support type date
Dec 28, 2021
45ebf02
upgrade substrait
Dec 29, 2021
8a9e4a1
print test result
Dec 29, 2021
2aff015
upgrade substrait
Dec 30, 2021
110eb99
fix read rel
Dec 30, 2021
6c04e84
add rpm config
Dec 30, 2021
e97fa5c
add rpm config
Dec 30, 2021
72348ed
add rpm config
Jan 4, 2022
76b6623
add rpm config
Jan 4, 2022
f2ad0e3
remove runtime dependency
Jan 5, 2022
d8365fb
support equal to
Jan 10, 2022
c1c365f
fix ut fail
Jan 10, 2022
7e18c9a
Fix start index
zzcclp Jan 13, 2022
9c989c7
Merge pull request #1 from liuneng1994/fix_start_index
liuneng1994 Jan 14, 2022
ddc9ee3
support duckdb parquet reader
Jan 14, 2022
67b93aa
support date
Jan 14, 2022
553f246
some optimize
Jan 24, 2022
433aa9a
Unify JNI interfaces
zzcclp Jan 26, 2022
d5314e1
Remove plan.txt
zzcclp Jan 26, 2022
33dd0b8
add merge tree test
Jan 27, 2022
bd643e7
add filter benchmark
Jan 28, 2022
1e24593
fix cmake error when one shared
Feb 21, 2022
7d9bf86
add shared config
Feb 21, 2022
af5a333
add merge tree support
Feb 21, 2022
1cb3d10
Merge pull request #2 from liuneng1994/unify_jni_interfaces
liuneng1994 Feb 21, 2022
8aac850
Merge branch 'local-engine-with-substrait' of https://github.com/liun…
Feb 21, 2022
c6ca1b8
Merge branch 'local-engine-with-substrait' into local-engine-with-mer…
Feb 21, 2022
a199571
support mergetree on jni
Feb 22, 2022
e43b06f
add part selection
Feb 22, 2022
f49e9bc
remove jni
Feb 23, 2022
67a0759
change so dependency in rpm
Feb 24, 2022
48fc1a4
remove java
Feb 24, 2022
7680515
fix error on object release
liuneng1994 Feb 25, 2022
2089d4b
chang init order
liuneng1994 Feb 25, 2022
8bf7422
add default constructor
liuneng1994 Feb 25, 2022
a9cae44
remove local server instead of map configuration
liuneng1994 Feb 25, 2022
0024cfb
optimize merge tree load parts
Feb 28, 2022
eb3f68a
add benchmark for executor create
Feb 28, 2022
7d6a5f0
optimize mergetree
liuneng1994 Mar 1, 2022
d7e2686
add logger
liuneng1994 Mar 2, 2022
5ef6435
optimize cmake
Mar 4, 2022
0dc2318
refactor code
Mar 8, 2022
fce9414
change config
liuneng1994 Mar 10, 2022
86f8fd4
add splitter
liuneng1994 Mar 16, 2022
0f01267
Update version to 22.3.2.1
Felixoid Mar 17, 2022
89a6216
Fix bug in push_to_artifactory.py, create a necessary dir
Felixoid Mar 17, 2022
2ac3789
Update version to 22.3.3.2
Felixoid Mar 17, 2022
2c43d76
Backport #35378 to 22.3: Fix possible deadlock in cache
Mar 18, 2022
8f62ade
Backport #35388 to 22.3: Slightly better performance of inserts to `O…
Mar 19, 2022
e80d55c
Merge pull request #35424 from ClickHouse/backport/22.3/35388
CurtizJ Mar 19, 2022
48160b8
Merge pull request #35415 from ClickHouse/backport/22.3/35378
kssenii Mar 19, 2022
09ee073
Backport #35409 to 22.3: Fix race in data type `Object`
Mar 20, 2022
1ca455f
Merge pull request #35441 from ClickHouse/backport/22.3/35409
CurtizJ Mar 20, 2022
475cd9a
add reader
liuneng1994 Mar 23, 2022
26a2636
Backport #35512 to 22.3: Fix crash with enabled `optimize_functions_t…
Mar 23, 2022
90c8cd0
Merge pull request #35542 from ClickHouse/backport/22.3/35512
CurtizJ Mar 23, 2022
f3e2ad1
Backport #35534 to 22.3: Fix cast into IPv4, IPv6 address in IN section
Mar 24, 2022
9a38aef
Merge pull request #35563 from ClickHouse/backport/22.3/35534
kitaisreal Mar 24, 2022
390307d
add reader
liuneng1994 Mar 25, 2022
951cc23
support columnar tpch q6
liuneng1994 Mar 30, 2022
047f335
[GJ-75] Unify function names
zzcclp Mar 21, 2022
9fa9e46
Backport #35755 to 22.3: Fix bug in conversion from custom types to s…
Mar 31, 2022
53091b4
shuffle init
liuneng1994 Apr 1, 2022
871c660
fix project error
liuneng1994 Apr 1, 2022
d478fe7
fix shuffle error
liuneng1994 Apr 1, 2022
ea9bca1
Backport #35770 to 22.3: Fix enable LLVM for JIT compilation in CMake
Apr 1, 2022
0c5d938
Backport #35815 to 22.3: Fix cgroups cores detection
Apr 1, 2022
40b698c
fix release problem
liuneng1994 Apr 2, 2022
a202398
Merge pull request #35783 from ClickHouse/backport/22.3/35755
Avogar Apr 2, 2022
9049d5b
Merge pull request #35856 from ClickHouse/backport/22.3/35815
alexey-milovidov Apr 3, 2022
2ae2121
Backport #35799 to 22.3: Fix extract function parser
Apr 3, 2022
93204d6
Fix path for assets upload
Felixoid Mar 17, 2022
a5c469b
Push LTS packages to both lts and stable repos
Felixoid Mar 17, 2022
c44cc2b
Fix keeper client timeout bug
alesapin Mar 22, 2022
5e6a314
Merge pull request #35855 from ClickHouse/backport/22.3/35770
kitaisreal Apr 4, 2022
925c433
Manual backport of black formatter
Felixoid Apr 4, 2022
80979f2
Add black formatting check
Felixoid Mar 21, 2022
372895f
Adjust check-workflows to exit-on-errors
Felixoid Mar 21, 2022
1b298db
Improve black check: show diff in the output
Felixoid Mar 28, 2022
046e742
Fix version string update, fix #35518
Felixoid Mar 24, 2022
3d43bce
Make GITHUB_RUN_URL variable and use it
Felixoid Mar 24, 2022
7b01fe2
Add build-url label to built docker images
Felixoid Mar 24, 2022
e93a06c
Merge pull request #33664 from ClickHouse/release-steps
alesapin Mar 22, 2022
dec0ec8
Merge pull request #35533 from ClickHouse/simplify_strip
alesapin Mar 25, 2022
5b06fba
Push only to the new CI DB
alesapin Mar 25, 2022
5a5a174
Remove outdated links from CI
alesapin Mar 28, 2022
8cd46fa
Merge pull request #35766 from ClickHouse/resurrect_official_flag
alesapin Mar 30, 2022
f2e41bc
Merge pull request #35774 from ClickHouse/ressurect_build_hash_v2
alesapin Mar 31, 2022
9234a3c
Merge pull request #35308 from ClickHouse/clickhouse-keeper
alesapin Mar 28, 2022
49bafdc
Merge pull request #35211 from ClickHouse/release-docker
Felixoid Apr 1, 2022
899d7f2
Merge pull request #35854 from ClickHouse/docker-master-head
Felixoid Apr 1, 2022
b03e8ca
Backport tests/ci/upload_result_helper.py
Felixoid Apr 4, 2022
b4cd668
Backport #35733 to 22.3: Added settings for insert of invalid IPv6, I…
Apr 4, 2022
c9a1d9c
Backport #35820 to 22.3: Avoid processing per-column TTL multiple times
Apr 4, 2022
697dd21
Merge pull request #35909 from ClickHouse/backport/22.3-release
Felixoid Apr 4, 2022
ad0a62d
Merge pull request #35881 from ClickHouse/backport/22.3/35799
alexey-milovidov Apr 4, 2022
82735cb
Merge pull request #35928 from ClickHouse/backport/22.3/35733
kitaisreal Apr 5, 2022
abb756d
Merge pull request #35938 from ClickHouse/backport/22.3/35820
CurtizJ Apr 5, 2022
025a573
remove input rel and support java iter from local files
liuneng1994 Apr 6, 2022
55da56d
Update version to 22.3.4.44
Felixoid Apr 6, 2022
84020f5
Tiny improvements to git and version helpers
Felixoid Apr 6, 2022
0a43cfe
Improve and fix edge cases for docker_server.py
Felixoid Apr 6, 2022
27ee3a4
add coalesce operator
liuneng1994 Apr 7, 2022
574025a
fix memory double free
liuneng1994 Apr 7, 2022
b228517
Merge remote-tracking branch 'origin/21.9' into local_engine_with_col…
liuneng1994 Apr 7, 2022
507e81a
fix compile error in benchmark
liuneng1994 Apr 7, 2022
131f6a6
fix metrics error
liuneng1994 Apr 7, 2022
ee1b577
Fix action for docker images build
Felixoid Apr 7, 2022
7d6fd3d
A temporary fix for artifactory push before multiple architectures
Felixoid Apr 7, 2022
e8168a1
Add python unit tests to backport workflow
Felixoid Apr 7, 2022
a365ef5
Move version_arg to version_helper, add tests
Felixoid Apr 7, 2022
df57f8e
Merge pull request #36028 from ClickHouse/backport/fix-release-workflow
Felixoid Apr 7, 2022
2cf5ddf
add mergetree data generate tool
liuneng1994 Apr 8, 2022
04f61b8
change jni package name
liuneng1994 Apr 8, 2022
995a6fb
add new benchmark
liuneng1994 Apr 11, 2022
8323fd6
Merge remote-tracking branch 'origin/21.9' into local_engine_with_col…
liuneng1994 Apr 11, 2022
2307ada
fix cmake error for googlebenchmark
liuneng1994 Apr 11, 2022
59ebc57
fix rebase error
liuneng1994 Apr 12, 2022
97c4bc2
change benchmark error column
liuneng1994 Apr 12, 2022
401d6ec
add time print in transform
liuneng1994 Apr 12, 2022
502c26c
fix filter actionDag has useless column
liuneng1994 Apr 12, 2022
a1c4e68
fix parse alias function failed
liuneng1994 Apr 12, 2022
5ed3db3
fix compile error in benchmark
liuneng1994 Apr 7, 2022
f212656
fix metrics error
liuneng1994 Apr 7, 2022
69660e5
add mergetree data generate tool
liuneng1994 Apr 8, 2022
bafd8ff
change jni package name
liuneng1994 Apr 8, 2022
f642f82
add new benchmark
liuneng1994 Apr 11, 2022
95f5b44
change benchmark error column
liuneng1994 Apr 12, 2022
69a445f
add time print in transform
liuneng1994 Apr 12, 2022
b9287b6
fix filter actionDag has useless column
liuneng1994 Apr 12, 2022
0d8fce3
fix parse alias function failed
liuneng1994 Apr 12, 2022
43acb9b
Revert "add time print in transform"
liuneng1994 Apr 13, 2022
8c6bb93
Merge branch '22.3' into local_engine_with_columnar_shuffle_no_rebase
liuneng1994 Apr 13, 2022
c342630
fix rebase error
liuneng1994 Apr 13, 2022
3c99723
fix rebase error
liuneng1994 Apr 13, 2022
2c22878
fix cmake rebase error
liuneng1994 Apr 13, 2022
705001a
fix rebase compile error
liuneng1994 Apr 16, 2022
e4d2372
add decompress benchmakr
liuneng1994 Apr 18, 2022
8153afc
resolve conflicts
liuneng1994 Apr 19, 2022
b0cd49c
Merge pull request #2 from liuneng1994/local_engine_with_columnar_shu…
liuneng1994 Apr 19, 2022
74c6a7e
Revert "add time print in transform"
liuneng1994 Apr 19, 2022
04b4e51
Merge pull request #4 from liuneng1994/revert_time_print
liuneng1994 Apr 19, 2022
ee7d3c8
Support TPCH Q1 (#8)
liuneng1994 Apr 27, 2022
1d9ce86
fix hash partition error (#14)
liuneng1994 Apr 28, 2022
c4cbcd6
[CH-18] Supported substrait cast node (#19)
zzcclp May 9, 2022
e4f0339
add blob and s3 read support (#20)
liuneng1994 May 10, 2022
15db61f
Support join (#25)
liuneng1994 May 25, 2022
bfe7a84
support tpch q14 (#26)
liuneng1994 May 25, 2022
0ce6a3a
fix join use nulls and support post join filter (#27)
liuneng1994 Jun 1, 2022
5c2784b
Fix shuffle column error (#29)
liuneng1994 Jun 6, 2022
a4cf20e
add stacktrace log (#30)
liuneng1994 Jun 6, 2022
f99279e
Support extract and substring function (#31)
zzcclp Jun 6, 2022
4765f0b
add function result cast (#32)
liuneng1994 Jun 7, 2022
fd77999
add native c2r (#33)
liuneng1994 Jun 9, 2022
5def534
Support broadCast join (#34)
liuneng1994 Jun 15, 2022
391b322
a lot of optimization (#35)
liuneng1994 Jun 22, 2022
4f14406
Support new shuffle (#36)
liuneng1994 Jun 30, 2022
fa2eaa0
fix mem leak (#37)
liuneng1994 Jul 1, 2022
d59bc9f
Using NewWeakGlobalRef instead of NewGlobalRef (#38)
zzcclp Jul 12, 2022
c695853
fix join duplicate table error (#39)
liuneng1994 Jul 13, 2022
7abd83d
add context clean when unload lib (#40)
liuneng1994 Jul 14, 2022
f4dfe2d
Optimize clickhouse arrow parquet reader (#41)
liuneng1994 Jul 22, 2022
6aac94d
Support nullable datatype (#51)
liuneng1994 Aug 18, 2022
b898c4b
support const column in c2r (#54)
liuneng1994 Aug 19, 2022
f3d754d
fix columnar shuffle split failed (#57)
liuneng1994 Aug 22, 2022
56c4bfc
upgrade substrait (#59)
liuneng1994 Aug 22, 2022
1385675
add BlockIterator to manage memory between java and cpp (#62)
liuneng1994 Aug 24, 2022
47b50f0
Support expr on broadcast (#64)
liuneng1994 Aug 25, 2022
169c1bb
issue #48 Optimize Arrow Parquet Reader (#61)
Aug 25, 2022
121d05a
fix compile error (#66)
liuneng1994 Aug 25, 2022
c0f737e
issue #48 fix null value case (#90)
Aug 29, 2022
1df1d49
Support expr eval and Support NULL literal (#92)
liuneng1994 Aug 29, 2022
b14d26a
[CH-87] fix min max on date32 (#96)
Aug 30, 2022
9feb3d2
fix aggregate nullable boolean column failed (#99)
liuneng1994 Aug 31, 2022
086a081
support singular_or_list (#101)
liuneng1994 Sep 1, 2022
874ad45
skip empty block when read shuffle data (#102)
liuneng1994 Sep 1, 2022
c635380
remove unused column before filter (#103)
liuneng1994 Sep 2, 2022
4b36351
support select constant (#104)
liuneng1994 Sep 2, 2022
6a40950
[CH-107] Support Native RowToColumnar (#114)
Sep 8, 2022
1643fd1
修复编译过程中的Warning (#106)
taiyang-li Sep 14, 2022
db56bf1
add check style (#122)
liuneng1994 Sep 15, 2022
dee2ac6
Multiple processes transfer parquet to mergetree (#110)
zhanglistar Sep 15, 2022
2c0c4cc
catch c++ exceptions and rethrow java exceptions (#126)
lgbo-ustc Sep 16, 2022
2706a4d
fixed : cover more jni interfaces for catching c++ exceptions (#127)
lgbo-ustc Sep 20, 2022
0889631
[CH-129][Followup] Support substrait SingularOrList (#131)
zzcclp Sep 21, 2022
c5ba2bf
Solve conflict symbols of DB::ParquetBlockInputFormat and add some be…
taiyang-li Sep 21, 2022
31a4432
Support functions for clickhouse backend: lower/upper/ltrim/rtrim (#117)
taiyang-li Sep 22, 2022
f5909f2
Support conversion between spark timestamp and ch datetime64 (#119)
taiyang-li Sep 22, 2022
3d1db97
Reduce log output (#134)
liuneng1994 Sep 27, 2022
198ba4e
refactor the file sources (#130)
lgbo-ustc Sep 30, 2022
d6a6cc9
improve the performance of converting row batch to column batch (#136)
lgbo-ustc Sep 30, 2022
1eed312
improve: catch java exception in c++ (#143)
lgbo-ustc Oct 10, 2022
e3b48c1
Support loading setting from config file and improve logging. (#118)
taiyang-li Oct 12, 2022
48322be
revert log level to error (#148)
liuneng1994 Oct 13, 2022
729900c
[CH-123] Support short/byte/binary/decimal/array/map/struct (#128)
taiyang-li Oct 14, 2022
8ba0bed
Revert "[CH-123] Support short/byte/binary/decimal/array/map/struct (…
liuneng1994 Oct 17, 2022
04939a8
support right semi join on substrait (#164)
liuneng1994 Oct 20, 2022
50f3be5
[CH-120] Support memory manager (#168)
liuneng1994 Oct 21, 2022
057f1e9
fixed a bug: coredump caused by transform a row batch with empty requ…
lgbo-ustc Oct 24, 2022
45d115f
[#156]support sort op (#160)
lgbo-ustc Oct 24, 2022
1b53b64
[CH-169] add reserve memory no exception (#173)
liuneng1994 Oct 26, 2022
17bdf88
add micro for jni env (#177)
liuneng1994 Oct 26, 2022
844e7f5
[CH-45]support count(*/count(1) (#175)
lgbo-ustc Oct 31, 2022
ce3bc0b
[CH-180] Support non-HA mode for ClickHouse reading from HDFS (#181)
zzcclp Oct 31, 2022
fc4843e
fix concurrent problem in allocator (#183)
liuneng1994 Nov 2, 2022
34245a0
[CH-170] Implement strings functions between spark and clickhouse: co…
taiyang-li Nov 3, 2022
d2b8933
[CH-123] Support short/byte/binary/decimal/array/map/struct (#163)
taiyang-li Nov 4, 2022
bc28ef0
[CH-187]Support spark math functions (#188)
taiyang-li Nov 11, 2022
1cbbf75
[CH-184] support prewhere (#185)
liuneng1994 Nov 11, 2022
4fda39e
[CH-190] enable tests in GlutenDataFrameAggregateSuite (#192)
Nov 13, 2022
0fc2fae
[CH-197] Fix bug when c2r with const columns (#198)
taiyang-li Nov 18, 2022
af81cb0
close https://github.com/Kyligence/ClickHouse/issues/199 (#200)
taiyang-li Nov 18, 2022
f48541f
change any to string value (#193)
liuneng1994 Nov 18, 2022
60e9258
fixed (#207)
lgbo-ustc Nov 21, 2022
e225f47
[CH-204] Fixed a bug in initializing settings
lgbo-ustc Nov 21, 2022
e5b7449
[CH-186] support `RangePartitioning` (#189)
lgbo-ustc Nov 22, 2022
b67c21b
finish shift left and right (#217)
taiyang-li Nov 30, 2022
294cd16
fix failed ch ut in https://github.com/oap-project/gluten/pull/620 (#…
taiyang-li Dec 1, 2022
97c3f4f
support trim both (#211)
taiyang-li Dec 5, 2022
ad6dc45
finish debug (#209)
taiyang-li Dec 6, 2022
79b4286
[CH-225]Fix decimal bug cased by big-endian encoding in spark row. (#…
taiyang-li Dec 7, 2022
0a7ccba
Improve: reducing the open operation for the same file when reading p…
lgbo-ustc Dec 12, 2022
e571255
[CH-191] Support generate exec (#194)
taiyang-li Dec 13, 2022
dc9b918
Add spark check_overflow function and cast toDecimal32/64/128 (#231)
loneylee Dec 13, 2022
d5af5eb
support map[key] and array[index] operator for gluten (#216)
taiyang-li Dec 15, 2022
0f5b2ac
support orc format files (#214)
lgbo-ustc Dec 19, 2022
a9e77e2
[CH-236] support split/pmod/factorial/rand/ascii/concat_ws function …
taiyang-li Dec 19, 2022
f091941
Follow substrait's naming for bitwise functions (#238)
loneylee Dec 19, 2022
11ecd0e
[CH-219] Support rlike/regexp_replace/regexp_extract/coalesce/DATE_AD…
taiyang-li Dec 20, 2022
5ab0b59
fixed (#243)
lgbo-ustc Dec 26, 2022
de33c20
support window (#235)
lgbo-ustc Dec 26, 2022
3d5b690
[CH-247] support async reader when reading parquet from hdfs
binmahone Dec 27, 2022
c73ccc2
not null
liuneng1994 Nov 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add blob and s3 read support (#20)
  • Loading branch information
liuneng1994 authored May 10, 2022
commit e4f03396649cc8dc7a4333c119046721fedfc60c
1 change: 1 addition & 0 deletions src/Processors/Formats/Impl/ParquetBlockInputFormat.h
Original file line number Diff line number Diff line change
@@ -29,6 +29,7 @@ class ParquetBlockInputFormat : public IInputFormat
private:
Chunk generate() override;

protected:
void prepareReader();

void onCancel() override
6 changes: 6 additions & 0 deletions utils/local-engine/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -22,6 +22,7 @@ include_directories(
${JNI_INCLUDE_DIRS}
${CMAKE_CURRENT_BINARY_DIR}/proto
${ARROW_INCLUDE_DIR}
${ClickHouse_SOURCE_DIR}/contrib/arrow-cmake/cpp/src
${ClickHouse_SOURCE_DIR}/utils/local-engine
${ClickHouse_SOURCE_DIR}/src
${ClickHouse_SOURCE_DIR}/base
@@ -186,6 +187,11 @@ target_compile_options(_mariadbclient PRIVATE -fPIC)
target_compile_options(_hdfs3 PRIVATE -fPIC)
target_compile_options(_libxml2 PRIVATE -fPIC)
target_compile_options(_gsasl PRIVATE -fPIC)
target_compile_options(_parquet PRIVATE -fPIC)
target_compile_options(_arrow PRIVATE -fPIC)
target_compile_options(_thrift PRIVATE -fPIC)
target_compile_options(_aws_s3_checksums PRIVATE -fPIC)

target_compile_options(absl_str_format_internal PRIVATE -fPIC)
target_compile_options(absl_strings PRIVATE -fPIC)
target_compile_options(absl_raw_logging_internal PRIVATE -fPIC)
34 changes: 34 additions & 0 deletions utils/local-engine/Common/ChunkBuffer.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#include "ChunkBuffer.h"

namespace local_engine
{
void ChunkBuffer::add(DB::Chunk & columns, int start, int end)
{
if (accumulated_columns.empty())
{
auto num_cols = columns.getNumColumns();
accumulated_columns.reserve(num_cols);
for (size_t i = 0; i < num_cols; i++)
{
accumulated_columns.emplace_back(columns.getColumns()[i]->cloneEmpty());
}
}

for (size_t i = 0; i < columns.getNumColumns(); ++i)
accumulated_columns[i]->insertRangeFrom(*columns.getColumns()[i], start, end - start);
}
size_t ChunkBuffer::size() const
{
if (accumulated_columns.empty())
return 0;
return accumulated_columns.at(0)->size();
}
DB::Chunk ChunkBuffer::releaseColumns()
{
auto rows = size();
DB::Columns res(std::make_move_iterator(accumulated_columns.begin()), std::make_move_iterator(accumulated_columns.end()));
accumulated_columns.clear();
return DB::Chunk(res, rows);
}

}
17 changes: 17 additions & 0 deletions utils/local-engine/Common/ChunkBuffer.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#pragma once
#include <Processors/Chunk.h>

namespace local_engine
{
class ChunkBuffer
{
public:
void add(DB::Chunk & columns, int start, int end);
size_t size() const;
DB::Chunk releaseColumns();

private:
DB::MutableColumns accumulated_columns;
};

}
114 changes: 114 additions & 0 deletions utils/local-engine/Common/DebugUtils.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
#include "DebugUtils.h"
#include <DataTypes/DataTypeDate.h>
#include <Formats/FormatSettings.h>
#include <Functions/FunctionHelpers.h>
#include <IO/WriteBufferFromString.h>

namespace debug
{

void headBlock(const DB::Block & block, size_t count)
{
std::cerr << "============Block============" << std::endl;
// print header
for (auto name : block.getNames())
{
std::cerr << name << "\t";
}
std::cerr << std::endl;
// print rows
for (size_t row = 0; row < std::min(count, block.rows()); ++row)
{
for (size_t column = 0; column < block.columns(); ++column)
{
const auto type = block.getByPosition(column).type;
auto col = block.getByPosition(column).column;
DB::WhichDataType which(type);
if (which.isUInt())
{
auto value = col->getUInt(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isString())
{
auto value = DB::checkAndGetColumn<DB::ColumnString>(*col)->getDataAt(row).toString();
std::cerr << value << "\t";
}
else if (which.isInt())
{
auto value = col->getInt(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isFloat32())
{
auto value = col->getFloat32(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isFloat64())
{
auto value = col->getFloat64(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isDate())
{
auto * date_type = DB::checkAndGetDataType<DB::DataTypeDate>(type.get());
String date_string;
DB::WriteBufferFromString wb(date_string);
date_type->getSerialization(DB::ISerialization::Kind::DEFAULT)->serializeText(*col, row, wb, {});
std::cerr << date_string.substr(0, 10) << "\t";
}
else
{
std::cerr << "N/A"
<< "\t";
}
}
std::cerr << std::endl;
}
}

void headColumn(const DB::ColumnPtr column, size_t count)
{
std::cerr << "============Column============" << std::endl;
// print header

std::cerr << column->getName() << "\t";
std::cerr << std::endl;
// print rows
for (size_t row = 0; row < std::min(count, column->size()); ++row)
{
auto type = column->getDataType();
auto col = column;
DB::WhichDataType which(type);
if (which.isUInt())
{
auto value = col->getUInt(row);
std::cerr << std::to_string(value) << std::endl;
}
else if (which.isString())
{
auto value = DB::checkAndGetColumn<DB::ColumnString>(*col)->getDataAt(row).toString();
std::cerr << value << std::endl;
}
else if (which.isInt())
{
auto value = col->getInt(row);
std::cerr << std::to_string(value) << std::endl;
}
else if (which.isFloat32())
{
auto value = col->getFloat32(row);
std::cerr << std::to_string(value) << std::endl;
}
else if (which.isFloat64())
{
auto value = col->getFloat64(row);
std::cerr << std::to_string(value) << std::endl;
}
else
{
std::cerr << "N/A" << std::endl;
}
}
}
}
98 changes: 2 additions & 96 deletions utils/local-engine/Common/DebugUtils.h
Original file line number Diff line number Diff line change
@@ -6,101 +6,7 @@
namespace debug
{

void headBlock(const DB::Block & block, size_t count=10)
{
std::cerr << "============Block============" << std::endl;
// print header
for (auto name : block.getNames())
{
std::cerr << name << "\t";
}
std::cerr << std::endl;
// print rows
for (size_t row = 0; row < std::min(count, block.rows()); ++row)
{
for (size_t column = 0; column < block.columns(); ++column)
{
auto type = block.getByPosition(column).type;
auto col = block.getByPosition(column).column;
DB::WhichDataType which(type);
if (which.isUInt())
{
auto value = DB::checkAndGetColumn<DB::ColumnUInt64>(*col)->getUInt(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isString())
{
auto value = DB::checkAndGetColumn<DB::ColumnString>(*col)->getDataAt(row).toString();
std::cerr << value << "\t";
}
else if (which.isInt())
{
auto value = col->getInt(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isFloat32())
{
auto value = col->getFloat32(row);
std::cerr << std::to_string(value) << "\t";
}
else if (which.isFloat64())
{
auto value = col->getFloat64(row);
std::cerr << std::to_string(value) << "\t";
}
else
{
std::cerr << "N/A"
<< "\t";
}
}
std::cerr << std::endl;
}
}
void headBlock(const DB::Block & block, size_t count=10);

void headColumn(const DB::ColumnPtr column, size_t count=10)
{
std::cerr << "============Column============" << std::endl;
// print header

std::cerr << column->getName() << "\t";
std::cerr << std::endl;
// print rows
for (size_t row = 0; row < std::min(count, column->size()); ++row)
{
auto type = column->getDataType();
auto col = column;
DB::WhichDataType which(type);
if (which.isUInt())
{
auto value = DB::checkAndGetColumn<DB::ColumnUInt64>(*col)->getUInt(row);
std::cerr << std::to_string(value) << std::endl;
}
else if (which.isString())
{
auto value = DB::checkAndGetColumn<DB::ColumnString>(*col)->getDataAt(row).toString();
std::cerr << value << std::endl;
}
else if (which.isInt())
{
auto value = col->getInt(row);
std::cerr << std::to_string(value) << std::endl;
}
else if (which.isFloat32())
{
auto value = col->getFloat32(row);
std::cerr << std::to_string(value) << std::endl;
}
else if (which.isFloat64())
{
auto value = col->getFloat64(row);
std::cerr << std::to_string(value) << std::endl;
}
else
{
std::cerr << "N/A"
<< std::endl;
}
}
}
void headColumn(const DB::ColumnPtr column, size_t count=10);
}
2 changes: 1 addition & 1 deletion utils/local-engine/Common/MergeTreeTool.cpp
Original file line number Diff line number Diff line change
@@ -42,7 +42,7 @@ std::unique_ptr<SelectQueryInfo> buildQueryInfo(NamesAndTypesList& names_and_typ
}


MergeTreeTable parseMergeTreeTable(std::string & info)
MergeTreeTable parseMergeTreeTableString(std::string & info)
{
ReadBufferFromString in(info);
assertString("MergeTree;", in);
2 changes: 1 addition & 1 deletion utils/local-engine/Common/MergeTreeTool.h
Original file line number Diff line number Diff line change
@@ -34,5 +34,5 @@ namespace local_engine
std::string toString() const;
};

MergeTreeTable parseMergeTreeTable(std::string & info);
MergeTreeTable parseMergeTreeTableString(std::string & info);
}
Loading