You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The test is as follows(4core 16G MacOS) select count(1) from (select user_id from event group by user_id)a
The total data is 350 million, and the user_id deduplication number is 5 million. The entire query takes 15s. Viewing through pprf, it is found that about 60% of the time is destructing the GroupState.
Describe the solution you'd like
Using bumpalo to allocate GroupState to a chunk of memory as much as possible, and then release it wholely. Will the destruction time be much better in this way?
The text was updated successfully, but these errors were encountered:
Thanks for the analysis @ic4y ! I am quite surprised we pay the fragmentation that comes from the row oriented structure of Accumulators that much more at de-allocation time than when computing the actual aggregates.
I guess the arena strategy that you are suggesting should work, though I don't know bumpalo specifically.
It is worth referencing your discussion about making accumulators column based in #956, which I believe should also solve this issue.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The test is as follows(4core 16G MacOS)
select count(1) from (select user_id from event group by user_id)a
The total data is 350 million, and the user_id deduplication number is 5 million. The entire query takes 15s. Viewing through pprf, it is found that about 60% of the time is destructing the GroupState.
Describe the solution you'd like
Using bumpalo to allocate GroupState to a chunk of memory as much as possible, and then release it wholely. Will the destruction time be much better in this way?
The text was updated successfully, but these errors were encountered: