Adding decimal support for min() and max() functions #9005

minhancao · 2024-03-07T23:02:33Z

Delivers #9004

This PR is for adding decimal support for min() and max() functions.

presto-cli output:

presto:tpch> SELECT MIN(Col,2) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
      _col0       
------------------
 [0.8200, 2.3330] 
(1 row)
presto:tpch> SELECT MIN(Col,3) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
          _col0           
--------------------------
 [0.8200, 2.3330, 3.1320] 
(1 row)
presto:tpch> SELECT MAX(Col,2) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
      _col0       
------------------
 [4.3440, 3.1320] 
(1 row)
presto:tpch> SELECT MAX(Col,3) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col);
          _col0           
--------------------------
 [4.3440, 3.1320, 2.3330] 
(1 row)

netlify · 2024-03-07T23:02:49Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`8abad48`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/66280a4fae26f100087498f1

karteekmurthys · 2024-03-08T00:31:44Z

@minhancao The linux-adapters job failed. This is likely an alignment issue.

 int32_t accumulatorAlignmentSize() const override {
    return 1;
  }

Please override this method:

 int32_t accumulatorAlignmentSize() const override {
    return 1; --> must return size of int128_t in case of long decimals.
  }

minhancao · 2024-03-08T01:12:37Z

@karteekmurthys I see this piece of code on line 99 of MinMaxAggregates.cpp, would this already have taken care of it?

/// Override 'accumulatorAlignmentSize' for UnscaledLongDecimal values as it
/// uses int128_t type. Some CPUs don't support misaligned access to int128_t
/// type.
template <>
inline int32_t MinMaxAggregate<int128_t>::accumulatorAlignmentSize() const {
  return static_cast<int32_t>(sizeof(int128_t));
}

karteekmurthys · 2024-03-11T17:17:31Z

Also, the accumulator internally uses a heap for sorting. This heap allocator must be an aligned allocator:

template <typename T, typename Compare>
struct MinMaxNAccumulator {
  int64_t n{0};
  std::vector<T, StlAllocator<T>> heapValues;

  explicit MinMaxNAccumulator(HashStringAllocator* allocator)
      : heapValues{StlAllocator<T>(allocator)} {}

  int64_t getN() const {
    return n;
  }

  size_t size() const {
    return heapValues.size();
  }

Refer this:https://github.com/facebookincubator/velox/pull/8723/files#diff-5091b1dd8232c357067be47fb0e751772f91a0aa12dea5db835a2d4ae296a63cR93

minhancao · 2024-03-27T16:03:20Z

Hi @karteekmurthys, I have made the change that you have requested, please review when you can. thanks!

karteekmurthys

LGTM!

aditi-pandit · 2024-04-04T22:50:56Z

@minhancao : Looks good. Thanks for contributing !

karteekmurthys · 2024-04-11T20:59:54Z

@Yuhta would you please review this?

minhancao · 2024-04-22T20:35:38Z

Hi @Yuhta , can you review this PR when you can please? Thanks!

Yuhta

Can you run aggregation fuzzer on the PR for 20 min?

Yuhta · 2024-04-22T20:38:56Z

velox/functions/prestosql/aggregates/MinMaxAggregates.cpp

+              if (inputType->isLongDecimal()) {
+                return std::make_unique<TNumericN<int128_t>>(resultType);
+              }
+              VELOX_NYI();


VELOX_UNREACHABLE()

Yuhta · 2024-04-22T20:39:35Z

velox/functions/prestosql/aggregates/MinMaxAggregates.cpp

+      AlignedStlAllocator<T, sizeof(int128_t)>,
+      StlAllocator<T>>;
+  using Heap = std::vector<T, Allocator>;
+  Heap heapValues;


nit: std::vector<T, Allocator> heapValues;

minhancao · 2024-04-22T23:36:05Z

@Yuhta I have ran locally the velox aggregation fuzzer test:
min() function:
command ran:

_build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test --seed 123456 --duration_sec 1200 --logtostderr=1 --minloglevel=0 --only min --v=1 2>&1 | tee agg_fuzzer_test_min_20mins.out

output:

I20240422 15:00:08.498713 3874823 AggregationFuzzer.cpp:456] ==============================> Done with iteration 442
I20240422 15:00:08.498742 3874823 AggregationFuzzer.cpp:993] Total masked aggregations: 67 (15.12%)
I20240422 15:00:08.498764 3874823 AggregationFuzzer.cpp:995] Total global aggregations: 33 (7.45%)
I20240422 15:00:08.498781 3874823 AggregationFuzzer.cpp:997] Total group-by aggregations: 325 (73.36%)
I20240422 15:00:08.498798 3874823 AggregationFuzzer.cpp:999] Total distinct aggregations: 46 (10.38%)
I20240422 15:00:08.498817 3874823 AggregationFuzzer.cpp:1001] Total aggregations over distinct inputs: 69 (15.58%)
I20240422 15:00:08.498834 3874823 AggregationFuzzer.cpp:1003] Total window expressions: 39 (8.80%)
I20240422 15:00:08.498852 3874823 AggregationFuzzerBase.cpp:647] Total functions tested: 1
I20240422 15:00:08.498868 3874823 AggregationFuzzerBase.cpp:648] Total iterations requiring sorted inputs: 71 (16.03%)
I20240422 15:00:08.498883 3874823 AggregationFuzzerBase.cpp:650] Total iterations verified against reference DB: 41 (9.26%)
I20240422 15:00:08.498899 3874823 AggregationFuzzerBase.cpp:652] Total functions not verified (verification skipped / not supported by reference DB / reference DB failed): 39 (8.80%) / 209 (47.18%) / 125 (28.22%)
I20240422 15:00:08.498917 3874823 AggregationFuzzerBase.cpp:657] Total failed functions: 32 (7.22%)
[==========] Running 0 tests from 0 test suites.
[==========] 0 tests from 0 test suites ran. (0 ms total)
[  PASSED  ] 0 tests.

max() function:
command ran:

_build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test --seed 123456 --duration_sec 1200 --logtostderr=1 --minloglevel=0 --only max --v=1 2>&1 | tee agg_fuzzer_test_max_20mins.out

output:

I20240422 15:59:21.033170 3979822 AggregationFuzzer.cpp:456] ==============================> Done with iteration 451
I20240422 15:59:21.033193 3979822 AggregationFuzzer.cpp:993] Total masked aggregations: 68 (15.04%)
I20240422 15:59:21.033223 3979822 AggregationFuzzer.cpp:995] Total global aggregations: 34 (7.52%)
I20240422 15:59:21.033241 3979822 AggregationFuzzer.cpp:997] Total group-by aggregations: 333 (73.67%)
I20240422 15:59:21.033258 3979822 AggregationFuzzer.cpp:999] Total distinct aggregations: 46 (10.18%)
I20240422 15:59:21.033274 3979822 AggregationFuzzer.cpp:1001] Total aggregations over distinct inputs: 70 (15.49%)
I20240422 15:59:21.033290 3979822 AggregationFuzzer.cpp:1003] Total window expressions: 39 (8.63%)
I20240422 15:59:21.033306 3979822 AggregationFuzzerBase.cpp:647] Total functions tested: 1
I20240422 15:59:21.033322 3979822 AggregationFuzzerBase.cpp:648] Total iterations requiring sorted inputs: 74 (16.37%)
I20240422 15:59:21.033339 3979822 AggregationFuzzerBase.cpp:650] Total iterations verified against reference DB: 41 (9.07%)
I20240422 15:59:21.033355 3979822 AggregationFuzzerBase.cpp:652] Total functions not verified (verification skipped / not supported by reference DB / reference DB failed): 39 (8.63%) / 217 (48.01%) / 126 (27.88%)
I20240422 15:59:21.033372 3979822 AggregationFuzzerBase.cpp:657] Total failed functions: 32 (7.08%)
[==========] Running 0 tests from 0 test suites.
[==========] 0 tests from 0 test suites ran. (0 ms total)
[  PASSED  ] 0 tests.

Yuhta · 2024-04-23T14:42:28Z

The branch seems messed up

minhancao · 2024-04-23T19:22:46Z

@Yuhta Fixed the branch

facebook-github-bot · 2024-04-23T20:24:47Z

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-04-25T15:08:57Z

@Yuhta merged this pull request in d97ddb4.

conbench-facebook · 2024-04-25T15:33:05Z

Conbench analyzed the 1 benchmark run on commit d97ddb41.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

…or#9005) Summary: Delivers facebookincubator#9004 This PR is for adding decimal support for min() and max() functions. presto-cli output: ``` presto:tpch> SELECT MIN(Col,2) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col); _col0 ------------------ [0.8200, 2.3330] (1 row) presto:tpch> SELECT MIN(Col,3) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col); _col0 -------------------------- [0.8200, 2.3330, 3.1320] (1 row) presto:tpch> SELECT MAX(Col,2) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col); _col0 ------------------ [4.3440, 3.1320] (1 row) presto:tpch> SELECT MAX(Col,3) FROM (VALUES cast(0.82 as decimal(5,4)), cast(2.333 as decimal(5,4)), cast(3.132 as decimal(5,4)), cast(4.344 as decimal(5,4))) AS X(Col); _col0 -------------------------- [4.3440, 3.1320, 2.3330] (1 row) ``` Pull Request resolved: facebookincubator#9005 Reviewed By: pedroerp Differential Revision: D56487562 Pulled By: Yuhta fbshipit-source-id: 283eb189a91840835784e565cd33fa713167d17d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 7, 2024

karteekmurthys self-requested a review April 2, 2024 20:57

karteekmurthys approved these changes Apr 2, 2024

View reviewed changes

karteekmurthys requested review from karteekmurthys, Yuhta and aditi-pandit April 4, 2024 17:08

aditi-pandit approved these changes Apr 4, 2024

View reviewed changes

karteekmurthys approved these changes Apr 11, 2024

View reviewed changes

Yuhta reviewed Apr 22, 2024

View reviewed changes

Minhan Cao and others added 2 commits April 22, 2024 14:19

Adding decimal support for min() and max() functions

4ca50f4

Made heap allocator be an aligned allocator for MinMaxNAccumulator

2a45bdc

Addressed PR comment

8abad48

minhancao force-pushed the add_decimal_support_for_min_max_oss branch from 3918677 to 8abad48 Compare April 23, 2024 19:21

facebook-github-bot closed this in d97ddb4 Apr 25, 2024

facebook-github-bot added the Merged label Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding decimal support for min() and max() functions #9005

Adding decimal support for min() and max() functions #9005

minhancao commented Mar 7, 2024 •

edited

Loading

netlify bot commented Mar 7, 2024 •

edited

Loading

karteekmurthys commented Mar 8, 2024

minhancao commented Mar 8, 2024 •

edited

Loading

karteekmurthys commented Mar 11, 2024

minhancao commented Mar 27, 2024

karteekmurthys left a comment

aditi-pandit commented Apr 4, 2024

karteekmurthys commented Apr 11, 2024

minhancao commented Apr 22, 2024

Yuhta left a comment

Yuhta Apr 22, 2024

Yuhta Apr 22, 2024

minhancao commented Apr 22, 2024

Yuhta commented Apr 23, 2024

minhancao commented Apr 23, 2024

facebook-github-bot commented Apr 23, 2024

facebook-github-bot commented Apr 25, 2024

conbench-facebook bot commented Apr 25, 2024

Adding decimal support for min() and max() functions #9005

Adding decimal support for min() and max() functions #9005

Conversation

minhancao commented Mar 7, 2024 • edited Loading

netlify bot commented Mar 7, 2024 • edited Loading

✅ Deploy Preview for meta-velox canceled.

karteekmurthys commented Mar 8, 2024

minhancao commented Mar 8, 2024 • edited Loading

karteekmurthys commented Mar 11, 2024

minhancao commented Mar 27, 2024

karteekmurthys left a comment

Choose a reason for hiding this comment

aditi-pandit commented Apr 4, 2024

karteekmurthys commented Apr 11, 2024

minhancao commented Apr 22, 2024

Yuhta left a comment

Choose a reason for hiding this comment

Yuhta Apr 22, 2024

Choose a reason for hiding this comment

Yuhta Apr 22, 2024

Choose a reason for hiding this comment

minhancao commented Apr 22, 2024

Yuhta commented Apr 23, 2024

minhancao commented Apr 23, 2024

facebook-github-bot commented Apr 23, 2024

facebook-github-bot commented Apr 25, 2024

conbench-facebook bot commented Apr 25, 2024

minhancao commented Mar 7, 2024 •

edited

Loading

netlify bot commented Mar 7, 2024 •

edited

Loading

minhancao commented Mar 8, 2024 •

edited

Loading