Use DisposableElementsAttr for ZHigh constant propagation #3013

tungld · 2024-11-18T06:04:43Z

This patch reverts the PR [NNPA] Memory reduction of stickified constant by stickifying at file writing #2917
- The old way in [NNPA] Memory reduction of stickified constant by stickifying at file writing #2917 is to propose a ConstantOpInterface and delay the stickification until storing constants-to-file. Two main disadvantages are: 1) constant propagation for ztensor at zhigh dialect is off, 2) it does not work if store-constants-to-file is off.
This patch extends the constant prop mechanism (that currently applies to ONNXConstantOp) to ZHighStickifiedConstantOp, so that the zhigh constant prop can benefit from the memory management for DisposableElementsAttr: In particular, it does:
- Make all ZHigh optimizations/rewritings work with DisposableElementsAttr
- Change ZHighStikifiedConstantOp to accept DisposableElementsAttr. Its parser and printer are changed to read/write from/to DenseElementsAttr for lit tests.
- Change ZHighConstantPropagationPass so that it reads data directly from DisposableElementsAttr instead of DenseElementsAttr.
- Add two passes: ZHighDisposableGabageCollector and ZHighScrubDisposablePass to manage buffers used by ZHighStickifiedConstantOp

Quick experiment: the peak compile memory consumptions of #2917 and this PR when compiling the gpt2-large model for NNPA (744M parameters, the constant file's size is 3.2GB) are quite similar, both are about 9GB.

This patch contains the reverting code so it's no easy to follow. To ease the review, I merge all new changes (not the reverting code) into a single commit: 265ff90. Please look at this commit for review.

… at file writing (onnx#2917)" This reverts commit 33b466e. Signed-off-by: Tung D. Le <[email protected]>

Signed-off-by: Tung D. Le <[email protected]>

AlexandreEichenberger · 2024-11-18T14:24:24Z

@tungld Just to understand the high level and without the class names. You are using Soren's approach of applying "logical" operations to the constants so that for example if we have <large-constant-tensor> * 2 + 1 we just keep the original <lage-constant-tensor> and tag along mult and add operators to the constant, so that if we need to materialize the multipied/added large constant tensor, we first apply these operations before generating the constant? And so, you added a stickify (presumably we never need an unstickify) operator?

tungld · 2024-11-18T22:16:50Z

You are using Soren's approach of applying "logical" operations to the constants

Yes, I extend it for ZHigh operations so the same approach is used for both ONNX and ZHigh until lowering to krnl. We can extend it to cover krnl operations but it needs more work and I didn't do it in this PR.

Signed-off-by: Tung D. Le <[email protected]>

imaihal

LGTM!

imaihal · 2024-11-22T05:56:28Z

src/Accelerators/NNPA/Transform/ZHigh/ZHighConstPropagation.cpp

+struct ConstantStickPattern : public OpRewritePattern<ZHighStickOp> {
+  ConstantStickPattern(MLIRContext *context) : OpRewritePattern(context) {}
+  LogicalResult matchAndRewrite(
+      ZHighStickOp stickOp, PatternRewriter &rewriter) const override {
+    Value input = stickOp.getIn();
+    Value output = stickOp.getOut();
+    StringAttr layout = stickOp.getLayoutAttr();
+
+    // Match
+    if (!isDenseONNXConstant(input)) {
+      return failure();
+    }
+
+    // Rewrite
+    Value stickifiedVal =
+        createConstantForStick(rewriter, output, input, layout);
+    replaceOpAndGC(rewriter, stickOp, stickifiedVal);
+    return success();
+  }
+};


Just to confirm, you replaced tablegen with cpp. Is this because replaceOpAndGC() is difficult to use in tablegen format?

Yes, I don't know how to do that with tablegen, let me know if you know how to do it. Thanks!

AlexandreEichenberger · 2024-11-22T20:01:30Z

Can you post here for ref the improvements you got, just for future reference purpose. Does not need to be super detailed. Thanks

imaihal · 2024-11-25T01:20:10Z

Can you post here for ref the improvements you got, just for future reference purpose. Does not need to be super detailed. Thanks

I put the measurement results of gpt2-large and Mistral-7b.
In gpt2-large, the peak memory usage reduced from 8.9 GB to 7.4 GB, and compilation time becomes faster from 5 min 22sec to 4 min 30 sec. Left graph is current main, and right graph is PR3013.

In Mistral-7b, the peak memory usage reduced from 33.2 GB to 27.9 GB, and compilation time becomes faster from 17 min 4 sec to 13 min 58 sec. Left graph is current main, and right graph is PR3013.

tungld · 2024-11-27T14:55:00Z

@jenkins-droid test this please

Revert "[NNPA] Memory reduction of stickified constant by stickifying…

c99e50d

… at file writing (onnx#2917)" This reverts commit 33b466e. Signed-off-by: Tung D. Le <[email protected]>

tungld force-pushed the zhigh-constprop-with-dispose branch from 68979b3 to 48cf039 Compare November 18, 2024 06:05

Use DisposableElementsAttr for ZHigh Constant Propagation

265ff90

Signed-off-by: Tung D. Le <[email protected]>

tungld force-pushed the zhigh-constprop-with-dispose branch from 48cf039 to 265ff90 Compare November 18, 2024 06:18

tungld requested review from imaihal, AlexandreEichenberger and chentong319 November 18, 2024 06:31

tungld added 10 commits November 19, 2024 00:08

Lit tests for splat DenseElementsAttr

06e6e4c

Signed-off-by: Tung D. Le <[email protected]>

Call garbage collector after each rewriting

7dea632

Signed-off-by: Tung D. Le <[email protected]>

Use getRawBytes to make the data reading faster

6d044be

Signed-off-by: Tung D. Le <[email protected]>

Reduce memory consumption

d6721d7

Signed-off-by: Tung D. Le <[email protected]>

clean up

00782ee

Signed-off-by: Tung D. Le <[email protected]>

fix an error

60fef2f

Signed-off-by: Tung D. Le <[email protected]>

Lit tests

b963c1d

Signed-off-by: Tung D. Le <[email protected]>

undo redundant parts

ea9bc82

Signed-off-by: Tung D. Le <[email protected]>

redundant code

44b81b3

Signed-off-by: Tung D. Le <[email protected]>

more clean up

9143cb5

Signed-off-by: Tung D. Le <[email protected]>

imaihal approved these changes Nov 22, 2024

View reviewed changes

Merge branch 'main' into zhigh-constprop-with-dispose

a026421

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use DisposableElementsAttr for ZHigh constant propagation #3013

Use DisposableElementsAttr for ZHigh constant propagation #3013

tungld commented Nov 18, 2024 •

edited

Loading

AlexandreEichenberger commented Nov 18, 2024

tungld commented Nov 18, 2024

imaihal left a comment

imaihal Nov 22, 2024

tungld Nov 22, 2024

AlexandreEichenberger commented Nov 22, 2024

imaihal commented Nov 25, 2024 •

edited

Loading

tungld commented Nov 27, 2024

Use DisposableElementsAttr for ZHigh constant propagation #3013

Are you sure you want to change the base?

Use DisposableElementsAttr for ZHigh constant propagation #3013

Conversation

tungld commented Nov 18, 2024 • edited Loading

AlexandreEichenberger commented Nov 18, 2024

tungld commented Nov 18, 2024

imaihal left a comment

Choose a reason for hiding this comment

imaihal Nov 22, 2024

Choose a reason for hiding this comment

tungld Nov 22, 2024

Choose a reason for hiding this comment

AlexandreEichenberger commented Nov 22, 2024

imaihal commented Nov 25, 2024 • edited Loading

tungld commented Nov 27, 2024

tungld commented Nov 18, 2024 •

edited

Loading

imaihal commented Nov 25, 2024 •

edited

Loading