Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConvInteger: fix parsing for x_zero_point and w_zero_point #3763

Merged
merged 24 commits into from
Jan 21, 2025

Conversation

kahmed10
Copy link
Collaborator

@kahmed10 kahmed10 commented Jan 16, 2025

  • The previous test did not properly expose the bug when parsing ConvInteger found in SD3. Updated the parser and updated the unit test.
  • I also updated some function and types for clarity.
  • I also needed to add an exception for layout in find_inner_broadcast. The unit test otherwise would not run because the zero points were broadcasted, and applying layout to a 1d tensor with 4d permutation would not work.
  • Feel free to suggest better variable names.

I noticed allocation segments reaching close to uint64 max value, which clearly would throw an out of memory error when trying to allocate on the GPU. This happened when MIGRAPHX_NSTREAMS was set to 2 or greater and the model somehow was large enough to trigger it.

Changing from `auto` to `size_t` seems to fix the issue.
@kahmed10 kahmed10 requested a review from causten as a code owner January 16, 2025 03:38
@kahmed10 kahmed10 requested review from TedThemistokleous and lakhinderwalia and removed request for causten January 16, 2025 03:38
@kahmed10 kahmed10 self-assigned this Jan 16, 2025
@kahmed10 kahmed10 added bugfix Fixes a bug found in the code. simple small or simple changes labels Jan 16, 2025
@kahmed10 kahmed10 force-pushed the parse_qconv_bias_fix branch from ab9d13f to 276d9d4 Compare January 16, 2025 13:10
@kahmed10
Copy link
Collaborator Author

Some further details on the bug:
The previous logic after checking if x_zp and w_zp were not symmetric was faulty. There was already an implicit multibroadcast in the logic by using add_common_op, so that's why it didn't throw any errors. And the shapes of the test inputs happened to be in a way that the broadcasting rules did not break.

Previous input shapes:

x -> [1,3,5,5]
w -> [1,3,2,2]
conv(x,w) -> [1,1,3,3]

The previous logic would then look at x_zp and find it's not symmetric.
After which it would use add_common_op to do conv(x_zp,w), but that would multibroadcast x_zp to shape [1,3,2,2].
The resulting shape of conv(x_zp,w) would be (1,1,1,1).
ret=conv(x,w)-conv(x_zp,w) -> [1,1,3,3] - [1,1,1,1] -> [1,1,3,3] - [1,1,3,3] technically works.
Similarly, using add_common_op to conv(x,w_zp) would multibroadcast w_zp to shape [1,3,5,5].
The resulting shape of conv(x,w_zp) would also be (1,1,1,1).
ret=ret-conv(x,w_zp) -> [1,1,3,3] - [1,1,1,1] -> [1,1,3,3] - [1,1,3,3] also technically works.
Then the final add_common_op for adding conv(x_zp,w_zp) would also technically work:
ret=ret-conv(x_zp,w_zp) -> [1,1,3,3] - [1,1,1,1] -> [1,1,3,3] - [1,1,3,3]

But now if you have the new input shapes:

x_new -> [2,3,10,10]
w_new -> [4,3,3,3]
conv(x_new,w_w) -> [2,4,8,8]

Now if you tried to use add_common_op for conv(x_zp,w_new), you'd get the following:

ret=conv(x_new,w_new)-conv(z_zp,w_new) -> [2,4,8,8] - [4,4,1,1] -> incompatible shapes

TL;DR: The previous logic of using add_common_op was wrong. The test case just so happened to produce broadcastable shapes. Bad test case did not uncover the bug in the logic.


// multibroadcast (or broadcast) zero points according to spec
// x_zp should be a scalar or literal with one element
// w_zp can be either a single element or a 1d tensor with size out_channels
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: Good to add the comment here!

Copy link
Contributor

@lakhinderwalia lakhinderwalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the general description, and it helps a lot. And for modifying the test_cases. Approved.

@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
1ce1f9
Rate old
150105
Diff Compare
torchvision-resnet50 64 3,253.57 3,254.33 -0.02%
torchvision-resnet50_fp16 64 6,927.88 6,927.70 0.00%
torchvision-densenet121 32 2,454.74 2,454.03 0.03%
torchvision-densenet121_fp16 32 4,182.86 4,166.00 0.40%
torchvision-inceptionv3 32 1,630.27 1,630.12 0.01%
torchvision-inceptionv3_fp16 32 2,714.81 2,716.54 -0.06%
cadene-inceptionv4 16 763.30 763.13 0.02%
cadene-resnext64x4 16 813.38 813.13 0.03%
slim-mobilenet 64 7,463.06 7,460.93 0.03%
slim-nasnetalarge 64 208.67 208.67 -0.00%
slim-resnet50v2 64 3,444.86 3,446.18 -0.04%
bert-mrpc-onnx 8 1,148.90 1,146.26 0.23%
bert-mrpc-tf 1 489.20 481.46 1.61%
pytorch-examples-wlang-gru 1 484.53 470.60 2.96%
pytorch-examples-wlang-lstm 1 454.03 443.44 2.39%
torchvision-resnet50_1 1 807.66 804.59 0.38%
cadene-dpn92_1 1 429.19 430.86 -0.39%
cadene-resnext101_1 1 385.86 386.12 -0.07%
onnx-taau-downsample 1 373.45 372.92 0.14%
dlrm-criteoterabyte 1 33.31 33.33 -0.07%
dlrm-criteoterabyte_fp16 1 52.36 52.63 -0.51%
agentmodel 1 8,700.12 8,589.15 1.29%
unet_fp16 2 58.56 58.37 0.33%
resnet50v1_fp16 1 1,028.12 1,036.76 -0.83%
resnet50v1_int8 1 1,022.10 1,039.07 -1.63%
bert_base_cased_fp16 64 1,180.85 1,182.01 -0.10%
bert_large_uncased_fp16 32 365.23 365.27 -0.01%
bert_large_fp16 1 201.65 202.67 -0.50%
distilgpt2_fp16 16 2,227.31 2,227.86 -0.02%
yolov5s 1 537.79 523.22 2.79%
tinyllama 1 43.60 43.57 0.06%
vicuna-fastchat 1 157.73 173.92 -9.31% 🔴
whisper-tiny-encoder 1 417.40 418.33 -0.22%
whisper-tiny-decoder 1 429.83 430.88 -0.24%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

Copy link

codecov bot commented Jan 20, 2025

Codecov Report

Attention: Patch coverage is 92.30769% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.28%. Comparing base (976ae75) to head (314b6ba).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
src/onnx/parse_convolution.cpp 90.90% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #3763   +/-   ##
========================================
  Coverage    92.28%   92.28%           
========================================
  Files          519      519           
  Lines        22222    22227    +5     
========================================
+ Hits         20507    20512    +5     
  Misses        1715     1715           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kahmed10 kahmed10 merged commit f36eba4 into develop Jan 21, 2025
17 of 21 checks passed
@kahmed10 kahmed10 deleted the parse_qconv_bias_fix branch January 21, 2025 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a bug found in the code. simple small or simple changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants