[Wait for #2579][unittest] Add mixed precision unit test #2589

DonghakPark · 2024-05-17T09:37:55Z

Add 2 case of mixed precision unit test

case1 : FC-FC-FC
case2 : Flatten - FC

i will add more mixed precision case (conv, lstm, etc..)

Self evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

Signed-off-by: Donghak PARK [email protected]

We will add Var32 Tensor if the Variable Weight is not Full precision (FP32). This eables the Weight Update with full precision and only Apply Gradient Process ueses this Tensor. Therefore, the lifespan of this tensor should be "ApplyGradient". . Modify TensorPool to generate Weigth considering Mixed Precsion. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This pr create the variable fp32 tensor when we create the Weight and Optimizer Weight. . update the manager to create Weight with var32 tensor which requested to weight pool. . update the weight requests with Weight Spec and var, grad and var32 tensors which created already. . add clone Tensor with specific type in tensor.h Resolves: **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR enables the FP16 support for the layers below: . input layer . mse loss layer Resolves: **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR includes the mixed precision test case. . Input - FC - MSE : "batch_size=2", "model_tensor_type=FP16-FP16", "loss_scale=128" **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This commit modify apply gradient in optimizer. We do not need to save optimizer variables in weight type. Only Optimizer needs the optimizer variables and we should update the weight with full precision to maintain the accuracy. Therefore, remove the var32 tensors for optimizer variables. Resolves: **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR add is_NaN function to check if the tensor has NaN value. This is for the check NaN during mixed precision training. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR add loss scale parameter in runcontext and use it to update mse loss. . Add Loss Scale Parameter in RunLayerContext Constructor . Add applyLossScale func to update return derivitive in Loss Layer . Change MSE Loss Layer to apply the loss scale to return derivitive **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR enables the Mixed Precision Training. For now only FP16-FP32 is considered. Additional Test cases will be added. . add getSortedLayerIdx to set the graph order for fowarding. . change clip_weights to lazy_apply_weights to use both cases. . add fowarding_op to run forwarding from that layer which has a gradient with nan. . add while loop for re-run backwarding after reset the loss scale. . add setLossScale in RunLayerContext . add check the gradient if mixed precsion enable. **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR add inifinity value check in Tensor data. . rename the hasNaN to isValid . add infinity check in isValid Function and now it check NaN and Inf . modify to check the blas_avx and blas_neon . modify graph and model check is_valid rather than has_nan . add unittest of isValid Function **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

This PR chage the loss computation using full precsion rather than half precsion to maintain accuracy. **Changes proposed in this PR:** - Added TOC generator for README.md Resolves: **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: jijoong.moon <[email protected]>

It adds tests for conv2d fp16 test. Signed-off-by: Jiho Chu <[email protected]>

It is assumed that activations and weight are fully compotaible, so it's unnecessary to be converted to. input layer and loss layres are different, cause input data and label data is assumed to be always float 32 type now. Signed-off-by: Jiho Chu <[email protected]>

This PR is to update the mixed precision layer. - integrate nnstreamer#2568 & nnstreamer#2455 - will update more test **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>

Add 2 case of mixed precision unit test - case1 : FC-FC-FC - case2 : Flatten - FC i will add more mixed precision case (conv, lstm, etc..) **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghak PARK <[email protected]>

taos-ci · 2024-05-17T09:37:58Z

📝 TAOS-CI Version: 1.5.20200925. Thank you for submitting PR #2589. Please a submit 1commit/1PR (one commit per one PR) policy to get comments quickly from reviewers. Your PR must pass all verificiation processes of cibot before starting a review process from reviewers. If you are new member to join this project, please read manuals in documentation folder and wiki page. In order to monitor a progress status of your PR in more detail, visit http://ci.nnstreamer.ai/.

taos-ci

@DonghakPark, 💯 All CI checkers are successfully verified. Thanks.

DonghakPark · 2024-11-25T01:48:00Z

Close By #2663

jijoongmoon and others added 14 commits May 17, 2024 16:03

[Test] Add conv2d test for fp16

0d52e0d

It adds tests for conv2d fp16 test. Signed-off-by: Jiho Chu <[email protected]>

DonghakPark requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs and wooksong as code owners May 17, 2024 09:37

DonghakPark requested review from helloahn, kparichay, gichan-jang, anyj0527, zhoonit, lhs8928, songgot, jihochu and SeoHyungjun as code owners May 17, 2024 09:37

DonghakPark requested review from baek2sm, skykongkong8, djeong20, EunjuYang and a team as code owners May 17, 2024 09:37

github-actions bot added the Need Review label May 17, 2024

taos-ci approved these changes May 17, 2024

View reviewed changes

DonghakPark mentioned this pull request Jul 8, 2024

[Wait for #2615] Enable Mixed Precision Training in NNTrainer @open sesame 11/09 15:18 #2663

Merged

DonghakPark closed this Nov 25, 2024

DonghakPark deleted the mixed_pre_2 branch November 26, 2024 04:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Wait for #2579][unittest] Add mixed precision unit test #2589

[Wait for #2579][unittest] Add mixed precision unit test #2589

DonghakPark commented May 17, 2024

taos-ci commented May 17, 2024

taos-ci left a comment

DonghakPark commented Nov 25, 2024

[Wait for #2579][unittest] Add mixed precision unit test #2589

[Wait for #2579][unittest] Add mixed precision unit test #2589

Conversation

DonghakPark commented May 17, 2024

taos-ci commented May 17, 2024

taos-ci left a comment

Choose a reason for hiding this comment

DonghakPark commented Nov 25, 2024