Replies: 2 comments 5 replies
-
Hi @jsrdcht I need more contexts to understand your request.
|
Beta Was this translation helpful? Give feedback.
-
I have read the source code of torchdistill and found a problem when passing the parameters that need to be optimized to the optimizer:
When the configuration file is set as above, only the parameters specified here will be considered as the parameter group that needs optimization. This corresponds to this part of the source code:
The problem here is that when module_wise_params_configs is not empty, the code in the branch only constructs parameter groups for the parameters defined in module_wise_params_configs. Another issue is that the current code only supports nn.module and does not support nn.parameter. This corresponds to the line of code "module_wise_params_dict['params'] = module.parameters()". |
Beta Was this translation helpful? Give feedback.
-
Recently, I encountered a strange bug in the torchdistill framework. After debugging, I found that the cause was that torchdistill only supports nn.module modules and requires all optimizer parameters of the modules to be passed in together. If only one module is passed, it will be treated as if the other modules do not need optimization. This obviously does not match the actual usage scenario.
However, my requirement is to adjust the parameters of a specific nn.parameter() individually, and I hope torchdistill can support this. Specifically, the logic should be to include all parameters that need gradients in the optimizer, and then update the optimizer parameters of specific model parameters.
Beta Was this translation helpful? Give feedback.
All reactions