Fine-tuned optimizer parameter control #387

jsrdcht · 2023-08-19T17:12:28Z

jsrdcht
Aug 19, 2023

Recently, I encountered a strange bug in the torchdistill framework. After debugging, I found that the cause was that torchdistill only supports nn.module modules and requires all optimizer parameters of the modules to be passed in together. If only one module is passed, it will be treated as if the other modules do not need optimization. This obviously does not match the actual usage scenario.

However, my requirement is to adjust the parameters of a specific nn.parameter() individually, and I hope torchdistill can support this. Specifically, the logic should be to include all parameters that need gradients in the optimizer, and then update the optimizer parameters of specific model parameters.

yoshitomo-matsubara · 2023-08-19T18:45:51Z

yoshitomo-matsubara
Aug 19, 2023
Maintainer

Hi @jsrdcht

I need more contexts to understand your request.
Could you clarify the following points?

What do you mean by "If only one module is passed,"?
How exactly did you pass one module to optimizer?
"adjust the parameters of a specific nn.parameter() individually, " I feel it is supported by module wise param configurations for optimizer

4 replies

jsrdcht Aug 19, 2023
Author

like below

module_wise_params:
  - params: {'weight_decay': 0.0}
    is_teacher: False
    module: 'a'

I have not passed any arguments. What I mean is that only the modules that exist in the configuration file module_wise_params_configs will be added to the parameter group.
module_wise_params_dict['params'] = module.parameters() won't work for nn.parameter. It should be module_wise_params_dict['params'] = [module]

yoshitomo-matsubara Aug 19, 2023
Maintainer

I see the points, and it looks like a good change.
I just made a PR #388, referring to this discussion

If it meets your request, let me know, and press "close discussion" button below. Then I will merge the PR

jsrdcht Aug 21, 2023
Author

Thank you for your work, I appreciate it. Okay, looking forward to the new code.

Is the new code experienced by updating the version of torchdistill installed via pip?

yoshitomo-matsubara Aug 21, 2023
Maintainer

It will be included as part of a new PyPi version (0.3.3 -> 1.0.0), and I'm still working on documentation.

Once it's done, I will release 1.0.0

jsrdcht · 2023-08-19T19:36:20Z

jsrdcht
Aug 19, 2023
Author

I have read the source code of torchdistill and found a problem when passing the parameters that need to be optimized to the optimizer:

module_wise_params:
  - params: {'weight_decay': 0.0}
    is_teacher: False
    module: 'a'

When the configuration file is set as above, only the parameters specified here will be considered as the parameter group that needs optimization. This corresponds to this part of the source code:

if len(module_wise_params_configs) > 0:
  trainable_module_list = list()
  for module_wise_params_config in module_wise_params_configs:
      module_wise_params_dict = dict()
      if isinstance(module_wise_params_config.get('params', None), dict):
          module_wise_params_dict.update(module_wise_params_config['params'])
  
      if 'lr' in module_wise_params_dict:
          module_wise_params_dict['lr'] *= self.lr_factor
  
      target_model = \
          self.teacher_model if module_wise_params_config.get('is_teacher', False) else self.student_model
      module = get_module(target_model, module_wise_params_config['module'])
  
      module_wise_params_dict['params'] = module.parameters()
      
      trainable_module_list.append(module_wise_params_dict)
  else:
    trainable_module_list = nn.ModuleList([self.student_model])
    if self.teacher_updatable:
        logger.info('Note that you are training some/all of the modules in the teacher model')
        trainable_module_list.append(self.teacher_model)

The problem here is that when module_wise_params_configs is not empty, the code in the branch only constructs parameter groups for the parameters defined in module_wise_params_configs.

Another issue is that the current code only supports nn.module and does not support nn.parameter. This corresponds to the line of code "module_wise_params_dict['params'] = module.parameters()".
The passing method of nn.parameter is [module].

1 reply

yoshitomo-matsubara Aug 19, 2023
Maintainer

@jsrdcht Please reply to my comment instead making a separate comment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuned optimizer parameter control #387

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Fine-tuned optimizer parameter control #387

jsrdcht Aug 19, 2023

Replies: 2 comments · 5 replies

yoshitomo-matsubara Aug 19, 2023 Maintainer

jsrdcht Aug 19, 2023 Author

yoshitomo-matsubara Aug 19, 2023 Maintainer

jsrdcht Aug 21, 2023 Author

yoshitomo-matsubara Aug 21, 2023 Maintainer

jsrdcht Aug 19, 2023 Author

yoshitomo-matsubara Aug 19, 2023 Maintainer

jsrdcht
Aug 19, 2023

Replies: 2 comments 5 replies

yoshitomo-matsubara
Aug 19, 2023
Maintainer

jsrdcht Aug 19, 2023
Author

yoshitomo-matsubara Aug 19, 2023
Maintainer

jsrdcht Aug 21, 2023
Author

yoshitomo-matsubara Aug 21, 2023
Maintainer

jsrdcht
Aug 19, 2023
Author

yoshitomo-matsubara Aug 19, 2023
Maintainer