Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRL algorithm with api #2528

Open
eightreal opened this issue Apr 1, 2024 · 8 comments
Open

DRL algorithm with api #2528

eightreal opened this issue Apr 1, 2024 · 8 comments
Labels

Comments

@eightreal
Copy link

Hello , Dear Contributors
I notice that the application DQN don't use the api .h file.
And there only exists defined loss function, so if I want to develop a DQN methods, I would like to ask you to confirm the following.

  1. Is there an interface or method to customize the Los function?
  2. Can I copy the header file you used in Aplication / DRL, and if so, which release package should I use? nntrainer-devel?

Or you have better advice.

@taos-ci
Copy link

taos-ci commented Apr 1, 2024

:octocat: cibot: Thank you for posting issue #2528. The person in charge will reply soon.

@myungjoo
Copy link
Member

myungjoo commented Apr 3, 2024

  1. Example: https://github.com/nnstreamer/nntrainer/blob/main/Applications/Custom/mae_loss.cpp
  2. Yes, you can. A devel package is always recommended, too, if you want to setup a CI/CD system.

@eightreal
Copy link
Author

Another question.
When I call the run interface and save the model, do I also save the current training status (such as gradient information)? Is it possible to continue training after the model is loaded in the future.

@EunjuYang
Copy link
Contributor

Hello! Thank you for your question and concern. Here're my answer on your questions:
First, you can save the model after training. However, it does not support to save the gradient information.
Second, Yes. it is possible to continue training after the model is loaded.

@myungjoo
Copy link
Member

You can do checkout and continue training process, but that's just not based on gradient saving.
You can do epoch-based checkpointing (that's what most nntrainer's mobile applications do), but I'm not sure about finer-grained checkpointing.

@eightreal
Copy link
Author

ok, thanks for your reply ,
another question , is there any method for a model copy and Polyak update?

@myungjoo
Copy link
Member

For model copy, if there is no copy-constructor for model class and the default behavior does not do what you want, you may try "original.save()" and "cloned.load()".

For Polyak update, it appears that the DQN application (or simple "reinforcement learning" app) has its own "custom" op. But I'm not too sure about this. I guess @jijoongmoon may answer this when he returns from trip.

@eightreal
Copy link
Author

Hello, I checked the reinforcement learning app ,
you update the net by save file and load file, but not polyak update ,
could you help check it ?
And if if there is a impl of polyak update, could you help clear its path and code line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants