Inquiry Regarding Loss Score Interpretation in Reinforcement Learning with reinvent4 #79

jackhu5 · 2024-05-12T15:28:19Z

jackhu5
May 12, 2024

I am a pharmacology chemistry student deeply interested in the applications of generative AI, specifically the reinvent4. Although I am relatively new to deep learning, I have been actively engaging with reinvent4, deploying and running both the provided demos and experiments tailored to my projects.

During my experimentation, particularly in the reinforcement learning phase, I have observed that the loss scores associated with model evaluation consistently increase. This seems counterintuitive as I understand that lower loss scores typically indicate better model performance. Given this, I am concerned that I may be misinterpreting the nature or the expectations of the loss score in this specific context.

Could you please provide some guidance on how to interpret these results? Any detailed explanations or resources you could point me to would be immensely helpful as I aim to deepen my understanding of these processes.

Thank you very much for your time and assistance.

Best regards,

Jack Hu

Answered by halx

May 13, 2024

Hi,

many thanks for your interest in REINVENT and welcome to the community!

the loss function is documented in our paper, see eqs. 5, 6 and discussion following eq 7. Please note carefully that the equations there are written with the log likelihood while the TensorBoard output logs the negative log likelihood (NLL) which, confusingly, is a positive number.

Practically thinking, what you want to achieve in reinforcement learning (RL) is that the new model (agent) generates highly scoring SMILES with low NLL. This will, somewhat naturally, mean that the very same SMILES would probably not score well with the prior, which is used as a reference or "regularizer", and generate compounds with …

View full answer

halx · 2024-05-13T05:31:57Z

halx
May 13, 2024
Maintainer

Hi,

many thanks for your interest in REINVENT and welcome to the community!

the loss function is documented in our paper, see eqs. 5, 6 and discussion following eq 7. Please note carefully that the equations there are written with the log likelihood while the TensorBoard output logs the negative log likelihood (NLL) which, confusingly, is a positive number.

Practically thinking, what you want to achieve in reinforcement learning (RL) is that the new model (agent) generates highly scoring SMILES with low NLL. This will, somewhat naturally, mean that the very same SMILES would probably not score well with the prior, which is used as a reference or "regularizer", and generate compounds with high NLL. As eq 6 computes the distance between the two models, NB the prior network is fixed, and adds the scaled score as reward, which you would want to behave as 0->1, eq 6 will have to increase in order for the agent to learn to generate better scoring SMILES. Also, keep in mind that the sample is drawn anew from the agent in every step i.e. every new batch with have new SMILES (although there will also be duplicates). That is what you are eventually after.

Many thanks,
Hannes.

1 reply

jackhu5 May 13, 2024
Author

thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry Regarding Loss Score Interpretation in Reinforcement Learning with reinvent4 #79

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Inquiry Regarding Loss Score Interpretation in Reinforcement Learning with reinvent4 #79

jackhu5 May 12, 2024

Replies: 1 comment · 1 reply

halx May 13, 2024 Maintainer

jackhu5 May 13, 2024 Author

jackhu5
May 12, 2024

Replies: 1 comment 1 reply

halx
May 13, 2024
Maintainer

jackhu5 May 13, 2024
Author