Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The result in conversational task is better than paper's #4

Open
C-hase opened this issue Mar 13, 2023 · 2 comments
Open

The result in conversational task is better than paper's #4

C-hase opened this issue Mar 13, 2023 · 2 comments

Comments

@C-hase
Copy link

C-hase commented Mar 13, 2023

I ran your code successfully. Although the result in recommendation part is almost the same, the result of conversational part is so much better than your paper's. And I didn't change the source code. Would you tell me the reasons about it?
The following is the result in my device.
'test/dist@2': 0.5503033723719486, 'test/dist@3': 0.9362212501763792, 'test/dist@4': 1.211090729504727
The following is the result in your paper.
image

@dayuyang1999
Copy link

I am not the author. But here is my guess: Dist is a metric promoting diversity. The sampling method to use and their hyperparameters of generative module can largely influence the diversity. So try different hyperparameters or seed may help.

@wxl1999
Copy link
Owner

wxl1999 commented Nov 27, 2023

@dayuyang1999 You are right. In addition, Dist is sensitive to the training steps. So I suggest paying more attention to human evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants