You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a problem in the paper to demonstrate step-dpo effectiveness: The square root of t is greater than 2 and less than 3.5. How many integer values of t satisfy this condition? However I change the prompt as "The square root of t is greater than 2.3 and less than 3.5. How many integer values of t satisfy this condition?" The answer is like
The text was updated successfully, but these errors were encountered:
hxdtest
changed the title
Does step-dpo works?
Does step-dpo work?
Sep 22, 2024
There is a problem in the paper to demonstrate step-dpo effectiveness: The square root of t is greater than 2 and less than 3.5. How many integer values of t satisfy this condition? However I change the prompt as "The square root of t is greater than 2.3 and less than 3.5. How many integer values of t satisfy this condition?" The answer is like
The text was updated successfully, but these errors were encountered: