Code models score low on humaneval dataset #1067
Replies: 1 comment 7 replies
-
Could you provide the configs of models and datasets for testing? Thanks! |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我使用opencompass测试了starcoder2,wizardcoder,codellama等代码大模型,在humaneval上的评分非常低(小于10),为什么它们的预处理方式都是一样的呢?不同的代码大模型前后处理好像差别还挺大的吧?
请问大家有遇到这个情况吗?
Beta Was this translation helpful? Give feedback.
All reactions