Skip to content

v0.0.4

Compare
Choose a tag to compare
@alinaryan alinaryan released this 01 Jul 15:35
· 255 commits to main since this release
7642cab

What's Changed

  • e2e: Fix permissions error by @russellb in #34
  • Include qna_file in mt_bench_branch results by @danmcp in #33
  • Include task scores with mmlu results + adjust default api retries by @danmcp in #37
  • Bump lm-eval minimum version to 0.4.3 by @nathan-weinberg in #44
  • Allow first_n option for gen answers, fix return values with max_workers=1 and only print api errors on final failure by @danmcp in #41
  • Read question_id as a string to preserve precision by @danmcp in #42

New Contributors

Full Changelog: v0.0.3...v0.0.4