v0.0.4
What's Changed
- e2e: Fix permissions error by @russellb in #34
- Include qna_file in mt_bench_branch results by @danmcp in #33
- Include task scores with mmlu results + adjust default api retries by @danmcp in #37
- Bump lm-eval minimum version to 0.4.3 by @nathan-weinberg in #44
- Allow first_n option for gen answers, fix return values with max_workers=1 and only print api errors on final failure by @danmcp in #41
- Read question_id as a string to preserve precision by @danmcp in #42
New Contributors
Full Changelog: v0.0.3...v0.0.4