OpenCompass v0.3.0
The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.0! This release brings a variety of new features, enhancements, and bug fixes to improve your experience.
🌟 Highlights
- Support for OpenAI ChatCompletion
- Updated Model Support List
- Support Dataset Automatic Download
- Support
pip install opencompass
🚀 New Features
- Support for CompassBench Checklist Evaluation
- PR #1339 by @bittersweet1999
- Adding support for Doubao API
- PR #1218 by @LeavittLang
- Support for ModelScope Datasets
- PR #1289 by @wangxingjun778
📖 Documentation
🐛 Bug Fixes
- Fix Typing and Typo
- Fix Lint Issues
- PR #1334 by @DseidLi
- Fix Summary Error in subjective.py
⚙ Enhancements and Refactors
- Upgrade Default Math
pred_postprocessor
- Fix Path and Folder Updates
- Update Get Data Path for LCBench and HumanEval
🔗 Full Change Logs
- [Fix] Change abbr for arenahard dataset by @bittersweet1999 in #1302
- [Fix] Force register by @Leymore in #1311
- [Fix] add bc for alignbench summarizer by @bittersweet1999 in #1306
- [Fix] update Faq by @bittersweet1999 in #1313
- [Fix] Fix rouge evaluator of rolebench_zh by @xu-song in #1322
- [Doc] Update NeedleBench Docs by @DseidLi in #1330
- [Fix] Fix typing and typo by @xu-song in #1331
- [Fix] Fix lint by @DseidLi in #1334
- [Feature] support compassbench Checklist evaluation by @bittersweet1999 in #1339
- Add compassbench wiki&math part by @liushz in #1342
- Compassbench v1_3 subjective evaluation by @MaiziXiao in #1341
- [Fix] Update path and folder by @tonysy in #1344
- Upgrade default math
pred_postprocessor
by @xu-song in #1340 - commit inference ppl datasets by @Quehry in #1315
- CompassBench subjective summarizer added by @MaiziXiao in #1349
- Fix MathBench Generation Config by @liushz in #1351
- [Update] Update model support list by @bittersweet1999 in #1353
- [Update] update Subeval demo config by @bittersweet1999 in #1358
- [Fix] Fix the summary error in subjective.py by @WenjinW in #1363
- [Fix] Support HF models deployed with an OpenAI-compatible API. by @heya5 in #1352
- update docs by @Leymore in #1318
- [Feature] Make NeedleBench available on HF by @DseidLi in #1364
- 【bug fix】: Remove extra ampersands. by @baymax591 in #1365
- [Fix] minor update wildbench by @kleinzcy in #1335
- Adding support for Doubao API by @LeavittLang in #1218
- [Fix] origin_prompt should be None in llm-compression task by @mqy004 in #1225
- Calm dataset by @pengbo807 in #1287
- Add
en
andzh
groups to longbench summarizer; Fix longbench overall score by @xu-song in #1216 - [Revert] "Calm dataset (#1287)" by @bittersweet1999 in #1366
- Charm by @jxd0712 in #1230
- Support ModelScope datasets by @wangxingjun778 in #1289
- [Feature] Update pip install by @tonysy in #1324
- add support for hf_pulse_7b by @QXY716 in #1255
- [Fix] Update get_data_path for LCBench and HumanEval by @tonysy in #1375
- [Bug] Fix bug in turbomind by @tonysy in #1377
- [Fix] Fix version mismatch of CIBench by @kleinzcy in #1380
- [Fix] Fix InternLM2.5-7B-Chat-1M config by @DseidLi in #1383
- [Feature] Support import configs/models/summarizers from whl by @tonysy in #1376
- Calm dataset by @pengbo807 in #1385
- [Feature] Support OpenAI ChatCompletion by @tonysy in #1389
- [Fix] Fix slurm env by @tonysy in #1392
- [Fix] Fix CaLM import by @tonysy in #1395
- [Bump] Bump version for v0.3.0 by @tonysy in #1398
🎉 Welcome New Contributors
- @MaiziXiao made their first contribution in #1341
- @Quehry made their first contribution in #1315
- @WenjinW made their first contribution in #1363
- @heya5 made their first contribution in #1352
- @LeavittLang made their first contribution in #1218
- @pengbo807 made their first contribution in #1287
- @wangxingjun778 made their first contribution in #1289
- @QXY716 made their first contribution in #1255
Full Changelog: 0.2.6...0.3.0