OpenCompass v0.3.1
The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.1!
🌟 Highlights
- 🚀 Support pip installation, update Readme and evaluation demo
- 🐛 Fixed various dataset loading issues.
- ⚙️ Enhanced auto-download features for datasets.
🚀 New Features
- 🆕 Introduced support for Ruler datasets.
- 🆕 Enhanced model compatibility.
- 🆕 Improved dataset handling, support auto-download for various datasets
📖 Documentation
- 📚 Updated README to reflect the latest changes.
- 📚 Improved documentation for dataset loading procedures.
🐛 Bug Fixes
- 🐞 Resolved modelscope dataset load issues.
- 🐞 Corrected evaluation scores for the Lawbench dataset.
- 🐞 Fixed dataset bugs for CommonsenseQA and Longbench.
⚙ Enhancements and Refactors
- 🔧 Retained first and last halves of prompts to avoid max_seq_len issues.
- 🔧 Updated Compassbench to v1.3.
- 🔧 Switched to Python runner for single GPU operations.
🎉 Welcome New Contributors
- 🙌 @Yunnglin for fixing modelscope dataset load problem.
- 🙌 @changyeyu for addressing max_seq_len issues with prompt handling.
- 🙌 @seetimee for updates to openai_api.py.
- 🙌 @HariSeldon0 for adding the scicode dataset.
What's Changed
- [Fix] Fix modelscope dataset load problem by @Yunnglin in #1406
- [Fix] the issue where scores are negative in the Lawbench dataset evaluation(#1402) by @yaoyingyy in #1403
- [Doc] Update README by @tonysy in #1404
- Retain first and last halves of prompts to avoid max_seq_len issues by @changyeyu in #1373
- [UPDATE] Compassbench v1.3 by @MaiziXiao in #1396
- [Fix] longbench dataset load fix by @MaiziXiao in #1422
- [Fix] Sub summarizer order fix by @bittersweet1999 in #1426
- [Update] Support auto-download of FOFO/MT-Bench-101 by @tonysy in #1423
- [Bug] Commonsenseqa dataset fix by @MaiziXiao in #1425
- [Feature] Add abbr for rolebench dataset by @xu-song in #1431
- [Feature] Add Ruler datasets by @MaiziXiao in #1310
- [Fix] Fix openai api tiktoken bug for api server by @liushz in #1433
- Update openai_api.py by @seetimee in #1438
- [Feature] Add model support for 'huggingface_above_v4_33' when using '-a' by @liushz in #1430
- Add scicode by @HariSeldon0 in #1417
- [Doc] Update Readme by @MaiziXiao in #1439
- [Fix] Update option postprocess & mathbench language summarizer by @liushz in #1413
- [ci] add commond testcase into daily testcase by @zhulinJulia24 in #1447
- [Feature] Switch to python runner for single GPU by @xu-song in #1308
- [Fix] Update SciCode and Gemma model by @tonysy in #1449
- [Bump] Bump version to 0.3.1 by @tonysy in #1450
Full Changelog: 0.3.0...0.3.1
Thank you for your continued support and contributions to OpenCompass!