Skip to content

0.3.6

Compare
Choose a tag to compare
@MaiziXiao MaiziXiao released this 19 Nov 03:54
· 4 commits to main since this release
ff831b1

The OpenCompass team is thrilled to announce the release of OpenCompus v0.3.6!

🌟 Highlights
✨ This release brings several updates and new features that enhance the functionality and performance of OpenCompass. Notable additions include support for long context evaluations, the introduction of the BABILong dataset, and enhancements to the MuSR dataset. We have also welcomed new contributors to our community, which we are excited to introduce.

πŸš€ New Features
πŸ”₯ Added long context evaluation for base models, expanding the scope of model assessments.
πŸ”₯ Introduced the BABILong dataset, enriching the resources available for research and development.
πŸ”₯ Added MUSR dataset evaluation, which evaluates language models on multistep soft reasoning tasks.

πŸ“– Documentation
πŸ“š Updated documentation to reflect the latest changes and features, ensuring that users can easily integrate these updates into their workflows.

πŸ› Bug Fixes
πŸ›  Fixed issues with first_option_postprocess to improve reliability.
πŸ›  Addressed bugs in the PR testing process to ensure smoother contributions from the community.

βš™ Enhancements and Refactors
πŸ”§ Implemented auto-download for FollowBench, streamlining the setup process for new users.
πŸ”§ Refined the CI/CD pipeline, including daily tests and baseline updates, to maintain high standards of quality and performance.

πŸŽ‰ Welcome New Contributors
πŸ‘ We are delighted to welcome three new contributors who have made valuable contributions to this release:

  1. @MCplayerFromPRC for pushing InternTrain evaluation differences.
  2. @DespairL for adding single LoRA adapter support for vLLM inference.
  3. @abrohamLee for contributing MuSR Dataset Evaluation.

We hope you enjoy this new release and find it useful for your projects. Your feedback is always welcome and helps us improve OpenCompass continuously. Thank you for being part of our community! 🌟

Full Changelog: 0.3.5...0.3.6