Releases · open-compass/opencompass

19 Nov 03:54

0.3.6

ff831b1

0.3.6

The OpenCompass team is thrilled to announce the release of OpenCompus v0.3.6!

🌟 Highlights
✨ This release brings several updates and new features that enhance the functionality and performance of OpenCompass. Notable additions include support for long context evaluations, the introduction of the BABILong dataset, and enhancements to the MuSR dataset. We have also welcomed new contributors to our community, which we are excited to introduce.

🚀 New Features
🔥 Added long context evaluation for base models, expanding the scope of model assessments.
🔥 Introduced the BABILong dataset, enriching the resources available for research and development.
🔥 Added MUSR dataset evaluation, which evaluates language models on multistep soft reasoning tasks.

📖 Documentation
📚 Updated documentation to reflect the latest changes and features, ensuring that users can easily integrate these updates into their workflows.

🐛 Bug Fixes
🛠 Fixed issues with first_option_postprocess to improve reliability.
🛠 Addressed bugs in the PR testing process to ensure smoother contributions from the community.

⚙ Enhancements and Refactors
🔧 Implemented auto-download for FollowBench, streamlining the setup process for new users.
🔧 Refined the CI/CD pipeline, including daily tests and baseline updates, to maintain high standards of quality and performance.

🎉 Welcome New Contributors
👏 We are delighted to welcome three new contributors who have made valuable contributions to this release:

@MCplayerFromPRC for pushing InternTrain evaluation differences.
@DespairL for adding single LoRA adapter support for vLLM inference.
@abrohamLee for contributing MuSR Dataset Evaluation.

We hope you enjoy this new release and find it useful for your projects. Your feedback is always welcome and helps us improve OpenCompass continuously. Thank you for being part of our community! 🌟

Full Changelog: 0.3.5...0.3.6

Contributors

MCplayerFromPRC, DespairL, and abrohamLee

Assets 2

04 Nov 02:56

MaiziXiao

0.3.5

db258eb

0.3.5 Latest

Latest

The OpenCompass team is thrilled to announce the release of OpenCompress v0.3.5!

🌟 Highlights

🚀 Introduction of two new datasets: CMO&AIME, expanding our evaluation capabilities.
📖 Several updates to our documentation, ensuring clearer guidance for all users.
⚙ Several enhancements and refactoring efforts to make our codebase more robust and maintainable.

🚀 New Features

🆕 Added support for the CMO&AIME datasets, broadening the scope of models we can evaluate. (#1610)
🆕 Introduced the CompassArenaSubjectiveBench, a new benchmark for subjective evaluations. (#1645)
🆕 Added configurations for the lmdeploy DeepSeek model, enhancing compatibility with cutting-edge technologies. (#1656)

📖 Documentation

📚 Updated the documentation to reflect the latest changes and improvements, making it easier than ever to navigate and understand. (#1655)

🐛 Bug Fixes

🔧 Fixed issues with the ruler_16k_gen component, ensuring more accurate and reliable results. (#1643)
🔧 Resolved an error in the get_loglikelihood function when using lmdeploy as the accelerator. (#1659)
🔧 Addressed problems with automatic downloads for certain datasets, streamlining the user experience. (#1652)

⚙ Enhancements and Refactors

💪 Enhanced the summarizer configurations for models, improving the efficiency and effectiveness of summarization tasks. (#1600)
💪 Added new model configurations, keeping up with the latest advancements in machine learning. (#1653)
💪 Updated the WildBench maximum sequence length, allowing for better handling of longer input sequences. (#1648)
💪 Updated the Needlebench OSS path, ensuring smoother data access and processing. (#1651)
💪 Improved the mmmlu_lite dataloader, optimizing data loading processes. (#1658)

🎉 Welcome New Contributors

👏 A warm welcome to @jnanliu, who has made their first contribution by adding the CMO&AIME datasets! (#1610)

For a complete overview of all changes, please refer to the full changelog: 0.3.4...0.3.5

Contributors

jnanliu

Assets 2

25 Oct 12:25

MaiziXiao

0.3.4

9c39cb6

0.3.4

The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.4!

🎉 OpenCompass v0.3.4 brings major enhancements including new benchmarks, improved documentation, and numerous bug fixes.
🌈 Notable features include support for new datasets and the integration of lmdeploy pipeline API.

🔧 Support for New Datasets:

Addition of GaoKaoMath Dataset for Evaluation.
Support for MMMLU & MMMLU-lite Benchmark.
Integration of Judgerbench and reorganization of subeval.
Support for LiveCodeBench.

📝 Output Format Enhancements:

Support for printing and saving results as markdown format tables.

🔧 Pipeline and Integration Improvements:

Integration of lmdeploy pipeline API.
Update of TurboMindModel through integration of lmdeploy pipeline API.
Removal of prefix bos_token from messages when using lmdeploy as the accelerator.

🛠️ Miscellaneous Enhancements:

Updates to the common summarizer regex extraction.
Internal humaneval postprocess addition and updates.

📖 Documentation Updates

🐛 Bug Fixes

🎉 Welcome New Contributors
👋 New Contributors Joined the Team:

@BobTsang1995 - Contributed support for MMMLU & MMMLU-lite Benchmark.
@noemotiovon - Provided NPU support fixes.
@changlan - Fixed RULER datasets.
@BIGWangYuDong - Added support for printing and saving results as markdown format tables.
Thank you to all contributors who have made this release possible. For a complete list of changes, please see the full changelog linked below.

Full Changelog: 0.3.3...0.3.4

Contributors

changlan, BIGWangYuDong, and 2 other contributors

Assets 2

30 Sep 08:58

MaiziXiao

0.3.3

22a4e76

0.3.3

🌟 OpenCompass v0.3.3 Release Log
The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.3!

🚀 New Features

🔧 Added support for the SciCode summarizer configuration.
🛠 Introduced support for internal Followbench.
🔧 Updated models and configurations for MathBench & WikiBench under FullBench.
🛠 Enhanced support for OpenAI O1 models and Qwen2.5 Instruct.
🔧 Included a postprocess function for custom models.
🛠 Added InternTrain feature for broader model training scenarios.

📖 Documentation

📚 Updated the README with the latest information on how to use OpenCompass effectively.

🐛 Bug Fixes

🔧 Fixed issues with the link-check workflow and wildbench.
🛠 Resolved errors in partitioning and corrected typos throughout the codebase.
🔧 Addressed compatibility issues with lmdeploy interface type changes.
🛠 Fixed the followbench dataset configuration and token settings.

⚙ Enhancements and Refactors

🛠 Enhanced support for verbose output in OpenAI API interactions.
🔧 Updated maximum output length configurations for multiple models.
🛠 Improved handling of the "begin section" in meta_template for better parsing.
🔧 Added a common summarizer for qabench and expanded test coverage for various models.

🎉 Welcome New Contributors
👋 We'd like to extend a warm welcome to our new contributors who have made their first contributions to OpenCompass:

@x54-729 introduced InternTrain.
@chuanyangjin helped correct typos.
@cuauty added support for reasoning from BaiLing LLM.

Thank you to all our contributors for making this release possible!

Full Changelog: 0.3.2.post1...0.3.3

Contributors

cuauty, x54-729, and chuanyangjin

Assets 2

06 Sep 10:48

MaiziXiao

0.3.2.post1

b5f8afb

0.3.2.post1

What's Changed

[Fix]Init import fix by @MaiziXiao in #1500
[Bump] Bump version to 0.3.2.post1 by @MaiziXiao in #1502

Full Changelog: 0.3.2...0.3.2.post1

Contributors

MaiziXiao

Assets 2

06 Sep 08:21

MaiziXiao

0.3.2

ff18545

0.3.2

The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.2!

🚀 New Features

🛠 Added extra_body support for OpenAISDK and introduced proxy URL support when connecting to OpenAI's API.
🗂 Included auto-download functionality for Mmlu-pro, Needlebench, Longbench and other datasets.
🤝 Integrated support for the Rendu API.
🧪 Added a model postprocess function.

📖 Documentation

📜 Updated the README file for better clarity and guidance.

🐛 Bug Fixes

🛠 Fixed CLI evaluation for multiple models.
🛠 Updated requirements to resolve dependency issues.
🛠 Corrected configurations for the Llama model series.
🛠 Addressed bad cases and added environment information to improve testing.

⚙ Enhancements and Refactors

🛠 Made OPENAI_API_BASE compatible with OpenAI's default environment settings.
🛠 Optimized SciCode for improved performance.
🛠 Added an api_key attribute to TurboMindAPIModel.
🛠 Implemented fixes and improvements to the CI test environment, including baselines for vllm.

🎉 Welcome New Contributors

👋 @cpa2001 contributed with the addition of icl_sliding_k_retriever.py and updates to __init__.py.
👋 @gyin94 made the OPENAI_API_BASE compatible with OpenAI's default environment.
👋 @chengyingshe added an attribute api_key into TurboMindAPIModel.
👋 @yanzeyu supported the integration of Rendu API.

Full Changelog: 0.3.1...0.3.2

Contributors

yanzeyu, gyin94, and 2 other contributors

Assets 2

23 Aug 03:00

MaiziXiao

0.3.1

5485207

OpenCompass v0.3.1

The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.1!

🌟 Highlights

🚀 Support pip installation, update Readme and evaluation demo
🐛 Fixed various dataset loading issues.
⚙️ Enhanced auto-download features for datasets.

🚀 New Features

🆕 Introduced support for Ruler datasets.
🆕 Enhanced model compatibility.
🆕 Improved dataset handling, support auto-download for various datasets

📖 Documentation

📚 Updated README to reflect the latest changes.
📚 Improved documentation for dataset loading procedures.

🐛 Bug Fixes

🐞 Resolved modelscope dataset load issues.
🐞 Corrected evaluation scores for the Lawbench dataset.
🐞 Fixed dataset bugs for CommonsenseQA and Longbench.

⚙ Enhancements and Refactors

🔧 Retained first and last halves of prompts to avoid max_seq_len issues.
🔧 Updated Compassbench to v1.3.
🔧 Switched to Python runner for single GPU operations.

🎉 Welcome New Contributors

🙌 @Yunnglin for fixing modelscope dataset load problem.
🙌 @changyeyu for addressing max_seq_len issues with prompt handling.
🙌 @seetimee for updates to openai_api.py.
🙌 @HariSeldon0 for adding the scicode dataset.

What's Changed

[Fix] Fix modelscope dataset load problem by @Yunnglin in #1406
[Fix] the issue where scores are negative in the Lawbench dataset evaluation(#1402) by @yaoyingyy in #1403
[Doc] Update README by @tonysy in #1404
Retain first and last halves of prompts to avoid max_seq_len issues by @changyeyu in #1373
[UPDATE] Compassbench v1.3 by @MaiziXiao in #1396
[Fix] longbench dataset load fix by @MaiziXiao in #1422
[Fix] Sub summarizer order fix by @bittersweet1999 in #1426
[Update] Support auto-download of FOFO/MT-Bench-101 by @tonysy in #1423
[Bug] Commonsenseqa dataset fix by @MaiziXiao in #1425
[Feature] Add abbr for rolebench dataset by @xu-song in #1431
[Feature] Add Ruler datasets by @MaiziXiao in #1310
[Fix] Fix openai api tiktoken bug for api server by @liushz in #1433
Update openai_api.py by @seetimee in #1438
[Feature] Add model support for 'huggingface_above_v4_33' when using '-a' by @liushz in #1430
Add scicode by @HariSeldon0 in #1417
[Doc] Update Readme by @MaiziXiao in #1439
[Fix] Update option postprocess & mathbench language summarizer by @liushz in #1413
[ci] add commond testcase into daily testcase by @zhulinJulia24 in #1447
[Feature] Switch to python runner for single GPU by @xu-song in #1308
[Fix] Update SciCode and Gemma model by @tonysy in #1449
[Bump] Bump version to 0.3.1 by @tonysy in #1450

Full Changelog: 0.3.0...0.3.1

Thank you for your continued support and contributions to OpenCompass!

Contributors

tonysy, xu-song, and 9 other contributors

Assets 2

06 Aug 17:34

tonysy

0.3.0

264fd23

OpenCompass v0.3.0

The OpenCompass team is thrilled to announce the release of OpenCompass v0.3.0! This release brings a variety of new features, enhancements, and bug fixes to improve your experience.

🌟 Highlights

Support for OpenAI ChatCompletion
Updated Model Support List
Support Dataset Automatic Download
Support pip install opencompass

🚀 New Features

Support for CompassBench Checklist Evaluation
- PR #1339 by @bittersweet1999
Adding support for Doubao API
- PR #1218 by @LeavittLang
Support for ModelScope Datasets
- PR #1289 by @wangxingjun778

📖 Documentation

Update NeedleBench Docs
- PR #1330 by @DseidLi
Update Documentation
- PR #1318 by @Leymore

🐛 Bug Fixes

Fix Typing and Typo
- PR #1331 by @xu-song
Fix Lint Issues
- PR #1334 by @DseidLi
Fix Summary Error in subjective.py
- PR #1363 by @WenjinW

⚙ Enhancements and Refactors

Upgrade Default Math pred_postprocessor
- PR #1340 by @xu-song
Fix Path and Folder Updates
- PR #1344 by @tonysy
Update Get Data Path for LCBench and HumanEval
- PR #1375 by @tonysy

🔗 Full Change Logs

[Fix] Change abbr for arenahard dataset by @bittersweet1999 in #1302
[Fix] Force register by @Leymore in #1311
[Fix] add bc for alignbench summarizer by @bittersweet1999 in #1306
[Fix] update Faq by @bittersweet1999 in #1313
[Fix] Fix rouge evaluator of rolebench_zh by @xu-song in #1322
[Doc] Update NeedleBench Docs by @DseidLi in #1330
[Fix] Fix typing and typo by @xu-song in #1331
[Fix] Fix lint by @DseidLi in #1334
[Feature] support compassbench Checklist evaluation by @bittersweet1999 in #1339
Add compassbench wiki&math part by @liushz in #1342
Compassbench v1_3 subjective evaluation by @MaiziXiao in #1341
[Fix] Update path and folder by @tonysy in #1344
Upgrade default math pred_postprocessor by @xu-song in #1340
commit inference ppl datasets by @Quehry in #1315
CompassBench subjective summarizer added by @MaiziXiao in #1349
Fix MathBench Generation Config by @liushz in #1351
[Update] Update model support list by @bittersweet1999 in #1353
[Update] update Subeval demo config by @bittersweet1999 in #1358
[Fix] Fix the summary error in subjective.py by @WenjinW in #1363
[Fix] Support HF models deployed with an OpenAI-compatible API. by @heya5 in #1352
update docs by @Leymore in #1318
[Feature] Make NeedleBench available on HF by @DseidLi in #1364
【bug fix】: Remove extra ampersands. by @baymax591 in #1365
[Fix] minor update wildbench by @kleinzcy in #1335
Adding support for Doubao API by @LeavittLang in #1218
[Fix] origin_prompt should be None in llm-compression task by @mqy004 in #1225
Calm dataset by @pengbo807 in #1287
Add en and zh groups to longbench summarizer; Fix longbench overall score by @xu-song in #1216
[Revert] "Calm dataset (#1287)" by @bittersweet1999 in #1366
Charm by @jxd0712 in #1230
Support ModelScope datasets by @wangxingjun778 in #1289
[Feature] Update pip install by @tonysy in #1324
add support for hf_pulse_7b by @QXY716 in #1255
[Fix] Update get_data_path for LCBench and HumanEval by @tonysy in #1375
[Bug] Fix bug in turbomind by @tonysy in #1377
[Fix] Fix version mismatch of CIBench by @kleinzcy in #1380
[Fix] Fix InternLM2.5-7B-Chat-1M config by @DseidLi in #1383
[Feature] Support import configs/models/summarizers from whl by @tonysy in #1376
Calm dataset by @pengbo807 in #1385
[Feature] Support OpenAI ChatCompletion by @tonysy in #1389
[Fix] Fix slurm env by @tonysy in #1392
[Fix] Fix CaLM import by @tonysy in #1395
[Bump] Bump version for v0.3.0 by @tonysy in #1398

🎉 Welcome New Contributors

@MaiziXiao made their first contribution in #1341
@Quehry made their first contribution in #1315
@WenjinW made their first contribution in #1363
@heya5 made their first contribution in #1352
@LeavittLang made their first contribution in #1218
@pengbo807 made their first contribution in #1287
@wangxingjun778 made their first contribution in #1289
@QXY716 made their first contribution in #1255

Full Changelog: 0.2.6...0.3.0

Contributors

tonysy, wangxingjun778, and 16 other contributors

Assets 2

05 Jul 16:36

Leymore

0.2.6

a62c613

OpenCompass v0.2.6

The OpenCompass team is thrilled to announce the release of OpenCompass v0.2.6!

🌟 Highlights

No noteworthy highlights.

🚀 New Features

#1215 #1224 #1266 Add Datasets MT-Bench-101, Fofo, wildbench
#1286 Add Models InternLM2.5-7B

📖 Documentation

#1252 Add doc for accelerator function
#1263 Update quick start guide

🐛 Bug Fixes

#1221 Resolve release version installation and import issues
#1228 Fix pip version issues
#1282 Update MathBench summarizer & fix cot setting

⚙ Enhancements and Refactors

#1284 Reorganize subjective eval

🎉 Welcome New Contributors

@mqy004, @sefira, @Zor-X-L and @baymax591 made their first contributions. Welcome to the OpenCompass community!

🔗 Full Change Logs

[Fix] fix summarizer by @bittersweet1999 in #1217
解决release版本安装后不能导入opencompass.cli.main的问题 by @mqy004 in #1221
MT-Bench-101 by @sefira in #1215
[Feature] add dataset Fofo by @bittersweet1999 in #1224
[Fix] fix pip version by @bittersweet1999 in #1228
add ",<2.0.0" to "numpy>=1.23.4" in requirements/runtime.txt, as pand… by @Zor-X-L in #1267
Support wildbench by @kleinzcy in #1266
Add doc for accelerator function by @liushz in #1252
flash attn installation in daily testcase by @zhulinJulia24 in #1272
Update mtbench101.py by @sefira in #1276
[Sync] Sync with internal codes 2024.06.28 by @Leymore in #1279
Update MathBench summarizer & fix cot setting by @liushz in #1282
npu适配 by @baymax591 in #1250
[ci] update daily testcase by @zhulinJulia24 in #1285
[Feature] Add InternLM2.5 by @tonysy in #1286
[Feat] Update owners for issues by @tonysy in #1293
[Refactor] Reorganize subjective eval by @bittersweet1999 in #1284
[Doc] quick start swap tabs by @Leymore in #1263

Full Changelog: 0.2.5...0.2.6

Contributors

Zor-X-L, xingyuanbu, and 8 other contributors

Assets 2

29 May 16:35

Leymore

0.2.5

a77b8a5

OpenCompass v0.2.5

The OpenCompass team is thrilled to announce the release of OpenCompass v0.2.5!

🌟 Highlights

Simplify the huggingface / vllm / lmdeploy model wrapper. meta_template is no longer needed to be hand-crafted in model configs
Introduce evaluation results README in ~20 dataset config folders.

🚀 New Features

#1065 Add LLaMA-3 Series Configs
#1048 Add TheoremQA with 5-shot
#1094 Support Math evaluation via judgemodel
#1080 Add gpqa prompt from simple_evals, openai
#1074 Add mmlu prompt from simple_evals, openai
#1123 Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs

📖 Documentation

#1053 Update readme
#1102 Update NeedleInAHaystack Docs
#1110 Update README.md
#1205 Remove --no-batch-padding and Use --hf-num-gpus

🐛 Bug Fixes

#1036 Update setup.py install_requires
#1051 Fixed the issue caused
#1043 fix multiround
#1070 Fix sequential runner
#1079 Fix Llama-3 meta template

⚙ Enhancements and Refactors

#1163 enable HuggingFacewithChatTemplate with --accelerator via cli
#1104 fix prompt template
#1109 Update performance of common benchmarks

🎉 Welcome New Contributors

@liuwei130, @IcyFeather233, @VVVenus1212, @binary-husky, @dmitrysarov, @eltociear, @acylam, @lfy79001, @JuhaoLiang1997, @yaoyingyy, and @jxd0712 made their first contributions. Welcome to the OpenCompass community!

🔗 Full Change Logs

[Fix] Update setup.py install_requires by @Leymore in #1036
add ChemBench by @liuwei130 in #1032
[Fix] logger.error -> logger.debug in OpenAI by @Leymore in #1050
[Sync] Bump version to 0.2.4 by @Leymore in #1052
[Doc] Update readme by @tonysy in #1053
[fix]Fixed the issue caused by the repeated loading of VLLM model dur… by @IcyFeather233 in #1051
[Sync] Sync with internal code 2024.04.19 by @Leymore in #1064
[Fix] fix multiround by @bittersweet1999 in #1043
[Feature] Add LLaMA-3 Series Configs by @Leymore in #1065
[Feature] Add TheoremQA with 5-shot by @Leymore in #1048
[Fix] Fix sequential runner by @Leymore in #1070
Add lmdeploy tis python backend model by @ispobock in #1014
Fix Llama-3 meta template by @liushz in #1079
Add humaneval prompt from simple_evals, openai by @jingmingzhuo in #1076
[Feature] Support Math evaluation via judgemodel by @bittersweet1999 in #1094
[Feature] support arenahard evaluation by @bittersweet1999 in #1096
Update CIBench by @kleinzcy in #1089
[Feature] Add gpqa prompt from simple_evals, openai by @Francis-llgg in #1080
[Deperecate] Remove multi-modal related stuff by @kennymckormick in #1072
add vllm get_ppl by @VVVenus1212 in #1003
fix: python path bug by @binary-husky in #1063
fix output typing, change mutable list to immutable tuple by @dmitrysarov in #989
[Doc] Update NeedleInAHaystack Docs by @DseidLi in #1102
[Feature] add support for Flames datasets by @Yggdrasill7D6 in #1093
adapt to lmdeploy v0.4.0 by @lvhan028 in #1073
[Fix] fix prompt template by @bittersweet1999 in #1104
[Fix] Fix Math Evaluation with Judge Model Evaluator & Add README by @liushz in #1103
[Update] Update performance of common benchmarks by @tonysy in #1109
[Fix] fix cmb dataset by @bittersweet1999 in #1106
[Docs] Update README.md by @eltociear in #1110
[Feature] Adding support for LLM Compression Evaluation by @acylam in #1108
[Fix] remove redundant pre-commit check by @Leymore in #891
fix LightllmApi workers bug by @helloyongyang in #1113
[Feature] Add mmlu prompt from simple_evals, openai by @Leymore in #1074
[Feature] update drop dataset from openai simple eval by @kleinzcy in #1092
add mgsm datasets by @Yggdrasill7D6 in #1081
[Fix] Fix AGIEval chinese sets by @xu-song in #972
S3Eval Dataset by @lfy79001 in #916
[Feature] Add AceGPT-MMLUArabic benchmark by @JuhaoLiang1997 in #1099
[Fix] fix links by @bittersweet1999 in #1120
[Fix] Fix NeedleBench Summarizer Typo by @DseidLi in #1125
[Feature] Add Qwen1.5 MoE 7b and Mixtral 8x22b model configs by @acylam in #1123
[Sync] Update accelerator by @Leymore in #1122
[Fix] fix alpacaeval while add caching path by @bittersweet1999 in #1139
[Fix] fix multiround by @bittersweet1999 in #1146
[Fix] Fix Needlebench Summarizer by @DseidLi in #1143
[Feature] Add huggingface apply_chat_template by @Leymore in #1098
[Feat] Support dataset_suffix check for mixed configs by @xu-song in #973
[Format] Add some config lints by @Leymore in #892
[Sync] Sync with internal codes 2024.05.14 by @Leymore in #1156
[Fix] fix arenahard summarizer by @bittersweet1999 in #1154
[Fix] use ProcessPoolExecutor during mbpp eval by @Leymore in #1159
[Fix] Update stop_words in huggingface_above_v4_33 by @Leymore in #1160
Update accelerator by @liushz in #1152
[Feat] enable HuggingFacewithChatTemplate with --accelerator via cli by @Leymore in #1163
update test workflow by @zhulinJulia24 in #1167
[Sync] Sync with internal codes 2024.05.17 by @Leymore in #1171
add dependency in daily test workflow by @zhulinJulia24 in #1173
[Sync] Sync with internal codes 2024.05.21.1 by @Leymore in #1175
Update MathBench by @liushz in #1176
[Fix] fix template by @bittersweet1999 in #1178
Fix a bug in drop_gen.py by @kleinzcy in #1191
[Fix] temporary files using tempfile by @yaoyingyy in #1186
[Fix] add support for lmdeploy api judge by @bittersweet1999 in #1193
[Fix] fix length by @bittersweet1999 in #1180
support CHARM (https://github.com/opendatalab/CHARM) reasoning tasks by @jxd0712 in #1190
[Feat] Update charm summary by @Leymore in #1194
Update accelerator by @liushz in #1195
[Sync] S...