Skip to content

Latest commit

 

History

History
16 lines (9 loc) · 855 Bytes

Procedures for Restarting an Incomplete Evaluation.md

File metadata and controls

16 lines (9 loc) · 855 Bytes

Resume Evaluation

In practice, evaluation processes may occasionally fail due to various technical issues, such as model request network interruptions, system failures, or unexpected errors. To ensure the continuity and integrity of the evaluation, follow these steps to effectively restart and complete the process.

Example Scenario:

If the evaluation process for the GPT-4o-Audio model with the dataset my_dataset fails due to a model request network interruption, the last checkpoint is saved in the res/gpt4o_audio/last_res.jsonl file.

To restart the evaluation process, follow these steps:

python audio_evals/main.py --dataset my_dataset --model gpt4o_audio --resume res/gpt4o_audio/last_res.jsonl

This command will resume the evaluation from the last saved checkpoint, ensuring that the process continues seamlessly.