Working toward a functional state

Signed-off-by: Dan McPherson <[email protected]>
instructlab · Jun 20, 2024 · f822e67 · f822e67
1 parent ad020ef
commit f822e67
Show file tree

Hide file tree

Showing 16 changed files with 364 additions and 1,180 deletions.
diff --git a/.spellcheck-en-custom.txt b/.spellcheck-en-custom.txt
@@ -2,5 +2,13 @@
 # make spellcheck-sort
 # Please keep this file sorted:
 # SPDX-License-Identifier: Apache-2.0
-eval
 Tatsu
+TODO
+eval
+gpt
+instructlab
+jsonl
+justfile
+openai
+vllm
+
diff --git a/README.md b/README.md
@@ -5,4 +5,38 @@
 ![Release](https://img.shields.io/github/v/release/instructlab/eval)
 ![License](https://img.shields.io/github/license/instructlab/eval)
 
-Python library for Evaluation
+Python Library for Evaluation
+
+## MT-Bench Testing Steps
+
+TODO: Figure out the right version.  Latest fails with openai.types not found.
+
+```shell
+pip install vllm==0.3.3
+```
+
+You should run with `--tensor-parallel-size <NUM GPUS>` and possibly increase `--max-model-len` to increase the context length
+
+```shell
+python -m vllm.entrypoints.openai.api_server --model instructlab/granite-7b-lab
+```
+
+```shell
+OPENAI_API_KEY="NO_API_KEY" python3 test_gen_answers.py
+```
+
+results are in eval_output/mt_bench/model_answer/instructlab/granite-7b-lab.jsonl
+
+For running judge model with vllm make sure you run with `--served-model-name gpt-4`
+
+You should run with `--tensor-parallel-size <NUM GPUS>` and possibly increase `--max-model-len` to increase the context length
+
+```shell
+python -m vllm.entrypoints.openai.api_server --model instructlab/granite-7b-lab --served-model-name gpt-4
+```
+
+```shell
+OPENAI_API_KEY="NO_API_KEY" python3 test_judge_answers.py
+```
+
+results are in eval_output/mt_bench/model_judgment/gpt-4_single.jsonl
diff --git a/data/mt_bench/model_answer/instructlab/granite-7b-lab.jsonl b/data/mt_bench/model_answer/instructlab/granite-7b-lab.jsonl
diff --git a/data/mt_bench/model_judgment/gpt-4_single.jsonl b/data/mt_bench/model_judgment/gpt-4_single.jsonl
diff --git a/requirements.txt b/requirements.txt
@@ -2,9 +2,9 @@
 FastChat
 shortuuid
 openai<1.0.0
-anthropic
 psutil
 torch
 transformers
 accelerate
 pandas
+pandas-stubs