[🌐 Project Page] [📖 Paper] [🤗 VGQA] [🤗 VGen]
- [2024.09.19] 🔥 VGBench is accepted to EMNLP 2024 main conference!
- [2024.07.15] 🔥 We released the VGQA dataset.
- Clone this repository and navigate to VGBench folder
git clone https://github.com/vgbench/VGBench.git
cd VGBench
- Create the file
keys.py
to load your API Keys into the program. The filekeys.py
should be formated as below.
For each type of model, you can put as many of keys as you want to speed up the evaluation process. Be careful that there is a difference in syntax between the Azure OpenAI and the official OpenAI keys.
keys = {
"gpt-4v": [
dict(
GPT_KEY='SAMPLE-AZURE-API-KEY',
GPT_ENDPOINT='https://sample-azure-endpoint.openai.azure.com/'
),
dict(
GPT_KEY='SAMPLE-OPENAI-KEY',
BASE_URL='https://api.openai.com/v1/'
)
"gpt-4": [
dict(
GPT_KEY='SAMPLE-AZURE-API-KEY',
GPT_ENDPOINT='https://sample-azure-endpoint.openai.azure.com/'
),
dict(
GPT_KEY='SAMPLE-OPENAI-KEY',
BASE_URL='https://api.openai.com/v1/'
)
],
"gpt-35-turbo": [
dict(
GPT_KEY='SAMPLE-AZURE-API-KEY',
GPT_ENDPOINT='https://sample-azure-endpoint.openai.azure.com/'
),
dict(
GPT_KEY='SAMPLE-OPENAI-KEY',
BASE_URL='https://api.openai.com/v1/'
)
]}
You can use vllm to host various open source large language model in a OpenAI compatible API and then add their endpoints to the list above.
- Download the dataset. You can download the whole dataset from 🤗 VGQA. They should be converted to json format and put into
data/{VECTOR_GRAPHICS_FORMAT}/final_dataset_{QUESTION_TYPE}.json
, whereVECTOR_GRAPHICS_FORMAT
should be replaced with one ofsvg
,tikz
,graphviz
andQUESTION_TYPE
should be replaced with the name of a specific question type, such ascolor
.
Run evaluate.py
.
$ python3 evaluate.py -h
usage: evaluate.py [-h] --q-type Q_TYPE --prompt-type {zero-shot,few-shot,zero-shot-cot} --format {svg,tikz,graphviz} --model
{gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1,Llama-3-8B-Instruct-262k,Llama-3-70B-Instruct-Gradient-262k} [--min MIN] [--max MAX] [--single]
Evaluate VGQA Dataset
options:
-h, --help show this help message and exit
--q-type Q_TYPE the type of questions
--prompt-type {zero-shot,few-shot,zero-shot-cot}
--format {svg,tikz,graphviz}
the format of the vector graphics
--model {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1,Llama-3-8B-Instruct-262k,Llama-3-70B-Instruct-Gradient-262k}
the model used to evaluate
--min MIN filter the lower bound of the lenght of the vector graphics
--max MAX filter the upper bound of the lenght of the vector graphics
--single
-
Download the 🤗 VGen dataset and put it into
data
folder. -
Run
generate.py
to generate vector graphics using the large language model of your interest.
$ python3 generate.py -h
usage: generate.py [-h] --format {svg,tikz,graphviz} --model {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1}
Generate Vector Graphics
options:
-h, --help show this help message and exit
--format {svg,tikz,graphviz}
the format of the vector graphics
--model {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1}
the model used to generate
-
Run
render_all_generated.py
with the same parameter to get all generated vector graphics rendered -
Use
evaluate_clip_score.py
andevaluate_fid_score.py
to evaluate the generated results.
If you find VGBench useful for your research and applications, please cite using this BibTeX:
@article{zou2024vgbench,
title={VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation},
author={Zou, Bocheng and Cai, Mu and Zhang, Jianrui and Lee, Yong Jae},
journal={arXiv},
year={2024}
}