GitHub

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

Bocheng Zou*, Mu Cai*, Jianrui Zhang, Yong Jae Lee

[🌐 Project Page] [📖 Paper] [🤗 VGQA] [🤗 VGen]

💥 News

[2024.09.19] 🔥 VGBench is accepted to EMNLP 2024 main conference!
[2024.07.15] 🔥 We released the VGQA dataset.

🛠️ Install

Clone this repository and navigate to VGBench folder

git clone https://github.com/vgbench/VGBench.git
cd VGBench

Create the file keys.py to load your API Keys into the program. The file keys.py should be formated as below.

For each type of model, you can put as many of keys as you want to speed up the evaluation process. Be careful that there is a difference in syntax between the Azure OpenAI and the official OpenAI keys.

keys = {
    "gpt-4v": [
        dict(
            GPT_KEY='SAMPLE-AZURE-API-KEY',
            GPT_ENDPOINT='https://sample-azure-endpoint.openai.azure.com/'
        ),
        dict(
            GPT_KEY='SAMPLE-OPENAI-KEY',
            BASE_URL='https://api.openai.com/v1/'
        )
    "gpt-4": [
        dict(
            GPT_KEY='SAMPLE-AZURE-API-KEY',
            GPT_ENDPOINT='https://sample-azure-endpoint.openai.azure.com/'
        ),
        dict(
            GPT_KEY='SAMPLE-OPENAI-KEY',
            BASE_URL='https://api.openai.com/v1/'
        )
    ],
    "gpt-35-turbo": [
        dict(
            GPT_KEY='SAMPLE-AZURE-API-KEY',
            GPT_ENDPOINT='https://sample-azure-endpoint.openai.azure.com/'
        ),
        dict(
            GPT_KEY='SAMPLE-OPENAI-KEY',
            BASE_URL='https://api.openai.com/v1/'
        )
    ]}

You can use vllm to host various open source large language model in a OpenAI compatible API and then add their endpoints to the list above.

Download the dataset. You can download the whole dataset from 🤗 VGQA. They should be converted to json format and put into data/{VECTOR_GRAPHICS_FORMAT}/final_dataset_{QUESTION_TYPE}.json, where VECTOR_GRAPHICS_FORMAT should be replaced with one of svg, tikz, graphviz and QUESTION_TYPE should be replaced with the name of a specific question type, such as color.

VGQA

Run evaluate.py.

$ python3 evaluate.py -h
usage: evaluate.py [-h] --q-type Q_TYPE --prompt-type {zero-shot,few-shot,zero-shot-cot} --format {svg,tikz,graphviz} --model
                   {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1,Llama-3-8B-Instruct-262k,Llama-3-70B-Instruct-Gradient-262k} [--min MIN] [--max MAX] [--single]

Evaluate VGQA Dataset

options:
  -h, --help            show this help message and exit
  --q-type Q_TYPE       the type of questions
  --prompt-type {zero-shot,few-shot,zero-shot-cot}
  --format {svg,tikz,graphviz}
                        the format of the vector graphics
  --model {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1,Llama-3-8B-Instruct-262k,Llama-3-70B-Instruct-Gradient-262k}
                        the model used to evaluate
  --min MIN             filter the lower bound of the lenght of the vector graphics
  --max MAX             filter the upper bound of the lenght of the vector graphics
  --single

VGen

Download the 🤗 VGen dataset and put it into data folder.
Run generate.py to generate vector graphics using the large language model of your interest.

$ python3 generate.py -h
usage: generate.py [-h] --format {svg,tikz,graphviz} --model {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1}

Generate Vector Graphics

options:
  -h, --help            show this help message and exit
  --format {svg,tikz,graphviz}
                        the format of the vector graphics
  --model {gpt-4,gpt-35-turbo,Mixtral-8x7B-Instruct-v0.1}
                        the model used to generate

Run render_all_generated.py with the same parameter to get all generated vector graphics rendered
Use evaluate_clip_score.py and evaluate_fid_score.py to evaluate the generated results.

Citation

If you find VGBench useful for your research and applications, please cite using this BibTeX:

@article{zou2024vgbench,
    title={VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation},
    author={Zou, Bocheng and Cai, Mu and Zhang, Jianrui and Lee, Yong Jae},
    journal={arXiv},
    year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.vscode		.vscode
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_dataset.py		build_dataset.py
captions.py		captions.py
evaluate.py		evaluate.py
evaluate_clip_score.py		evaluate_clip_score.py
evaluate_fid_score.py		evaluate_fid_score.py
generate.py		generate.py
generate_questions.py		generate_questions.py
get_i.py		get_i.py
json2spreadsheet_v2.py		json2spreadsheet_v2.py
render_all.py		render_all.py
render_all_generated.py		render_all_generated.py
size_dis.py		size_dis.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

💥 News

🛠️ Install

VGQA

VGen

Citation

About

Releases

Packages

Contributors 2

Languages

License

vgbench/VGBench

Folders and files

Latest commit

History

Repository files navigation

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

💥 News

🛠️ Install

VGQA

VGen

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages