-
Notifications
You must be signed in to change notification settings - Fork 726
feat: add inference and evaluation script with dataset transformations #733
base: main
Are you sure you want to change the base?
feat: add inference and evaluation script with dataset transformations #733
Conversation
tokenizer_vocab_file_path="/mnt/input_data_dir/pretrained_models/OPT/dependencies/gpt2-vocab.json", | ||
tokenizer_merges_file_path="/mnt/input_data_dir/pretrained_models/OPT/dependencies/gpt2-merges.txt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Metaseq
has a standardized path for the vocab and merges files then we'll need to replace them here :) If not we might need to remove the default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments :)
Dockerfile
Outdated
RUN pip install \ | ||
aim==3.16.2 \ | ||
py-rouge==1.1 \ | ||
rouge_score==0.1.2 \ | ||
parlai==1.7.1 \ | ||
evaluate==0.4.0 | ||
|
||
ENV NLTK_DATA="/usr/share/nltk_data" | ||
RUN python -c "import nltk; nltk.download('punkt', download_dir='${NLTK_DATA}')" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This likely isn't the correct place to make this change.
It is only snippet from our whole Dockerfile which adds the evaluation libraries
from metaseq.data.datasets.types import CommonDatasetConfiguration, DatasetConfiguration, DatasetConfigurationTeacherGenerated, DatasetModelConfig, DatasetModelHooks, DatasetTeacherGeneratedDataHooks, IdentityDict | ||
|
||
# Visual diagram of where hooks/functions are called during inference or data generation | ||
# https://excalidraw.com/#json=zoAk_TdynBHQnP9vZufGm,ekcVg_HqiF79cAp58_HKRQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This visualization may be important for understanding
Issue
Solutions
Add script for model inference and evaluation
Add mappings between dataset and pipeline configuration of eval libraries, metrics, and transformation functions
(4) something something
->4
Added necessary evaluation libraries and re-implemented some metrics
Add PromptGeneratror to create few-shot prompts based on configuration using Jinja templates
This PR is quite large so it may be hard to make sense of.
Originally was only going to be inference.py and few other modifications, but then I kept brining in missing dependencies to avoid gaps and it grew a lot 🤔
Testing
Did not test 😔
Related to: #726
Much of this work was done by @sahajgg, @tupini07, and @anselmwang 🙏