Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial skeleton for Evaluator classes and exceptions #6

Merged
merged 11 commits into from
Jun 17, 2024

Conversation

nathan-weinberg
Copy link
Member

This PR introduces the Evaluator parent class and the MMLUEvaluator and MTBenchEvaluator child classes

It also introduces the EvalError parent exception class for all eval custom exceptions and the ModelNotFoundError child exception class

Copy link
Member

@alinaryan alinaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?

@nathan-weinberg
Copy link
Member Author

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?

Good question! I think we could go with either approach - either extending the current child classes or adding new child classes but in the same files.

Could you do an explicit approval? I've set the repo so two are required for merge. @alimaredia same for you.

@danmcp and @xukai92 feel free to weigh-in as well - this foundation will define subsequent work.

Copy link
Member

@alinaryan alinaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards new child classes since the inputs/dependencies differ across PR benchmarks. You could add them in this PR or it can be handled in a follow-up PR by anyone.
Thanks for working on this!!

@danmcp
Copy link
Member

danmcp commented Jun 14, 2024

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?
@danmcp and @xukai92 feel free to weigh-in as well - this foundation will define subsequent work.

It would be good to understand the signatures before deciding. But I'm pretty sure they should be new classes. The other thing to consider whether the modeling makes more sense as a is-a or has-a relationship. Has-a might give some useful flexibility.

@nathan-weinberg
Copy link
Member Author

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?
@danmcp and @xukai92 feel free to weigh-in as well - this foundation will define subsequent work.

It would be good to understand the signatures before deciding. But I'm pretty sure they should be new classes. The other thing to consider whether the modeling makes more sense as a is-a or has-a relationship. Has-a might give some useful flexibility.

I pushed up two additional child classes based on the specs in the doc - lmk your thoughts!

src/instructlab/eval/mmlu.py Outdated Show resolved Hide resolved
src/instructlab/eval/mtbench.py Outdated Show resolved Hide resolved
src/instructlab/eval/evaluator.py Outdated Show resolved Hide resolved
Signed-off-by: Nathan Weinberg <[email protected]>
@nathan-weinberg nathan-weinberg merged commit 6632002 into instructlab:main Jun 17, 2024
7 checks passed
@nathan-weinberg nathan-weinberg deleted the skeleton branch June 17, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants