Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synth eval #24

Merged
merged 2 commits into from
Dec 20, 2024
Merged

Synth eval #24

merged 2 commits into from
Dec 20, 2024

Conversation

matthewcoole
Copy link
Collaborator

Closes #23 by adding LLM as a judge evaluation step to the synthetic test set generation which checks that:

  1. Synthetic questions are clear
  2. Reference a dataset that they are based if they are not general questions
  3. Have an appropriate ground truth to evaluate against.

Synthetic questions not meeting these criteria are removed from the set used to evaluate.

@matthewcoole matthewcoole merged commit 8016933 into main Dec 20, 2024
1 check passed
@matthewcoole matthewcoole deleted the synth-eval branch December 20, 2024 08:47
Copy link

answer_correctness: 0.5072450137294304
answer_relevancy: 0.5050751636311221
context_recall: 0.5142385736312046
context_precision: 0.4634856011771453

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor questions in synthetic test set
1 participant