Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to upload EvaluationResults via API/SDK? #3634

Open
DevonPeroutky opened this issue Feb 7, 2025 · 1 comment
Open

How to upload EvaluationResults via API/SDK? #3634

DevonPeroutky opened this issue Feb 7, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@DevonPeroutky
Copy link

DevonPeroutky commented Feb 7, 2025

I have an existing set of evaluations that I would like to publish to Weave, but they don't fit in the Evaluation workflow that is outlined in the documentation like:

Evaluation(
    dataset=examples, scorers=[match_score1]
).evaluate()

For various reasons, I need to execute my predictions and scoring outside of of weave.Evaluation. I can wrap my prediction in a weave.op, but because the scoring for this evaluation is formal verification that involves spinning up a docker container and then writing results to a file locally, I can't simply pass in a dataset, model, and scorers to Evaluation.

Is there a way to create an evaluation and submit EvaluationResults via the SDK, outside of Evaluation(...).evaluate?

@gtarpenning
Copy link
Member

Hi! Thanks for the feature request, this will be brought up for internal triage (I know that this is already in the backlog). There are possibly ways around this, like writing the results of your evaluation to disk, then spinning up a little simulation script (with weave ops) that reads those values in scorer functions and "predicts" just by looking at the source of truth.

Stay tuned for a first-class way of handling this though, it's a great idea.

@gtarpenning gtarpenning added the enhancement New feature or request label Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants