You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
1.2.0 (2025-01-27)
Features Added
CSV files are now supported as data file inputs with evaluate() API. The CSV file should have a header row with column names that match the data and target fields in the evaluate() method and the filename should be passed as the data parameter. Column name 'Conversation' in CSV file is not fully supported yet.
Breaking Changes
ViolenceMultimodalEvaluator, SexualMultimodalEvaluator, SelfHarmMultimodalEvaluator, HateUnfairnessMultimodalEvaluator and ProtectedMaterialMultimodalEvaluator will be removed in next release.
Bugs Fixed
Removed [remote] extra. This is no longer needed when tracking results in Azure AI Studio.
Fixed AttributeError: 'NoneType' object has no attribute 'get' while running simulator with 1000+ results
Fixed the non adversarial simulator to run in task-free mode
Content safety evaluators (violence, self harm, sexual, hate/unfairness) return the maximum result as the
main score when aggregating per-turn evaluations from a conversation into an overall
evaluation score. Other conversation-capable evaluators still default to a mean for aggregation.
Fixed bug in non adversarial simulator sample where tasks undefined
Other Changes
Changed minimum required python version to use this package from 3.8 to 3.9
Stop dependency on the local promptflow service. No promptflow service will automatically start when running evaluation.
Evaluators internally allow for custom aggregation. However, this causes serialization failures if evaluated while the
environment variable AI_EVALS_BATCH_USE_ASYNC is set to false.