Unitxt 1.14.0 - Faster Unitxt
What's Changed
- Simplify qa example by @yoavkatz in #1234
- allow multiple references for f1 strings metric by @ShirApp in #1225
- Add bluebench recipes by @shachardon in #1237
- Allow templates dicts to be python dicts and fix a bug in the TemplatesDict definition by @elronbandel in #1240
- Deep copy artifacts that fetched twice by @elronbandel in #1239
- Adding of ANLS metric to doc_vqa and info_vqa datasets by @alfassy in #1241
- Update README.md by @elronbandel in #1242
- Update version to 1.13.1 by @elronbandel in #1244
- Enhancements to inference engines by @lilacheden in #1243
- add post processor to convert log probs dictionary to probabilities of a specific class by @lilacheden in #1247
- CI for metrics other than main + Bugfix in RetrievalAtK by @lilacheden in #1246
- Add huggingface cache disabling option to unitxt settings by @elronbandel in #1250
- Make F1Strings faster by @elronbandel in #1248
- Fix duplicate column deletion bug in pandas serializer by @elronbandel in #1249
- revived no_deep just to compare performance by @dafnapension in #1254
- fixed scigen post-processor by @csrajmohan in #1253
- Add prediction length metric by @perlitz in #1252
- Fix faithfulness confidence intervals by @matanor in #1257
- Allow role names to be captialized in SerializeOpenAiFormatDialog by @yoavkatz in #1259
- Accelerate image example 1000X by @elronbandel in #1258
- Fix the empty few-shot target issue when using produce() by @marukaz in #1266
- fix postprocessors in turl_col_type taskcard by @csrajmohan in #1261
- Fix answer correctness confidence intervals by @matanor in #1256
- add BlueBench as a benchmark to the catalog by @shachardon in #1262
- Fix MultipleSourceLoader documentation by @marukaz in #1270
- Ignore unitxt-venv by @marukaz in #1269
- Add mmmu by @elronbandel in #1271
- A fix for a bug in metric pipeline by @elronbandel in #1268
- Added Tablebench taskcard by @csrajmohan in #1273
- Fix missing deep copy in MapInstanceValues by @yoavkatz in #1267
- Add stream name to generation of dataset by @elronbandel in #1276
- Fix demos pool inference by @elronbandel in #1278
- Fix quality github action by @elronbandel in #1281
- add operators for robustness check on tables by @csrajmohan in #1279
- Instruction in SystemFormet demo support. by @piotrhelm in #1274
- change the max_test_instances of bluebench.recipe.attaq_500 to 100 by @shachardon in #1285
- Add documentation for types and serializers by @elronbandel in #1286
- Add example for image processing with different templates by @elronbandel in #1280
- Integrate metrics team LLMaJ with current unitxt implemantation by @lilacheden in #1205
- performance profiler with visualization by @dafnapension in #1255
- Remove split arg to support old hf datasets versions by @elronbandel in #1288
- add post-processors for tablebench taskcard by @csrajmohan in #1289
- recursive copy seems safer here by @dafnapension in #1295
- Fix performance tracking action by @elronbandel in #1296
- try num of instances in nested global scores by @dafnapension in #1282
- Update version to 1.14.0 by @elronbandel in #1298
New Contributors
- @alfassy made their first contribution in #1241
- @piotrhelm made their first contribution in #1274
Full Changelog: 1.13.0...1.14.0