diff --git a/README.md b/README.md index df5bf686..b726da97 100644 --- a/README.md +++ b/README.md @@ -26,10 +26,10 @@ AlpacaEval provides the following: - [**Human evaluation data**](#data-release): 20K human preferences between a given and reference model on the [AlpacaFarm](https://github.com/tatsu-lab/alpaca_farm/tree/main) evaluation set. 2.5K of these are cross-annotations (4 humans annotating the same 650 examples). -- [**AlpacaEval dataset**](#data-release): a simplification +- [**AlpacaEval dataset**](https://huggingface.co/datasets/tatsu-lab/alpaca_eval/blob/main/alpaca_eval.json): a simplification of [AlpacaFarm's](https://github.com/tatsu-lab/alpaca_farm/tree/main) evaluation set, where "instructions" and " inputs" are merged - into one field, and reference outputs are longer. + into one field, and reference outputs are longer. [Details here]((#data-release)). **When to use AlpacaEval?** Our automatic evaluator is a quick and cheap proxy for human evaluation of simple instruction-following tasks.