Run pip3 install -r setup.txt
git clone https://github.com/google-research/bleurt.git
cd bleurt
pip install .
cd ..
The JSON configuration consists of three main sections: model, data, and result.
- Model Configuration In the model section, you need to provide your API key and specify the model name.
{
"model": {
"watsonxai_token": "Bearer Your-API-Key",
"model_name": "google/flan-t5-xxl"
},
...
}
Replace Your-API-Key with your actual API key.
- Data Configuration
The data section contains parameters related to the input data and evaluation.
{
...
"data": {
"data_path": "path/to/dataset",
"question": "instruction",
"context": "input",
"idea_answer": "output",
"q_num": 5
},
...
}
- data_path: Provide the path to the dataset file (Ex CoQA.json).
- question: Specify the column or field name in the dataset that contains the questions.
- context: Specify the column or field name in the dataset that contains the context or input information.
- idea_answer: Specify the column or field name in the dataset that contains the ideal answers for evaluation.
- q_num: Specify the number of questions to be evaluated from the dataset.
- Result Configuration
The result section is used to define the file where the evaluation results will be saved.
{
...
"result": {
"result_file": "path/to/result-file.csv"
}
}
- result_file: Provide the path to the CSV file where the evaluation results will be stored.
modelname_sourcedata_retriever_reranker_evaluatedOn.csv
Examples : flan-t5-xxl_excludeRedbooks_ES_colBERT_IBMTest.csv flan-t5-xxl_passageAvailable_NA_NA_QuAC.csv
Run the evaluation script
python eval_script.py
Evaluation
Run the evaluation result will be generated into a provided path for result_file
.