paper: https://dl.acm.org/doi/pdf/10.1145/3627673.3679975
For experiments with robust04, follow the instructions on the following ir-datasets to set up TREC disks 4 and 5. https://ir-datasets.com/disks45.html#disks45/nocr/trec-robust-2004
Install dependencies.
conda env create -f=envs/denserr.yml
conda activate denserr
Run Sentence Deletion Analysis experiments
python main.py denserr.DamagedAnalyze --local-scheduler
Task result is output to resources/denserr/analyzer/damaged_analyzer/{cache_file_name}
.
Then, correct and visualize ranking shift results
python scripts/compare_ranking_shifts.py \
resources/denserr/analyzer/damaged_analyzer/{cache_file_name}
You can compare multiple results with this script
python scripts/compare_ranking_shifts.py \
resources/denserr/analyzer/damaged_analyzer/{BM25_result_filename}
resources/denserr/analyzer/damaged_analyzer/{ANCE_result_filename}
resources/denserr/analyzer/damaged_analyzer/{ColBERT_result_filename}
resources/denserr/analyzer/damaged_analyzer/{DeepCT_result_filename}
resources/denserr/analyzer/damaged_analyzer/{SPLADE_result_filename}
When changing the datasets, models, and various settings used in the experiments, please edit conf/param.ini
.
For example, if you are going to do experimets on msmarco document, set params like this:
[DenseErrConfig]
dataset_name=msmarco-doc
available datasets are listed at denserr/dataset/load_dataset.py
To run Sentence Addition Analysis experiments, execute SentenceInstactAnalyze
python main.py denserr.SentenceInstactAnalyze --local-scheduler
For evaluate retireval effectiveness, run Evaluate Task
python main.py denserr.Evaluate --local-scheduler
To resolve dependency issues, we have prepared an conda environment yml file for both ColBERT and SPLADE.
If you want to use these models, create and activate their respective conda envs.
If you are using pyenv, don't forget to set the appropriate Python version using pyenv local [version]
.
colbert: envs/colbert.yml splade: envs/ptsplade.yml
e.g.
conda env create -f=envs/ptsplade.yml
conda activate ptsplade
# for pyenv
pyenv local {your conda version}/envs/ptsplade