WSDM 2023 Cup: Unbias Learning to Rank for A Large Scale Search Dataset from Baidu Search Engine

1 Background

This repository is the solution to the WSDM 2023 Unbiased Learning for Web Search. On the basis of the Dual Learning Algorithm (DLA), our solution conducts extensive and in-depth research on unbias learning to rank and proposes a strategy of using multiple behavioral features for unbiased learning, which greatly improves the performance of ranking models.

2 Model Overview

The overall framework of the model is shown in Fig.1.

Taking the data of one search session as an example, as shown in Fig.1, the text features of the document at position n will be fed into the relevance model to output the relevance score r, while other features of the document that can be used to calculate the propensity score are fed into the propensity model to get the propensity score s. Subsequently, p and r are multiplied to obtain the score s of the position n being clicked.

Note, instead of inputting the entire document list for the session, we pick a group of documents whose group size is 6 from the document list, including 1 document (positive sample) that is clicked and 5 documents (negative samples) that are not clicked. In addition, only the propensity score of the positive sample is provided by the model, while the propensity socre of the negative sample is forced to be set to a fixed value 0.1, which means that p1、p3 and pn in Fig.1 is 0.1.

3 Environment

The environment of unbias learning to rank task is same as the Pre-training for Web Search Task.

4 Quick Start

4.1 Prepare the corpus

Suppose your have downloaded the Web Search Session Data (training data) and annotation_data_0522.txt (test data) on Google drive. Moreover, for those who cannot access google drive:

training data
test data

Note: unzip the train data may spend a lot of time.

4.2 The Pre-trained Language Model

A pre-trained language model is important for the model in Fig.1. You can download the pre-trained language model we trained from the table below:

PTM Version	URL
Bert_Layer12_Head12	Bert_Layer12_Head12
Bert_Layer12_Head12 wwm	Bert_Layer12_Head12 wwm
Bert_Layer24_Head12	Bert_Layer24_Head12
In Table, wwm means that we use whole word masking.

4.3 Directory Structure

After the corpus and pre-trained language model is ready, you should organize them with the following directory structure:

4.3 Training Model

Modify data_root in ./pretrain/start.sh as Your Data Root Path
Then,

cd pretrain
sh start.sh

You can apply tensorboard in output_dir to observe the trend of model indicators

4.4 Test Model

4.4.1 Single model

In order to quickly test the model performance, you can directly download model trained by us whose dcg@10 is 10.25 on annotation_data_0522.txt (val dataset)
Then, modify data_root as Your Data Root Path、 model_name_or_path as the path of model you want to test and model_w as 1 in ./submit/start.sh.
Finally

cd submit
sh start.sh

4.4.2 Model Ensemble

In order to further improve the performance of the model, we used the weighted sum of the output scores of multiple models trained under different settings that we produced during the experiment as the final relevance score.
You can download these model with different setting from the table below:

Model Name	URL	DCG@10 on val dataset
group6_pos_slipoff_mtype_serph_emb8_mlp5l_maxmeancls_bs48	Download	10.03
group6_pos_slipoff_mtype_serph_emb8_mlp5l_maxmeancls	Download	10.14
group6_pos_slipoff_mtype_serph_emb8_mlp5l_wwm	Download	10.16
group6_pos_slipoff_serph_emb8_mlp5l_24l	Download	10.10
group6_pos_slipoff_serph_emb8_mlp5l	Download	10.25
group6_pos_slipoff_mtype_serph_emb8_bnnoelu_mlp5l_relu	Download	10.20
group6_pos_slipoff_mtype_serph_emb8_bnnoelu_dropout_mlp5l_relu	Download	10.14
group6_pos_slipoff_mtype_serph_emb8_bnnoelu_mlp5l_relu_24l	Download	10.23
group6_pos_slipoff_mtype_serh_emb8_bnnoelu	Download	10.15
group6_pos_slipoff_mtype_emb8_bnnoelu	Download	10.15
group6_pos_slipoff_serh_emb8	Download	10.05
group6_pos_slipoff_pad_with_pretrain_emb8	Download	10.05

Then, modify data_root as Your Data Root Path、 model_name_or_path as the path of model you want to test and model_w as 0.10,0.35,0.50,0.25,0.40,0.10,0.10,0.55,0.35,0.05,0.1,0.50 in ./submit/start.sh, in which model_w is set manually.

Finally

cd submit
sh start.sh

The dcg@10 of model Ensemble on val dataset is 10.54 (10.14 on final test dataset)

Contacts

Xiaoshu Chen: xschenranker@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

WSDM 2023 Cup: Unbias Learning to Rank for A Large Scale Search Dataset from Baidu Search Engine

1 Background

2 Model Overview

3 Environment

4 Quick Start

4.1 Prepare the corpus

4.2 The Pre-trained Language Model

4.3 Directory Structure

4.3 Training Model

4.4 Test Model

4.4.1 Single model

4.4.2 Model Ensemble

Contacts

Files

README.md

Latest commit

History

README.md

File metadata and controls

WSDM 2023 Cup: Unbias Learning to Rank for A Large Scale Search Dataset from Baidu Search Engine

1 Background

2 Model Overview

3 Environment

4 Quick Start

4.1 Prepare the corpus

4.2 The Pre-trained Language Model

4.3 Directory Structure

4.3 Training Model

4.4 Test Model

4.4.1 Single model

4.4.2 Model Ensemble

Contacts