This is the implementation of the TOIS submission "LLMCDSR: Leveraging Large Language Models for Cross-Domain Sequential Recommendation"
- torch == 2.0.1
- transformers == 4.31.0
The processed data used in our work (i.e., Movie-Book and Food-kitchen) are in ./data
.
For convinience, we also release the pre-trained parameters and the textual embeddings of the generated interactions by LLMs. These files can be downloaded at: https://drive.google.com/file/d/1Wn0l60PBN_ZZ84S6KxhZFVom3Tqi-hG6
One can use the ready-made textual embeddings downloaded above to skip this.
- Run to get the generations of LLMs.
cd ./generation
python candidate_generate_icl.py --base_model={LLM_model_path} --output_name={generation_dir_name}
- As generations may not be standard listed results, use LLM again to parse the names of the generated interaction items:
python parse_items.py --base_model={LLM_model_path} --data_name={generation_dir_name}
- Get the textual embeddings of the parsed generations:
python get_candidate_embeddings.py --model_path={text_embedding_model_path} --task={dataset} --generation_path={parsed_generation_path}
One can use the ready-made pre-trained parameters downloaded above to skip this.
First, get the textual embeddings of the items in the dataset.
cd ./generation
python get_item_embedding.py --task={dataset} --domain={A|B} --model_path={text_embedding_model_path}
Then, run the following commands to pre-train.
cd ./pre-train
python main.py --dataset={dataset} --domain={A|B}
After that, copy the trained parameters into ./pretrained_parameters
fold with the name of {dataset}_projection_{A|B}.pt
Once having prepared the needed ingradients, one can simply run the code to train the model and evaluate the performance:
python main.py --dataset={dataset}