PyTorch version (Default) | CuPy version
This is the official implementation of our The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023) paper:
Tianjun Wei, Jianghong Ma, Tommy W.S. Chow. Collaborative Residual Metric Learning. [arXiv] [Paper]
The model implementation ensures compatibility with the Recommendation Toolbox RecBole (Github: Recbole).
The requirements of the running environement:
- Python: 3.8+
- PyTorch: 1.9.0+
- RecBole: 1.2.0
Here we only put zip files of datasets in the respository due to the storage limits. To use the dataset, run
unzip -o "Data/*.zip"
to unzip the dataset files.
If you like to test CoRML on the custom dataset, please place the dataset files in the following path:
.
|-Data
| |-[CUSTOM_DATASET_NAME]
| | |-[CUSTOM_DATASET_NAME].user
| | |-[CUSTOM_DATASET_NAME].item
| | |-[CUSTOM_DATASET_NAME].inter
And create [CUSTOM_DATASET_NAME].yaml
in ./Params
with the following content:
dataset: [CUSTOM_DATASET_NAME]
For the format of each dataset file, please refer to RecBole API.
For each dataset, the optimal hyperparameters are stored in Params/[DATASET].yaml
. To tune the hyperparamters, modify the corresponding values in the file for each dataset.
The main hyparameters of CoRML are listed as follows:
-
lambda
($\lambda$ ): Weights for$\mathbf{H}$ and$\mathbf{G}$ in preference scores -
dual_step_length
($\rho$ ): Dual step length of ADMM -
l2_regularization
($\theta$ ): L2-regularization for learning weight matrix$\mathbf{H}$ -
item_degree_norm
($t$ ): Item degree norm for learning weight matrix$\mathbf{H}$ -
global_scaling
($\epsilon$ ): Global scaling in approximated ranking weights (in logarithm scale) -
user_scaling
($t_u$ ): User degree scaling in approximated ranking weights
For datasets containing a large number of items, calculating and storing the complete item-item matrix may lead to out-of-memory problem in environments with small GPU memory. Therefore, we have added the code for graph spectral partitioning to learn the item-item weight matrix on each small partitioned item set. The code was modified based on our previous work FPSR.
Hyperparameter partition_ratio
is used to control the maximum ratio of the partitioned set of items relative to the complete set, ranging from 0 to 1. When partition_ratio
is set to 1, no partitioning will be performed.
To maintain consistency, we perform a sparse approximation of the derived matrice
Hyperparameter sparse_approx
is used to control the sparse approximation. When sparse_approx
is set to False
, no sparse approximation will be performed.
The script run.py
is used to reproduced the results presented in paper. Train and evaluate CoRML on a specific dataset, run
python run.py --dataset DATASET_NAME
We also provide Colab notebook version of CoRML, you can click here to open Google Colab, select the runtime type as GPU, and run the model.
If you wish, please cite the following paper:
@inproceedings{CoRML,
author = {{Wei}, Tianjun and {Ma}, Jianghong and {Chow}, Tommy W.~S.},
title = {Collaborative Residual Metric Learning},
year = {2023},
isbn = {9781450394086},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3539618.3591649},
doi = {10.1145/3539618.3591649},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages = {1107–1116},
numpages = {10},
location = {Taipei, Taiwan},
series = {SIGIR '23}
}
@InProceedings{FPSR,
author = {{Wei}, Tianjun and {Ma}, Jianghong and {Chow}, Tommy W.~S.},
booktitle = {Proceedings of the ACM Web Conference 2023},
title = {Fine-tuning Partition-aware Item Similarities for Efficient and Scalable Recommendation},
year = {2023},
address = {New York, NY, USA},
publisher = {Association for Computing Machinery},
series = {WWW '23},
doi = {10.1145/3543507.3583240},
location = {Austin, TX, USA},
numpages = {11},
url = {https://doi.org/10.1145/3543507.3583240},
}
This project is licensed under the terms of the MIT license.