Skip to content

News Recommendation System based on NSMS models

License

Notifications You must be signed in to change notification settings

Maguire1999/NewsRecommendationSystem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

b44fce4 · Jan 14, 2023

History

8 Commits
Aug 4, 2022
Aug 4, 2022
Aug 4, 2022
Aug 4, 2022
Jan 14, 2023
Aug 4, 2022

Repository files navigation

News Recommendation

Presentation

The repository currently includes the following models.

Models in published papers

Model Full name Paper
NRMS Neural News Recommendation with Multi-Head Self-Attention https://www.aclweb.org/anthology/D19-1671/

Get started to train

Basic setup.

git clone https://github.com/yusanshi/NewsRecommendation
cd NewsRecommendation
pip3 install -r requirements.txt

Download and preprocess the data.

mkdir data && cd data
# Download GloVe pre-trained word embedding
wget https://nlp.stanford.edu/data/glove.840B.300d.zip
sudo apt install unzip
unzip glove.840B.300d.zip -d glove
rm glove.840B.300d.zip

# Download MIND dataset
# By downloading the dataset, you agree to the [Microsoft Research License Terms](https://go.microsoft.com/fwlink/?LinkID=206977). For more detail about the dataset, see https://msnews.github.io/.

# Uncomment the following lines to use the MIND Large dataset (Note MIND Large test set doesn't have labels, see #11)
# wget https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip https://mind201910small.blob.core.windows.net/release/MINDlarge_test.zip
# unzip MINDlarge_train.zip -d train
# unzip MINDlarge_dev.zip -d val
# unzip MINDlarge_test.zip -d test
# rm MINDlarge_*.zip

# Uncomment the following lines to use the MIND Small dataset (Note MIND Small doesn't have a test set, so we just copy the validation set as test set :)
wget https://mind201910small.blob.core.windows.net/release/MINDsmall_train.zip https://mind201910small.blob.core.windows.net/release/MINDsmall_dev.zip
unzip MINDsmall_train.zip -d train
unzip MINDsmall_dev.zip -d val
cp -r val test # MIND Small has no test set :)
rm MINDsmall_*.zip

# Preprocess data into appropriate format
cd ..
python3 src/data_preprocess.py
# Remember you shoud modify `num_*` in `src/config.py` by the output of `src/data_preprocess.py`

Modify src/config.py to select target model. The configuration file is organized into general part (which is applied to all models) and model-specific part (that some models not have).

vim src/config.py

Run.

# Train and save checkpoint into `checkpoint/{model_name}/` directory
python3 src/train.py
# Load latest checkpoint and evaluate on the test set
python3 src/evaluate.py

You can visualize metrics with TensorBoard.

tensorboard --logdir=runs

# or
tensorboard --logdir=runs/{model_name}
# for a specific model

Tip: by adding REMARK environment variable, you can make the runs name in TensorBoard more meaningful. For example, REMARK=num-filters-300-window-size-5 python3 src/train.py.

Optim study in MIND-mini

Model AUC MRR nDCG@5 nDCG@10 Remark
baseline 0.6253 0.2823 0.3051 0.3731
+SGD 0.5188 0.2148 0.2250 0.2905
+AdamW 0.6298 0.2841 0.3091 0.3765

Norm study in MIND-mini

Model AUC MRR nDCG@5 nDCG@10 Remark
baseline 0.6253 0.2823 0.3051 0.3731
+BN 0.5252 0.2476 0.2565 0.3181
+GN 0.6323 0.2884 0.3122 0.3795
+IN 0.6321 0.2847 0.3101 0.3785
+LN 0.6404 0.2905 0.3172 0.3835

Results in MIND-mini

Model AUC MRR nDCG@5 nDCG@10 Remark
baseline 0.6253 0.2823 0.3051 0.3731
+LN +AdamW + Cosine decay 0.6421 0.2960 0.3239 0.3890

Get started to open website

cd ..
python3 src/web.py

Acknowledge

@misc{yusanshi2020news-recommendation,
  title={news-recommendation},
  author={yusanshi},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/yusanshi/news-recommendation}},
  year={2020}
}

Citation

@misc{Maguire2022news-recommendation,
  title={news-recommendation-system},
  author={Maguire},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/Maguire1999/NewsRecommendationSystem}},
  year={2022}
}

About

News Recommendation System based on NSMS models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published