Official repository for the paper Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.
Authors
- Daniele Malitesta* ([email protected])
- Giandomenico Cornacchia** ([email protected])
- Claudio Pomo ([email protected])
- Felice Antonio Merra*** ([email protected])
- Tommaso Di Noia ([email protected])
- Eugenio Di Sciascio ([email protected])
* Work done while at Politecnico di Bari as a PhD student.
** Work done while at Politecnico di Bari before joining IBM Reseach Europe.
*** Work done while at Politecnico di Bari before joining Amazon.
If you wish to cite our paper, here is a reference:
@article{DBLP:journals/corr/abs-2309-05273,
author = {Daniele Malitesta and
Giandomenico Cornacchia and
Claudio Pomo and
Felice Antonio Merra and
Tommaso {Di Noia} and
Eugenio {Di Sciascio}},
title = {Formalizing Multimedia Recommendation through Multimodal Deep Learning},
journal = {CoRR},
volume = {abs/2309.05273},
year = {2023}
}
Paper | Year | Title |
---|---|---|
Ferracani et al. | 2015 | A System for Video Recommendation using Visual Saliency, Crowdsourced and Automatic Annotations |
Jia et al. | Multi-modal learning for video recommendation based on mobile application usage | |
Li et al. | Video recommendation based on multi-modal information and multiple kernel | |
Nie et al. | 2016 | Quality models for venue recommendation in location-based social network |
Chen et al. | Context-aware Image Tweet Modelling and Recommendation | |
Han et al. | 2017 | Learning Fashion Compatibility with Bidirectional LSTMs |
Oramas et al. | A Deep Multimodal Approach for Cold-start Music Recommendation | |
Zhang et al. | Hashtag Recommendation for Multimodal Microblog Using Co-Attention Network | |
Ying et al. | 2018 | Graph Convolutional Neural Networks for Web-Scale Recommender Systems |
Wang et al. | LRMM: Learning to Recommend with Missing Modalities | |
Liu et al. | 2019 | User Diverse Preference Modeling by Multimodal Attentive Metric Learning |
Chen et al. | Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network: Towards Visually Explainable Recommendation | |
Wei et al. | MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video | |
Cheng et al. | MMALFM: Explainable Recommendation by Leveraging Reviews and Images | |
Dong et al. | Personalized Capsule Wardrobe Creation with Garment and User Modeling | |
Chen et al. | POG: Personalized Outfit Generation for Fashion Recommendation at Alibaba iFashion | |
Yu et al. | 2020 | Vision-Language Recommendation via Attribute Augmented Multimodal Reinforcement Learning |
Cui et al. | MV-RNN: A Multi-View Recurrent Neural Network for Sequential Recommendation | |
Wei et al. | Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback | |
Sun et al. | Multi-modal Knowledge Graphs for Recommender Systems | |
Chen et al. | Neural Tensor Model for Learning Multi-Aspect Factors in Recommender Systems | |
Min et al. | Food Recommendation: Framework, Existing Solutions, and Challenges | |
Shen et al. | Enhancing Music Recommendation with Social Media Content: an Attentive Multimodal Autoencoder Approach | |
Yang et al. | Learning to Match on Graph for Fashion Compatibility Modeling | |
Tao et al. | MGAT: Multimodal Graph Attention Network for Recommendation | |
Yang et al. | AMNN: Attention-Based Multimodal Neural Network Model for Hashtag Recommendation | |
Sang et al. | 2021 | Context-Dependent Propagating-Based Video Recommendation in Multimodal Heterogeneous Information Networks |
Liu et al. | Pre-training Graph Transformer with Multimodal Side Information for Recommendation | |
Zhang et al. | Mining Latent Structures for Multimedia Recommendation | |
Vaswani et al. | Multimodal Fusion Based Attentive Networks for Sequential Music Recommendation | |
Lei et al. | Is the suggested food your desired?: Multi-modal recipe recommendation with demand-based knowledge graph | |
Wang et al. | Market2Dish: Health-aware Food Recommendation | |
Zhan et al. | 2022 | A3-FKG: Attentive Attribute-Aware Fashion Knowledge Graph for Outfit Preference Prediction |
Wu et al. | MM-Rec: Visiolinguistic Model Empowered Multimodal News Recommendation | |
Yi et al. | Multi-Modal Variational Graph Auto-Encoder for Recommendation Systems | |
Yi et al. | Multi-modal Graph Contrastive Learning for Micro-video Recommendation | |
Liu et al. | Multi-Modal Contrastive Pre-training for Recommendation | |
Mu et al. | Learning Hybrid Behavior Patterns for Multimedia Recommendation | |
Chen et al. | Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation | |
Yi et al. | A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation | |
Wang et al. | 2023 | DualGNN: Dual Graph Neural Network for Multimedia Recommendation |
Wei et al. | Multi-Modal Self-Supervised Learning for Recommendation | |
Zhou et al. | Bootstrap Latent Representations for Multi-modal Recommendation |
First, install all useful dependencies through:
pip install -r requirements.txt
pip install -r requirements_torch_geometric.txt
If you want to train again all models, run the following:
python -u start_experiments.py --config <dataset_name>
where dataset_name is one of the datasets in our benchmarks.
If you just want to run the generations of the results, run the following:
python -u start_experiments.py --config <dataset_name>_results
where dataset_name is one of the datasets in our benchmarks.
Note that the results may slightly differ from the ones provided here and in the paper, depending on the machine you are running the experiments on.
Office (best results)
Models | Recall@10 | nDCG@10 | EFD@10 | Gini@10 | APLT@10 | iCov@10 | Recall@20 | nDCG@20 | EFD@20 | Gini@20 | APLT@20 | iCov@20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
VBPR | 0.0652 | 0.0419 | 0.1753 | 0.3634 | 0.2321 | 93.83% | 0.1025 | 0.0533 | 0.1479 | 0.3960 | 0.2375 | 97.51% |
MMGCN | 0.0455 | 0.0300 | 0.1140 | 0.0128 | 0.0016 | 3.07% | 0.0798 | 0.0405 | 0.1027 | 0.0231 | 0.0078 | 4.64% |
GRCN | 0.0393 | 0.0253 | 0.1215 | 0.4587 | 0.3438 | 99.01% | 0.0667 | 0.0339 | 0.1051 | 0.4892 | 0.3469 | 99.79% |
LATTICE | 0.0664 | 0.0449 | 0.1827 | 0.2128 | 0.1752 | 87.86% | 0.1029 | 0.0566 | 0.1513 | 0.2652 | 0.2039 | 95.90% |
BM3 | 0.0701 | 0.0460 | 0.1837 | 0.1407 | 0.1427 | 77.13% | 0.1081 | 0.0583 | 0.1550 | 0.1900 | 0.1715 | 91.55% |
FREEDOM | 0.0560 | 0.0365 | 0.1493 | 0.1922 | 0.1875 | 79.12% | 0.0884 | 0.0469 | 0.1282 | 0.2439 | 0.2080 | 90.64% |
Toys (best results)
Models | Recall@10 | nDCG@10 | EFD@10 | Gini@10 | APLT@10 | iCov@10 | Recall@20 | nDCG@20 | EFD@20 | Gini@20 | APLT@20 | iCov@20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
VBPR | 0.0710 | 0.0458 | 0.1948 | 0.2645 | 0.1064 | 84.90% | 0.1006 | 0.0545 | 0.1527 | 0.3011 | 0.1180 | 92.82% |
MMGCN | 0.0256 | 0.0150 | 0.0648 | 0.0989 | 0.0961 | 37.87% | 0.0426 | 0.0200 | 0.0570 | 0.1450 | 0.1058 | 52.51% |
GRCN | 0.0554 | 0.0354 | 0.1604 | 0.3954 | 0.2368 | 92.66% | 0.0831 | 0.0436 | 0.1298 | 0.4329 | 0.2482 | 97.73% |
LATTICE | 0.0805 | 0.0512 | 0.2090 | 0.1656 | 0.0546 | 73.80% | 0.1165 | 0.0617 | 0.1665 | 0.2026 | 0.0684 | 86.58% |
BM3 | 0.0613 | 0.0393 | 0.1582 | 0.0776 | 0.0486 | 56.23% | 0.0901 | 0.0478 | 0.1270 | 0.1154 | 0.0658 | 73.50% |
FREEDOM | 0.0870 | 0.0548 | 0.2284 | 0.1474 | 0.0756 | 62.09% | 0.1249 | 0.0660 | 0.1820 | 0.2007 | 0.0951 | 78.42% |
Beauty (best results)
Models | Recall@10 | nDCG@10 | EFD@10 | Gini@10 | APLT@10 | iCov@10 | Recall@20 | nDCG@20 | EFD@20 | Gini@20 | APLT@20 | iCov@20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
VBPR | 0.0760 | 0.0483 | 0.2119 | 0.2076 | 0.0833 | 83.06% | 0.1102 | 0.0586 | 0.1700 | 0.2376 | 0.0915 | 91.41% |
MMGCN | 0.0496 | 0.0294 | 0.1300 | 0.0252 | 0.0282 | 13.75% | 0.0772 | 0.0379 | 0.1105 | 0.0423 | 0.0345 | 21.37% |
GRCN | 0.0575 | 0.0370 | 0.1817 | 0.3823 | 0.2497 | 94.59% | 0.0892 | 0.0466 | 0.1498 | 0.4178 | 0.2608 | 98.56% |
LATTICE | 0.0867 | 0.0544 | 0.2272 | 0.1153 | 0.0386 | 65.82% | 0.1259 | 0.0661 | 0.1830 | 0.1558 | 0.0511 | 81.60% |
BM3 | 0.0713 | 0.0443 | 0.1831 | 0.0245 | 0.0179 | 32.31% | 0.1051 | 0.0545 | 0.1490 | 0.0414 | 0.0228 | 48.75% |
FREEDOM | 0.0864 | 0.0539 | 0.2279 | 0.0921 | 0.0486 | 55.89% | 0.1286 | 0.0666 | 0.1868 | 0.1359 | 0.0653 | 72.96% |
Sports (best results)
Models | Recall@10 | nDCG@10 | EFD@10 | Gini@10 | APLT@10 | iCov@10 | Recall@20 | nDCG@20 | EFD@20 | Gini@20 | APLT@20 | iCov@20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
VBPR | 0.0450 | 0.0281 | 0.1167 | 0.1501 | 0.0497 | 75.77% | 0.0677 | 0.0349 | 0.0949 | 0.1722 | 0.0552 | 86.54% |
MMGCN | 0.0342 | 0.0207 | 0.0791 | 0.0095 | 0.0046 | 5.10% | 0.0551 | 0.0269 | 0.0678 | 0.0168 | 0.0065 | 8.39% |
GRCN | 0.0330 | 0.0202 | 0.0885 | 0.3087 | 0.2190 | 91.28% | 0.0523 | 0.0259 | 0.0746 | 0.3386 | 0.2273 | 97.09% |
LATTICE | 0.0610 | 0.0372 | 0.1465 | 0.0573 | 0.0129 | 48.44% | 0.0898 | 0.0456 | 0.1185 | 0.0802 | 0.0185 | 64.90% |
BM3 | 0.0548 | 0.0349 | 0.1372 | 0.0776 | 0.0283 | 59.13% | 0.0825 | 0.0430 | 0.1118 | 0.1120 | 0.0385 | 76.75% |
FREEDOM | 0.0603 | 0.0375 | 0.1494 | 0.0621 | 0.0319 | 48.37% | 0.0911 | 0.0465 | 0.1219 | 0.0926 | 0.0442 | 65.81% |
Clothing (best results)
Models | Recall@10 | nDCG@10 | EFD@10 | Gini@10 | APLT@10 | iCov@10 | Recall@20 | nDCG@20 | EFD@20 | Gini@20 | APLT@20 | iCov@20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
VBPR | 0.0339 | 0.0181 | 0.0502 | 0.2437 | 0.0809 | 83.40% | 0.0529 | 0.0229 | 0.0413 | 0.2791 | 0.0915 | 92.33% |
MMGCN | 0.0227 | 0.0119 | 0.0292 | 0.0136 | 0.0044 | 7.58% | 0.0348 | 0.0150 | 0.0240 | 0.0236 | 0.0066 | 12.44% |
GRCN | 0.0319 | 0.0164 | 0.0481 | 0.3990 | 0.2358 | 93.37% | 0.0496 | 0.0209 | 0.0397 | 0.4368 | 0.2459 | 97.77% |
LATTICE | 0.0502 | 0.0275 | 0.0738 | 0.1022 | 0.0134 | 58.49% | 0.0744 | 0.0336 | 0.0589 | 0.1384 | 0.0207 | 76.20% |
BM3 | 0.0418 | 0.0226 | 0.0596 | 0.1348 | 0.0319 | 72.88% | 0.0633 | 0.0281 | 0.0486 | 0.1825 | 0.0449 | 88.65% |
FREEDOM | 0.0547 | 0.0294 | 0.0805 | 0.1509 | 0.0600 | 65.54% | 0.0822 | 0.0363 | 0.0652 | 0.2078 | 0.0843 | 81.91% |