ConcatBERT model for multimodal classification with Text and Images Replication of model and some results obtained in: Gated multimodal networks (based on https://arxiv.org/abs/1702.01992)