MLLaMA (llama-3.2 Vision model) MLLaMA is a multimodal model, and reuse the multimodal modules in examples/multimodal