MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

[Website 🌐], [Text-to-Image data 🤗], [Image-to-Text data 🤗]

Overview

This repo contains the source code of MMDT (Multimodal DecodingTrust). This research endeavor is designed to help researchers and practitioners better understand the capabilities, limitations, and potential risks involved in deploying these state-of-the-art Multimodal foundation models (MMFMs). See our paper for details.

This project is organized around the following six primary perspectives of trustworthiness, including:

Safety
Hallucination
Fairness
Privacy
Adversarial robustness
Out-of-Distribution Robustness

Project Structure

This project is structured around subdirectories dedicated to each area of trustworthiness. Each subdir includes scripts, data, and a dedicated README for easy comprehension.

Getting Started

Clone the repository

git clone https://github.com/AI-secure/MMDT.git && cd MMDT

Install requirements

Create a new environment:

conda create --name mmdt python=3.9
conda activate mmdt

Install PyTorch following this link. Then install the requirements:

pip install -r requirements.txt
python -m spacy download en_core_web_sm

Evaluate all perspectives

bash scripts/t2i.sh {model_id}  # Evaluate a text-to-image model
bash scripts/i2t.sh {model_id}  # Evaluate an image-to-text model

Evaluate each perspective

We also provide off-the-shelf scripts for evaluating each perspective under ./scripts. For example, the following script evaluates all scenarios and tasks of image-to-text modality for the hallucination perspective.

bash scripts/hallucination_i2t.sh gpt-4o

An example of the output summarized score can be found here.

Moreover, you can customize the evaluation process with specific perspective, scenario, and task by running the following script:

python mmdt/main.py --modality {modality} --model_id {model_id} --perspectives {perspective} --scenario {scenario} --task {task}

For example, to evaluate gpt-4o on hallucination under natural selection scenario and action recognition task, we can run the following example script.

python mmdt/main.py --modality image_to_text --model_id gpt-4o --perspectives hallucination --scenario natural --task action

Our framework includes the following perspectives, scenarios, and tasks:

Text-to-image models
├── safety
│   ├── vanilla
│   ├── transformed
│   └── jailbreak
├── hallucination
│   ├── natural
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   └── count
│   ├── counterfactual
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   └── count
│   ├── misleading
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   └── count
│   ├── distraction
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   └── count
│   ├── ocr
│   │   ├── complex
│   │   ├── contradictory
│   │   ├── distortion
│   │   └── misleading
│   └── cooccurrence
│       ├── identification
│       ├── attribute
│       ├── spatial
│       └── count
├── fairness
│   ├── social_stereotype
│   ├── decision_making
│   ├── overkill
│   └── individual
├── privacy
│   └── training
│       └── laion_1k
├── adv
│   └── adv
│       ├── object
│       ├── attribute
│       └── spatial
└── ood
    ├── Shake_
    │   ├── helpfulness
    │   ├── count
    │   ├── spatial
    │   ├── color
    │   └── size
    └── Paraphrase_
        ├── helpfulness
        ├── count
        ├── spatial
        ├── color
        └── size

Image-to-text models
├── safety
│   ├── typography
│   ├── illustration
│   └── jailbreak
├── hallucination
│   ├── natural
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   ├── count
│   │   └── action
│   ├── counterfactual
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   └── count
│   ├── misleading
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   ├── count
│   │   └── action
│   ├── distraction
│   │   ├── identification
│   │   ├── attribute
│   │   ├── spatial
│   │   ├── count
│   │   └── action
│   ├── ocr
│   │   ├── contradictory
│   │   ├── cooccur
│   │   ├── doc
│   │   └── scene
│   └── cooccurrence
│       ├── identification
│       ├── attribute
│       ├── spatial
│       ├── count
│       └── action
├── fairness
│   ├── occupation
│   ├── education
│   ├── activity
│   └── person_identification
├── privacy
│   ├── location
│   │   ├── Pri-SV-with-text
│   │   ├── Pri-SV-without-text
│   │   ├── Pri-4Loc-SV-with-text
│   │   └── Pri-4Loc-SV-without-text
│   └── pii
├── adv
│   └── adv
│       ├── object
│       ├── attribute
│       └── spatial
└── ood
    ├── Van_Gogh
    │   ├── attribute
    │   ├── count
    │   ├── spatial
    │   └── identification
    ├── oil_painting
    │   ├── attribute
    │   ├── count
    │   ├── spatial
    │   └── identification
    ├── watercolour_painting
    │   ├── attribute
    │   ├── count
    │   ├── spatial
    │   └── identification
    ├── gaussian_noise
    │   ├── attribute
    │   ├── count
    │   ├── spatial
    │   └── identification
    ├── zoom_blur
    │   ├── attribute
    │   ├── count
    │   ├── spatial
    │   └── identification
    └── pixelate
        ├── attribute
        ├── count
        ├── spatial
        └── identification

Notes

Each of the six perspectives has its subdirectory containing the respective code and README.
Follow the specific README: Every subdirectory has its own README. Refer to these documents for information on how to run the scripts and interpret the results.

License

This project is licensed under the CC BY 4.0 - see the LICENSE file for details.

Contact

Please reach out to us if you have any questions or suggestions. You can submit an issue or pull request, or send an email to [email protected].

Thank you for your interest in MMDT. We hope our work will contribute to a more trustworthy, fair, and robust AI future.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
mmdt		mmdt
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Overview

Project Structure

Getting Started

Clone the repository

Install requirements

Evaluate all perspectives

Evaluate each perspective

Notes

License

Contact

About

Releases

Packages

Contributors 10

Languages

AI-secure/MMDT

Folders and files

Latest commit

History

Repository files navigation

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Overview

Project Structure

Getting Started

Clone the repository

Install requirements

Evaluate all perspectives

Evaluate each perspective

Notes

License

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Languages

Packages