GitHub - fawazsammani/uni-nlx: Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks, ICCVW 2023, Oral, Honourable Mention Award

Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks

[arXiv]
[video presentation at ICCV]

Requirements

PyTorch 1.8 or higher
CLIP (install with pip install git+https://github.com/openai/CLIP.git)
transformers (install with pip install transformers)
cococaption

Images Download

COCO
MPI. Rename to mpi
Flickr30K. Rename to flickr30k
VCR
ImageNet (ILSVRC2012). Rename to ImageNet
Visual Genome v1.2. Rename to VG_100K

Data

The trianing and test data (combined for all datasets) can be found here

Annotations

The annotations in the format that cococaption expects can be found here. Please place them inside the cococaption folder.

Code

train_nlx.py: script for training only
test_datasets.py: script for validation/testing for all epochs on all 7 NLE tasks
clip_model.py: script for vision backbone we use (CLIP visual encoder)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
clip_model.py		clip_model.py
demo_uninlx.png		demo_uninlx.png
test_datasets.py		test_datasets.py
train_nlx.py		train_nlx.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks

Requirements

Images Download

Data

Annotations

Code

Models

Results

About

Releases

Packages

Contributors 2

Languages

fawazsammani/uni-nlx

Folders and files

Latest commit

History

Repository files navigation

Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks

Requirements

Images Download

Data

Annotations

Code

Models

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages