Skip to content

Code release for "AnimalBench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding"

Notifications You must be signed in to change notification settings

PRIS-CV/Animal-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AnimalBench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding

This codebase provides the data and code of our NeurIPS 2024 paper:

Yinuo Jing, Ruxu Zhang, Kongming Liang*, Yongxiang Li, Zhongjiang He, Zhanyu Ma and Jun Guo, "Animal-Bench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding", in Proceedings of Neural Information Processing Systems (NeurIPS), 2024.

👀 About Animal-Bench

image

Previous benchmarks (left) relied on limited agent and the scenarios of editing-based benchmarks are unrealistic. Our proposed Animal-Bench (right) includes diverse animal agents, various realistic scenarios, and encompasses 13 different tasks.

Task Demonstration

image

Effectiveness evaluation results: image

Robustness evaluation results: image

Evaluation on Animal-Bench

Data: You can access and download the MammalNet, Animal Kingdom, LoTE-Animal, MSRVTT-QA, TGIF-QA, NExT-QA dataset to obtain the data used in the paper or you can use your own data.

Annotations: You can find our question-answer pair annotation files in /data.

Models: We mainly referred to MVBench to write the test code. You can refer to the structure of several model files in the /model folder to test your own model.

Acknowledgement

Thanks to the open source of the following projects: Chat-UniVi, mPLUG-Owl, Valley, VideoChat, VideoChat2, Video-ChatGPT, Video-LLaMA, Video-LLaVA.

About

Code release for "AnimalBench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages