Skip to content

Commit

Permalink
cv: update video & media models references
Browse files Browse the repository at this point in the history
  • Loading branch information
afondiel committed Nov 4, 2024
1 parent aedc7f5 commit 6ab7b08
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions computer-vision-notes/video-models/video-media-models-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
- [References](#references)

## Introduction

Video and media models process and analyze video content to recognize objects, events, and patterns for applications like content recommendation, surveillance, and autonomous vehicles.

### Key Concepts
Expand Down Expand Up @@ -128,6 +129,15 @@ cv2.destroyAllWindows()
3. **Related Topics**: NLP integration in video for tasks like transcription and translation.

## References

- [Karpathy, A., et al. "Large-scale Video Classification with Convolutional Neural Networks."](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42455.pdf)
- [Tran, D., et al. "Learning Spatiotemporal Features with 3D Convolutional Networks."](https://arxiv.org/pdf/1412.0767)


SOTA Video Models:
- [Text-to-Video: The Task, Challenges and the Current State - Alara Dirik (Hugging Face)](https://huggingface.co/blog/text-to-video)
- [Top 10 Multimodal Models - Encords](https://encord.com/blog/top-multimodal-models/)

SOTA Media Models:
- [Movie Gen: A Cast of Media Foundation Models - MetaAI](https://ai.meta.com/static-resource/movie-gen-research-paper)
- Collection of Media models from FAL.AI: https://fal.ai/models

0 comments on commit 6ab7b08

Please sign in to comment.