cv: update video & media models references

afondiel · Nov 4, 2024 · 6ab7b08 · 6ab7b08
1 parent aedc7f5
commit 6ab7b08
Showing 1 changed file with 10 additions and 0 deletions.
diff --git a/computer-vision-notes/video-models/video-media-models-notes.md b/computer-vision-notes/video-models/video-media-models-notes.md
@@ -19,6 +19,7 @@
 - [References](#references)
 
 ## Introduction
+
 Video and media models process and analyze video content to recognize objects, events, and patterns for applications like content recommendation, surveillance, and autonomous vehicles.
 
 ### Key Concepts
@@ -128,6 +129,15 @@ cv2.destroyAllWindows()
 3. **Related Topics**: NLP integration in video for tasks like transcription and translation.
 
 ## References
+
 - [Karpathy, A., et al. "Large-scale Video Classification with Convolutional Neural Networks."](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42455.pdf) 
 - [Tran, D., et al. "Learning Spatiotemporal Features with 3D Convolutional Networks."](https://arxiv.org/pdf/1412.0767)
 
+
+SOTA Video Models:
+- [Text-to-Video: The Task, Challenges and the Current State - Alara Dirik (Hugging Face)](https://huggingface.co/blog/text-to-video)
+- [Top 10 Multimodal Models - Encords](https://encord.com/blog/top-multimodal-models/)
+
+SOTA Media Models:
+- [Movie Gen: A Cast of Media Foundation Models - MetaAI](https://ai.meta.com/static-resource/movie-gen-research-paper)
+- Collection of Media models from FAL.AI: https://fal.ai/models