Lists (8)
Sort Name ascending (A-Z)
Stars
LLM and Langchain powered chatbot to handle Google Calendar tasks
SkyReels V1: The first and most advanced open-source human-centric video foundation model
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Inpaint anything using Segment Anything and inpainting models.
LangChain, LangGraph Open Tutorial for everyone!
Based on EcomID, PuLID and InstantID. Swap face between two photos with high ID fidelity, include hair feature.
Align Anything: Training All-modality Model with Feedback
Official implementation for "Stable Flow: Vital Layers for Training-Free Image Editing" [CVPR 2025]
Serverless Machine Learning Course for building AI-enabled Prediction Services from models and features
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
Janus-Series: Unified Multimodal Understanding and Generation Models
Frontier Multimodal Foundation Models for Image and Video Understanding
Industry leading face manipulation platform
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Motion-Controllable Video Diffusion via Warped Noise
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks
🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
A web app to manage your invoices. (WIP)
DiffuEraser is a diffusion model for video inpainting, which performs great content completeness and temporal consistency while maintaining acceptable efficiency.
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
"MiniRAG: Making RAG Simpler with Small and Free Language Models"
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Use OpenAI's realtime API for a chatting with your documents
flathunters / flathunter
Forked from mordax7/flathunterA bot to help people with their rental real-estate search. 🏠🤖
Curated list of datasets and tools for post-training.