Arslan-Mehmood1

Follow

🎯

Focusing

Arslan Mehmood Arslan-Mehmood1

🎯

Focusing

Follow

Deep Learning Engineer

9 followers · 15 following

Achievements

Achievements

Lists (8)

Sort

Face_related

genAI

10 repositories

llm

rag

SOTA

text to music, song

video

vision

Stars

jgordley / GoogleCalendarAssistant

LLM and Langchain powered chatbot to handle Google Calendar tasks

Python 171 30 Updated Dec 21, 2023

KwaiVGI / LivePortrait

Bring portraits to life!

Python 14,186 1,526 Updated Feb 28, 2025

SkyworkAI / SkyReels-V1

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 1,582 139 Updated Feb 24, 2025

bcmi / Light-A-Video

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Python 352 25 Updated Feb 28, 2025

yangchris11 / samurai

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,555 418 Updated Feb 18, 2025

geekyutao / Inpaint-Anything

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 6,927 590 Updated Feb 29, 2024

LangChain-OpenTutorial / LangChain-OpenTutorial

LangChain, LangGraph Open Tutorial for everyone!

Jupyter Notebook 410 176 Updated Feb 26, 2025

twlelev / FaceSwap

Based on EcomID, PuLID and InstantID. Swap face between two photos with high ID fidelity, include hair feature.

Python 14 2 Updated Dec 9, 2024

PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback

Python 2,424 336 Updated Feb 19, 2025

snap-research / stable-flow

Official implementation for "Stable Flow: Vital Layers for Training-Free Image Editing" [CVPR 2025]

Python 293 19 Updated Jan 28, 2025

featurestoreorg / serverless-ml-course

Serverless Machine Learning Course for building AI-enabled Prediction Services from models and features

Jupyter Notebook 556 277 Updated Sep 24, 2024

roboflow / maestro

streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL

Python 2,411 186 Updated Feb 25, 2025

patchy631 / ai-engineering-hub

Jupyter Notebook 2,950 637 Updated Feb 27, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,436 2,160 Updated Feb 1, 2025

DAMO-NLP-SG / VideoLLaMA3

Frontier Multimodal Foundation Models for Image and Video Understanding

Jupyter Notebook 562 35 Updated Feb 24, 2025

s0md3v / roop

one-click face swap

Python 29,370 6,630 Updated Aug 19, 2024

facefusion / facefusion

Industry leading face manipulation platform

Python 21,692 3,278 Updated Feb 25, 2025

declare-lab / TangoFlux

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

Jupyter Notebook 661 58 Updated Feb 20, 2025

Eyeline-Research / Go-with-the-Flow

Motion-Controllable Video Diffusion via Warped Noise

Python 780 41 Updated Feb 26, 2025

fudan-generative-vision / hallo3

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

Python 1,090 150 Updated Feb 27, 2025

magic-research / Sa2VA

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 925 59 Updated Feb 25, 2025

jeremyarancio / invoice-reader-app

A web app to manage your invoices. (WIP)

TypeScript 8 1 Updated Feb 13, 2025

lixiaowen-xw / DiffuEraser

DiffuEraser is a diffusion model for video inpainting, which performs great content completeness and temporal consistency while maintaining acceptable efficiency.

Python 313 27 Updated Jan 22, 2025

Francis-Rings / StableAnimator

[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…

Python 1,163 67 Updated Feb 27, 2025

HKUDS / MiniRAG

"MiniRAG: Making RAG Simpler with Small and Free Language Models"

Python 788 92 Updated Feb 28, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,719 1,338 Updated Feb 21, 2025

actualize-ae / voice-chat-pdf

Forked from run-llama/voice-chat-pdf

Use OpenAI's realtime API for a chatting with your documents

TypeScript 237 33 Updated Jan 15, 2025

flathunters / flathunter

Forked from mordax7/flathunter

A bot to help people with their rental real-estate search. 🏠🤖

HTML 892 188 Updated Feb 17, 2025

mlabonne / llm-datasets

Curated list of datasets and tools for post-training.

2,753 236 Updated Jan 29, 2025

PasiKoodaa / Chat-with-Screen

Python 18 1 Updated Sep 28, 2024