TAESDV is a Tiny AutoEncoder for Stable Diffusion Videos. TAESDV can decode sequences of Stable Diffusion latents into continuous videos with much smoother results than single-frame TAESD (but within the same tiny runtime budget).
Since TAESDV efficiently supports both parallel and sequential frame decoding, TAESDV should be useful for:
- Fast batched previewing for video-generation systems like SVD or AnimateLCM.
- Fast realtime decoding for interactive v2v systems like StreamDiffusion.
Original Video | TAESD Encode, TAESD Decode | TAESD Encode, TAESDV Decode |
---|---|---|
Note
Lots of TODOs still:
- Add StreamDiffusion or other v2v example
- Add performance metrics (it's like the same as TAESD)
- Better / more example videos
- Add to Diffusers somehow?
- Even better checkpoint?
See the AnimateLCM previewing example, which visualizes a TAESDV preview after each generation step.