Yesterday, Stability AI announced the release of Stable Video Diffusion, marking a huge leap forward for open source novel video synthesis.
This new system builds on the image synthesis capabilities of Stable Diffusion to generate a high-quality, coherent video from a single still image. The system is trained to take one image as input and produce a 14 or 25 frame video that continues the scene or action depicted in that seed image. While it was trained on 576x1024 resolution data, it can extrapolate to other dimensions, similar to Stable Diffusion.
Stable Video Diffusion is now available for researchers and developers to deploy in their own applications. This represents an exciting advancement that unlocks new possibilities for video creation and editing.
The possibilities are endless, but some potential use cases include:
Stock footage: animate still stock photos to create customizable b-roll and background videos
Personal photos: make an old family photo come alive by having the AI generate a short video clip from it
Presentations: supplement slides and images with AI-generated video clips to hold audience attention
Film previsualization: Storyboard scenes of your film or video project with AI-generated video clips before shooting
And so much more.
The key benefit over previous video AI models is that Stability AI's new model produces coherent, high-quality results that look believable while matching closed-source performance.
In qualitative evaluations, Stability found Stable Video Diffusion to surpass the most successful video generation models that are already out there.
The model also can be fine-tuned on downstream data on a wide range of tasks, such as multi-view synthesis from a single image.
For those looking to leverage video synthesis capabilities in their products, experiments, or workflows, this announcement opens up an exciting new opportunity. Contact us to deploy Stable Video Diffusion onto a performant, custom instance in a single click.