In stark contrast to text-to-image generative AIs, there’s almost nothing available for video. But that may soon change as startup company Runway has recently revealed its new AI model: Gen-2.
Functioning similar to Stable Diffusion (which Runway had a hand in creating, by the way), Gen-2 operates by taking in text prompts to create videos from scratch. As seen on the developer’s website (opens in new tab), you can create aerial footage of a mountain range or a sunset outside a New York City loft. A text-to-video upgrade may not sound all that impressive at first, but it is if you compare it to Runway’s previous endeavor.
Back in February, the developer launched its Gen-1 model (opens in new tab) which was more of a video editor. It required some kind of base, like an unfinished 3D animation or a person, before the model would overlay that footage with AI-created video. The old AI couldn’t create anything from scratch.
Fans of the old model will able to continue enjoying Gen-1 as its features will become separate modes in Gen-2.
Mode 01, however, is the main text-to-video feature component. The second new mode allows you to add an image to a text prompt to produce better results. And with the third mode, you just upload an image to generate a video. A text prompt won’t be required.
Everything beyond Mode 03 is all Gen-1 stuff (opens in new tab). Mode 04: Stylization applies the “styles of any image prompt to every frame of your video” like adding a fiery effect. Mode 05: Storyboard turns mockup footage into AI-rendered video. Next is Mask to isolate subjects and modify them with simple prompts like, “Add spots to a labrador to create a dalmatian.” Seventh is Render where the AI generates a video over a 3D render. The last one, Customization, does the same thing as Render, but with people.
This technology is still in its early stages. The previews from the demo reel are rather strange looking, to say the least. They’re deep into the uncanny valley as buildings melt into one another and people sport vacant stares. Even so, the possibility of having a publicly available text-to-video generative AI is exciting. It can open up new avenues for creativity (or misinformation). Some tech giants have dabbled in AI video before such as Google and its Imagen Video project, but those models are still behind closed doors.
Some reports (opens in new tab) claim there’s a waitlist for early access to Gen-2 on Runway’s private Discord channel. However, the only beta we found is for Gen-1. It’s possible there will be a Gen-2 beta later on in the year, although there’s no official word at the moment. In the meantime, you can join the Discord channel for updates through Runway’s website.