D-ID Integrates Stable Diffusion and GPT-3 Into Creative Video Builder
Animate virtual humans or custom avatars
D-ID announced earlier this month that it had integrated GPT-3 and Stable Diffusion into its Creative Reality Studio. These features mean that GPT-3 can help complete your script, and you can even create a custom avatar that D-ID will animate. This enables creators and marketers to generate customized short videos hosted by virtual humans.
Watch the promo video for the GPT-3 and Stable Diffusion features.
GPT-3 Script Writer
The first feature I tested was the GPT-3 script writing assistant. While the earlier version of the studio web app enabled you to type in a script for the virtual human to perform, there is now a “magic wand” option backed by GPT-3. This works much like earlier GPT-3 third-party integrations. You can write a sentence or two, and GPT-3 will complete the script. If you don’t like it, you can delete the generated text, click the wand again, and it will write out something different.
I chose a voice in the newscaster style and one of the stock avatars. The virtual human rendering may not be as smooth as some other technologies, but it is more lifelike than many competitors. Also, this is not a virtual human built using CGI. D-ID is animating 2D images instead of building an avatar from the ground up.
Some of the technology on display is an extension of D-ID’s product with MyHeritage called Deep Nostalgia. Through that product, D-ID has animated more than 100 million images. This also creates the opportunity to animate portraits in real-time.
Stable Diffusion Avatar Creator
The next test was to try creating an avatar using Stable Diffusion. I typed in a short phrase asking for a university professor's portrait. The solution is very similar to other Stable Diffusion implementations and returned three options. I chose one and then returned to the magic wand to get a GPT-3 generated script.
This was impressive. The rendering time didn’t seem to take much longer than with the stock character. D-ID’s image animation technology appears to be improving quickly too. The head and mouth movements are not perfect, but the rendering of the performance is high quality.
BTW - I left the script as is, even though it includes some errors related to the Shakespearean stories.
Look for More AI Combinations in 2023
GPT-3 and Stable Diffusion are impressive tools on their own. However, there is often added value when you combine the text and image generation technologies. I expect to see most of the advances in 2023 will be a refinement of the single-mode use cases. Text only; Image only; etc. However, we are going to see more solutions like D-ID that offer up a recipe of solutions that make it easier to build a variety of synthetic media content quickly.
You can try D-ID’s Creative Reality Studio for free at the company website. Users get 20 tokens to start. I used 12 tokens to create the two videos above.