3 Themes from the Synthedia 2 Conference
From ChatGPT to virtual newscasters, the industry is growing fast
Synthedia 2, the second edition of the synthetic media industry’s first conference series, centered on themes and trends that have dominated the second half of 2022. The event included real-world use cases in action, experts using the technology in new ways, and initial results from some of Synthedia’s evaluation of AI-based text and image generators.
Many of the sessions will be published over the next several weeks in Voicebot’s YouTube channel synthetic media playlist. However, while you wait for those videos, I wanted to offer a glimpse of three themes we saw across the event.
Synthetic Media Used in Production
We are by now accustomed to seeing impressive synthetic media demos. However, there was always a question hanging out there. How long will it take for these novelties to convert into production-ready solutions used daily by businesses and professionals? The wait is over.
Marc Scarpa, CEO of Defiance Media, showcased the company’s virtual human newscaster, Roxanna. He estimates that Roxanna has appeared in over 1,500 news segments since debuting in 2021. DeFiance created Roxanna as a custom virtual human avatar based on a real-life model and has created an image around her similar to what you would do for a human news anchor. You can see Roxanna read the news five times per day on DeFiance.
Heise Online’s managing editor for new media, Hannah Moderkamp, reviewed the publisher’s success using a voice clone of a popular podcast host to double the output of its podcast news show. The solution led to a rise in downloads of 37% in the first week of use and has continued to deliver added reach.
Raphael Schaad of the AI Training Institut talked about what he has learned, co-authoring a book with GPT-3. We discuss his process and enhanced productivity, along with how ChatGTP and the recent introduction of GPT-3.5 have led him to begin rewriting the book with the more powerful models.
Copilots Make Experts More Productive
A key trend in synthetic media is the rise of the copilot application to augment the performance of experts. GitHub Copilot was one of the first examples in this category to take off. It supports software developers in writing code. Synthedia showcased several other copilot-oriented solutions.
Altered AI CEO Ioannis Agiomyrgiannakis showcased the company’s studio and voice editor products. They enable professional voiceover artists and sound engineers in gaming and entertainment to rapidly prototype alternative voices, accents, and styles supporting both hyper-automation and hyper-creation benefits. Altered’s synthetic voice product can increase voice actors’ range of variability by 8x and enable sound and character designers to experiment with many more voice options than have been available in the past.
Scott Stevenson, CEO of Rally Legal showed off Spellbook. The GPT-3 based solution helps lawyers draft contracts in real-time. It will even analyze existing contracts to provide summaries, suggest changes, and highlight points for negotiation. He told Synthedia that his company is onboarding 35 law firms per week for the new product, but the backlog suggests they could do three times that rate and still not keep up with demand.
We also heard from artist Michael Pierre Price. He was commissioned to create 1,001 AI-generated artworks on Midjourney for a non-profit’s charity auction. An accomplished artist, Price was able to use the AI tool to generate a wide variety of artworks in novel styles that aligned with the non-profit’s mission. This was a big effort that took a few months to complete but likely would have taken decades if he were to paint all of the works.
Generative Variability and Rapid Improvement
Futurist Bakz T. Future joined to offer a generative AI state of the industry report. We discussed the key events in 2022 that were milestones for the adoption of AI-based text and image generators. He weighed in on ChatGPT and how it was the logical evolution of InstructGPT, which debuted in February. He also offered his predictions for 2023. His expectations include important product launches in AI-generated music and video, along with even more impressive text and image generation models.
Voicebot.ai’s Eric Schwartz and Andrew Herndon weighed in on what they’ve learned from analyzing more than 20 text and image generation AI models. Schartz showed several examples of AI-generated poems and short essays created by different solutions. The contrasting quality of the output was notable. There was also one model that refused to accept the premise of his prompt and, instead of offering an argument as requested, argued against the idea. AI21 and three flavors of OpenAI’s large language models were demonstrated, including ChatGTP.
Andrew Herndon showed off the wide variability you see from AI image generators. While it is true that these solutions produce amazing images, Herndon emphasized that you may have to kiss a lot of frogs before finding a prince. Most of the AI-generated images are likely to be mediocre to bad, and you have to stick with a deliberate process to produce gems. He also offered specific insights related to DALL-E, Midjourney, and Stable Diffusion.
You can find out more about the presenters and access links to their products here. Videos of the sessions will be coming soon.
This was a great way to wrap up 2022, which might just be the year of synthetic media. However, as much as we saw amazing developments in synthetic media this year, 2023 is shaping up to be even bigger.