One of the first Synthedia newsletter posts waaaaay back in August of this year was headlined: AI Image Generation is Having a Moment. That comment seems awfully quaint about now. Text-to-image generators have had a Cambrian explosion over the past four months.
Right on cue, the Google Play Best Overall App for 2022 is AI-based text-to-image generator Dream by WOMBO. Google wrote in a blog post yesterday:
Every year we recognize the best apps and games on Google Play and the developers who bring them to life. Their bold ideas and creativity help us reach people across all kinds of devices — from phones and watches to Chromebooks and tablets…
And like every year, 2022’s winners reflect what matters most to people right now — ranging from games that help us escape reality and enter a whole new world, to apps that help us stay grounded and present.
Interestingly, Google’s choice of the best overall game matched the user’s choice: Apex Legends Mobile. For the best overall app, Google chose Dream, and the users chose BeReal. Google has a sense of what trend may be longer lasting. Not to say BeReal isn’t enormously popular. It just gives you that sense it will burn out. Generative AI is here to stay.
WOMBO is About AI for Fun
Wombo launched its first app in 2021. It is a deepfake lipsynch app. This year it brought out Dream with its Stable Diffusion text-to-image model. The app has over 100 million downloads and nearly 500,000 reviews with a 4.4 aggregate star rating.
Many people look at text-to-image generators and think they are about creating art or are an alternative to stock images. This is true for some users. However, WOMBO understands that the dominant use case today is for entertainment. Users will spend hours creating different images and tweaking their prompts. For some, it becomes a full-blown hobby. They watch videos on how to write better prompts. They post their creations regularly on social media. They talk to others about the different image generators.
There is a sense of discovery every time you write a prompt and click enter. Dream is the most popular word used to name text-to-image generators after diffusion. This might be a tip-off on the entertainment angle. Dreams are basically entertainment while you are asleep.
Generative AI as Entertainment
Granted, not all generative AI is entertainment. AI-based text generators have some entertaining qualities, but they tend to be used more as utilities from what we can discern so far. But image generators are clearly fun.
We already know that the biggest players in text-to-image generation are moving quickly to improve the quality of their model output. If we assume that text-to-image generators can be entertainment, then it may point to some other features on the horizon. Many are using Discord today to build a community. You should expect to see some in-app features for social connections in 2023. Think about how the “game” Draw Something morphed into Draw Something with Friends. It’s fun generating new images but more fun to do it with friends.
I would not expect to see this from OpenAI in support of DALL-E. The OpenAI culture is more about technical prowess and quality than fun. It will be popular with the professional apps used by artists and a few purists that look for quality over everything else.
WOMBO and a few other Stable Diffusion based solutions lean in the fun direction. Midjourney may wind up somewhere in between. I suspect it will also be among the favorites of skilled artists. The greater control it offers over prompt engineering is appealing to people that actually know what they are doing.
This market is moving so quickly that everyone is still trying to find their way. Some will wind up in the utility category, some in entertainment, and some elsewhere. However, Google has its own in-house text-to-image and text-to-video solutions and recognizes how important the market will be. Kudos also go out to Google for highlighting an early mover in this space that is focused on consumer ease of use and fun over elaborate displays of AI engineering prowess.