Image credit: Bret Kinsella via Stability AI / Stable Diffusion
If you want to know what is hot, follow the money. In this case, it is another eye-popping early funding round. Stability AI, the company behind the Stable Diffusion text-to-image generator, announced $101 million in new funding. Venture Capital firms Lightspeed Ventures and Coatue led the round.
The Open Source Diffusion
Stable Diffusion adoption has been torrid since its introduction in August, and the valuation reflects that growth. Nearly all of the text-to-image generators appear to be popular by varying degrees. However, the Stable Diffusion model has a twist. Stability AI provides a managed Stable Diffusion model service, which was used to create the featured image for this post. There is also an open source model that anyone can use.
That open source option makes Stable Diffusion quite different than its competitors. Text-to-image generators all hold their models as closely guarded secrets without sharing information about the weights or code. Stable Diffusion was created by a group at the Ludwig Maximilian University of Munich (LMU) and published as an open source model. Stability AI created a partnership to provide an application layer on top of it.
However, this is an open source model, so everyone can use it. The result has been many independent models of the software. Stable Diffusion is also now an option when using the Night Cafe image-generation service.
A Permissive Model
OpenAI, Midjourney, and other text-to-image generators have specific restrictions on how they can be used. There are also typically limitations placed on using the generated images for commercial applications. Stability AI’s Stable Diffusion service enforces similar restrictions. The Stable Diffusion open source license also contains restrictions such as those highlighted below.
You agree not to use the Model or Derivatives of the Model:
In any way that violates any applicable national, federal, state, local or international law or regulation;
For the purpose of exploiting, harming or attempting to exploiot or harm minors in any way;
To generate or disseminate verifiably false information and/or content with the purpose of harming others';
To generate or disseminate personal identifiable information that can bes used to to harm an individual;
To defame, disparage or otherwise harass others;
…
The list goes on, and all seems reasonable. However, there is no way for the creators of Stable Diffusion or monitor or audit uses of the model as it is used in private instances. Enforcement could only arise as the result of having access to every instance of the model or for a user to provide the information. Even then, it is likely impractical to have the LMU CompVis group enforce these provisions and others. The complaint about Stable Diffusions is just this: What will it be used for, and can you restrict nefarious content generation? The answer is generally no.
Voicebot reports that more than 200,000 developers have downloaded Stable Diffusion since August. That is a lot of licenses to track and audit. It is not likely to happen at any meaningful scale.
This is different for DALL-E, Midjourney, and crew. They run the servers that generate the images and the text boxes where prompts are entered. That means these companies can proactively restrict certain types of image generation and have an audit trail of all activity.
Stable Diffusion is also more permissive in terms of ownership rights compared to other solutions. Users have full rights to their created images as long as they don’t infringe upon rights associated with trademarks, copyright, and name, image, and likeness (NIL). A key benefit is that this includes commercial rights.
Shepherding Product Market Fit
Over a million users have signed up for Stability AI’s DreamStudio beta access. The company says it has created more than 170 million images in just a few months. That growth trajectory, along with the intention to add text, audio, and video generators, drove the oversubscribed funding round.
This is not the only big funding round in the text-to-x space. Last week we reported that Jasper AI landed a $125 million A round of funding. Seemingly as a jest, Stability AI called its $101 million funding a seed round. An inside joke, I’m sure.
Interestingly, Jasper raised that mega-round after introducing a text-to-image generator based on OpenAI’s DALL-E to complement its enormously successful text-to-text generator that leverages another OpenAI product, GPT-3. It looks like Jasper and Stability are heading directly into each others’ markets and employing different underlying solutions. You have a battle shaping up with contrasts in the technology platforms, application layers, and bankrolls.