Shutterstock Launches Text-to-Image Generator Based on OpenAI's DALL-E
Says it will compensate artists and offers an unusual pricing plan.
Shutterstock yesterday launched its text-to-image generation service that it first announced in October. The solution is powered by OpenAI’s DALL-E API and represents one of the few companies that did not choose Stable Diffusion for their AI image generator. Paul Hennessy, Chief Executive Officer at Shutterstock commented:
"Our easy-to-use generative platform will transform the way people tell their stories — you no longer have to be a design expert or have access to a creative team to create exceptional work. Our tools are built on an ethical approach and on a library of assets that represents the diverse world we live in, and we ensure that the artists whose works contributed to the development of these models are recognized and rewarded."
Artist Recognition and Rewards
The last part about artist recognition is somewhat vague. It appears Shutterstock intends to compensate artists whose work was included in the training data behind the AI image generation model with the phrase, “ensure that the artists whose works contributed to the development of these models are recognized and rewarded.”
He did not say the development of the art would be rewarded. That would be easier. The artist using the AI generator could be easily tied to the output. Works that contributed to the model development should include artwork used to train the models. But how will they do that?
A study published in December by researchers from the University of Maryland and NYU showed how Stable Diffusion could recreate near replicas of images in its training data. However, it is not clear this is true for DALL-E, and it is also often unclear who owns rights to the originals. Shutterstock shed a little more light on this topic in a blog post yesterday.
“The AI image generator, at the heart of this partnership, was built using millions of diverse, responsibly sourced visuals. As part of our commitment to responsible AI, we compensate known contributors to AI for licensing revenue associated with AI-generated images on our platform.”
Here we see that the intent is to “compensate known contributors.” It is unclear what a “known contributor” is and what percentage of the training corpus they represent. It is also unclear how much compensation is likely to flow to artists. My prediction is that the figure will be small.
I bring up this point first because it is the most provocative element of the announcement. Many other websites have already deployed image generators, but no one has figured out how to tie new images back to the original artists’ works and create a compensation system.
If Shutterstock has a solution for this fully baked, that is a significant evolution in the market. Of course, they may just be indicating this is their intent without clarity on how, when, or if this might come to fruition. If it does, the solution almost certainly will be a feature of DALL-E.
A Win for DALL-E
The other notable element of the announcement is the selection of OpenAI’s DALL-E as the technology provider. When it comes to AI image generation models, every announcement we have covered since August, except this month’s regarding Microsoft, has identified Stable Diffusion as the technology choice. Some of those companies have deployed their own version of the open source Stable Diffusion model, while others have connected to Stability AI’s APIs for the model.
It appears a key driver for Shutterstock’s decision was the safety filters that OpenAI used in training DALL-E. The blog post mentioned “responsibly sourced visuals,” and the press release says “responsibly-produced generative AI capabilities.” Of course, if OpenAI is now saying they have a way to identify artists and provide a way to trace new images back to source data, that may be a new selling point.
Stability AI also has some “trust and safety” features. Its documentation mentions this saying, “The gRPC response will contain a finish_reason
specifying the outcome of your request in addition to the delivered asset. If the finish_reason
is filter
, this means our safety filter has been activated and the resulting image will be blurred. This is by design.” This is one of the reasons some companies choose the Stability AI API service over deploying their own instance of Stable Diffusion and then developing their own filters.
It is not entirely clear how these filters differ. Regardless, Shutterstock seems to be a meaningful marketplace win for DALL-E.
Paywall From the Start
Shutterstock’s new “AI Generator” feature is not much different from the text box you get with OpenAI’s DALL-E web app. Just type your text and click generate. However, whereas DALL-E and many of the other text-to-image generation solutions enable users to create a few images before signing up and paying, Shutterstock is more aggressive about user conversions.
You can generate images, but if you want to download them, you must create an account and signup for a monthly recurring fee that will cover a set number of images. The monthly subscription packages are:
$29 per month for up to 10 images giving you a rate of $2.90 per image
$59 per month for up to 25 images giving you a rate of $2.36 per image
$169 per month for up to 350 images giving you a rate of $0.48 per image
Of course, if you don’t create all of those images in a month, your cost per image just rose. And no matter how you display the pricing, this is expensive. Midjourney pricing will get you as low as $0.05 per image for around $10 per month. The $30 per month package will cost far less per image if you are a heavy user because it is unlimited image generation. DALL-E is close to $0.10 to $0.15 per image. If you use the feature in Canva, it is basically free.
Shutterstock must be hoping to fleece its existing corporate customer base. Granted, the company must be worried about cannibalizing its existing stock image sales revenue. This is a classic innovator’s dilemma. It probably looked at costs for OpenAI and the likely cannibalization of existing revenue and then decided the right pricing was something like 3-10x the market clearing price for these services today. That doesn’t strike me as a recipe for success, but it will be interested to see how the story unfolds.
I was never a big fan of stock image sites, but I used them sometimes as we need images frequently for our work. About six months ago, we stopped using paid stock image sites and switched to using text-to-image generators. I am happier with the output and save time and money.
This may fall short of an existential crisis for stock image marketplaces, but you can see how the erosion of their business has begun. The high pricing for Shutterstock’s text-to-image feature is likely to drive users to other services that are more affordable and change their habits. Disruption can be hard to navigate. Sometimes you take a wrong turn.