OpenAI Makes the GPT-4, DALL-E, Whisper, and ChatGPT Model APIs Generally Available
Some other models are also heading to deprecation
There has been a lot of interest among software developers in API access for GPT-4, DALL-E for image generation, Whisper for speech recognition, and GPT-3.5-turbo for ChatGPT-style solutions. Many have been able to secure access but often after long delays. Each of these APIs is now generally available for any developer to use.
Removing Barriers to Adoption
While it appears that OpenAI’s large language model (LLM) APIs are the most popular among developers today, the gated access to GPT-4 has surely led many to try other, more readily available proprietary and open-source options. The shift to general availability will make it easier for developers to start with OpenAI offerings and compare them directly to alternatives, including earlier GPT-3.5 OpenAI models.
The restricted approach has surely had a more negative impact on DALL-E and Whisper adoption. DALL-E appears to have fallen well behind open-source versions of Stable Diffusion and Stability AI API use. The pricing and control difference with using a raw Stable Diffusion model was always going to be an incentive for some developers. However, many developers willingly pay Stability AI for API access because it is readily available and DALL-E is not.
Opening DALL-E for general availability is long overdue. I suspect this may have come sooner if ChatGPT had not been such a runaway hit and consumed so many internal resource cycles. This will also pave the way for a DALL-E 3 launch (the current version is essentially DALL-E 2) with some restrictions but open access to the most mature model version.
Whisper’s popularity also took OpenAI by surprise after its open-source release in the fall of 2022. Its capabilities led to a lot of experimentation and other companies offering hosted versions of the natural language processing (NLP) solution. Looking to capture some of the benefits of interest in the model, OpenAI announced a limited access API in March of this year, and now it is available to anyone.
This could turn out to be OpenAI’s biggest revenue source outside of its LLMs and the ChatGPT service. Proprietary speech recognition translation models offered by leading cloud providers are widely used technologies. These solutions are also expensive and often inferior to Whisper in performance. Whisper could set a new de facto standard in the industry, particularly because of its broad language support.
An API Future
ChatGPT raised OpenAI awareness to unexpected heights. However, it is built to be a generative AI foundation model provider, not an end-user application developer. ChatGPT is a side hustle with great marketing benefits.
The future of OpenAI will be driven by API adoption. Reducing barriers to adoption is critical as competitors are starting to step up in hopes of claiming the spot as the leading OpenAI alternative. The unrestricted access to OpenAI’s leading models for language generation, image generation, speech recognition, and chat will make it a little harder for competitors to gain market traction.
Hey Bret thank you for this update. We have a need to host an LLM and a voice recognition service off the internet, in a private location. What companies or open source projects are available in that model? Thanks!