OpenAI is Looking to Make its Own Chips for AI Training and Inference
Cost savings, flexibility, and potentially conflict with Microsoft
It is no secret that NVIDIA GPUs are the favored computing chips for generative AI training and inference workloads. This is based on efficiency and scaling. However, it is also not a secret that there is more demand for these chips than available supply. That is driving up the price for chip acquisition and the cost of running workloads in cloud providers offering GPU access.
Sam Altman, OpenAI’s CEO, expressed frustration about the chip shortage in May of this year in a talk with developers, according to Humanloop. In a blog post that has since been removed but was recorded by the Internet Archive, Humanloop’s Raza Habib commented:
A common theme that came up throughout the discussion was that currently OpenAI is extremely GPU-limited and this is delaying a lot of their short-term plans. The biggest customer complaint was about the reliability and speed of the API. Sam acknowledged their concern and explained that most of the issue was a result of GPU shortages.
He also said that the fine-turning API and dedicated user capacity were also GPU-limited. The scarcity is a key reason why dedicated capacity required a $100,000 up-front commitment at the time. CNBC reported that OpenAI is considering whether to develop its own chips to alleviate this situation. According to CNBC’s reporting:
CEO Sam Altman has made the acquisition of more AI chips a top priority for the company. He has publicly complained about the scarcity of graphics processing units, a market dominated by Nvidia, which controls more than 80% of the global market for the chips best suited to run AI applications…The company has not yet decided to move ahead, according to recent internal discussions described to Reuters.
Vertical Integration
If OpenAI were to develop chips to run its AI models, the step toward vertical integration would not be unusual. Google introduced its first Tensor Processing Unit (TPU) in 2016, and subsequent models have been optimized to run machine learning workloads. Amazon has created Trainium and Inferentia ASICs for AI workloads. Meta has its Meta Training and Inference Accelerators (MTIA) chips. And NVIDIA not only sells the industry’s most popular GPUs for training generative AI models, it also has its own foundation models and cloud hosting service.
But wait, there’s more. OpenAI’s largest investor, Microsoft, has also developed a chip for AI workloads. The Information reported earlier this year:
The software giant has been developing the chip, internally code-named Athena, since as early as 2019, according to two people with direct knowledge of the project. The chips are already available to a small group of Microsoft and OpenAI employees, who are testing the technology, one of them said. Microsoft is hoping the chip will perform better than what it currently buys from other vendors, saving it time and money on its costly AI efforts.
OpenAI and Microsoft have a close relationship, as the latter purportedly owns 49% of the former. For OpenAI to consider building its own chip may signal that the company is not impressed with Athena’s performance. It could also be driven by the need to reduce reliance on chip makers in general, no matter how close the relationship. Nevertheless, there is the potential for conflict between the organizations over this topic.
Lower Cost and Better Access to Supply
All of the biggest players in generative AI would like to reduce their dependence on NVIDIA. This will save costs and provide more flexibility in meeting demand. It is logical that OpenAI would want the same flexibility, and it is probably the only pure-play generative AI foundation model provider that has the market position and resources to consider this option. CNBC elaborated on the economic proposition:
Since 2020, OpenAI has developed its generative artificial intelligence technologies on a massive supercomputer constructed by Microsoft, one of its largest backers, that uses 10,000 of Nvidia’s graphics processing units (GPUs).
Running ChatGPT is very expensive for the company. Each query costs roughly 4 cents, according to an analysis from Bernstein analyst Stacy Rasgon. If ChatGPT queries grow to a tenth the scale of Google search, it would require roughly $48.1 billion worth of GPUs initially and about $16 billion worth of chips a year to keep operational.
Four cents per query is an enormous cost for anything of significant scale. Other estimates of one cent per query would also incur massive costs. So, anything that helps to bring that cost down flows directly to the operating margin for the company. This is a key reason the cloud hyperscalers are investing so heavily in AI chips.
However, the flexibility to respond to demand by controlling your own chip supply may be just as important. OpenAI has a significant market share lead today. Customers may be forced to consider competing foundation models if OpenAI cannot meet demand due to a chip shortage. Habits are being formed right now. OpenAI wants developers to form their habits and build solutions optimized for the GPT model family.
Funding for an Acquisition?
Synthedia recently reported that OpenAI was looking at an $80-$90 billion valuation for a private sale of some employee shares. The practical reason for this is that OpenAI is not likely to be listed as a public company anytime soon, and it would enable some long-time employees to get some liquidity for their shares. At the same time, some investors who would like to hold OpenAI stock get an opportunity to make a purchase.
Another consideration is that any stock sale today would set a new valuation for the company that was purportedly worth $29 billion when Microsoft made a $10 billion investment earlier this year. Acquiring a chip company could help OpenAI get to market faster, but it will be costly. All the more reason to have a significant valuation when raising funds for an acquisition.
The fact that OpenAI may have this option is evidence of its market power and momentum. There are many foundation model providers. However, for large language models (LLM) today, the others are also viewed as alternatives to OpenAI.