Amazon CEO Lays Out Generative AI Strategy, Says Everyone in the Company is Using the Tech
Rohit Prasad also moves from Alexa to heading up generative AI / AGI initiatives
Amazon was slower than its rivals to go all-in on generative AI, but it now appears the largest battleship in cloud computing is heading squarely in that direction. Executive realignment and increased investment are clear evidence of this change.
Rohit Prasad, the long-time head scientist for Alexa and business leader for the product unit since last year, was recently named senior vice president and head scientist of AGI. AGI is the acronym for artificial general intelligence. Generative AI is thought by some people to be a step toward achieving humanlike intelligence or the more advanced concept of superintelligence.
Andy Jassy, Amazon’s CEO, spoke at length about the company’s perspective on generative AI in the company’s quarterly earnings call yesterday, and much of it deserves to be quoted at length. Jassy said in his prepared remarks:
Generative AI has captured people's imagination, but most people are talking about the application layer, specifically what OpenAI has done with ChatGPT. It's important to remember that we're in the very early days of the adoption and success of generative AI, and that consumer applications is only one layer of the opportunity.
…
Inside Amazon, every one of our teams is working on building generative AI applications that reinvent and enhance their customers' experience. But while we will build a number of these applications ourselves, most will be built by other companies, and we're optimistic that the largest number of these will be built on AWS. Remember, the core of AI is data. People want to bring generative AI models to the data, not the other way around.
The Online Store for Generative AI Models
Jassy is expressing two core ideas here. One is that generative AI infrastructure layer services are as critical as the application layer and that most companies will not build a foundation model. Instead, they will start with a pre-trained foundation model and refine it with their data to optimize around a specific use case. In that scenario, AWS hosts a variety of generative AI foundation models to offer maximum user flexibility and facilitate price and performance competition.
From a strategy standpoint, Microsoft Azure has OpenAI’s GPT suite and is currently the preferred host for Meta’s Llama 2 instances. Google Cloud has the PaLM, and, soon, Gemini models to offer, along with some third-party proprietary models. AWS is offering the “everything else” option.
This is also likely to be true for text-to-image and text-to-video foundation models. Azure will favor DALL-E and maybe some new Meta offerings, while Google wants to steer users to Imagen. It’s not that Azure and Google Cloud will not offer alternatives. Instead, many of the “alternatives” will likely see AWS as providing a level playing field and steer their customers to AWS first unless they get favorable economic terms from Azure or Google Cloud. Plus, a lot of companies already have deep investments in AWS, and using the models it offers will be more convenient.
You should also expect AWS to become the preferred cloud provider for most of the open-source generative AI models, except possibly Llama. Open-source will flock to the “everything else” option, where they don’t have to compete directly with OpenAI offerings.
The 3 Layers of Generative AI
Beyond the generative AI model store concept, Jassy also went into detail about Amazon’s view of the three layers of generative AI technology. He starts with the computing layer, which is the hardware that runs generative AI model training and inference. A key point here is that Amazon provides NVIDIA GPUs, but also hopes to drive more use of its chips as a lower-cost option.
We think of large language models in generative AI as having 3 key layers, all of which are very large in our opinion and all of which AWS is investing heavily in. At the lowest layer is the compute required to train foundational models and do inference or make predictions. Customers are excited by Amazon EC2 P5 instances powered by NVIDIA H100 GPUs to train large models and develop generative AI applications. However, to date, there's only been one viable option in the market for everybody and supply has been scarce.
That, along with the chip expertise we've built over the last several years, prompted us to start working several years ago on our own custom AI chips for training called Trainium and inference called Inferentia that are on their second versions already and are a very appealing price performance option for customers building and running large language models. We're optimistic that a lot of large language model training and inference will be run on AWS' Trainium and Inferentia chips in the future.
The second layer is the model layer. This is where AWS is providing easy access to generative AI foundation models. AWS offers selection and convenience in this category. You can’t access OpenAI GPT3/4 models or Google PaLM, but you can now use others from AI21, Anthropic, Cohere, Stability AI, and Amazon’s own Titan LLM.
We think of the middle layer as being large language models as a service. Stepping back for a second, to develop these large language models, it takes billions of dollars and multiple years to develop. Most companies tell us that they don't want to consume that resource building themselves. Rather, they want access to those large language models, want to customize them with their own data without leaking their proprietary data into the general model, have all the security, privacy and platform features in AWS work with this new enhanced model and then have it all wrapped in a managed service.
This is what our service Bedrock does and offers customers all of these aforementioned capabilities with not just one large language model but with access to models from multiple leading large language model companies like Anthropic, Stability AI, AI21 Labs, Cohere and Amazon's own developed large language models called Titan. Customers, including Bridgewater Associates, Coda, Lonely Planet, Omnicom, 3M, Ryanair, Showpad and Travelers are using Amazon Bedrock to create generative AI application. And we just recently announced new capabilities from Bedrock, including new models from Cohere, Anthropic's Claude 2 and Stability AI's Stable Diffusion XL 1.0 as well as agents for Amazon Bedrock that allow customers to create conversational agents to deliver personalized up-to-date answers based on their proprietary data and to execute actions.
If you think about these first 2 layers I've talked about, what we're doing is democratizing access to generative AI, lowering the cost of training and running models, enabling access to large language model of choice instead of there only being one option, making it simpler for companies of all sizes and technical acumen to customize their own large language model and build generative AI applications in a secure and enterprise-grade fashion, these are all part of making generative AI accessible to everybody and very much what AWS has been doing for technology infrastructure over the last 17 years.
ChatGPT is an example of the application layer. That is less of a focus today for Jassy, but we will likely see more of that emerge when generative AI arrives in Alexa and other Amazon applications.
Traction and Strategy
Microsoft announced that 4,500 companies were using Azure OpenAI Services in May, including Volvo, Docusign, and Ikea. AWS is not claiming that many users but did highlight Bridgewater, 3M, Ryanair, and Travelers, among others. So, you can see Amazon’s strategy coming together, and they are already executing. Amazon has also claimed over 100,000 AI computing customers on AWS, so they should have a large installed base to shift to the new models.
Amazon’s strategy may not be as well-crafted as Microsoft's today, but it is logical, tactical, and has a good chance of succeeding. The “everything else” and “most convenient” could emerge as popular buying criteria. Also, if an open-source model other than Meta becomes a market leader, AWS will have a good chance of becoming a preferred hosting provider. Its key challenges:
OpenAI has dominant generative AI mindshare and market share today, and AWS does not offer access to those models.
Open-source foundation models are likely to be very popular. However, Meta’s Llama 2 is the model with the most interest today and is launching first on Azure. Meta said Llama 2 will come to AWS, but the timing is unclear.
Six months ago, Amazon was talking about exciting possibilities presented by generative AI but also emphasized where Alexa may be better than ChatGPT as if attempting to change the subject.
Four months ago, Amazon introduced the Bedrock generative AI service in AWS. Two months ago, Amazon announced a $100 million investment in a Generative AI Innovation Center. One month ago, Jassy said AWS was avoiding the Microsoft and Google hype cycle and focused on the substance cycle for generative AI. He also indicated that every part of Amazon was working on or with generative AI. This week he outlined the company’s broader strategy in more detail.
Amazon’s leadership is smart. They had a large language model for internal use long before ChatGPT arrived, which Synthedia had already heard about. They also recognized the importance of the ChatGPT moment not long after it arrived.
However, earlier this year, the company was still absorbing the changes brought on by the November 2022 layoffs, and the executive team clearly wanted to buy some time to put their strategy in place. You should expect to see more emphasis and more acceleration from Amazon over the next six months around generative AI.
Google famously issued a “code red” after the launch of ChatGPT to focus the company on bringing its formidable generative AI portfolio to market quickly. Amazon had its own all-hands-on-deck initiative that was less publicized (or simply not leaked) than Google’s. We are starting to see how that is playing out now.
Remember that the generative AI wars are at one important level, just proxy battles in the overall cloud computing wars. The foundation model makers may be the rockstars, but the cloud providers provide the key venues where they perform. And, of course, to carry this analogy to its logical conclusion, NVIDIA makes all of the instruments. 😀