Microsoft is Hedging its Generative AI Bets with the Databricks and Meta Llama 2 Deals
Databricks has a large role to play in generative AI
While some news organizations would have you believe that Microsoft Azure’s deal with Databricks threatens OpenAI, it is exactly what you would expect from a cloud provider. OpenAI is Microsoft’s biggest generative AI bet, but it is far from its only horse in the stable. Cloud customers expect choices of technology vendors, so it was never going to be exclusive to OpenAI.
Sam Altman even said in January the Azure relationship would not be exclusive for OpenAI’s, but he did stress what a good partner Microsoft had been. That included technical support beyond the piles of investment dollars.
Microsoft has long hosted NVIDIA NeMo foundation models, and more recently made a big splash as Meta’s leading partner for hosting the open-source Llama 2 models. More options will be on the way. The Information reported last week that:
Microsoft plans to start selling a new version of Databricks’ software that helps customers make AI apps for their businesses, according to three people with direct knowledge of the plans. The Databricks software, which Microsoft would sell through its Azure cloud-server unit, helps companies make AI models from scratch or repurpose open-source models as an alternative to licensing OpenAI’s proprietary ones.
Databricks ML on Azure
In the first week of August, new Databricks documentation for Azure showed off its latest generative AI services for Azure users. Databricks Machine Learning includes several services you might expect from the company, including data exploration and development. There is also a section dedicated to LLMs and generative AI, which mostly focuses on using Databricks data services for existing models, but could be used to facilitate new model adoption:
Use Databricks for LLMs and generative AI
Databricks Runtime for Machine Learning includes libraries like Hugging Face Transformers and LangChain that allow you to integrate existing pre-trained models or other open-source libraries into your workflow. The Databricks MLflow integration makes it easy to use the MLflow tracking service with transformer pipelines, models, and processing components. In addition, you can integrate OpenAI models or solutions from partners like John Snow Labs in your Databricks workflows.
With Azure Databricks, you can customize a LLM on your data for your specific task. With the support of open source tooling, such as Hugging Face and DeepSpeed, you can efficiently take a foundation LLM and start training with your own data to have more accuracy for your domain and workload.
In addition, Azure Databricks provides AI functions that SQL data analysts can use to access LLM models, including from OpenAI, directly within their data pipelines and workflows. See AI Functions on Azure Databricks.
The Information’s sources suggest this will expand substantially in the coming months. That seems logical, given Databrick’s acquisition of MosaicML earlier this year for $1.3 billion. The company will want to drive revenue to justify its investment, and it is also a strong advocate of open-source solutions, which contrasts sharply with the proprietary OpenAI. The Information added:
In a touch of irony, Microsoft is using OpenAI’s technology to create a ChatGPT-like chatbot to help less tech-savvy customers use Databricks’ software, which was originally developed for sophisticated data scientists. The net result could be that some Microsoft customers end up using open-source models rather than OpenAI’s closed-source ones.
The planned Azure-Databricks service underscores the overlapping nature of AI software tools and partnerships between companies that are trying to commercialize the latest advances in the field.
…
In the new Azure-Databricks service, the embedded chatbot aims to help customers automate a lot of the highly technical work that typically goes into parsing large data sets and customizing open-source AI models, according to people familiar with the plans. Customers would be able to explain to the chatbot, in plainer terms, what kind of app they want to build so it could perform those tasks.
Microsoft has already started demonstrating the service to some Azure customers, according to someone with direct knowledge of the discussions, and could announce it publicly in the coming months.
Google Cloud and AWS Also have Multiple Bets
Google has its own set of generative AI tools but hosts several foundation models on its service. The company has also invested hundreds of millions of dollars in rival LLM developer Anthropic. AWS has its Titan LLM and hosts models from a variety of third-party providers.
So, I am not surprised the larger Databricks deal is in the works. Databricks data management software is already important tooling for the generative AI industry. And the company will provide foundation models as well as a way to get more people to use their data management software. These will be offered alongside OpenAI on Azure, and likely in the other leading cloud providers as well.
We may see exclusive deals emerge in generative AI over the next couple of years, but for the most part, expect everyone to play nicely together for the time being. This ensures they will have the right relationships in place, no matter who the technology winners turn out to be.