Mistral Introduces Two New High Performance LLMs and Neither Are Open-Source
Key milestones in the bid to become the "OpenAI of Europe"
There is a lot to unpack in the new AI foundation models Mistral Large and Mistral Small. While Mistral’s earlier models debuted with open-source licenses, the new releases are proprietary, support English, French, German, Italian, and Spanish (in addition to other languages), and are available as hosted endpoints directly from the company and through Microsoft Azure.
Notably, Mistral has not released the number of parameters in either model nor has it indicated how many tokens they were trained on. It did release benchmark data comparing the Large and Small models to previously released models from Mistral, Meta, Anthropic, and OpenAI. Mistral is not content to compete in the small model or open-source categories. The new models show the company’s intent to compete directly in the proprietary frontier model category, currently led by OpenAI with Anthropic and Google in close proximity.
The timing is also significant given Google’s recent issues surrounding the Gemini Pro models and the generation of inappropriate and inaccurate images and text responses. Many enterprises would like to have an alternative to OpenAI, and Mistral is now a strong candidate for that position. Since tens of thousands of companies are already using OpenAI through Azure, it will be an easy option to try Mistral.
Performance
Mistral released a lot of benchmark data for the Large model. It also provided data showing the new Small model outperforms the Mixtral 8x7B model released in December 2023. The Large model is stronger than GPT-3.5 in each of the benchmarks cited, beats all models for Arc C 5-shot and TruthfulQA, falls just short of Anthropic’s Claude 2 for TriQA, and trails only GPT for MMLU, HellaSwag, WinoGrande, and Arc C 25-shot.
The company also showed data suggesting that Mistral Large beats Meta’s Llama 2 for the Arc-C, HellaSwag, and MMLU benchmarks in French, German, Spanish, and Italian. According to the announcement:
Mistral Large achieves strong results on commonly used benchmarks, making it the world's second-ranked model generally available through an API (next to GPT-4).
The Models include a 32k context window, function calling, and optional moderation policies. It is priced well above GPT-3.5 Turbo and just under GPT-4 Turbo.
Microsoft Partnership
Mistral also announced a “multi-year partnership” with Microsoft. Mistral’s large language models (LLM) will be available through Azure AI Studio’s model as a service (MaaS) offering. According to Microsoft Azure’s Eric Boyd:
We are excited to announce Mistral AI’s flagship commercial model, Mistral Large, available first on Azure AI and the Mistral AI platform, marking a noteworthy expansion of our offerings.
This is another win for Microsoft, given the strong interest in Mistral and the hopes expressed by many that it becomes the “OpenAI of Europe.” Microsoft had already been working on introducing OpenAI alternatives into Azure, such as Mistral and Microsoft’s Phi, before the near-meltdown of OpenAI in November 2023 caused by infighting in the corporate oversight board. While that situation was amicably resolved and the commotion barely slowed OpenAI, it did cause everyone, including Microsoft, to consider their alternative plans should OpenAI have a more significant issue in the future.
Of course, this is not an exclusive relationship. You can access the models directly from Mistral through the company’s newly launched La Platforme, which the company says is “safely hosted on Mistral’s infrastructure in Europe.” You may also note that Microsoft’s announcement says the models are available “first on Azure AI.” Expect them to appear on AWS and/or Google Cloud in the near future.
The Alternative Has Arrived
Anthropic was the key proprietary model alternative to OpenAI through much of 2023. Llama 2 from Meta was the leading open-source model in terms of developer attention. Gemini 1.0 and 1.5 make a strong case that Google should be considered a key alternative in 2024. Mistral is the other company driving attention. The performance of the Large model, along with competitive pricing and the European provenance, is likely to attract attention and some large corporate users.