Mistral Has a New 176B Parameter LLM and a Number of New Customers
Snowflake Copilot built on Mistral Large
Mistral made a new large language model (LLM) available through a Tor link earlier this week, and it is now available through Hugging Face as well. The Mixtral 8x22B has a mixture of experts (MoE) architecture similar to the the Mixtral 8x7B model that debuted in December. However, the new model is larger in parameter count and context window size.
The new model boasts 176B parameters, though the MoE approach means that only two of the 22B “expert models” within the solution will be active at any given time, meaning that only 44B parameters are active. The significance on that front is inference efficiency. This translated into cost and latency savings.
The 65,000 token context window is likely to accommodate about 50,000 words. That is more than double the size of the 32,000 token context window for Mixtral 8x7B.
While Mistral Large launched on Azure and Mistal’s La Platforme last month as a proprietary model, Mixtral 8x22B is an open-source model, issued under an Apache 2.0 license. The new model is the largest of Mistral’s open-source models and is likely similar in size to Mistral Medium, a proprietary model that has restricted availability.
Mistral Lands Snowflake and Other Customers
Unrelated to the new model, Mistral has also started to list its customers. Some of these include BNP Paribas, Orange, MongoDB, Brave and others. Joining those named customers is Snowflake. The company today announced its new Snowflake Copilot, a text-to-sequel generative AI solution. The solution is built on Mistral Large and enables users to generate sequal queries based on natural language requests. This is designed to make the process faster and enable more seamless data exploration.
Overall, we are seeing a lot of momentum from Mistral both in terms of model output and customer activity. It has quickly become a clear rival to Meta as a leader in open-source foundation models and has generated a lot of enthusiasm among developers. Mistral’s European roots also appears to be an asset when you consider its customer base.
We have yet to see public benchmark data related to Mixtral 8x22B but that is likely to arrive soon and given the performance of other Mistral models, it is a good bet that the numbers will be competitive. With strong models from Meta, X.ai, Databricks, and Mistral, companies that prefer open-source models have solid alternatives to OpenAI, Anthropic, and Google.
Grok-1.5 Closes Gap with OpenAI, Google, and Anthropic, Aces Long Context Window Retrieval
X.ai announced the Grok-1.5 large language model (LLM) Thursday, and it reflects a significant performance and feature upgrade over the now open-source Grok-1 model. Caveats aside about what benchmark results LLM developers choose to present, MMLU, MATH, GSM8K, and HumanEval results rose from Grok-1 to Grok-1.5, from 73% to 81.3%, 23.9% to 50.6%, 62.9% …
Elon Musk's X.ai in Talks to Raise $3B at a Valuation Matching Anthropic
An SEC filing in December 2023 indicated X.ai had raised $135 million of a planned funding round of up to $1 billion. There was no reported valuation for that initial funding round. The Wall Street Journal reported yesterday that early investors in SpaceX and Tesla are considering investment in a new $3 billion funding round that values the company at a…
Mistral Introduces Two New High Performance LLMs and Neither Are Open-Source
There is a lot to unpack in the new AI foundation models Mistral Large and Mistral Small. While Mistral’s earlier models debuted with open-source licenses, the new releases are…