Meta Just Became a Big Player in the LLM World Making Llama 2 Free and Open Source
A deal with Microsoft Azure as also highlighted for hosting Llama models
Meta today announced the availability of three open-source Llama-2 large language models (LLM). They are all available for free for both commercial and research use but do include some licensing restrictions (see below). The models come in 7B, 13B, and 70B parameter sizes, and they now occupy the top spots in Hugging Face’s Open LLM leaderboard. Each was trained on 2 trillion data tokens.
Versions of Meta’s restricted-access Llama models already held many of the top spots on the leaderboard, including GPlatty, which is based on Alpaca, which is, in turn, based on Llama. In recent weeks, however, some Llama 1 models were surpassed by other open-source LLMs.
The Falcon-40b LLM from the Technology Innovation Institute originally knocked Llama 1 models out of the top spots in May. Meta showed off a number of other tests where Llama 2 outshines competitors in its announcement.
Available Through Azure, AWS, and Hugging Face
Meta’s announcement also played up it’s new, expanded relationship, with Microsoft. Although Llama 2 will also be available on AWS, Hugging Face, and through other providers, Microsoft earned the status as “preferred partner.” According to the announcement and a Facebook post by Mark Zuckerberg:
Microsoft as our preferred partner for Llama 2 and expanding our efforts in generative AI. Starting today, Llama 2 will be available in the Azure AI model catalog, enabling developers using Microsoft Azure to build with it and leverage their cloud-native tools for content filtering and safety features. It is also optimized to run locally on Windows, giving developers a seamless workflow as they bring generative AI experiences to customers across different platforms. Llama 2 will be available through Amazon Web Services (AWS), Hugging Face, and other providers too.
More Data = Better Performance
A key ingredient of Llama 2’s performance rise appears to result from adding 40% more training data than Llama 1. In addition, chat use cases received over 1 million human ratings as part of the reinforcement learning with human feedback (RLHF) process. The context window also doubled to 4096 tokens.
Each of these changes surely contributed to the performance rise. OpenAI’s CEO Sam Altman has hinted that data curation and expansion were key elements of the performance rise between GPT-3 and GPT-4.
Free But with a Catch
TII’s Falcon-40b offered its model as open-source and free to use for non-commercial use. Commercial use carries a royalty fee. Meta is foregoing any fees, even for commercial use, provided the user complies with several restrictions.
I reviewed the license and acceptable use terms so you don’t have to. Some key elements include:
1. License Rights and Redistribution.
…
iv. Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available at https://ai.meta.com/llama/use-policy), which is hereby incorporated by reference into this Agreement.
v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof).
…
2. Additional Commercial Terms. If, on the Llama 2 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
5. Intellectual Property.
…
b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 2 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.
…
Acceptable Use Policy
…
3. Intentionally deceive or mislead others, including use of Llama 2 related to the following:
a. Generating, promoting, or furthering fraud or the creation or promotion of disinformation
b. Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content
Here are my key takeaways from the licensing agreement above:
1. iv - You must comply with the acceptable use policy.
1. v - You cannot use Llama 2 to improve another LLM.
2. - You cannot use Llama for more than 700 million monthly active users.
3. - There is no warranty. Use at your own risk.
4. - Meta is not liable. Use at your own risk.
5. b - You own the changes you make to Llama. 🎉
5. c - If you sue Meta, you cannot use Llama anymore.
Here are the takeaways from the acceptbile use policy
3. a - This seems like a reasonable request, but it does not stipulate who determines what is “disinformation.”
3. b - This also seems reasonable, but it does not stipulate who determines what is “defamatory.”
Your lawyer may come up with other comments, but for the most part, the licensing items are not extraordinary. The key risks you have are if the model operates in a way that is unexpected, you are the liable party, and you could face a situation where Meta arbitrarily says you have violated the acceptable use policy.
Who Should Be Concerned about Llama?
Every LLM foundation model provider should be concerned about Meta and Llama 2 to some degree. However, this may impact some LLM makers more than others.
I suspect OpenAI will not be impacted much in the near term as its customers are not interested in an open-source model today. The same is likely true for Google for PaLM and Gemini. Their customers want a company standing behind the model and its upgrades. Anthropic may also be in the clear since Llama’s 4k context window is far smaller than 100k. Of course, over time, the impact of these factors may change. Interconnect notes:
The base model seems very strong (beyond GPT3) and the fine-tuned chat models seem on the same level as ChatGPT. It is a huge leap forward for open-source, and a huge blow to the closed-source providers, as using this model will offer way more customizability and way lower cost for most companies.
Every other model provider that is lining up to be the open-source LLM leader or even a proprietary alternative to OpenAI must immediately contend with the “free” and high-performance offering from Meta. This is particularly true for the sub-100 billion parameter LLMs.
The biggest winner may be Microsoft. It will still promote the proprietary OpenAI models and now has a powerful open-source alternative. Both require substantial cloud computing resources for training and inference.