The UAE's TII Released the Falcon 2 LLM as a Challenge to Meta's Llama 3
TII also released a multimodal version of the model
The Technology Innovation Institute (TII) released 40B and 180B parameter large language models as open models in 2023. The 40B edition quickly vaulted to the top of the Hugging Face Open LLM Leaderboard in May 2023. In September, TII followed that up with the truly large model that showed performance results favorable to Google’s PaLM model. That was a somewhat selective comparison but the point was, TII can match Google.
Last week, TII introduced the Falcon 2 model series and said it compared favorably to Meta’s Llama 3. That is a tall order because Llama 3 is well ahead of the old PaLM models and is particularly strong in the small and mid-sized model segments. The Falcon 2 models introduced thus far are the 11B and the 11B VLM, which is a vision-to-language model. TII commented in the announcement:
We are really excited about Falcon 2 11B VLM – it enables the seamless conversion of visual inputs into textual outputs. While both models are multilingual, notably, Falcon 2 11B VLM stands out as TII's first multimodal model – and the only one currently in the top tier market that has this image-to-text conversion capability, marking a significant advancement in AI innovation.
The 11B variant was trained on 5.5 trillion data tokens as presumably was the VLM edition. However, VLM may have had additional training and certainly underwent a different post-training and fine-tuning regimen.
Performance
Model developers are very selective about whom they benchmark against. Very often we see a different set of benchmark figures and different model comparisons across model versions. This is no exception. However, this is still very useful, given that Llama 3 8B made such a big impression when it was introduced. Meta CEO Mark Zuckerberg highlighted that it performed comparably to the Llama 2 70B edition.
However, credit TII for reporting the key benchmark tests from the Open LLM Leaderboard, offering it a combined score of 64.28 compared to 62.55. It performs a bit better than Llama 3 8B in five of the six benchmarks. Granted, it is about one-third larger and trained on more tokens, but what you can take from this is that Falcon 2 11B is a competitive model.
A Multi-Model
TII also stressed the multilingual and multi-modal capabilities of the new model. It supports English, French, Spanish, German, Portuguese, and “various other languages.” A notable omission is Arabic. Given that TII Falcon originates in the UAE, it would be logical for Arabic to be listed as one of its supported languages, but that may be something to look for in the future.
The 11B VLM’s multi-modal capabilities appear limited today to interpreting images. Unfortunately, the announcement did not provide significant details on this topic, but it did provide an example, which is provided above. While many models are now capable of generating images, image interpretation has wider applicability. TII’s announcement added:
Falcon 2 11B VLM, a vision-to-language model also has the capability to identify and interpret images and visuals from the environment, providing a wide range of applications across industries such as healthcare, finance, e-commerce, education, and legal sectors.
These applications range from document management, digital archiving, and context indexing to supporting individuals with visual impairments.
Open Model Anyone
Falcon 2 is appropriately compared with Meta’s Llama three, in part, because both are open models. Neither are precisely open-source, as they provide restrictions beyond traditional open-source licenses. However, both are open access in that the restrictions are placed on some use cases and use profiles as opposed to who can use them.
TII likes to state that Falcon 2 licensing “is, in part, based on the Apache License Version 2.0.” This is largely true for the 11B licensing. There are only four “acceptable use” policy restrictions and there is no mention of commercial restrictions.
Some of the other TII models have more onerous commercial restrictions, but the organization seems to be most permissive with its smaller models. This is likely a competitive necessity, regardless of the organization’s intent. Meta’s Llama models are open, as are Google’s Gemma models and offerings from Databricks and others.
What’s Next
Falcon 2 11B is available now, and the 11B VLM is listed as “coming soon.” Beyond this release, you can likely expect another 40-70B model to come out as a replacement for Falcon 40B and much larger models as a successor to 180B. In addition, I’m predicting that a version of Falcon tuned for Arabic is likely to arrive later in 2024.