Rabbit Launches R1 Device for GenAI-Enabled Experiences and Adds $10M in New Funding
The software needed a vessel and the iPhone is too restrictive, but...
At CES today, Rabbit launched its first product based on the concept of a large action model (LAM). A LAM is a large language model (LLM) optimized to take action on behalf of users by controlling various software solutions. Jesse Lyu, Rabbit’s founder and CEO, said in the company’s launch video:
The popularity of LLM chatbots over the past year has shown that the natural language based experience is the path forward. However, where these assistants struggle is still getting things done. For example if you go to ChatGPT and use your Expedia plugin to book a ticket it can suggest options but ultimately cannot assist you in completing the booking process from start to finish…
Rabbit [OS] powered by large action models concept and test results are so powerful that we decided to make a one-of-a-kind mobile device.
In October, the company announced a $20 million funding round led by Khosla Ventures. The company announced another $10 million cash infusion last month. Today’s announcement moves the Rabbit OS into the market. That is a key idea. The Rabbit LAM solution seemingly could not be deployed on an iPhone due to app restrictions. The r1 is the vessel Rabbit needs to give its software a place to show what it can do.
Rabbit does appear to be committed to the device, but if there were a way to make it your phone assistant, that certainly would be more scalable and reduce barriers to adoption.
LAM Do Engine
Rabbit OS is a novel solution with some elements you may be familiar with from Google Duplex on the Web. That product was shut down by Google at the end of 2022. Its focus was to navigate the web and execute actions or fill out forms for users. Rabbit OS is designed to navigate various online services to complete tasks. According to the earlier funding announcement:
"We dedicated a significant amount of the initial research effort on learning app interfaces and how humans interact with them. That is how the Large Action Model (LAM) was born," Lyu explained. "Our operating system,rabbit OS, powered by LAM, understands your intentions, automatically conducts research, operates variouscomputer apps through interfaces, compiles and presents information, and ultimately accomplishes tasks foryou."
Rabbit is working on similar challenges that Adam Cheyer, the co-founder of Siri and Viv Labs, identified as the key ChatGPT shortcoming. Siri was branded by Cheyer’s co-founder, Dag Kittlaus, as The Do Engine. It was designed to execute tasks. Cheyer told me that the technology was not available to create a powerful “Know Engine” but that LLMs had largely solved that problem. He also suggested that it will be hard for the Know Engines (assistants), like ChatGPT, to build out Do Engines because the problems are entirely different.
Lyu suggests that voice assistants cannot effectively navigate other apps, which was a key shortcoming of Siri and Alexa. That is where the LAM comes in. According to Lyu, Rabbit’s LAM is more effective at intent recognition and interacting with app interfaces.
The Product or the Software
The r1 looks interesting, but it is nothing special. It is a dedicated device for a new kind of voice assistant. The most compelling element of the r1 is its price of $199. That is far more affordable and appropriate for an optional device than the Humane pin.
However, it is hard for me to see any significant number of consumers adopting this device. It does not replace your smartphone, and the demonstrations are not exactly novel.
One of the key factors that limited smart speaker adoption was their penchant for merely being a more convenient way to execute tasks that could already be done reasonably efficiently on smartphones or televisions. That convenience factor, combined with the hands-free UI and low price, was sufficient to sell hundreds of millions of devices. It was not enough to make the devices pervasive or vault voice assistants to the same level of consumer importance as smartphones.
The use cases demonstrated in the product launch video are only mildly different from previous voice assistant demos. I recommend anyone interested in seeing some complex ordering instructions check out early SoundHound videos for the Hound assistant. Despite not being based on generative AI, the assistant deftly navigated a series of complex requirements.
You may note that Lyu mentions Rabbit is a neuro-symbolic architecture. That simply means combining deep learning and rules-based solutions. Or, in this case, a generative AI and rules-based solution. You will also hear this called hybrid AI. The symbolic part enables more control, while the neuro part deals better with ambiguity, variability, and complexity. Some voice assistants experimented with pre-generative AI neuro-symbolic approaches in their own way.
That is not to say Rabbit is reinventing the wheel. The concept of a LAM is likely to be valuable. Rabbit has created a new voice assistant that doesn’t have the technical debt of the previous solutions and is focused on task execution instead of knowledge presentation. Rabbit as an app could be a consumer hit. As a device, I am skeptical it will find an audience.
However, the company likely has a future selling API access to its LAM. If it can accumulate a reasonable number of r1 device users, it can capture first-party data about app navigation, capabilities, and user behaviors. That could become a competitive advantage in LAM performance in relation to consumer apps. It also might become a feature that other generative AI solutions would benefit from tapping into.
With that said, if they insist on creating a device, a watch would be a much better approach. That is a second device people would be more likely to consider. Does anyone really want another device in their pocket or purse?
I pre-ordered one (Rabbit), if nothing else is a small Linux device, that can connect to my phones WiFi, I'm interested to see what happens with it.