Amazon is Considering $20 Monthly Subscription for GenAI Enhanced Alexa
Here is how that could play out
It is no secret that Amazon is in the process of transitioning much of Alexa’s technical back-end to generative AI models. That was made clear this past fall in a preview I attended at Amazon’s HQ2.
The first steps involved adding the capability to answer some user queries via a generative AI model as opposed to the legacy natural language understanding (NLU) model, accommodating Alexa skills driven by generative AI, and creating an arbitration capability that could route requests to the “legacy” or “new” Alexa. Other announcements at the time included a new, smoother-sounding, and more expressive voice and the ability to generate images on FireTV.
Business Insider reported earlier this year that Amazon was considering charging users for an upgraded version of Alexa that went well beyond the capabilities demonstrated in September 2023. CNBC added context to this story by reporting that Amazon is considering charging $20 per user per month, and it will be separate from Prime membership. According to CNBC:
Amazon is upgrading its decade-old Alexa voice assistant with generative artificial intelligence and plans to charge a monthly subscription fee to offset the cost of the technology, according to people with knowledge of Amazon’s plans.
The Seattle-based tech and retail giant will launch a more conversational version of Alexa later this year, potentially positioning it to better compete with new generative AI-powered chatbots from companies including Google and OpenAI, according to two sources familiar with the matter, who asked not to be named because the discussions were private. Amazon’s subscription for Alexa will not be included in the $139-per-year Prime offering, and Amazon has not yet nailed down the price point, one source said.
…
One source estimated the cost of using generative AI in Alexa at 2 cents per query, and said a $20 price point was floated internally. Another suggested it would need to be in a single-digit dollar amount, which would undercut other subscription offerings.
Alexa vs ChatGPT vs Gemini
Synthedia’s coverage of the fall 2023 Amazon event outlined some of the key features coming to the generative AI-enabled Alexa. These include:
Open-ended conversations - Alexa’s “Let’s chat” feature enables conversations about any topic as it is connected to the internet, and the LLM can access several web services to answer questions.
Barge-in - You will be able to interrupt Alexa when speaking with it on an Echo Show device with a camera and the new adaptive context feature. The system considers both audible and visual cues to understand whether you are addressing the assistant, attempting to barge in, or just listening.
No wake word (sometimes) - When you are using “Let’s Chat” and looking at an Echo Show with a camera, it will be able to recognize you are making a request, even without saying “Alexa.”
Context Maintenance - It is not clear how long a conversation persists. However, the demo showed Limp stepping away from the conversation multiple times and then just picking up where he left off after making some side comments to the audience.
Personalization - This is not entirely new to the Alexa ecosystem, but it appears that the LLM-enabled solution has either adopted or extended this capability.
You may note that these features are similar to several of the capabilities demonstrated by OpenAI for ChatGPT and Google for Gemini Advanced last week. Generative AI has proven to be more deft at long-form conversations than NLU-based systems. In addition, the ChageGPT and Gemini demos included barge-in (i.e., interruptions) and context maintenance and were activated without a wake word.
The ability to personalize these experiences was less clear. Memory is a popular feature topic, and ChatGPT recently added the capability. However, it is unclear how well that feature will work and how much memory will be retained.
Knowing and Doing Assistants
Synthedia has generally embraced Siri co-founder Adam Cheyer’s framework that segments digital assistants into “knowing” and “doing” categories. Siri was famously called the “do engine” when it was first launched. It could execute tasks on behalf of the user as an early iPhone app and then as part of the iPhone after Apple acquired the company.
By contrast, ChatGPT is a “knowing” assistant. Cheyer has told me in the past that he would have liked for Siri to be both a “knowing” and “doing” assistant, but the technology at the time would not support the type of knowledge-based interactions that ChatGPT users take for granted today.
Alexa arrived in late 2014 as a device-based “doing” assistant with a broader feature set and entirely hands-free operation. It also created the voice-interactive smart speaker and smart display categories. The subsequent six years saw a rapid rise in applications, brands, and independent developers creating skills that enabled Alexa’s “do engine” to expand well beyond the limited scope that Siri had retreated into.
OpenAI thought it would quickly add “doing” capabilities with the launch of plugins in the spring of 2023. However, despite strong interest and the launch of many plugins, the solutions add-ons to ChatGPT generally fell short of usefulness. OpenAI deprecated plugins in November 2023 and directed developers toward GPTs, mini “knowing” applications for customized ChatGPT experiences.
Knowing and doing assistants both offer consumer benefits. The question is which assistant will be first to integrate the feature categories into a single experience. It turns out that adding “knowing” capabilities to “doing” assistants will be easier than the reverse. Much of that is driven by a mix of technology and business barriers that make cross-application control complex and highly variable.
Amazon could be the first to marry these capabilities by introducing an upgraded Alexa. It has already overcome some pernicious challenges in adding “doing” capabilities. Plugging in a large language model (LLM) or a multimodal foundation model behind Alexa could be a shortcut to moving ahead of rivals.
GenAI Assistant Landscape
Synthedia sees Alexa today as straddling the “knowing” and “doing” divide. By adding more “knowing” features similar to ChatGPT and Gemini Advanced, Alexa could become the digital assistant that offers a more holistic solution to everyday consumer needs.
A further extension is also possible. Synthedia has also identified a limited rise in the number of new digital assistants that address “connecting” needs. Connecting needs can be fulfilled by the AI app that expresses interest and empathy in the form of a digital companion. They can also enable connections between users. Given Alexa’s ability to build strong affective trust with users, it could be a candidate to expand into this segment along with knowledge-oriented features.
Alexa, Bedrock, and Generative AI
Amazon has several assets at its disposal to expand into knowledge and connectivity domains. One of the most prominent is AWS and the Amazon Bedrock service. A missed opportunity for Amazon thus far has been a more explicit connection between Amazon Bedrock, the generative AI service for accessing generative AI models through AWS, and Alexa.
The company is apparently using the Amazon Bedrock Titan LLM as part of its generative AI Alexa solution, but there is no specific linkage between the two services. For example, Amazon Bedrock does not mention Alexa, despite the group’s enterprise services offerings that are tailor-made for AWS enterprise users.
It is unclear whether an Alexa-Bedrock fusion is likely at this point. However, both organizations would benefit from the combination.
Beyond this, Amazon has sold over 500 million Alexa-enabled devices. This means that Amazon has instant distribution for whatever assistant solution follows the legacy Alexa assistant.
What Alexa Needs to Compete
Alexa was once the most talked-about and most popular voice assistant in the U.S. and several other countries. For Alexa to reassert that prominence and light of the new technology landscape, it will need to:
Add broad capability “knowing” assistant features: the rise of ChatGPT and Gemini has led to a new set of consumer expectations around what is deemed to be smart.
Maintain the doing capabilities: Google has headed in the direction of focusing on the “knowing” features via Gemini but has not ported over the Google Assistant “doing” capabilities. Google may still add these back in, but the lack of carryover thus far appears to be a mistake. Amazon can avoid following in this path by embracing its “doing” roots even while enhancing capabilities in a new domain.
Promote the text interface in the mobile app: Alexa has been a pre-eminent voice assistant, but there are many users that are accustomed to using text for “knowing” tasks. The Alexa app has a text input interface, which will need to be promoted more heavily. It is also the most likely path for Amazon to drive up Alexa usage outside the home.
Add memory: Memory at a session and user level is essential for providing value in a multi-turn conversation. It is also the key building block required to offer individualized personalization.
Whether Amazon decides to charge for Alexa remains an open question. It seems unlikely for the company to offer Alexa for free, given the high inference costs per query. However, with ChatGPT-4o opening up all features to free users, Amazon may be forced to change approach. Price will not be the only driver of adoption, but it will be a factor.
Ha ha 😂🤣😂🤣😂 and you know what you can do with that I have an Alexa device in every room but very rarely use them only to turn something on or off most things are set up in routines if there was a further on cost I would scrap them all and move over to home assistant