DoNotPay CEO Joshua Browder posted a video on LinkedIn late yesterday showing a GPT-3 powered chatbot negotiating with an Xfinity customer service agent for a reduction in the user’s monthly bill. It apparently successfully saved a DoNotPay engineer $10 per month, translating to $120 in annual savings. Presumably, a user would not need to do anything (or very little) to capture this benefit.
This post has been extremely popular because the chatbot is an assistant that takes on an undesirable task for the user and completes it successfully. In the virtual assistant world, we refer to these features as agency. The assistant (i.e. chatbot) has the agency to work on your behalf and make decisions that will benefit you.
What the Chatbot Does
You can watch the full video of the interaction through the link to the LinkedIn post. You will see that it expertly navigates questions with button responses, explains the issues the user faces in detail, threatens to take the user’s business to another service, cites applicable laws, answers questions about the customer, and remains polite the entire time. When given the opportunity, it also accepts a discount for the same level of service.
LLMs and Agency
Google Duplex famously had disfluencies making it more humanlike in phone conversations, but more importantly, it had agency. It could book that salon appointment or restaurant reservation for you by working within some parameters you set. Except it didn’t scale very well.
Duplex was using humans to complete many restaurant reservations almost a year after its introduction. The variability and inconsistency in these conversations were too much for a reliable machine-only solution. So, Google hired people to help out the AI. This sounded similar to the issues faced by Facebook’s ill-fated M assistant.
Large language models (LLM) such as GPT-3 have some potential advantages. They are trained on broader sets of data enabling navigation of low probability intents and interactions. They can also be “fine-tuned” to provide certain types of services that require a bias towards specific types of knowledge.
DoNotPay fashions itself as the “first robot lawyer.” It employs knowledge bases about law, regulation, policies, and human emotion to help consumers confront and negotiate with companies. Another use case it is working on is to help consumers defend themselves in traffic court. The LLM has a knowledge base fine tuned to specific laws and templates to draw from and conversational skill that makes it seem like a smart human. Or, at least, a human confident in the position they are promoting.
You can see how these types of features played into the conversation between the DoNotPay bot and the Xfinity customer service representative. We have all seen the wonderful writing that ChatGTP can do for you. But what about task completion? OpenAI’s Sam Altman indicated in a Tweet that task execution was likely to be the next major area of GTP-3’s development. ChatGTP’s system name is Assistant. Task execution is an obvious evolution of the service.
DoNotPay is simply tailoring its GTP-3 implementation to specific tasks it already helps consumers execute. Instead of templates, it is now using an LLM.
It’s Not Perfect But What Is?
You will note that the chatbot makes at least one conversation mistake. Regardless, it accomplished the outcome the user wanted.
The LLM as a writing assistant is already embraced. At the Synthedia 2 conference, which is only hours away, Scott Stevenson will talk about the overwhelming demand for his company’s GTP-3-enabled legal assistant. DoNotPay has plans for a number of advocacy assistants that could make consumers’ lives easier and less expensive. New use cases will generate a lot more interest in LLMs, which should sustain ChatGTP hype for some time.
Learn more about the good, the bad, and the surprising about LLMs, along with text-to-image AI models, voice clones, and virtual humans, at the Synthedia 2 online conference. It’s free and starts in just a few hours.