What Happens When You Combine Siri with GPT-3
A developer used Siri Shortcuts and a GPT-3 API to bring a new kind of assistant to life
One of the best things about ChatGPT is all of the innovative experiments that have come from it. Mate Marschalko, a UK-based developer, decided to see if he could use the new GPT-3.5 davinci-003 model that is also used by ChatGPT and connect it to Siri on his iPhone. It worked.
The demo in the video above shows Mate interacting with “Smart Home” assistant. His original concept was to use the integration to control his smart home devices. You can see him making direct commands to “Smart Home” and controlling his devices after activating Siri using the “Okay, Smart Home” wake word command. He also demonstrates indirect requests based on the context of his comment, which simply refers to his office as “dark.”
Finally, he enabled his new ChatGPT-enabled assistant to answer general questions to get the ChatGPT response to queries instead of Siri’s. He did all of this using natural language that described what he wanted the assistant to do and entered that as the text string for his Siri Shortcut to pass to GPT-3. He also created the link for Siri to reach the GPT-3 service through his API token.
Not for the Average User
While Mate entered all of this data in natural language and asked for the return response to be in JSON format, he actually needed to understand what types of data were required to send and what types of data could be returned from his smart home devices. He provided the full text of his instructions in his Medium post about the project.
This type of design is going to be a lot easier for a developer to construct than someone coming to this from a non-technical background. Pasting the text into the Siri Shortcut is followed by a tedious task of creating many if-then statements that are used to connect with the HomeKit-enabled smart home devices. The process makes it impractical for most people to set up.
However, the simplest of the features to implement in this format is responding to general knowledge questions. This feature requires less knowledge about how to structure the request so a proper format is returned. It could be a simple way to supercharge your iPhone with voice-enabled ChatGPT-like responses. It also shows how an enterprising independent voice assistant developer could enable easy access to iPhone users and bypass typical constraints around using Siri.
One More Gotcha
Another important point is GPT-3 latency. Mate said in the video comments that he edited the response latency out of the video easier to make it easier to watch. He indicated that it took 3-5 seconds for most responses. That amount of latency will surely be noticed by users. Anyone creating their own voice assistant and hoping to use GPT-3 APIs will need to consider that user experience factor.
A Smarter Assistant?
“Ever since I tried ChatGPT and GPT-3, everything else feels painfully dumb and useless: Siri, Alexa, Google Home and all other ‘smart’ assistants. Here’s the shocking thing: you can build your own in less than an hour!”
Those are the words of Mate Marschalko. Whether ChatGPT and other GPT-3.5 products are “smarter” than Alexa, Google Assistant, or Siri may be beside the point. Users think these OpenAI-based products are smarter. Consider his comment about a request he made to his new assistant that said, “Just noticed that I’m recording this video in the dark in the office. Can you do something about that?” and the smart lights were turned on.
“Honestly, when I first saw this response, I couldn’t believe my eyes and how exceptionally well it worked! The request was not a simple ‘Switch the lights on the office.’ It was phrased in a very twisted and elaborate way. Something that would immediately throw off Siri, Alexa or Google Home.”
ChatGPT has captivated many developers and users of voice assistants because it seems so much better at interpreting imprecise requests and responding to general knowledge queries. Another query by Mate drives home this point.
“‘I’m going to trust you with this one! Set the bedroom to a temperature you think would help me sleep better.’ And it set the bedroom to a comfortable 19 celsius based on its knowledge!”
However, GPT-3 also is not connected to an ecosystem of products, does not have access to real-time data, and frequently responds to questions with incorrect information. That makes the value of any GPT-3 based service limited in terms of many of the most popular voice assistant use cases today.
On the other hand, the value may not be in replicating or extending what voice assistants can do today. It may be about new use cases that the monolithic architectures of the leading consumer voice assistants are ill-prepared to address or simply refuse to enable.
Never thought about this possibility. Pretty exciting indeed. Especially for someone with disability who has problems using their hands to type, Siri+ChatGPT offers a fantastic way to leverage AI to improve their quality of life.