OpenAI Cuts Prices and Announces Most Significant Feature Expansion this Year
Function calls is a far more powerful feature than plugins
OpenAI announced new model updates today, lower prices, and the most significant feature expansion of the year. That means the new feature is even more significant than the GPT-3.5-turbo model, which offers third-party developers access to the same model that powers ChatGPT.
Function Calling
Adam Cheyer, the co-founder and key engineering founder behind Siri (acquired by Apple) and Viv Labs (acquired by Samsung and renamed New Bixby), created his first virtual assistant thirty years ago. Over the years, he came to understand that assistants typically have two types of capabilities:
Knowing - the ability to recall information and share knowledge
Doing - the ability to execute tasks
Cheyer told me in an interview earlier this year that ChatGPT is better at the “knowing” capability than all previous assistants, despite its known faults. He gives it a check for mastering that domain. This is an area where voice assistants have typically underperformed and are significantly beyond ChatGPT.
Siri, Alexa, and Google Assistant are generally good at “doing” capabilities. ChatGPT, by contrast, has few or no capabilities in the “doing” category. Plugins were supposed to address this shortcoming, but as Cheyer predicted, their capabilities so far are not meeting expectations.
This situation may be changing with the introduction of “Function Calling.” OpenAI summarized the new features in a blog post:
Developers can now describe functions to
gpt-4-0613
andgpt-3.5-turbo-0613
, and have the model intelligently choose to output a JSON object containing arguments to call those functions. This is a new way to more reliably connect GPT's capabilities with external tools and APIs.These models have been fine-tuned to both detect when a function needs to be called (depending on the user’s input) and to respond with JSON that adheres to the function signature. Function calling allows developers to more reliably get structured data back from the model. For example, developers can:
Create chatbots that answer questions by calling external tools (e.g., like ChatGPT Plugins)
Convert queries such as “Email Anya to see if she wants to get coffee next Friday” to a function call like
send_email(to: string, body: string)
, or “What’s the weather like in Boston?” toget_current_weather(location: string, unit: 'celsius' | 'fahrenheit')
.
Convert natural language into API calls or database queries
Convert “Who are my top ten customers this month?” to an internal API call such as
get_customers_by_revenue(start_date: string, end_date: string, limit: int)
, or “How many orders did Acme, Inc. place last month?” to a SQL query usingsql_query(query: string)
.
This vaguely resembles what Cheyer and the Viv Labs team engineered for Bixby. The system could write code in real time to fulfill a user request. The difference here is that the developers must describe the functions and APIs to call to the models so they know when to invoke the request.
Still, this will make assistants and chatbots built on top of the GPT model family capable of calling web services to fulfill user requests. It will not completely fill the ChatGPT “doing” gap, but it will enable OpenAI and third-party model API users to deliberately add “doing” features.
The feature is now available in the chat completion APIs. This includes the new gpt-4-0613 and gpt-3.5-turbo-0613 models.
A Larger Context Window for Chat
OpenAI also announced a new 16k context window option for the “ChatGPT” API. The standard gpt-3.5-turbo context window is 4k, with the new gpt-3.5-turbo-16k four times larger. A larger context window means the model can consider more information as it formulates a response.
That could be the full context of a long conversation between the model and a user, a long document used to prime the chat conversation, or lengthy prompt engineering inputs. “16k context means the model can now support ~20 pages of text in a single request.”
Lower Prices
OpenAI’s reduced prices will also be welcome in the market. The price per 1k tokens of use for the Ada embedding model was cut by 75%. Input data tokens (these are the words in user requests that are passed to the model) are also falling by 25%.
"Developers can now use this model for just $0.0015 per 1K input tokens and $0.002 per 1K output tokens, which equates to roughly 700 pages per dollar.
The gpt-3.5-turbo-16k model is twice the price for four times the context window. So, you can expect roughly 350 output pages per dollar using OpenAI’s metrics. With that said, if your users are employing far larger inputs or you are maintaining large volumes of data tokens for context, your total cost will be more than double. This is because the price is doubled, and you will also have more data token consumption. Caveat emptor!
Each of these announcements will put more pressure on OpenAI’s large language model (LLM) competitors. OpenAI is learning how to reduce training and inference costs at scale and is passing along some of those saving to customers. Other LLM providers have undercut OpenAI pricing, but these moves will make it harder to compete on price. Function calling may also make it harder to compete on features.