Do Large Language Models Represent the Future of Search?
InstructGPT and Metaphor Systems Provide New Ideas
Metaphor Systems recently launched a new search engine that attempts to determine what URL would most likely complete a partial query. This approach is similar to large language models (LLM) such as GPT-3 that, at a basic level, attempt to predict the next word most likely to follow in a text generated in response to a query. Metaphor predicts the URL this is most aligned with a prompt intent in place of a word.
On the surface, this doesn’t seem that different from Google or Bing, which return a list of links based on a query. The difference arises in the form of the prompt (i.e., query) and what that expresses about the user’s intent. Here is how Metaphor describes the approach.
Suppose you were interested in motorcycle riding and wanted to find the personal pages of interesting people to talk to about the hobby. What might an ideal prompt on Metaphor look like?
You might think something like “Motorcycle riding personal page” or maybe something like “People interested in motorcycles” would be reasonable prompts.
If you try searching for either of these however, you’ll notice that the results aren’t very dialed in. Why are the results so poor? The keyword-like nature of the prompts is the clue. When describing links they’re about to share, people don’t tend to talk in keyword phrases. In other words, the reason for the poor results is that neither of those prompts look like how someone on the internet might actually describe a link they found.
A more optimized prompt might look something like
"This person would be great to talk to about motorcycles (personal site here:"
This prompt has a couple of things going for it. Primarily, it truly resembles the way someone on the internet might actually talk about a really useful link they just found. We can imagine that this prompt would be just at home in the comments section of a particularly helpful subreddit or in the middle of someone’s personal blog page. These properties make it a good link.
Another great thing about this prompt is that near the end of it we specify exactly what kind of website we’d like for Metaphor to return; “personal site”. If, for example, we were searching for something where a blog post or arxiv link would be more appropriate we could specify exactly what we want by adding “blog post here” “arxiv paper:” to the end of the prompt.
LLMs can generate text and code. Metaphor enables them to select URLs. It is particularly interesting to consider what Metaphor’s model must know about web content in order to suggest the right URL. Assuming the example prompt was not in the dataset, Metaphor’s LLM must be able to identify many attributes of content that go well beyond keywords and previous user searches.
Page Rank Algorithm Clouded with Noise
This could be very helpful because Google can be particularly inept at fulfilling certain types of queries. The Page Rank algorithm over-indexes for backlink quantity and quality, URL text, and H1 tags and appears to have a significant recency bias. This leads to poor results for many query types.
It also leads to grammatical gymnastics, where users update queries in hopes of receiving a better result. The situation is particularly frustrating when a user knows a certain website or page exists, doesn’t recall the URL, and cannot coax Google into serving it up.
Google may be the best we have, but everyone recognizes it falls short of the goal of best fulfilling the user intent every time. Metaphor appears to provide a model that can return higher quality results for certain types of queries. Specialty search engines such as Wolfram Alpha exist, but they are a rare species. It is good to see Metaphor step into the fray.
InstructGPT is a Synthesizer
Google has another type of search result, which is the answer box. Instead of the blue links, you get an answer to your query. This selectively presents data from a web page which spares the user the need to click on a link and find the information on the page.
[N.B. When researching how many Google search queries resulted in an answer box, Metaphor offered a better result than Google]
However, Google can only answer questions that have been explicitly addressed on a single web page or attempt to match keywords to get you to a URL. OpenAI’s InstructGPT can actually synthesize data from multiple web pages to provide an answer to a novel question that no single web page explicitly or wholly addresses. This topic merits its own post, which you can look for in a future issue of Synthedia.
I mention InstructGPT here, in part, because the solution described above requires an LLM, and it is something Metaphor could expand into over time. Right now, Metaphor appears to be providing what might look like a niche search engine, but its niche is query type, while its domain application is broad. That is the opposite of what you have seen previously in the vertical search engine category. The LLM is critical for providing a better URL search result, but there is no reason it could not extend into answer boxes or synthesized responses similar to InstructGPT as well.
LLMs are going to change search and might provide an opening to erode Google’s dominance. We will have to see how LaMDA develops alongside the emerging contenders.