How GPT-4 Was Transformed Into Generative AI Search in Bing Chat
Behind the scenes details on generative AI search architecture, UI, and tools
“When we got access to GPt-4 very early, we were all just blown away. We had like two weeks where the team did nothing but play with it, build different applications, and try different things. Every day, our minds were blown with new things that were possible. Initially, the biggest challenge was grounding. We had lots [of challenges], especially shipping this thing in the context of a search engine…I would say the biggest challenge at this point is actually what I would call responsible AI.” - Brad Abrams, Microsoft Bing.
Brad Abrams, product manager for Bing Chat and Windows Copilot, broke down how the solutions work and where they are headed during the Synthedia 4 conference in October 2023. The presentation included new information on Bing Chat’s origins, the user experience strategy, product strategy, solution architecture, and tips for users.
How Bing Chat Works with GPT-4 & Tools
One area where Abrams went into significant depth is around the logic flow driving how Bing Chat operates. He details how the orchestration engine works with OpenAI’s GPT-4 large language model (LLM) and with first-party and third-party tools.
We started with one tool which was search and now we've already added several more. So, for example, we now have image creation and image understanding and and even ads generation because we want to make these models available to everyone.
…
We don't think about these large language models as being a big repository of human information. They are, but what's interesting is the emergent behavior that comes from having all that information is…these models have very strong reasoning capabilities…We're using that reasoning to decide what tool to invoke. We think of each of each of these as a different tools and the model decides, “Okay. At this time, I'm going to invoke this tool with this parameter.”
Multiple Search Styles
Another area where Abrams went into some depth was the different search styles and how they served different purposes. The default search style for Bing Chat is “More Balanced.” However, after hearing Abrams's breakdown, you should probably proactively select one of the other styles most of the time.
More Creative - Powered by GPT-4, it has a slightly different meta prompt in the background to make the responses “richer and more creative.”
More Precise - A GPT-4 fine-tuned model that is more grounded. “I encourage you to use ‘More Precise’ if you’re doing some activity where getting exactly the right answer is important.”
More Balanced - You will receive answers somewhere in between “More Creative” and “More Precise.” The answer is likely to arrive faster, according to Abrams, but they will consider a smaller context window. So, long multi-turn conversational searches may lose context more quickly when using the “More Balanced” style.
Changing Information Consumption
Two other topics we discussed in some depth are both related to information consumption. The first addressed adoption. While the nature of the Bing Chat experience looks more like ChatGPT or a chatbot text interface, the solution is also integrated alongside the traditional “ten blue link” search on the Bing search page. The reason for this is to offer Bing Chat as a supplement to traditional search and introduce it to users in a context they are already familiar with.
As users transition to chat-based search, other user interface affordances, such as suggestion buttons, highlight how the search can be extended in scope or depth. Also, showing what sources Bing Chat is consulting while it is preparing an answer became an important user experience (UX) element as the product was developed.
The very first demos that we did we had it showing up…but then we had to have a long conversation with the UX team and some of our management…The biggest piece is to help users understand the amount of work that's being done for you under the covers because one of the things that we're trying to differentiate ourselves on is we want to have the best quality results, the best grounded, meaning attributed sources of highest quality in terms of output and readability, and coherence of the conversation.
And the reality is that takes some time and so what we didn't want to do is show a spinner…We wanted to indicate to users a thing is happening now. Imagine how long it would take you to go to two, three, and I've seen sometimes 10 searches…How long would it take you to do those searches click through every result on there to find the answer. Bing is doing that for you.
The second information consumption topic was related to the interpretation of source content. While Bing summarizes content to provide answers, it can also summarize a website's content when using the Edge browser. You can also ask Bing Chat to summarize a website by providing a URL.
Bret: I think this is could fundamentally the change the way we have to think about building websites.
Brad: Yes. You are saying people may not read every line of our web page but may just go in and ask the model. “What was the top score,” or … it provides a summary automatically on the side if I've got the page open. I go there and it's something that your web team actually has to look at because now you’ve got this third party Bing, or it could be one of the other players, who's then just saying, “Hey, this is what this website and this web page are all about.” Exactly yes.
While Abrams hopes companies don’t have to create content specifically for Bing Chat or the other generative AI-powered search experiences, it is a new consumption format that website owners need to consider. You may note that Synthedia was among the first, and maybe the first source, to identify this important shift. 😀
Peering Behind the Curtains
The generative AI market is moving quickly, and application introduction has followed a torrid pace. It is important that we get a chance to peer behind the curtains to understand how some of the leading solutions operate. That will enable all of us to become better at evaluating applications and identifying the implications for our existing tools, processes, and behaviors.
Also, many thanks go to Brad for taking the time to share the Bing Chat story and technical details. It is the most in-depth discussion I have seen to date on generative AI search and includes ideas and information that were not previously published. I recommend you watch the entire video and arm yourself with more grounding in generative AI search architecture, logic, UX, and strategy.