OpenAI Racks Up Data and Content Deals with News Corp, Vox, and The Atlantic
More high-quality data for OpenAI, less incentive to give into NYT demands
OpenAI closed three new data and content deals last week with news publishers NewsCorp, Vox, and The Atlantic. This follows earlier deals with The Associated Press, Axel Springer, LeMonde, Prisa, and The Financial Times. Each of the deals enables OpenAI to utilize the publishers’ content catalogs to train their AI foundation models. They also provide access to the current news via OpenAI’s ChatGPT. Publishers receive fees for access to their content and ensure relevance with consumers if generative AI chat assistants become an important channel for news consumption.
News Corp
The NewsCorp agreement is the most substantial in terms of content depth and breadth. According to the announcement:
OpenAI has permission to display content from News Corp mastheads in response to user questions and to enhance its products…OpenAI will receive access to current and archived content from News Corp’s major news and information publications, including The Wall Street Journal, Barron’s, MarketWatch, Investor’s Business Daily, FN, and New York Post; The Times, The Sunday Times and The Sun; The Australian, news.com.au, The Daily Telegraph, The Courier Mail, The Advertiser, and Herald Sun; and others…
In addition to providing content, News Corp will share journalistic expertise to help ensure the highest journalism standards are present across OpenAI’s offering.
There was no explicit mention of NewsCorp receiving compensation from OpenAI or plans to use OpenAI tools internally. However, those were elements of earlier deals and are likely included in the “multi-year agreement.”
Vox Media
Vox Media positioned its OpenAI agreement as “strategic.” In addition to providing access to content, Vox says it plans to collaborate with OpenAI to launch “innovative products” and integrate the technology with the Forte data platform.
Vox Media’s respected portfolio of properties including Vox, The Verge, Eater, New York Magazine, The Cut, Vulture, and SB Nation, will help inform ChatGPT’s 100 million users, receiving brand attribution and audience referrals. The two companies will also collaborate using OpenAI’s technology to develop innovative products for Vox Media’s consumers and advertising partners…
Additionally, OpenAI will enhance its technology with Vox Media’s archives, which contain a deep well of reliable and accountable information and journalism…Vox Media will also use OpenAI technology to extend the leadership of Forte, its first-party data platform…Advertisers will benefit from the OpenAI partnership through even stronger creative optimization and audience segment targeting, leading to even higher campaign performance.
The Atlantic
The Atlantic also positioned the OpenAI agreement as “strategic.” The publisher intends to launch a new microsite that showcases its AI products, including collaboration with OpenAI.
The Atlantic’s articles will be discoverable within OpenAI’s products, including ChatGPT, and as a partner, The Atlantic will help to shape how news is surfaced and presented in future real-time discovery products. Queries that surface The Atlantic will include attribution and a link to read the full article on theatlantic.com(opens in a new window).
As part of this agreement, The Atlantic and OpenAI are also collaborating on product and tech: The Atlantic’s product team will have privileged access to OpenAI tech, give feedback, and share use-cases to shape and improve future news experiences in ChatGPT and other OpenAI products. The Atlantic is currently developing an experimental microsite, called Atlantic Labs, to figure out how AI can help in the development of new products and features to better serve its journalism and readers––and will pilot OpenAI’s and other emerging tech into this work.
What it Means
OpenAI has three objectives driving these deals. First, it needs high-quality data to train its AI foundation models. The race among AI model developers is transitioning from competition around accumulating more data to the quest for higher-quality data. Data quality has been a key driver in recent benchmark performance improvements of several AI foundation models.
Second, OpenAI would like to provide a better user experience for ChatGPT subscribers. The company believes it can grow the current 100 million weekly active users and increase usage frequency by adding more timely news coverage with attribution links. This puts OpenAI on a collision course with Google and the company’s key patron, Microsoft.
Finally, each licensing deal ensures that OpenAI will not be sued for its previous use of content from the publishers in training its earlier models. Though the company believes it had the right to use that public data to train its models, these agreements serve the purpose of reducing the risk of future lawsuits.
This is also more bad news for The New York Times. Instead of cutting a deal with OpenAI, the publisher filed a lawsuit. News publishers’ content varies in quality, and some have niche reporting that may be unique. However, publishers largely deliver similar news. In addition, OpenAI is primarily interested in high-quality content to train its models. Given the breadth of deals that OpenAI has already cut, the value of a deal with The New York Times has surely diminished. OpenAI may still attempt to strike an agreement, but its incentives to bend to the publisher’s demands lessen with every new publisher deal it inks.
I wonder if OpenAI (apart from collecting more training data) is preparing, given all these content deals, some kind of rich news-reading experience inside ChatGPT.
I could imagine users 'enabling' their favorite news sources inside the ChatGPT interface and then, if one asks, "Hey ChatGPT, what's new today" providing users with a curated generative news experience.