Google Faces Scrutiny for Using Publisher Content in AI Despite Opt-Outs Amid Antitrust Battle
May 3, 2025
Prosecutors argue that Google's retention of access to publisher content, even after opt-outs, enables the company to maintain its market dominance and develop high-quality AI models that competitors cannot match.
In addition to the current trial, Google faces another antitrust trial in September concerning its advertising technology business, with the DOJ proposing significant changes, including the divestiture of its ad exchange and publisher ad server businesses.
Despite these challenges, Alphabet reported strong financial performance, with Q4 revenue reaching $96.5 billion, reflecting a 12% year-over-year growth, and Google Cloud revenue surging by 30%.
During a recent court hearing, Eli Collins, Vice President at Google DeepMind, revealed that Google continues to use publisher content to train its AI for search, despite explicit opt-out requests from website owners.
Publishers can only opt out of data usage for search AI if they also choose to be excluded from search indexing via the robots.txt standard, which governs web crawling.
An internal document from August 2024 indicated that Google filtered out 80 billion tokens from its AI training data, corresponding to content from publishers who opted out.
This situation has sparked ongoing debates about copyright and the use of online content by technology companies, raising significant concerns for content creators and the media industry.
The outcome of the antitrust lawsuit against Google could establish important precedents regarding how tech companies manage content permissions in the AI era, potentially leading to new regulations that enhance publishers' control over their intellectual property.
The Department of Justice (DOJ) is leveraging these revelations in its antitrust case against Google, highlighting internal documents that reveal the extent of data removed from DeepMind's training pool.
A decision on antitrust remedies is anticipated later in 2025, which could have profound implications for publisher control over data used in AI training.
As publishers increasingly advocate for greater transparency and control over their content, the balance of power appears to be shifting in favor of tech platforms amid the rise of AI-generated content in search results.
Summary based on 19 sources
Get a daily email with more Tech stories
Sources

Yahoo Finance • May 3, 2025
Google Can Train Search AI With Web Content After AI Opt-Out
Economic Times • May 4, 2025
Google can train search AI with web content after AI opt-out
Business Standard • May 4, 2025
Google can train search AI on web content even if publishers opt out
The Business Times
Google can train search AI with web content even after opt-out