Experts Warn: 90% of Online Content Could Be AI-Generated by 2025, Risking Data Quality and Diversity
August 26, 2024
Interest in artificial intelligence has surged, with Google searches reaching 92% of their peak over the past year.
The internet is increasingly saturated with AI-generated content, with OpenAI's CEO noting that the company produces around 100 billion words daily.
Currently, approximately 57% of all web-based text is AI-generated, raising sustainability concerns for both AI and the internet.
Experts predict that by 2025, 90% of online content could be AI-generated, complicating the training of future AI models.
AI-generated content spans various formats, including restaurant reviews and news articles, with over a thousand websites identified as producing erroneous AI-generated news.
This model collapse results in a loss of diversity in AI outputs, potentially leading to homogenized content that lacks real-world representation.
Experts warn that the decline in high-quality training content could worsen, with predictions suggesting that the human world may run out of quality data by 2026.
To combat these issues, AI companies should prioritize acquiring high-quality, diverse real data over relying on synthetic data harvested from the internet.
Researchers emphasize that AI-generated data often serves as a poor substitute for real data, leading to inaccuracies and hallucinations.
Model collapse is a growing concern, occurring when AI models are trained on their own generated data, leading to degraded performance and accuracy.
As AI technologies advance, traditional CAPTCHAs face challenges, as sophisticated AI can analyze images and interact with websites in a human-like manner.
Dr. Ilia Shumailov highlights the necessity for collaboration among AI developers to address model collapse and ensure data provenance.
Summary based on 6 sources
Get a daily email with more Tech stories
Sources

The New York Times • Aug 26, 2024
When A.I.’s Output Is a Threat to A.I. Itself
Forbes • Aug 26, 2024
Is AI Quietly Killing Itself—And The Internet?
ABC News • Aug 25, 2024
What is 'model collapse'? An expert explains the rumours about an impending AI doom
DEV Community • Aug 22, 2024
Will Robots Finally Beat CAPTCHA?