Experts Warn: 90% of Online Content Could Be AI-Generated by 2025, Risking Data Quality and Diversity

August 26, 2024
Experts Warn: 90% of Online Content Could Be AI-Generated by 2025, Risking Data Quality and Diversity
  • Interest in artificial intelligence has surged, with Google searches reaching 92% of their peak over the past year.

  • The internet is increasingly saturated with AI-generated content, with OpenAI's CEO noting that the company produces around 100 billion words daily.

  • Currently, approximately 57% of all web-based text is AI-generated, raising sustainability concerns for both AI and the internet.

  • Experts predict that by 2025, 90% of online content could be AI-generated, complicating the training of future AI models.

  • AI-generated content spans various formats, including restaurant reviews and news articles, with over a thousand websites identified as producing erroneous AI-generated news.

  • This model collapse results in a loss of diversity in AI outputs, potentially leading to homogenized content that lacks real-world representation.

  • Experts warn that the decline in high-quality training content could worsen, with predictions suggesting that the human world may run out of quality data by 2026.

  • To combat these issues, AI companies should prioritize acquiring high-quality, diverse real data over relying on synthetic data harvested from the internet.

  • Researchers emphasize that AI-generated data often serves as a poor substitute for real data, leading to inaccuracies and hallucinations.

  • Model collapse is a growing concern, occurring when AI models are trained on their own generated data, leading to degraded performance and accuracy.

  • As AI technologies advance, traditional CAPTCHAs face challenges, as sophisticated AI can analyze images and interact with websites in a human-like manner.

  • Dr. Ilia Shumailov highlights the necessity for collaboration among AI developers to address model collapse and ensure data provenance.

Summary based on 6 sources


Get a daily email with more Tech stories

Sources

When A.I.’s Output Is a Threat to A.I. Itself

The New York Times • Aug 26, 2024

When A.I.’s Output Is a Threat to A.I. Itself



Will Robots Finally Beat CAPTCHA?

DEV Community • Aug 22, 2024

Will Robots Finally Beat CAPTCHA?

More Stories