Apple's AI Research Sparks Debate: Do Large Reasoning Models Truly Think?
June 13, 2025
Apple's recent research paper, titled 'The Illusion of Thinking', has ignited a debate within the generative AI community regarding the capabilities of large reasoning models (LRMs).
The paper contends that current LRMs from companies like OpenAI, DeepSeek, Anthropic, and Google do not truly think or reason; instead, they excel at pattern recognition and mimicry.
According to the research, the reasoning abilities of LRMs diminish as task complexity increases, suggesting these models are not a viable path toward achieving artificial general intelligence (AGI).
The study employed classic planning problems, such as the Tower of Hanoi and River Crossing, to evaluate model performance, revealing significant drops in effectiveness as task complexity escalated.
Despite the controversy surrounding the paper, many practitioners maintain that the findings do not undermine the practical utility of AI tools in everyday applications.
Critics of Apple's research quickly emerged, arguing that the findings challenge the hype surrounding AI reasoning models, asserting that these systems primarily memorize patterns rather than engage in genuine reasoning.
The timing of the paper's release coincided with Apple's WWDC event, leading to speculation that it aimed to manage expectations amid the company's ongoing challenges in AI development.
Apple's cautious approach to AI contrasts with the more aggressive strategies of competitors like Google and Samsung, reflecting its hesitance to fully embrace AI technology.
This controversy underscores the complexity of benchmarking AI models, emphasizing the need for careful evaluation metrics that do not unfairly constrain a model's perceived capabilities.
Researchers from Anthropic and Open Philanthropy criticized the paper for neglecting output limits, asserting that models can demonstrate high accuracy when allowed to use code.
In response, a rebuttal paper titled 'The Illusion of The Illusion of Thinking' was released, arguing that Apple's methodology was fundamentally flawed and that models could reason effectively under different conditions.
Cognitive scientist Gary Marcus supports Apple's findings, claiming that today's AI models lack true understanding or reasoning capabilities, making the path to AGI seem increasingly unlikely.
Summary based on 2 sources
Get a daily email with more AI stories
Sources

VentureBeat • Jun 13, 2025
Do reasoning models really “think” or not? Apple research sparks lively debate, response
Economic Times • Jun 13, 2025
Apple Paper questions path to AGI, sparks division in GenAI group