This is an extract from Large language mistake - Cutting-edge research shows language is not the same as intelligence. The entire AI bubble is built on ignoring it by Benjamin Riley (published in The Verge, November 2025).
The article contains further details and is well worth reading.
Premise

Recent enthusiasm around large language models (LLMs) often assumes that strong linguistic performance is equivalent to genuine intelligence. The central message of the article is that this assumption is fundamentally flawed: language competence and cognition are not the same thing.
Prediction isn't understanding
LLMs operate by estimating the next most probable token. This predictive mechanism is powerful and produces coherent text at scale, but it does not constitute reasoning or comprehension. The human mind, by contrast, does not rely on linguistic prediction as the core of its cognition.
Neuroscientific evidence reinforces this point. People with severe language impairments can retain full reasoning capabilities, and individuals who lack conventional language altogether can still think and solve problems effectively. In other words, human cognition exists independently of linguistic fluency, a sharp; contrast with LLMs, for which text is the only substrate.
Scaling alone won't unlock general intelligence
The optimistic belief that increasing model size, training data and compute will naturally converge on artificial general intelligence overlooks one critical fact: the human mind is multimodal, embodied and context-rich. Perception, sensorimotor grounding, causal reasoning, memory, intuition... none of these arise simply from scaling text-based pattern recognition.
LLMs excel at capturing statistical structure in language, but they lack the experiential grounding that underpins human understanding. Expecting them to spontaneously develop general reasoning through scale alone is, at best, wishful thinking.
Useful tools, but not emerging minds
This doesn't undermine the value of LLMs. They remain exceptional at tasks that are fundamentally linguistic: summarisation, translation, drafting, and rapid synthesis of textual information. Their limitations become problematic only when they're deployed as if they were agents with deep comprehension or world models they do not possess.
Recognising these limits is essential if we want to develop systems that move beyond language modelling toward real, grounded intelligence.
To me the critique strongly echoes John Searle's classic Chinese Room thought experiment: Searle argued that a system can manipulate symbols according to rules and still lack any understanding of what those symbols mean. LLMs fit this description almost perfectly, they map strings to other strings with extraordinary fluency, but without any semantic grasp of the content they produce. Their competence is syntactic rather than cognitive.
This reinforces the central point: impressive language output is not evidence of genuine comprehension or internal understanding.