Can LLMs truly understand language or do they only mimic statistical patterns?

Are LLMs just statistical parrots, or do they actually understand language?

The central debate is whether LLMs genuinely grasp meaning or merely excel at predicting the next word based on probabilities. Blaise Agüera y Arcas takes a strong stance, arguing that statistics do amount to understanding in any falsifiable sense, and that complex sequence learning can be a sufficient basis for general intelligence [1]. This view suggests that mimicking patterns is, in itself, a form of understanding.

However, a 2025 benchmark study reveals persistent weaknesses in LLMs' numerical reasoning, such as basic arithmetic and magnitude comparison, highlighting their reliance on surface-level statistical patterns rather than understanding numbers as continuous magnitudes [4]. This shows that while LLMs can appear to understand, they fail at tasks requiring fundamental comprehension, supporting the 'statistical mimicry' view.

What does the evidence actually show about what LLMs can and cannot do?

LLMs have demonstrated that human-like grammatical language can be acquired without a built-in grammar, suggesting statistical learning is powerful enough to explain much of language acquisition [2]. This supports the idea that statistical patterns can lead to impressive linguistic abilities.

Yet, a 2024 paper argues that LLMs' desirable qualities, like zero-shot rule extrapolation and in-context learning, are not simply a consequence of good statistical generalization [3]. This means there is something more than just pattern matching at play, but it is not necessarily human-like understanding. Anders Søgaard adds that while LLMs can learn inferential semantics (relationships between words), they struggle with referential semantics (connecting words to real-world objects) unless grounded through additional techniques [5].

So, do they understand or not? Here's the nuanced answer.

The evidence points to a middle ground: LLMs do not understand language the way humans do, but their statistical mimicry can lead to a functional, albeit limited, form of understanding. The 2025 benchmark shows clear gaps in numerical reasoning [4], while the 2022 paper by Agüera y Arcas argues that statistics can constitute understanding [1]. This is not a contradiction; rather, it suggests that LLMs have a different kind of understanding—one that is pattern-based and task-specific.

Ultimately, the question may be unanswerable objectively, as Agüera y Arcas notes: since the interior state of another being can only be understood through interaction, no objective answer is possible to the question of when an 'it' becomes a 'who' [1]. For practical purposes, LLMs can understand language in some contexts but fail in others, especially where deep reasoning or real-world grounding is required.

Sources used in this answer

Do Large Language Models Understand Us?

Argues that statistics do amount to understanding in any falsifiable sense, and that complex sequence learning may be sufficient for general intelligence [1].

2022 · Blaise Agüera y Arcas · Daedalus

Original

Large Language Models Demonstrate the Potential of Statistical Learning in Language

Demonstrates that human-like grammatical language can be acquired without a built-in grammar, supporting the power of statistical learning [2].

2023 · Pablo Contreras Kallens, Ross Deans Kristensen-McLachlan, Morten H. Christiansen · Cognitive science

Original

Understanding LLMs Requires More Than Statistical Generalization

Shows that LLMs' desirable qualities like zero-shot rule extrapolation are not simply due to good statistical generalization, requiring separate explanation [3].

2024 · Patrik Reizinger, Szilvia Ujváry, Anna Mészáros, Anna Kerekes, Wieland Brendel, Ferenc Huszár · arXiv (Cornell University)

WisPaper

Original

Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models

Reveals persistent weaknesses in LLMs' numerical reasoning (e.g., basic arithmetic) via a new benchmark, highlighting reliance on surface patterns [4].

2025 · Haoyang Li, Xuejia Chen, Zhanchao Xu, Darian Li, Nicole Hu, Fei Teng, Yiming Li, Luyu Qiu, Chen Jason Zhang, Li Qing, Lei Chen · Findings of the Association for Computational Linguistics: ACL 2025

Original

Understanding models understanding language

Distinguishes between inferential and referential semantics, noting LLMs can learn the former but struggle with the latter without grounding [5].

2022 · Anders Søgaard · Synthese

Original