This paper introduces "Selection Theorems" for autonomous agents, proving that low average-case regret on action-conditioned prediction tasks forces agents to implement structured, predictive internal states. It demonstrates that robust competence in POMDPs necessitates world models and belief-like memory, achieving SOTA theoretical bounds for representation necessity without assuming optimality or determinism.
TL;DR
Why do the most capable AI agents — from DreamerV3 to biological brains — seem to converge on similar internal world models? This paper by Aran Nayebi provides a rigorous mathematical answer: Selection Theorems. It proves that if an agent achieves low regret across a diverse family of tasks, it is mathematically "forced" to represent the underlying causal structure of its environment. Predictive internal state isn't just a good design choice; it's a structural necessity for survival in uncertain worlds.
The "As-If" Trap: Moving Beyond Sufficiency
For decades, the Control Theory community has operated under a constructive paradigm: we know that if you have a belief state (like a Kalman filter or a POMDP belief vector), you can act optimally. However, this only proves sufficiency. It doesn't prove that a black-box neural network must develop such a state to be competent.
The author addresses the "Good Regulator Theorem" pitfall — where a simple, constant policy might look competent in a trivial environment without actually "modeling" anything. By introducing average-case regret over structured task families, Nayebi shows that as tasks become deeper and more varied, the "shortcuts" disappear, leaving predictive modeling as the only viable path to low regret.
Methodology: The Power of Binary Bets
The core technical innovation is reducing world modeling to a game of "betting."
1. The Betting Reduction
Any prediction task (e.g., "Where will the ball be in 5 seconds?") can be decomposed into binary choices. The agent chooses between two incompatible branches:
- Branch L: Outcome counts are $\leq k$.
- Branch R: Outcome counts are $> k$.
The author proves that an agent's normalized regret ($\delta$) is directly proportional to the probability mass the agent assigns to the "wrong" bet.
2. Formalizing Necessity
If an agent manages to keep its regret low across many such bets, it must be distinguishing between the world-states that make those bets different. In the paper's framework, this leads to the recovery of the Interventional Kernel (Pearl’s Level 2 Causality).
The theorem shows that as the goal depth $n$ increases, the agent is forced to estimate transition dynamics with increasing precision ($1/\sqrt{n}$).
Partial Observability and "No-Aliasing"
One of the most significant contributions is solving an open question in world-model recovery for POMDPs. In partially observed environments, different histories can look the same (aliasing).
The Memory Necessity Theorem proves that any agent achieving low regret cannot alias histories that require different high-confidence bets. If History A and History B lead to different future observations, a competent agent's internal memory must be different for both, even if the current observation is identical. This provides a normative pressure for the emergence of "belief-like" memory.
Key Results & Structured Task Success
The paper extends these theorems to explain why certain "Brain-like" features emerge in AI:
- Modularity: Block-structured tests (independent sub-tasks) select for informational modularity in the agent's architecture.
- Regime Tracking: Shifting mixtures of tasks force the agent to maintain "latent variables" (analogous to affective or homeostatic states in neuroscience) to track the current regime.
- Representational Match: Under a condition called "$\gamma$-minimality," any two low-regret agents—regardless of their internal architecture—must converge to the same internal partitions (up to an invertible recoding).
Equation (23/25) relates the policy's threshold choices to the actual binomial median of the environment, proving that the agent's internal "report" bits must track the environment's true transition probabilities.
Deep Insight: The Convergence of NeuroAI
The most profound takeaway is the link to the Platonic Representation Hypothesis. If task-general performance "compresses" the space of possible internal representations, then the fact that our LLMs and RL agents are starting to show brain-like representational alignment isn't a fluke. It's a mathematical inevitability.
The Takeaway: We don't need to hard-code "consciousness" or "world models" into agents. Instead, by scaling the diversity and depth of tasks they must solve under uncertainty, we are logically forcing these structures to emerge. Robust agency and structured internal world models are two sides of the same coin.
Perspectives and Future Work
While the paper proves the necessity of Level 2 (Interventional) models, it also shows that Level 3 (Counterfactual) models cannot be guaranteed by low regret alone. To reach the highest level of causal reasoning, agents might need even more specific "selection pressures" or architectural inductive biases that go beyond simple task competence.
