Why Smarter Models Might Fail in Simulating Human Behavior

By Signe EriksenApril 15, 2026

Exploring the pitfalls of using advanced language models for simulating human behavior. The focus is on how reasoning capabilities might hinder rather than help.

Large language models are often heralded as the silver bullet for accurate simulations in social, economic, and policy scenarios. But what if their enhanced reasoning abilities are more a curse than a blessing in certain contexts?

Simulation vs. Optimization

Here's the crux: smarter isn't always better when simulating human behavior. When the aim is to reflect plausible and boundedly rational actions, overly reasoning models may actually derail the simulation. They tend to over-optimize, gravitating towards strategically dominant actions that don't resemble the messiness of real-world human compromise.

In three distinct multi-agent negotiation environments, the study examined three conditions: no reflection, bounded reflection, and native reasoning. The results were telling. Bounded reflection consistently led to more diverse and compromise-driven outcomes, a stark contrast to the rigid authority decisions that plagued models operating with native reasoning.

Case Studies

Consider this: in the direct OpenAI runs using GPT-5.2, native reasoning ended with authority decisions in every single one of 45 runs across three experiments. Yet, when bounded reflection was applied, compromise outcomes were achieved consistently. This highlights a fundamental mismatch between model capability and simulation fidelity.

Why It Matters

So, why should we care? If language models are employed to simulate human decision-making, shouldn't they actually mimic human behavior? A model that functions well as a solver may not necessarily serve as a credible simulator. The paper's key contribution is a methodological warning: don't conflate problem-solving prowess with the ability to simulate human-like behavior.

This builds on prior work from researchers who have emphasized the importance of aligning model objectives with simulation goals. A question worth pondering: are we too focused on the sophistication of models at the expense of their practical applicability in behavioral simulations?

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

Why Smarter Models Might Fail in Simulating Human Behavior

Simulation vs. Optimization

Case Studies

Why It Matters

Key Terms Explained