LLMs and Human-Like Decision-Making: A Surface-Level...

Large Language Models (LLMs) are frequently praised for their apparent ability to mimic human decision-making, especially in complex risk scenarios. However, a recent investigation demonstrates that this alignment is superficial at best.

Examining LLMs with the St. Petersburg Game

Researchers used the St. Petersburg game, a classic paradox where the expected payoff is infinite but actual human willingness to pay remains low, as a testing ground. Testing 28 different LLMs, they observed that most models indeed generate finite bids. This behavior superficially aligns with human risk aversion, yet the mechanisms driving these decisions differ significantly.

What drives this difference? The researchers introduced controlled decision variants, altering aspects like truncation and numeric endowment, to see how models respond. The results? Models frequently resort to conditionally rational behavior, deviating from typical human responses when faced with these variants.

Human-Cue Prompting and Instruction Tuning

Attempting to bring LLMs closer to human-like reasoning, the team employed human-perspective prompts and instruction tuning. These approaches did lower bids and addressed some visible discrepancies. However, the core mechanism-level inconsistencies persisted, highlighting that these adjustments only scratched the surface.

The paper's key contribution: It challenges the notion that outcome similarity equates to true alignment. If LLMs deliver decisions that appear human-like, is that sufficient? Or should we demand more from models that increasingly influence high-stakes decisions?

Implications for High-Stakes AI

The study’s findings underscore the necessity of examining LLMs beyond the outcomes they produce. When evaluating AI systems, it's essential to ensure that the decision-making processes are genuinely aligned with human reasoning. In fields like finance and healthcare, where AI decisions can have significant impacts, superficial alignment could lead to unforeseen consequences.

Instead of settling for outcome-level resemblance, we should strive for mechanism-level consistency. Only then can we trust LLMs in real-world applications that demand not just accuracy, but comprehension and alignment with human values.

LLMs and Human-Like Decision-Making: A Surface-Level Illusion

Examining LLMs with the St. Petersburg Game

Human-Cue Prompting and Instruction Tuning

Implications for High-Stakes AI

Key Terms Explained