Why LLMs Aren't Asking the Right Questions Yet

By Callum BryceMay 28, 2026

AI agents often miss critical user preferences by not asking the right questions. New benchmarks reveal the proactivity gap.

Artificial intelligence is supposed to be smart, right? But long-lived language models like OpenClaw, there's a big oversight. They aren't asking the right questions. These agents are designed to act according to user preferences across various sessions. The catch? They often miss unspoken preferences.

Understanding the Proactivity Gap

Picture this: you've got an AI agent that remembers what you tell it, but struggles with anything you don't. This is called the proactivity gap, and as we rely on our digital assistants more and more, this gap becomes a glaring issue. Users delegate tasks expecting smooth assistance, yet their agents can't act on preferences they never asked about.

JUST IN: This gap has a name now, Ask-to-Remember (ATR). The concept is simple. The AI decides whether to ask you now about a preference that might be useful later, even if the current task doesn't need it. Sounds straightforward? It's anything but.

ATRBench: The Game Changer

This is where ATRBench enters the scene. It's the first-ever benchmark to quantify how well AI agents handle ATR. By setting a user's preferences as hidden ground truths, success isn't just about remembering. It's about knowing when to ask.

Sources confirm: Across eight new AI models, the default performance falls short by at least 62 points compared to an oracle armed with the relevant preference. Even with prompting, that gap barely closes. It's a wild finding that highlights acquisition as the key bottleneck.

Why This Matters

And just like that, the leaderboard shifts. Current AI systems aren't as proactive as we might think. This isn't just a technical hiccup. It's a significant hurdle for the future of AI-driven personal assistants. The labs are scrambling to address it. But here's the million-dollar question: Can they really overcome this challenge?

The answer will shape the next generation of AI assistants. Will they become more intuitive, or will users need to spoon-feed their preferences forever? This challenge is a wake-up call for AI developers. It's time for a smarter approach.

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

Why LLMs Aren't Asking the Right Questions Yet

Understanding the Proactivity Gap

ATRBench: The Game Changer

Why This Matters

Key Terms Explained