LLM Agents Face Reality: Experience Isn't Always Enough

language models, experience has been touted as the secret sauce for evolution. But what happens when experience isn’t as valuable as promised? Enter a new challenge: low-repetition tasks with implicit rewards, where past experiences might not always be your best ally.

The Test: FinEvolveBench

Meet FinEvolveBench, a benchmark designed to test financial sentiment predictions. This benchmark links daily news-driven predictions directly to future excess returns. The premise? Experience-based self-evolution for LLM agents should help them navigate these murky waters.

But the reality is far less rosy. When feedback is delayed, noisy, and outcome-level, as it often is in finance, experience can become more of a hindrance than a help. In these cases, it seems the data is already hinting at a bleak outcome.

Tree-of-Experience: A New Approach

To counteract this experience conundrum, researchers introduced the Tree-of-Experience (ToE), a method for structured experience management. ToE organizes, retrieves, and updates agent experiences in a way that supposedly makes them more useful.

However, the results were mixed at best. General-purpose experience mechanisms couldn’t consistently outperform baselines devoid of experience. It begs the question: is all this emphasis on experience management just bullish on hopium?

Lessons in Implicit-Reward Environments

What’s clear is that structured experience management becomes important in environments where rewards aren’t handed out on a silver platter. But even the structured approach has its limits. When the feedback is as unpredictable as the stock market, sometimes experience leads you astray.

So, should we abandon experience altogether? Not quite. But it's evident that not every task benefits from our past. Zoom out. No, further. See it now? Overrelying on experience might just leave you overextended, waiting on a delayed payout that never comes.

LLM Agents Face Reality: Experience Isn't Always Enough

The Test: FinEvolveBench

Tree-of-Experience: A New Approach

Lessons in Implicit-Reward Environments

Key Terms Explained