Language Models and the Art of Non-Sequential Thinking

Artificial Intelligence and its offspring, transformer language models, are increasingly venturing into complex terrains. One such territory is entity tracking, a fundamental skill for understanding and reasoning through text. But here's the thing: these models aren't doing it the way you might expect.

Tracking Without the Tracks

Think of it this way: when we read a story, we naturally keep tabs on characters, states, and actions. Our brains track these elements sequentially. Language models, on the other hand, don't always play by that rulebook. Instead of inching forward with each new token, they aggregate relevant information in parallel, waiting until the last possible moment to make sense of it all. It's like building a puzzle without looking at the picture on the box until the very end.

This non-sequential approach becomes apparent when models tackle complex scenarios involving state changes. For instance, operations like 'PUT', 'REMOVE', and 'MOVE' show that these LMs aren't tracking world states in a linear fashion. They store everything up and then boom, they spit out an answer when the query is clear.

The Fragility of Removal

Here's where things get interesting. The 'REMOVE' operation, important in many scenarios, is handled with what can be described as a fragile global suppression tag. Basically, models implement a kind of memory wipe that's both sweeping and delicate, leading to various failure modes. If you've ever trained a model, you know that this kind of fragility can be a recipe for unpredictability.

Why should we care about these quirks? Well, understanding these mechanisms isn't just for AI researchers. It has implications for any real-world application relying on LMs for decision-making and reasoning. If a language model struggles with sequential logic, you might not want it running critical systems without a safety net.

Behavior and Mechanism, Hand in Hand

The analogy I keep coming back to is behavior and mechanism as two sides of the same coin. Behavioral analyses inform mechanistic hypotheses. In turn, insights from mechanical studies can predict failure modes not yet caught by behavioral evaluations. It's a dance of data and hypothesis where each informs and refines the other.

But let's be honest. The real question is: how do we turn this theoretical understanding into practical improvement? Perhaps nullifying these global suppression tags is a step in the right direction. But can we really expect language models to think like humans? Or should we lean into their unique strengths and develop systems that complement their non-sequential prowess?

Ultimately, the takeaway here isn't just about technical curiosity. It's about recognizing and adapting to the strengths and limitations of the tools at our disposal. As we push these language models into more turning point roles in society, understanding their idiosyncrasies isn't just a nice-to-have. It's essential.

Language Models and the Art of Non-Sequential Thinking

Tracking Without the Tracks

The Fragility of Removal

Behavior and Mechanism, Hand in Hand

Key Terms Explained