Sharper Chats: Cutting Token Waste in Multi-Agent Systems

Multi-agent systems (MAS) have been flexing their muscle with large language models, but there's a catch. The free-form communication they rely on can quickly balloon token usage, gobbling up the shared context window and driving up inference costs. It's a problem that's been crying out for a solution.

Why Free-Form Communication Is a Problem

If you've ever trained a model, you know that token management is important. Think of it this way: each token is like a piece of your compute budget. Waste them, and you're left with a bill that's bigger than it needs to be. Free-wheeling messages in MAS setups can lead to just that, a bloated token count that hampers performance.

In analyzing five common communication strategies across two MAS topologies, researchers found no one-size-fits-all strategy. The only constant? Messages that focus on action-centered information that downstream agents actually need. Now, let me translate from ML-speak, fluff-free messages are key.

Enter the PACT Protocol

This is where PACT (Protocolized Action-state Communication and Transmission) enters the scene. PACT turns inter-agent communication into a public state-update problem. It's like cleaning up a messy room before inviting friends over. By projecting each agent's output into a compact action-state record, PACT keeps interactions lean and meaningful.

What does this mean in practice? Across different MAS topologies, PACT slashes token use without sacrificing performance. In fact, in some cases, it boosts it. Take OpenHands, for instance. With PACT, it achieves a higher resolve rate while cutting down tokens-per-resolved by 10%. For the SWE-agent, PACT is resolve-neutral but manages to halve input tokens. That's efficiency.

Why This Matters for Everyone

Here's the thing: it's not just researchers who should care about this. When systems are more efficient, they cost less to run. That means lower costs for companies, which can lead to lower prices for consumers. Who doesn't want that?

But let's not overlook the bigger picture. As AI systems become more integrated into daily life, efficient communication isn't just a backend concern. It's about making sure these systems can run smoothly and affordably at scale. And honestly, who wouldn't want to trim down on unnecessary chatter?

The analogy I keep coming back to is trimming the fat off a steak. You get the same or even better flavor without the extra calories. With PACT, MAS can achieve the same, or better, results without the token bloat. It's a lean, mean, efficient machine.

Sharper Chats: Cutting Token Waste in Multi-Agent Systems

Why Free-Form Communication Is a Problem

Enter the PACT Protocol

Why This Matters for Everyone

Key Terms Explained