Rethinking Autoregressive Models: The Rise of Projected...

Autoregressive language models have long stuck to a playbook: predict a token, commit to it, rinse and repeat. But what if we could break free from this rigid cycle? Enter Projected Autoregression, a fresh take on text generation that ditches the old token-by-token approach for something a bit more fluid.

Continuous Prediction: A New Frontier

Projected Autoregression flips the script by predicting in continuous space. Think about it. Instead of locking in a token right away, the model predicts vectors and only makes a commitment when it's ready. It's like drafting an email and deciding to hit send only when you're truly satisfied. This shift in approach means the model isn't shackled to a single, irreversible path. It can refine its choices iteratively, thanks to what's called a 'liquid tail', a mutable suffix that's open to changes before finalizing the token.

A Different Beast

This method isn't just a minor tweak. It's a whole new beast. Even when discrete tokens are produced with immediate projection (K=1), the results aren't what you'd expect from typical token-space autoregressive models. This isn't just about keeping up with the Joneses. it's about redefining the neighborhood entirely.

Why should you care? Because this continuous prediction method allows for what's called a 'continuous control surface.' You can tweak direction rate, manage history noise, delay decisions, and guide state-space, all in real-time as the model generates text. It's like having a dashboard that lets you fine-tune the engine while you're driving.

Broader Implications

We're talking about more than just back-end improvements. This opens up a broader algorithmic design space for language generation. Suddenly, token selection is just one item on a longer menu of autoregressive interfaces. It pushes the limits of what's possible in text generation, making the whole process more dynamic and adaptable.

Is this the future of language models? It might just be. If continuous prediction can offer better structure and control, why stick with the old regime? The speed difference isn't theoretical. You feel it.

In a world where precision and adaptability are king, Projected Autoregression could very well be the knight in shining armor. It's time to rethink what we've been doing and embrace a model that allows us to keep our options open until the very last moment. If you haven't looked into this yet, you're already behind.

Rethinking Autoregressive Models: The Rise of Projected Autoregression

Continuous Prediction: A New Frontier

A Different Beast

Broader Implications

Key Terms Explained