Ettin Models Shake Up the AI Playground

Ettin's new models are challenging the status quo, balancing encoder and decoder strengths. This could redefine AI benchmarks.
JUST IN: The AI world’s been buzzing about Ettin's fresh suite of models. These aren't your typical models. They're engineered to challenge the dominance of decoder-only models in text generation.
Battle of the Models
The Ettin suite offers both encoder-only and decoder-only models, ranging from 17 million to a hefty 1 billion parameters. They've been trained on a staggering 2 trillion tokens. It's like having a heavyweight match with fighters in the same weight class, finally a fair fight.
Sources confirm: The labs are scrambling to understand the implications of this. For too long, decoder models have been the go-to for generative tasks, while encoders were sidelined, used mainly for classification and retrieval. Ettin changes that narrative.
Performance Check
The models come with a surprise twist. While previous attempts to adapt models across tasks often fell flat, Ettin shows that sticking an encoder on encoder tasks and a decoder on generative ones still outshines trying to force a one-size-fits-all approach. Take the MNLI task, where Ettin's 400M encoder leaves a 1B decoder in its dust. And just like that, the leaderboard shifts.
So, what's the takeaway here? Are encoders making a comeback? It looks like it. Ettin's models outperform ModernBERT as encoders and crush the likes of Llama 3.2 and SmolLM2 as decoders.
Future Implications
Open-source enthusiasts, rejoice! All artifacts from this study are available, including training data and over 200 checkpoints. This transparency means other researchers can dig deep, analyze, and maybe even push these models further.
Will this usher in a new era where encoder-decoder debates become a thing of the past? Or will it just fuel more fierce competition between camps? One thing's for sure: Ettin's suite is a shot across the bow, and everyone's taking notice.
Get AI news in your inbox
Daily digest of what matters in AI.
Key Terms Explained
A machine learning task where the model assigns input data to predefined categories.
The part of a neural network that generates output from an internal representation.
The part of a neural network that processes input data into an internal representation.
A neural network architecture with two parts: an encoder that processes the input into a representation, and a decoder that generates the output from that representation.