Why TESSY Could Be a breakthrough for AI Model Training

By Nadia OkoroApril 17, 2026

Synthetic data generation often fails to enhance reasoning in AI models. But a new framework, TESSY, shows promise by aligning stylistic elements with learning targets, leading to significant performance improvements.

In the quest for superior AI models, synthetic data generation has been a popular strategy. However, emerging reasoning models like Qwen3-8B, this approach often backfires. Instead of enhancing abilities, it can lead to notable performance declines. So, what's the problem?

The Problem with Synthetic Data

It turns out that the disconnect between the style of data generated by stronger models (teachers) and the target models (students) significantly hampers the effectiveness of supervised fine-tuning (SFT). The numbers tell a different story: when the Qwen3-8B model was fine-tuned using data from the GPT-OSS-120B, its performance dropped by 3.25% on LiveCodeBench-Pro and a staggering 10.02% on OJBench.

The reality is, style matters more than we thought. The divergence between teacher-generated data and student model distributions is a major roadblock. And it's not just a minor stylistic quirk. it's a fundamental issue impacting performance.

Introducing TESSY: A Fresh Approach

Enter TESSY, a Teacher-Student Cooperation Data Synthesis framework that offers a novel solution. Instead of merely handing down data from teacher to student, TESSY interleaves the models to alternately generate style and non-style tokens. This dual approach allows synthetic sequences to retain the advanced reasoning capabilities of the teacher while aligning stylistically with the student.

The results are impressive. TESSY managed to improve Qwen3-8B's performance by 11.25% on LiveCodeBench-Pro and 6.68% on OJBench. That's a substantial turnaround, especially in an area where improvement was elusive.

Why TESSY Matters

Why should we care about TESSY? Because it underscores a critical insight: the architecture matters more than the parameter count. By focusing on the synthesis process, TESSY provides a framework where AI models can learn more effectively, potentially reshaping how we think about model training.

But here's the big question: will other models follow suit? As AI continues to evolve, adopting frameworks like TESSY might not just be an option, but a necessity. As we strip away the marketing, what remains is a clear vision of how blending style with substance can lead to real progress in AI training.

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

Why TESSY Could Be a breakthrough for AI Model Training

The Problem with Synthetic Data

Introducing TESSY: A Fresh Approach

Why TESSY Matters

Key Terms Explained