Chunk-Level Guidance: A Smarter Way to Steer AI Models

The quest for improving AI model accuracy during inference has taken a leap forward with Chunk-Level Guided Generation. This innovative approach avoids the pitfall of small models committing to erroneous reasoning paths, a common issue when selecting responses from multiple small-model samples using a stronger scorer. Crucially, it does so without needing a reward model trained with step-level labels.

The New Approach

Chunk-Level Guided Generation employs an off-the-shelf large language model as a process scorer, marking a significant departure from traditional methods. The method involves a small model generating k fixed-length candidate chunks, with a larger model scoring these candidates based on likelihoods without actually generating text. The chunk with the highest likelihood is then selected, steering the generation process before mistakes can accumulate.

Two selection rules are key in this framework. Likelihood-Guided Selection (LGS) chooses the chunk with the highest length-normalized large-model log-probability, while Contrastive-Guided Selection (CGS) favors chunks where the large model's preference diverges from the small model's, by subtracting the small model's log-probability. Notably, this approach tackles the systematic length bias found in variable-length reasoning steps, as fixed-length chunks sidestep this confounding factor.

Performance that Speaks

The benchmark results speak for themselves. When applied to datasets like GSM8K, MATH, Minerva Math, AMC23, and AIME24, CGS outperforms majority voting by as much as 28 percentage points. Even when matched against Qwen2.5-Math-PRM-72B guided search, CGS holds its ground without the need for reward-model training. For instance, with Qwen2.5-7B guided by Qwen2.5-72B, CGS achieves a notable 81.8% on MATH and 63.6% on Minerva Math at k=16. Compare these numbers side by side with traditional methods, and the advantage is clear.

Why It Matters

So why should this matter? Traditional methods relying on majority voting or PRM guided search often produce lengthy reasoning traces, which aren't only inefficient but also more prone to errors. Chunk-Level Guided Generation, by contrast, results in substantially shorter reasoning traces. Isn't it time we question the reliance on outdated methods when smarter, more efficient strategies are available?

Western coverage has largely overlooked this advancement, yet its implications are significant. As AI models become an integral part of decision-making processes across industries, ensuring their accuracy and efficiency is key. Chunk-Level Guided Generation offers a promising path forward, simplifying the process without sacrificing performance.

Chunk-Level Guidance: A Smarter Way to Steer AI Models

The New Approach

Performance that Speaks

Why It Matters

Key Terms Explained