DecomposeR: Redefining AI Research with Structured Planning
DecomposeR uses directed acyclic graphs to enhance AI research planning, outperforming competitors on benchmarks by 5.1-8.0 points. This new framework marks a shift towards more explicit, rewardable planning.
Deep research tasks challenge large language models (LLMs) to not only retrieve evidence but to plan investigations and synthesize complex answers across different inquiry branches. Existing methods have struggled to separate planning from execution, often resulting in muddled credit assignment. Enter DecomposeR.
The DecomposeR Framework
DecomposeR introduces a planner-centric framework that utilizes typed directed acyclic graphs (DAGs) to explicitly represent research plans. This structure allows for planning to be both explicit and rewardable. The model is trained in two stages using the Qwen3-8B setup: planner reinforcement learning (RL) and answerer RL.
The first stage focuses on learning graph structures and query decomposition to advance research planning capabilities. The second stage, answerer RL, builds on this by executing branch-level tasks and synthesizing final answers based on the established plan. By targeting rewards at specific planner tokens and structured components rather than a flat trajectory, DecomposeR optimizes the planning process with greater precision.
Benchmark Performance
The benchmark results speak for themselves. DecomposeR-8B not only matches but surpasses strong open baselines by 5.1-8.0 points on popular long-form benchmarks. This improvement underscores the model’s enhanced planning and answering capabilities.
But why does this matter? The paper, published in Japanese, reveals a critical shift towards structured planning in AI research. By explicitly outlining research plans, DecomposeR addresses a fundamental weakness in previous paradigms. The data shows a clear advantage in structured over monolithic approaches.
Implications and Future Directions
Crucially, the success of DecomposeR could signal a broader trend in AI development. As models become more sophisticated, will structured planning frameworks become the norm? Western coverage has largely overlooked this, but the implications for AI research methodology can't be ignored.
DecomposeR’s approach suggests that breaking down complex tasks into structured, rewardable plans isn’t just beneficial, it’s essential for progress. With AI's increasing role in research, the need for such frameworks will likely grow.
Compare these numbers side by side with prior models. The increased performance isn't just a statistical anomaly. it's a testament to the efficacy of planning-centric approaches. Could this be the future of AI research? If DecomposeR's success is any indication, the answer is yes.
Get AI news in your inbox
Daily digest of what matters in AI.