Equation Discovery: A New Benchmark for Symbolic Regression

By Signe EriksenJune 9, 2026

Discovering scientific models through equations is a complex task. ERBench aims to standardize and improve performance in this domain.

Equation discovery is important in automating the formation of scientific models, transitioning them from raw data to mathematical equations. This task is tackled by symbolic regression algorithms, which are evaluated on prediction accuracy and their ability to recover known formulas. While conventional regression focuses on in-domain accuracy, true equation discovery demands more.

Beyond In-Domain Testing

In traditional regression, datasets are typically split into training and test sets, measuring accuracy within the same domain. But for equation discovery, this approach falls short. Why? Because it doesn't address the real challenge: out-of-domain generalization. Yet, crafting reliable out-of-domain test data isn't straightforward. This is where the focus shifts from mere prediction to recovering known mathematical expressions.

The Case for Equation Recovery

Benchmarking symbolic regression often involves equation recovery tasks, albeit with limitations. The existing benchmarks are criticized for their limited scope of ground truth formulas and insufficient evaluation of algorithm robustness across variable dimensions, sampling, and distribution. Practitioners in natural sciences need tools that can handle noisy, diverse data, which is notoriously difficult with current benchmarks.

Introducing ERBench

To bridge this gap, the Equation Recovery Benchmark (ERBench) emerges as a new standard. Designed for rigorously assessing algorithms aimed at equation discovery, ERBench promises to evaluate performance across diverse conditions that mimic real-world data better. It's a major shift for researchers seeking to model complex natural phenomena accurately.

But will ERBench deliver on its promise and become the go-to framework for symbolic regression evaluation? Only rigorous adoption and feedback from the research community will tell. Yet, the shift it represents towards a more comprehensive evaluation method is a step in the right direction. The paper's key contribution: a renewed focus on robustness and real-world applicability.

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

Equation Discovery: A New Benchmark for Symbolic Regression

Beyond In-Domain Testing

The Case for Equation Recovery

Introducing ERBench

Key Terms Explained