Why Deep Networks Prefer Simplicity: The Unexpected Truth

Deep learning models are revealing a surprising bias: they favor simplicity over complexity when estimating density. This preference isn't just a quirk of a single model or dataset. It's a pattern seen in varied architectures, from iGPT to score-based diffusion models.

The OOD Anomaly Unveiled

Typically, models trained on specific datasets should assign higher density to in-distribution data than to out-of-distribution (OOD) data. But that's not always the case. The OOD anomaly shows us that deep models often give simpler OOD data higher density scores than their in-distribution counterparts. This isn't a fluke, it's a consistent finding.

Researchers have expanded the scope of this anomaly. By decoupling network training from density estimation, they've discovered a regularity across models and data. Lower-complexity samples often receive higher estimated density compared to higher-complexity ones. This isn't limited to one test set but spans across OOD pairs like CIFAR-10 and SVHN.

Simplicity Trumps Complexity

Why do deep networks consistently rank simpler data higher? It seems that regardless of how complex a sample used for training is, the outcome is the same: simpler images get higher density. The numbers tell a different story than we'd expect, showing that simplicity universally wins out over complexity.

Using Spearman rank correlation, researchers found striking agreement between models and external complexity metrics. Even models trained on the lowest-density samples rank simpler images higher. Strip away the marketing, and you get deep networks that consistently favor less complex data.

Implications and Questions

What does this mean for model development and deployment? If deep networks naturally prefer simplicity, are we using the right metrics to evaluate them? This could shift how we think about training models, especially when tasked with complex real-world data.

Frankly, this raises a fundamental question: Are we overestimating a model's ability to handle complexity? The architecture matters more than the parameter count, and simplicity could be a hidden ally in achieving better performance.

The significance isn't just academic. As AI continues to integrate deeper into systems that rely on complex data, understanding this bias towards simplicity will be essential. It might redefine best practices across industries relying on AI for decision-making.

Why Deep Networks Prefer Simplicity: The Unexpected Truth

The OOD Anomaly Unveiled

Simplicity Trumps Complexity

Implications and Questions

Key Terms Explained