Why AI Judges Shouldn’t Rush Verdicts in Mixed Evidence...

AI judges making calls on mixed evidence claims, there's a glaring issue: they're turning verdicts into commitments without proper authorization. This isn't just a technical oversight, it's a failure that could undermine the trust in AI judgment systems altogether. The problem gets a name, Cherry-pick Override (CCO), and it shows up when AI judges choose a directional verdict like SUPPORTS or REFUTES, despite conflicting evidence.

The Problem with CCO

In AI judgment systems, CCO occurs under a specific task contract, exposing a fault line in how AI handles ambiguity. On the AVeriTeC dataset's Conflicting subset, where N_C equals 150, AI judges favored a directional verdict in over 84% of these cases. The schema allows for a CONFLICTING verdict, but that's not what we're seeing in practice.

What's more, majority voting among three judges only made the situation worse. It amplified the directional verdicts in conflicting cases on AVeriTeC from 0.840 to 0.887. Yet this didn't replicate in the VitaminC-Mixed dataset, suggesting that AI's decision-making process is far from foolproof.

Failed Fixes and the Need for a New Approach

Attempts to mitigate CCO with single-channel fixes, like typed vocabulary and confidence thresholding, leave behind significant failures. Panel aggregation, for instance, drowns out dissenting conflicting verdicts 48% of the time. Even a well-calibrated panel with an expected calibration error (ECE) of 0.07 on pure SUPPORTS/REFUTES fails to distinguish CCO from correct decisions effectively.

A promising two-channel reference probe approach shows some potential, outperforming single channel methods and highlighting the structural issues in AI's judgment. On AVeriTeC, this method shows structural targeting with an empirical p-value of less than 1/2001, though it's less pronounced on VitaminC-Mixed. But it’s not about the magnitude, it’s about selectively improving how AI systems process conflicting evidence.

The Case for Commitment Control

We need to rethink how verdicts are handled in AI systems. An external layer for commitment control could separate the verdict generation from the commitment authorization process, using structural evidence and confidence as distinct channels. In simple terms, AI should have a NO-COMMIT state, functioning as a controller to prevent premature conclusions.

This boils down to a fundamental question: Can we trust AI judges to make impartial decisions without a mechanism to control their commitments? Until we see a reliable system in place, skepticism remains warranted. Slapping a model on a GPU rental isn't a convergence thesis. The intersection of AI and judgment is real. Ninety percent of the projects aren't.

Why AI Judges Shouldn’t Rush Verdicts in Mixed Evidence Cases

The Problem with CCO

Failed Fixes and the Need for a New Approach

The Case for Commitment Control

Key Terms Explained