Propagational Proxy Voting: Rethinking Majority in LLM Consensus
Majority voting may dominate unsupervised LLM inference, but Propagational Proxy Voting could redefine the consensus with smarter aggregation.
In the field of unsupervised large language model (LLM) inference, majority voting has reigned supreme. It's straightforward: sample answers, take the majority, call it a day. But is it really that simple? A new approach, Propagational Proxy Voting (PPV), challenges this norm, showcasing a more nuanced and effective method of consensus-building.
Beyond Majority: A New Approach
Let's talk numbers. On a benchmark known as MMLU-Pro, PPV outperforms traditional majority voting by a notable 1.5 percentage points overall and 2.24 percentage points on more challenging subsets of data. It's not just about counting votes. it's about understanding the signals within those votes. Majority voting tends to ignore essential data points like within-group letter entropy and between-group reasoning geometry. These aren't just technical jargon, they're the keys to unlocking smarter aggregation.
PPV leverages these signals with two key levers: WHEN, determining how much weight a voter retains on its own choice, and WHOM, dictating how this weight is distributed among peers. It's about time someone asked, are all votes really equal? PPV argues they aren't, and it does so by integrating semantic entropy and reasoning geometries into its vote delegation process.
The Method Behind the Madness
PPV's methodology involves partitioning sampled generations into groups, analyzing each group's semantic entropy and reasoning geometry, and feeding these into a stochastic matrix. No need for gold-standard labels or additional training. This isn't just about efficiency. it's about breaking free from entrenched paradigms. In one case study, a 10-6 majority for the wrong answer was overturned by PPV due to the geometric incoherence of the majority group, despite having a numerical advantage.
What's the takeaway? Slapping a model on a GPU rental isn't a convergence thesis. If the AI can hold a wallet, who writes the risk model?
Rethinking the Aggregation Landscape
While PPV shows promise, it also brings to light the limitations of certain delegation strategies that don't close the oracle gap. This insight constrains the design space for unsupervised LLM aggregation, revealing the complexity of developing a truly solid system. It's a reminder that in AI, the intersection is real. Ninety percent of the projects aren't.
The implications of PPV extend far beyond theoretical discussions. As AI systems grow more complex and pervasive, distinguishing signal from noise becomes important. So, is majority voting outdated? In certain applications, perhaps. Show me the inference costs. Then we'll talk.
Get AI news in your inbox
Daily digest of what matters in AI.