Rethinking Data Attribution: Bridging the Gap with AdamW-Influence
A fresh look at data attribution errors reveals significant advancements by addressing optimizer mismatches. Here's why AdamW makes all the difference.
Data attribution methods are the unsung heroes of machine learning, tracing back through training samples to evaluate their impact on model predictions. But, they've had a glaring oversight: error analysis. How can we trust these methods without understanding their missteps?
Exposing the Optimizer Mismatch
One major error source has been largely ignored. Most attribution methods assume models are trained using Stochastic Gradient Descent (SGD), even when AdamW is the optimizer of choice. Why does this matter? Because it skews the entire attribution process. Enter AdamW-influence, a proposed solution that aligns attribution calculations with AdamW's dynamics. The results? A staggering improvement in Spearman correlation by 10% to over 300% across various models, including MLP, CNN, GPT-2, and Llama 3.2-1B.
Algorithm-Level Errors: Beyond First-Order Approximations
Algorithm-level errors, particularly those tied to first-order Taylor approximations, also warrant scrutiny. Factors like learning rate and trajectory length play important roles. The new approach offers a closed-form error proxy, allowing for evaluation without the need for retraining. It's a step towards methodological transparency, but how often do these nuances get lost in the noise of academic publications?
A Unified Framework for Data Selection
These insights feed into practical data selection guidelines. By unifying offline and online strategies under a K-step look-ahead framework, practitioners can make more informed choices. Online selection with a short horizon often rivals or surpasses offline methods, making the optimal horizon a tunable parameter alongside the learning rate. Let's apply some rigor here: this framework isn't just theoretical. It's a practical recipe for those in the trenches of machine learning.
In a field obsessed with innovation, it's the overlooked fundamentals that often hold us back. This analysis doesn't just bridge gaps. it redefines them. Will others follow suit, or will these revelations be another flash in the academic pan?
Get AI news in your inbox
Daily digest of what matters in AI.