TAO: A New Protocol for Trustworthy Machine Learning on the Cloud
TAO offers a fresh approach to verifying neural network outputs in decentralized environments. It combines empirical thresholds and theoretical bounds, challenging the status quo of ML output verification.
Machine learning is increasingly conducted on hardware outside users' control, such as cloud GPUs and inference marketplaces. This lack of control raises a significant challenge: verifying that the outputs returned by these services accurately reflect the intended inputs. Users are often left in the dark, unable to counter service downgrades like model swaps or discrepancies in ad embeddings.
The Problem with Verification
Why is verifying outputs so tricky? The crux of the issue lies in the inherent nondeterminism of floating-point execution on heterogeneous accelerators. Previous attempts at solving this either fell short for real-world neural networks or required users to place their trust in vendors. Enter TAO, a new protocol that promises to change this narrative.
Introducing TAO
TAO stands for Tolerance Aware Optimistic verification. It's not just a catchy acronym but a novel approach that accepts outputs falling within operator-level acceptance regions, rather than demanding bitwise equality. This protocol uses two error models: sound per-operator IEEE-754 worst-case bounds and tight empirical percentile profiles, calibrated across various hardware.
When discrepancies arise, TAO employs a Merkle-anchored, threshold-guided dispute game. This process recursively partitions the computation graph until a single operator is left. The adjudication then boils down to a lightweight check against theoretical bounds or a simple vote against empirical thresholds. The key contribution: TAO doesn't rely on trusted hardware or deterministic kernels, making it a scalable solution for real-world ML compute.
TAO in Practice
TAO has been implemented as a PyTorch-compatible runtime and a contract layer, deployed on the Ethereum Holesky testnet. The runtime can instrument graphs, compute per-operator bounds, and execute vendor kernels in FP32 with minimal overhead. Specifically, there's only a 0.3% overhead on models like Qwen3-8B.
Across different models such as CNNs, Transformers, and diffusion models on hardware including A100, H100, RTX6000, and RTX4090, empirical thresholds were found to be 100 to 1,000 times tighter than theoretical bounds. The ablation study reveals that bound-aware adversarial attacks achieved a 0% success rate under TAO, demonstrating its solid defense capabilities.
Why This Matters
TAO's approach to reconciling scalability with verifiability is a significant step forward. In a world where trust is becoming a precious commodity, protocols like TAO could become the standard for ensuring trustworthiness in ML-as-a-Service. But here's the million-dollar question: can TAO truly replace the need for vendor trust, or will it serve as a complement to existing systems?
For now, TAO offers a promising path forward. By providing a way to verify outputs without relying on trusted environments, it's setting the stage for more transparent and accountable machine learning services in decentralized settings. Code and data are available at TAO's project page, inviting others to explore its potential and build on its foundation.
Get AI news in your inbox
Daily digest of what matters in AI.
Key Terms Explained
The processing power needed to train and run AI models.
Running a trained model to make predictions on new data.
A branch of AI where systems learn patterns from data instead of following explicitly programmed rules.
A computing system loosely inspired by biological brains, consisting of interconnected nodes (neurons) organized in layers.