Synthesized Data Powers New Era in Hardware Verification
Data synthesis breathes new life into translating natural language to SystemVerilog Assertions. CodeV-SVA models are now setting benchmarks, challenging top-tier LLMs.
SystemVerilog Assertions (SVAs) have long been a linchpin in hardware verification, yet translating natural language properties into these assertions has proved difficult. The main culprit? Limited data. Recent attempts to use general-purpose language models (LLMs) for this translation process have hit a wall because they lack something fundamental: solid datasets.
Synthesizing the Solution
Enter data synthesis. By employing large-scale open-source Register-Transfer Levels (RTLs), researchers have guided LLMs to generate real-world SVAs. It's a two-fold strategy. First, it addresses the scarcity of reliable SVA corpora. Second, it offers a method to verify the semantic equivalence between natural language and the SVAs generated, through bidirectional translation.
With this synthesized data, the CodeV-SVA series of models has been trained, marking a significant leap forward. Particularly impressive is the CodeV-SVA-14B model. Its performance metrics are compelling, achieving 75.8% on NL2SVA-Human and 84.0% on NL2SVA-Machine in Functional @1 accuracy. These numbers aren't just impressive. they rival advanced language models like GPT-5 and DeepSeek-R1.
Why This Matters
For many, the mere mention of SVAs might induce a yawn, but consider this: if we can reliably translate natural language to SVAs, the implications are vast for hardware design and verification. It simplifies a notoriously complex process and potentially trims down development cycles. Show me the inference costs and I'll show you an industry ready to pivot.
the success of CodeV-SVA-14B isn't just about beating benchmarks. It's about showcasing that with enough synthesized data, even complex technical challenges can be addressed. But there's a essential question here: If the AI can hold a wallet, who writes the risk model?
The Road Ahead
As we move forward, the potential for synthesized data in hardware verification is promising. The intersection is real. Ninety percent of the projects aren't. But for the ones that are, they're set to redefine hardware verification.
Get AI news in your inbox
Daily digest of what matters in AI.