Text-to-Image Models: The Safety Illusion

By Ruth AdesanyaMay 28, 2026

New research shows text-to-image models are more affected by unsafe data proportions than sheer volume, challenging assumptions about AI training.

As AI continues to transform the creative landscape, the safety of text-to-image models is under scrutiny. Recent research reveals that these models, often trained on vast datasets, aren't immune to the pitfalls of unsafe content. The findings suggest that the proportion of unsafe images in training data, not their absolute count, significantly influences model output safety.

The Proportion Problem

The study sheds light on a critical oversight in AI model training. By training text-to-image models on datasets with varying fractions of unsafe images, ranging from 0% to 9.6%, researchers identified a monotonic rise in output unsafety. Astonishingly, even with 0% contamination, a baseline unsafety level of 16.6% persisted. Once the proportion hit 5%, unsafety climbed to 25.5%. This raises a pressing question: Are we focusing on the wrong metric by counting images instead of considering their proportion?

Unseen Risks

These findings point to inherent risks beyond just the training data. The study implicates components like the frozen text encoder, which contributes to a residual safety risk. Even when employing SafeCLIP, which reduced the unsafety floor to 9.6%, the dose-response effect was consistent across all encoders tested. This highlights a concerning issue: regardless of data curation efforts, inherent model risks remain.

Quality vs. Safety

that safety filtering didn't compromise model quality. Evaluations using metrics like FID, CLIPscore, and ImageReward showed no degradation. Yet, the persistence of unsafety poses a conundrum for future research. How do we balance emerging AI capabilities with safety concerns?

Let's apply the standard the industry set for itself. This research underscores a gap in accountability and transparency, challenging AI developers to rethink data curation practices. As models grow more sophisticated, the industry must ensure that safety doesn't become an afterthought. The burden of proof sits with the team, not the community.

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

Text-to-Image Models: The Safety Illusion

The Proportion Problem

Unseen Risks

Quality vs. Safety

Key Terms Explained