Why AI Struggles with Clinical Triage Decisions

By Nadia OseiMay 29, 2026

AI models face challenges in clinical triage when shifting from free-text to multiple-choice formats. The real issue? Output format, not understanding.

Artificial Intelligence models have lately been put to the test in areas we wouldn’t traditionally expect, such as clinical triage. The twist? These models struggle more when outcomes are framed in multiple-choice questions rather than the more open-ended free-text formats.

The Clinical Representation Gap

Research involving models like Gemma 3 4B/12B IT and Qwen3-8B reveals that medical features remain consistent across different formats. However, when the models hit the multiple-choice decision point, these features suddenly go silent. Three independent methods have confirmed this: natural-language autoencoder verbalization, decision-token logit attribution, and top-feature characterization. The conclusion? It's the output format that's the culprit, not the models' understanding or representation of clinical data.

Why Format Matters

When AI models miss the mark in clinical triage, the issue isn't a lack of knowledge. Instead, it's the multiple-choice penalty that creates a gap. Models often misfire by choosing an adjacent acuity letter instead of the correct answer, a sign that the format itself is skewing results. It's a bit like asking an essayist to answer in a bubble sheet. The fidelity of AI understanding is compromised by how we ask it to respond.

Implications for AI Development

This phenomenon raises critical questions about AI deployment in sensitive fields like healthcare. If AI systems falter due to formatting, how can we trust their judgments in life-or-death scenarios? Are we focusing enough on the right aspects of AI training? Slapping a model on a GPU rental isn't a convergence thesis. We need to rethink how we structure AI outputs to align with the model's strengths rather than its weaknesses.

Ultimately, while AI's role in healthcare holds immense promise, the real challenge lies in how we integrate human-like understanding with machine-specific processing. The intersection is real. Ninety percent of the projects aren't. Without addressing format issues, even the most sophisticated models won't make the cut.

Share this article:

Get AI news in your inbox

Daily digest of what matters in AI.

Why AI Struggles with Clinical Triage Decisions

The Clinical Representation Gap

Why Format Matters

Implications for AI Development

Key Terms Explained