EHRs Go Local with New AI: A Step Toward Smarter Healthcare

In the quest to speed up healthcare, a new framework promises to change how clinicians retrieve patient data. Enter the locally deployable Clinical Contextual Question Answering (CCQA) system. This innovation allows clinical questions to be answered directly from electronic health records (EHRs) without any external data transfer, safeguarding patient privacy.

Performance Under the Microscope

Testing the system's capabilities involved benchmarking large language models (LLMs) ranging from 4 billion to 70 billion parameters. These tests, conducted offline, used 1,664 expert-annotated question-answer pairs collected from 183 patients' records. Significantly, the dataset was predominantly Finnish clinical text.

The results were compelling. The Llama-3.1-70B model achieved 95.3% accuracy and 97.3% consistency across semantically equivalent question variants. Surprisingly, a smaller model, the Qwen3-30B-A3B-2507, delivered comparable performance, challenging the notion that bigger always means better in AI.

Practical Deployment: Challenges and Solutions

Deploying these models in a clinical setting isn't a simple task. While accuracy was high, the models showed variability in calibration during multiple-choice tests. Crucially, the use of low-precision quantization, specifically 4-bit and 8-bit, helped maintain predictive performance. This approach reduces GPU memory needs, making deployment more feasible.

Even with these advances, there are pitfalls. Clinically significant errors appeared in 2.9% of the outputs. Moreover, semantically equivalent questions sometimes produced discordant answers, with 0.96% of cases showing errors. This highlights an ongoing need for human oversight.

The Way Forward

Local deployment of open-source LLMs within EHR systems could revolutionize clinical data retrieval. But is AI ready to handle life-and-death decisions? Not yet. While the models are promising, the presence of critical errors underscores the necessity of human verification.

healthcare, where precision is everything, integrating these systems will require careful validation. As the healthcare industry moves forward, the collaboration between AI and clinicians could enhance decision-making. But until AI can guarantee flawless performance, human judgment will remain indispensable.