Large Language Models Need Ongoing Monitoring, Not One-Time Approval
A Nature piece argues that large language models require capability-based monitoring as they evolve after deployment. In healthcare, that warning is especially relevant because model behavior can change as tools, data access, and workflows change around them.
The traditional approval model for software assumes the system is mostly fixed. Large language models break that assumption because their capabilities can expand after launch through updates, integration, and changing context.
That is why capability-based monitoring is such a consequential idea for healthcare. A model that is safe in a narrow setting can become risky when connected to records, messaging, triage, or decision support tools that amplify its reach.
In practical terms, this means healthcare organizations will need to monitor behavior continuously, not just at procurement or deployment. They will also need escalation rules for when a model crosses from administrative assistance into clinically meaningful influence.
The article’s broader message is that oversight must track function, not just form. For health systems and regulators, the question is no longer whether an AI product passed a test once, but whether its real-world capabilities are still the ones they think they approved.