Can AI Match Clinicians in Medical Interviews? New Evidence Says Not Quite
Researchers are testing whether AI can perform medical interview assessments as well as clinicians, a question with major implications for triage and intake workflows. Early evidence suggests models may be promising but still fall short of human judgment in nuanced patient interactions.
Medical interviewing is one of the more tempting targets for AI, because it appears structured: ask questions, collect symptoms, summarize, and suggest next steps. But studies comparing AI to clinicians keep showing that the hard part is not generating an interview—it is knowing what matters, what is missing, and what should trigger escalation.
That distinction is critical. A model can sound organized and empathetic while still missing context cues, contradictory details, or red-flag symptoms. In a real clinical setting, interview quality is measured not only by completeness, but by the ability to adapt in the moment to uncertainty and risk.
The interest in AI interviews is understandable. Health systems are under pressure to reduce intake burden, improve access, and standardize preliminary screening. If an AI tool can safely handle routine history-taking, clinicians could spend more time on diagnostic reasoning and patient counseling.
But the new work reinforces a familiar pattern in medical AI: tasks that look like language problems often turn out to be judgment problems. That means performance should be judged against clinical outcomes, not just transcript quality or user satisfaction.
For now, the most realistic role for AI interview systems may be as a structured assistant—capturing history, flagging gaps, and preparing a clinician-facing summary—rather than acting as the primary evaluator of patient concerns.