AI in Healthcare

The latest on artificial intelligence transforming medicine

News stories discovered and organized by an automated pipeline. Covering clinical deployments, research breakthroughs, regulation, and industry developments.

Filtered by: large language modelsClear filter
regulationNature

Nature: AI Oversight Must Shift From Model Inputs to Real-World Capabilities

A Nature article argues that traditional AI oversight focused on training data, prompts, or model architecture is no longer enough. As large language models become more capable and more widely deployed, the key question is what they can do in practice and how those capabilities should be monitored over time.

artificial intelligenceoversightcapability monitoringlarge language models
regulation

Large Language Models Need Ongoing Monitoring, Not One-Time Approval

A Nature piece argues that large language models require capability-based monitoring as they evolve after deployment. In healthcare, that warning is especially relevant because model behavior can change as tools, data access, and workflows change around them.

Nature
large language modelsmonitoringcapability-based oversight
research

Study Says Advanced AI Language Models Can Outreason Physicians on Some Medical Tasks

An EMJ report says a newer AI language model outperformed physicians on selected reasoning tasks. The result adds to a growing body of work showing that models can be strong at structured clinical logic even when real-world deployment remains uncertain. The key question is no longer whether AI can reason, but where that reasoning actually transfers.

EMJ
medical reasoninglarge language modelsbenchmarking
technology

Open-Source Medical AI Is Getting Bigger, Cheaper, and Harder to Ignore

AntAngelMed is being introduced as a 103-billion-parameter open-source medical language model built on a sparse MoE architecture. The launch underscores how the medical AI race is expanding beyond closed commercial systems toward large, inspectable models that developers can adapt and study.

MarkTechPost
open sourcemedical language modelMoE
industry

Large Language Models May Help Patients and Providers Appeal Denied Radiology Claims

Radiology business reporting highlights a less visible use case for AI: administrative appeals. Large language models could help draft and organize appeals when claims are denied, reducing clerical burden in a heavily bureaucratic part of imaging care.

Radiology Business
large language modelsradiology billingprior authorization
technology

Grok for Patients? ARVO Talk Puts AI Health Answers Under the Microscope

A discussion at ARVO 2026 asks whether Grok or similar large language models are useful tools for patients. The answer is not simply yes or no: consumer-facing AI may improve access, but without verified content and clinical guardrails it can just as easily amplify confusion.

Ophthalmology Times
consumer AIpatient toolslarge language models
technology

General-purpose AI is colliding with specialty medicine’s messy reality

Modern Healthcare argues that generalized AI fails in specialty medicine because clinical nuance matters more than broad language fluency. That critique is increasingly central as healthcare moves from demo-friendly tools to specialty-grade use cases.

Modern Healthcare
specialty medicinegeneral AIclinical workflow
research

AI Models Are Beating Doctors at Clinical Reasoning — But the Real Test Is Still Ahead

A cluster of new reports says large language models can outperform physicians on clinical reasoning and diagnostic tasks, especially in controlled case studies and emergency-department scenarios. The result is attention-grabbing, but experts are already shifting the debate from raw accuracy to reliability, workflow fit, and patient safety.

Medical Xpress
artificial intelligenceclinical reasoningdiagnosis
research

Large Language Models Outperform Physicians in Clinical Reasoning Studies, Raising the Bar for Validation

Multiple outlets are reporting that advanced language models can outperform physicians on clinical reasoning tasks and diagnostic questions. The findings are impressive, but they also sharpen the need for more realistic testing and clearer evidence of value in practice.

News-Medical
artificial intelligencelarge language modelsclinical decision-making
research

Fractal’s Vaidya 2.0 Raises the Bar for Healthcare AI Benchmarks

Fractal says its Vaidya 2.0 model outperforms leading frontier models on healthcare AI benchmarks, adding fresh competition in the race to build specialized clinical language systems. The claim highlights a broader trend: domain-tuned models are increasingly trying to prove they can beat general-purpose giants where it matters most.

MSN
large language modelsbenchmarksclinical AI
research

A New Peer-Reviewed Study Suggests Radiologists Prefer Domain-Specific AI Over General Models

A first peer-reviewed study on AI-generated impressions reportedly found that radiologists preferred domain-specific models over general-purpose ones. The result reinforces a growing theme in medical AI: specialization still beats broad capability when the stakes are clinical.

StreetInsider
radiology AIlarge language modelsspecialized AI
research

Peer-Reviewed Study Finds Radiologists Prefer Domain-Specific AI Over General Models for Report Impressions

A new peer-reviewed study is offering some of the clearest evidence yet that radiologists are not simply impressed by bigger general-purpose models. Instead, they appear to prefer AI systems tuned specifically for radiology when generating report impressions. That distinction matters because it suggests clinical value will depend less on raw generative capability and more on domain adaptation, workflow fit, and trust.

PR Newswire
radiologygenerative AIlarge language models
research

New Studies Reinforce a Hard Truth: General-Purpose AI Still Struggles With Safe Clinical Reasoning

A cluster of recent articles points to the same uncomfortable conclusion: large language models remain unreliable when asked to make early diagnostic judgments, differential diagnoses, or other low-data clinical decisions. The findings strengthen the case for viewing general-purpose AI as a support tool, not a substitute for medical reasoning.

sciencebasedmedicine.org
large language modelsdiagnosisclinical reasoning
research

New Evidence Shows Medical LLMs Still Struggle to Reason Like Clinicians

A set of reports from clinical imaging and medical AI outlets points to the same conclusion: large language models remain unreliable when asked to reason through real clinical scenarios. The findings strengthen the case for keeping LLMs in supporting roles rather than deploying them as diagnostic authorities.

diagnosticimaging.com
medical AIclinical reasoningradiology
research

Seven Major Language Models Tested on Radiology Exam Show Uneven Clinical Readiness

A Cureus study compared seven mainstream large language models on the 2022 American College of Radiology Diagnostic Imaging In-Training Examination. The results offer a useful reality check on how far general-purpose AI still is from dependable radiology support.

Cureus
radiologylarge language modelsbenchmarking
research

AI Chatbots Still Struggle With Real Clinical Judgment in Ophthalmology, Nature Comparison Finds

A Nature comparison of large language model chatbots on ophthalmology case vignettes adds to the growing evidence that medical AI can sound fluent without reliably thinking like a clinician. The study underscores a widening gap between benchmark-style performance and the messy reasoning required in specialty care.

Nature
AI chatbotsophthalmologylarge language models
research

German University Clinics Signal a New Phase of Hospital AI Governance

A Nature study examining expectations and needs around large language models at Bavarian university clinics offers a useful snapshot of where hospital AI adoption is actually heading: not straight to automation, but through governance, workflow fit, and trust. The findings suggest academic medical centers are moving from curiosity to institutional design questions.

Nature
large language modelshospital operationsacademic medicine
clinical

Safety Audit Finds Medical Self-Triage LLM Still Misses Red Flags

A Cureus safety audit using Japanese symptom vignettes found persistent under-triage of red-flag cases by a large language model, even when near-deterministic decoding improved reproducibility. The result reinforces a growing concern in healthcare AI: consistency is not the same as safety.

Cureus
self-triagepatient safetylarge language models
clinical

Pediatric AI Is Advancing Faster Than the Evidence Base

A new AJMC report highlights the promise of large language models in pediatric care while underscoring a central constraint: safety and efficacy data remain too thin for broad clinical reliance. The pediatric setting raises a higher bar because developmental nuance, family communication, and lower tolerance for error make general-purpose AI weaknesses more consequential.

AJMC
pediatricslarge language modelspatient safety
research

Nature Sets the Agenda for Healthcare LLMs Beyond the Hype Cycle

A new Nature piece on large language models in healthcare signals that the conversation is shifting from novelty to governance, workflow fit, and evidence. The article matters because it helps frame LLMs not as a single product category, but as a broad enabling layer touching clinical documentation, decision support, research, and patient communication.

Nature
large language modelshealthcare AIclinical workflows

How this works

Discover

An automated pipeline searches the web for significant AI healthcare news across clinical, research, regulatory, and industry domains.

Structure

The pipeline turns source material into concise, readable stories with categories, tags, and context that make the feed easier to scan.

Publish

Stories are deduplicated, stored, and published to this site. The pipeline runs automatically to keep coverage current.