Hallucinations, Costs, and AI Limits
A radiology‑AI journal thread flagged hallucination risks and published a new checklist for classification, segmentation and reconstruction models to guide deployment and monitoring. (x.com) At the same time, researchers warned that AI could raise overall healthcare costs unless payment models change, and public leaders are openly framing AI as a labor lever — all of which complicates procurement and ROI calculations. (ldi.upenn.edu) (nurse.org)
A radiology image is supposed to be a measurement, not a guess, and that is why “hallucination” is such a loaded word in medical artificial intelligence. In this setting, a model can add, erase, or reshape a finding that looks believable on the screen but was not really in the patient data. (pubs.rsna.org) (papers.miccai.org) Radiology systems now use artificial intelligence for several different jobs, and each job fails in a different way. One model may sort an image into “normal” or “abnormal,” another may trace the outline of a tumor pixel by pixel, and another may rebuild a scan from raw signals before a doctor ever sees it. (pubs.rsna.org 1) (pubs.rsna.org 2) That is why radiology researchers keep pushing checklists instead of sales pitches. The Radiological Society of North America’s CLAIM update in 2024 said imaging artificial intelligence papers should spell out data sources, reference standards, evaluation methods, and other details so hospitals can tell whether a model was tested in conditions that look anything like their own. (pubs.rsna.org) The gap between a polished paper and a real hospital is where many problems start. A 2025 Radiology review on generative artificial intelligence said failures are often discovered only after public release and named performance drift, bias, privacy, and weak vendor transparency as practical risks during deployment. (pubs.rsna.org) Even conventional imaging tools need local testing after purchase. A 2025 thoracic imaging review said daily use requires on-site performance evaluation, information technology integration, and post-deployment monitoring, which means the real work starts after the contract is signed, not before. (pubs.rsna.org) The market is already big enough that these warnings are not theoretical. As of May 2024, the United States Food and Drug Administration listed 882 cleared artificial intelligence-enabled medical devices, and 671 of them, or 76.1%, were for radiology practice. (pubs.rsna.org) The financial pitch sounds simple: software is cheap to copy, so software should make care cheaper. Amol Navathe at the University of Pennsylvania’s Leonard Davis Institute argued on April 8, 2026, that health care payment rules do not work that way, because they still reward inputs like clinician time and skill while artificial intelligence can scale to huge volumes at very low marginal cost. (ldi.upenn.edu) That mismatch can push spending up instead of down. Navathe said artificial intelligence can create new billable services, expand use across many more patients at once, and raise productivity without necessarily improving outcomes, which is a bad fit for a payment system built around human labor. (ldi.upenn.edu) Hospital leaders are now saying the quiet part out loud. Radiology Business reported on March 31, 2026, that Mitchell Katz, chief executive of New York City Health + Hospitals, said, “We could replace a great deal of radiologists with AI at this moment” if regulators allowed it, and he pointed specifically to mammograms and X-rays as places to cut labor costs. (radiologybusiness.com) That changes how every procurement meeting looks. If one group inside a hospital is buying artificial intelligence to reduce misses, another is buying it to increase throughput, and a third is buying it to reduce headcount, then the same model can look like a safety tool, a revenue tool, or a labor tool depending on who is holding the spreadsheet. (radiologybusiness.com) (ldi.upenn.edu) (pubs.rsna.org) So the hard question is no longer whether a model can beat a benchmark in a paper. The hard question is whether a hospital can prove, month after month, that the tool is still accurate on its own scanners, still safe for its own patients, and still worth the money once the hidden costs of monitoring, workflow changes, and incentives are counted. (pubs.rsna.org 1) (pubs.rsna.org 2)