Medical AI Struggles with Real Patients

While AI chatbots excel in lab settings with up to 95% accuracy in identifying medical issues, they still struggle with the complexity and nuance of real-world medical questions from human users. This underscores ongoing challenges in deploying machine learning systems in high-stakes, real-life medical contexts.

- In one study, an AI's diagnostic accuracy plunged from 95% in lab tests to less than 35% when used by real people asking conversational questions. The same study found that users consulting a standard search engine diagnosed the problem correctly over 40% of the time, outperforming those who used the advanced AI chatbots. - A key reason for this performance gap is not the AI's lack of medical knowledge, but the challenge in human-AI interaction; users often struggle to know what information to provide, how to evaluate the AI's suggestions, and when to trust the output. - A Google Health AI designed to detect diabetic retinopathy with over 90% accuracy in the lab ended up rejecting more than 20% of eye scans in real-world clinics in Thailand because rushed nurses took photos in poor lighting, conditions not accounted for in the AI's training on high-quality scans. - AI models can also "hallucinate" or invent false information, such as creating fake medical studies to support their advice. In one documented case, a patient's reliance on a chatbot's incorrect diagnosis for a transient ischemic attack led to a significant delay in seeking proper treatment. - A 2024 study in *JAMA Pediatrics* found that ChatGPT made incorrect diagnoses in over 80% of real-world pediatric cases presented to it, underscoring the gap between the AI's text generation and the clinical experience required for diagnosis. - Many AI algorithms operate as "black boxes," meaning even their creators cannot fully explain the reasoning behind a specific conclusion. This lack of transparency is a major hurdle for clinical adoption and regulatory approval. - The U.S. Food and Drug Administration (FDA) is actively developing regulations for these tools, creating distinctions between "locked" algorithms that don't change and "adaptive" ones that learn from new data, which pose a more complex safety and efficacy challenge. - Integrating AI into hospitals is often hampered by technical issues, as many healthcare facilities rely on older, fragmented data systems that are not compatible with modern AI platforms, making it difficult to train and deploy the models effectively.

Medical AI Struggles with Real Patients

Get your own daily briefing