Medical LLMs vulnerable

- A security note found medical LLMs can be influenced by tiny amounts of poisoned data during training. (x.com) - The researchers reported adversarial poisoning at roughly 0.001% of training data was sufficient to alter medical outputs. (x.com) - The paper urges stronger provenance checks on retrieval‑augmented data and stricter dataset hygiene for clinical models. (x.com)

A medical chatbot can be nudged toward bad clinical advice by poisoning an almost invisible slice of its training data. (nature.com) Large language models learn by absorbing huge text collections, then predicting the next word over and over until patterns stick. In medicine, that means a model can pick up both sound guidance and planted falsehoods from web-scale data. (ncbi.nlm.nih.gov) In a paper published online January 8, 2025 in *Nature Medicine*, researchers simulated an attack on The Pile, a widely used training dataset for language models. They reported that changing just 0.001% of training tokens with medical misinformation made models more likely to repeat medical errors. (nature.com) The team also found the poisoned models still performed about as well as clean models on standard open-source medical benchmarks. That means a model can look fine on familiar tests while carrying hidden bad advice into clinical answers. (ncbi.nlm.nih.gov) The paper’s warning lands as hospitals, health systems, and software vendors push language models into drafting notes, answering patient questions, and assisting with clinical workflows. A poisoning problem at training time is hard to spot later because the falsehood is baked into the model’s learned patterns. (nature.com) The risk is not limited to direct model training. The researchers said systems that pull outside documents at answer time — often called retrieval-augmented generation — also need stronger provenance checks, because poisoned source material can be reintroduced through the retrieval pipeline. (nature.com) As one safeguard, the group tested a screening method based on biomedical knowledge graphs, which are structured maps of accepted relationships such as drugs, diseases, and treatments. In the paper, that filter caught 91.9% of harmful content, with an F1 score of 85.7%. (ncbi.nlm.nih.gov) The broader security field has been moving in the same direction. The Open Worldwide Application Security Project’s 2025 list of large language model risks includes data and model poisoning as a core threat, covering corrupted pretraining data, fine-tuning sets, and embeddings. (owasp.org) The paper does not say every medical model in use today has been compromised. It says developers building clinical systems from scraped internet text need tighter dataset hygiene, clearer data lineage, and stronger checks before bad information turns into confident bedside prose. (nature.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.