Angelo Tanna tests ChatGPT on patients

- The American Academy of Ophthalmology published an April 23 podcast in which Northwestern ophthalmologists Rukhsana Mirza and Angelo Tanna discussed a study testing ChatGPT-4o on real patient eye-care questions. - The underlying study reviewed 165 ophthalmology questions from Epic MyChart and had ophthalmologists rate ChatGPT-4o answers as accurate and complete, incomplete, or unacceptable at standard and sixth-grade prompts. - The episode lands as ophthalmology groups keep testing whether chatbots can safely handle patient education without replacing clinician judgment. (aao.org)

Patients are already asking chatbots about eye problems, and ophthalmologists are now measuring how often the answers hold up. On April 23, the American Academy of Ophthalmology published a podcast with Drs. Rukhsana Mirza and Angelo Tanna on that question. (aao.org) Mirza and Tanna discussed their Ophthalmology Science paper on ChatGPT-4o responses to ophthalmology patient questions. The paper lists seven authors from Northwestern University Feinberg School of Medicine in Chicago, including Angelo P. Tanna and Rukhsana G. Mirza. (aao.org) (pmc.ncbi.nlm.nih.gov) The study used 165 ophthalmology-related questions that patients had submitted through Epic MyChart at a single institution. The questions came from glaucoma, retina, and cornea clinics, and nonclinical questions were excluded. (pmc.ncbi.nlm.nih.gov) Researchers entered each question into ChatGPT-4o twice. One run had no reading-level limit, and the second asked the model to answer at a sixth-grade reading level. (pmc.ncbi.nlm.nih.gov) Two ophthalmologist reviewers graded the chatbot’s answers and follow-up conversations as “accurate and complete,” “incomplete,” or “unacceptable.” A third subspecialist reviewer broke ties when the first two disagreed. (pmc.ncbi.nlm.nih.gov) That setup gets at the central problem with patient-facing chatbots in medicine. A smooth answer can still miss a key warning sign, leave out next steps, or sound definitive when a doctor would want an exam first. (pmc.ncbi.nlm.nih.gov) (aao.org) The paper focused on readability too, using Flesch-Kincaid and other reading-level measures. That matters because patient messages that are medically correct but written above a patient’s reading level can still fail as education. (pmc.ncbi.nlm.nih.gov) This was not a test of whether ChatGPT can diagnose from an eye exam or replace a clinic visit. It was a study of text answers to patient-submitted questions, using real messages from an ophthalmology practice. (pmc.ncbi.nlm.nih.gov) The Academy has been covering the same question for several years across articles and podcasts, including earlier pieces on ChatGPT and eye-disease information, board-style questions, and clinical use in ophthalmology. (aao.org 1) (aao.org 2) (aao.org 3) The new podcast does not present chatbots as stand-alone eye doctors. It presents them as tools being tested against real patient questions, with ophthalmologists still doing the scoring. (aao.org)

Angelo Tanna tests ChatGPT on patients

Get your own daily briefing