Gerard Sans: LLMs fake agency

- Gerard Sans ran a massive set of experiments showing large language models often act like they claim knowledge or make decisions without properly using evidence. - In a 25,000-experiment sweep Sans reported 68% of runs ignored contrary evidence, 71% showed no belief updates, and scaffolding explained only 1.5% of output variance. - Those failure rates point to systematic reasoning gaps that will complicate deploying agentic systems unless models are forced to check evidence and update beliefs. (x.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.