On‑device LLM safeguards bypassed
Researchers published a prompt‑injection technique that can bypass Apple’s on‑device LLM safeguards, showing local model controls can be subverted even without remote execution. A separate Cert‑In advisory also flagged multiple high‑severity vulnerabilities impacting Apple devices that could allow compromise with minimal user interaction, reinforcing that patch urgency and trust boundaries around local AI remain unresolved. (dataconomy.com) (the420.in)
Apple put a small language model directly on the iPhone, iPad, and Mac so apps could ask it for help without sending every request to a remote server. That local setup was supposed to act more like a locked room than an open website. (rsaconference.com) That locked room still has a mailbox. Apple’s on-device model is reached through the Foundation Models framework application programming interface, which means apps do not touch the model’s internal weights directly and the operating system sits in the middle. (rsaconference.com) Apple also wrapped that mailbox in filters. RSAC researchers say the system checks both what goes into the local model and what comes out, trying to block malicious instructions on the way in and dangerous text on the way out. (rsaconference.com) A prompt injection attack is the language-model version of slipping a fake note into a stack of real paperwork. The model reads the attacker’s instruction as if it belongs with the user’s request, then follows the wrong boss. (securityweek.com) The RSAC team says it paired two tricks. The first was “Neural Exec,” which uses nonsense-looking text as a reusable trigger that can push the model toward an attacker-chosen task. (securityweek.com) The second trick used a Unicode text control called right-to-left override. The researchers wrote malicious output backward, then used that control so the model would render the text normally while filters saw something different. (securityweek.com) In RSAC’s test, that combination worked 76 times out of 100 random prompts. The researchers said the attack could force the local model to produce attacker-directed results and interact with data available to Apple Intelligence-enabled apps, including health and fitness information and family videos. (rsaconference.com) This was not a case of cracking open the phone and stealing the model itself. The point was narrower and more unsettling: even when the model stays on the device and behind operating-system controls, the text channel into it can still be manipulated. (rsaconference.com) (securityweek.com) The scale is not tiny. RSAC estimated there were at least 200 million Apple Intelligence-capable devices in consumers’ hands by December 2025, and it said between 100,000 and 1 million Apple customers were already using apps that were vulnerable before Apple’s operating-system updates. (rsaconference.com) At almost the same time, Apple users got a more familiar reminder that local software still needs urgent patching. Apple’s security releases page lists iOS 26.4, iPadOS 26.4, and macOS 26.4 from March 24, 2026, plus iOS 18.7.7 and iPadOS 18.7.7 updates on March 24 and April 1, 2026 for older devices. (support.apple.com) Apple’s iOS 26.4 security notes include issues that could let an app access sensitive user data, let malicious web content crash a process, and let an attacker in a privileged network position intercept traffic. Apple updated the entry for one audio-related flaw again on April 9, 2026. (support.apple.com) India’s Computer Emergency Response Team, known as Cert-In, separately warned on April 10, 2026 that Apple vulnerabilities could expose users to unauthorized access, remote code execution, denial of service, spoofing, and full device compromise, and it said iOS and iPadOS versions before 26.4 were affected. (rediff.com) Put those two stories together and the old security boundary looks thinner than it used to. “On device” now means the same phone can hold your private data, run the model that reasons over it, and still be one bad prompt or one missed patch away from doing the wrong thing. (rsaconference.com) (support.apple.com) (rediff.com)