Apple eases on‑device AI access
Apple addressed a CoreML kernel bug affecting Kokoro TTS and said the fix will ship in the next macOS—an unusual direct developer feedback loop. (x.com) Separately, Apple approved third‑party AMD/NVIDIA eGPU drivers for Macs, which could let researchers and engineers run larger AI workloads while staying in the Apple ecosystem. (x.com)
Apple made two quiet moves this week that point in the same direction. It is getting a little easier to do serious AI work on a Mac without routing everything through Apple’s own stack. The first move was small, but revealing. A developer working on Kokoro, an open-weight text-to-speech model with 82 million parameters, hit a Core ML kernel bug while trying to run the model cleanly on Apple hardware. Kokoro is exactly the kind of model Apple likes to talk about: compact, fast, and local. It has been converted to Core ML to run on Apple Silicon and the Apple Neural Engine, with fixed-shape pipelines designed to avoid the framework’s more brittle paths. (github.com) What made this episode unusual was not the bug itself. Core ML developers regularly work around opaque runtime failures, shape constraints, and hardware-specific regressions. Apple’s own developer forums are full of reports about Core ML crashes, NaNs, and kernel-level issues that only appear on certain macOS versions or devices. (developer.apple.com) What stood out was the response. According to the developer’s post, Apple acknowledged the Core ML kernel bug affecting Kokoro TTS and said the fix would ship in the next macOS release. That is a very direct feedback loop by Apple standards. The company usually exposes machine learning infrastructure as a polished product, not as a conversation. Here, a developer hit a low-level problem in public, and Apple answered in a way that sounded less like platform theater and more like normal engineering. That matters because Kokoro is not a giant frontier model. It is a lightweight speech model meant to run on-device. If even that kind of workload can still trip over framework bugs, then “on-device AI” remains more fragile than Apple’s demos suggest. The interesting part is that Apple seems to know this, and is now smoothing the path in public, one bug at a time. The second move was larger. Apple approved a third-party driver extension that lets AMD and NVIDIA external GPUs work with Apple Silicon Macs over Thunderbolt or USB4, using Tiny Corp’s TinyGPU software. The support is not for gaming or desktop graphics. It is for compute. Tinygrad’s own documentation says the setup is for running AMD RDNA3+ or NVIDIA Ampere+ GPUs with tinygrad, and Apple’s approval means users no longer need to disable System Integrity Protection just to get the driver installed. (docs.tinygrad.org) That last detail is the hinge. Before this, getting external GPUs to work on Apple Silicon meant hacks. Apple’s modern driver model pushes hardware support into user-space system extensions and DriverKit, with entitlements Apple must approve. In other words, this kind of access does not happen by accident. Apple had to say yes. (developer.apple.com) The caveat is just as important as the breakthrough. These eGPUs do not turn a Mac into a normal NVIDIA workstation. They are not exposed as general-purpose graphics acceleration for macOS apps. The current path is aimed at AI inference through TinyGPU and tinygrad, with AMD compiler tooling running natively and NVIDIA compilation routed through Docker. (docs.tinygrad.org) Still, the practical effect is obvious. Apple spent years narrowing what kinds of compute belonged on a Mac. Now it is fixing Core ML bugs for outside developers and approving third-party GPU drivers for AI workloads. Tiny Corp says a Mac mini with an M4 connected to a Radeon RX 7900 XTX reached 18.5 tokens per second on Qwen 3.5 27B over Thunderbolt. That is not elegant. It is not especially Apple-like. It is concrete. (rits.shanghai.nyu.edu)