Apple’s on‑device AI push

A reverse engineer just cracked parts of Apple’s Neural Engine to unlock roughly 10% faster C API access on M1–M4 chips — the project includes a public GitHub repo for the tweak. (x.com) Apple is also reportedly using Google’s Gemini for on‑device AI distillation, signaling the company is combining hardware and large‑model distillation to shrink models locally. (x.com)

The ANE project was published by developer Manjeet Singh (GitHub handle maderix) and the repo has seen thousands of stars and active commits since its release. ) The code deliberately bypasses CoreML and interacts with private _ANEClient/_ANECompiler interfaces and the MIL format to emit compute graphs that run forward and backward passes directly on the Apple Neural Engine. (maderix Substack (maderix.substack.com); github.com/maderix/ANE (github.com)) Benchmarks surfaced in the public writeups and repo show sustained throughput claims such as ~1.78 TFLOPS and training-step timings reported around 9.3 milliseconds on M4 hardware while the project trained a ~109M-parameter Llama2-style transformer as a proof of concept. (RITS article (rits.shanghai.nyu.edu); github.com/maderix/ANE (github.com)) Project history and merged PRs record targeted operation offloads — one merge noted a ~16% speedup after moving classifier/softmax/rmsnorm backward passes to ANE — and the repo also added INT8 quantization paths that the maintainers say produce up to ~1.88× ANE throughput via quantize/dequantize. (GitHub commits page (github.com); ANE bridge folder (github.com)) Separately, reporting from The Information and follow-ups by MacRumors and AppleInsider on March 25, 2026 say Google granted Apple “complete access” to Gemini inside Google-run data centers and that Apple can use that access to perform model distillation to produce smaller, task-specific models. (The Information (theinformation.com); MacRumors (macrumors.com); AppleInsider (appleinsider.com)) The Information’s reporting adds that distillation can capture Gemini’s answers and internal reasoning traces to train “student” models for on-device use, and it notes Apple is pursuing that work in parallel with its Apple Foundation Models team ahead of expected AI feature updates at WWDC in June 2026. (The Information (theinformation.com); MacRumors (macrumors.com)) Taken together, the public ANE codebase documents concrete hardware access paths and measured performance on M-series Neural Engines while the Apple–Google arrangement reported by The Information gives Apple the model-level material to distill smaller Gemini-derived models inside its own data centers.; The Information / MacRumors (macrumors.com))

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.