Apple begins offering free on-device LLMs to developers — no API key required
- Apple opened its Foundation Models framework to app developers, letting apps call the on-device Apple Intelligence language model with no API key. - The framework ships in iOS 26, iPadOS 26, macOS 26, and visionOS 26, and Apple says inference is free, private, and offline-capable. - That shifts Apple’s AI pitch from cloud add-ons toward local app features developers can ship without paying per-token fees.
Apple is doing something unusually direct with AI. It is letting developers call the same on-device language model behind Apple Intelligence from inside their own apps — with no API key, no per-request billing, and no internet requirement for the core model path. That matters because most app developers have learned to think about AI as a cloud service first and a product feature second. Apple is trying to flip that. ### What actually launched? The thing is called the Foundation Models framework. It gives developers a native Swift API to Apple’s on-device large language model, so an app can summarize, extract structured data, generate text, or call app-defined tools without sending user data to a remote server by default. Apple says developers can get started with as few as a few lines of code, and the framework is part of the Apple Intelligence stack now exposed to third-party apps. (developer.apple.com) ### Which platforms get it? This is tied to Apple’s newest OS cycle, not older public releases. The framework is listed for iOS 26, iPadOS 26, macOS 26, Mac Catalyst 26, and visionOS 26. Apple also framed it as part of the broader developer push unveiled at WWDC 2025, where it said third-party apps would get direct access to the on-device foundation model powering Apple Intelligence. (developer.apple.com) ### Why is “no API key” a big deal? Because that phrase is really shorthand for “no metered cloud dependency.” Most popular AI app features today ride on hosted APIs, which means usage bills, latency, outages, rate limits, and privacy tradeoffs. Apple’s pitch is the opposite — inference is free of cost to the developer, runs locally, and keep(developer.apple.com)s make economic sense, especially small utilities and niche tools that cannot support ongoing token bills. (apple.com) ### What can developers make with it? Apple is steering developers toward bounded, practical features — not giant open-ended chatbots. The docs describe text generation, content transformation, and task execution, including tool calling into app code. Apple’s examples are things like generating su(apple.com)e app I’m already using,” not “replace the whole internet with a chatbot.” (developer.apple.com) ### Is this the same as MLX? No — and that distinction matters. Foundation Models is Apple exposing its own built-in on-device model to app developers. MLX is Apple’s machine-learning framework for working with models on Apple silicon more broadly, including running and fine-tuning external open models on Macs. Apple’s WWDC sessions around ML(developer.apple.com)dels framework is the productized, app-facing API for Apple Intelligence itself. (developer.apple.com) ### What’s the catch? The catch is capability and reach. This is an on-device model, so developers get privacy and speed, but not unlimited scale. Features depend on supported hardware and the new OS versions, and developers still need to design around what a smaller local model can reliably do. Apple is not offering a universal replacement for fr(developer.apple.com) layer for everyday app tasks. That is a different product category. (developer.apple.com) ### Why does this matter beyond Apple apps? Because it quietly changes the default architecture for a lot of software. If AI can run locally, instantly, and for free at inference time, developers can stop treating intelligence as an expensive premium feature that needs a server budget attached. That nudges the ecosystem toward offline-first, privacy-first app design — exactly where Apple wants the conversation. (apple.com) ### Bottom line? Apple is not winning the “biggest model” race here. It is trying to win the “most shippable AI” race. For developers, that may be the more important move.