Apple releases Python SDK for on-device AI
Apple is advancing its on-device AI capabilities with the release of Python bindings for its Foundation Models framework. The new SDK, called "apple-fm-sdk," enables developers to access and leverage on-device models that are at the core of Apple Intelligence. The move suggests a strategic push toward privacy-preserving, low-latency AI as a standard feature across the company's ecosystem.
- The on-device foundation model accessed by the SDK is a ~3 billion parameter language model, significantly smaller than large cloud-based models, and is optimized to run efficiently on Apple silicon's Neural Engine. This enables a time-to-first-token latency of about 0.6 milliseconds per prompt token and a generation rate of 30 tokens per second on an iPhone 15 Pro. - This Python SDK provides a Pythonic interface to the same Foundation Models framework that was introduced for Swift with iOS 18 and macOS Sequoia. It allows developers to evaluate Swift app features, perform batch inference, and analyze results from a Python environment. - A key feature of the SDK is "guided generation," which allows developers to constrain the model's output to specific, structured formats like JSON schemas or Python classes. This ensures predictable, type-safe data for applications, avoiding the need for manual parsing of unstructured text. - The framework supports "tool calling," enabling the model to interact with external data and APIs by calling Python functions you define. This allows the AI to perform actions and integrate real-world information into its responses. - Unlike Core ML, which is a general-purpose inference engine for running a variety of trained models, the Foundation Models framework provides direct API access to Apple's specific pre-trained language model. Core ML has been available since iOS 11, whereas the Foundation Models framework requires newer devices capable of running Apple Intelligence. - Apple's AI models are trained using its open-source AXLearn framework, which is built on JAX and XLA for high efficiency and scalability across various hardware platforms. - This initiative is part of a broader strategic push by Apple, which includes a significant overhaul of Siri set for 2026 and an expansion of the Foundation Models team with former Google researchers. - The on-device model can be further customized using LoRA (Low-Rank Adaptation) adapters. Developers can use the Foundation Models Adapter Training Toolkit to fine-tune the model for specific tasks.