New On-Device AI Tools Emerge for Apple Silicon

The ecosystem for on-device AI on Apple Silicon is expanding with new developer tools. Prince Canuma launched the MLX-Audio-Swift SDK, enabling real-time, on-device voice AI functions like text-to-speech and speech-to-text. Meanwhile, startup TryMirai announced it has developed inference technology for Apple Silicon that outperforms existing solutions like MLX and llama.cpp.

- Apple's MLX is an open-source array framework for machine learning developed by its research division, specifically designed to leverage the unified memory architecture of Apple Silicon. This allows operations to run on the CPU or GPU without data transfer, a key hardware-software optimization. - The MLX-Audio-Swift SDK is a modular library, enabling developers to import only necessary components for tasks like Text-to-Speech (TTS), Speech-to-Text (STT), or Voice Activity Detection (VAD). It facilitates real-time audio generation by supporting streaming and automatically downloads models from the Hugging Face Hub. - The developer behind the SDK, Prince Canuma, is a prolific contributor to the MLX ecosystem, having adapted over 1,000 models for the framework, significantly expanding the range of open, multimodal AI tools available to run on Apple devices. - TryMirai's performance claims stem from building a proprietary inference engine from the ground up, specifically for Apple Silicon, rather than using cross-platform abstractions. This hardware-aware approach allows them to control the full stack from model optimization to memory management. - In a recent seed funding announcement, TryMirai claimed its engine can be up to 37% faster in token generation and 59% faster in prefill processing compared to existing solutions on certain model-device pairs. - These tools fit into Apple's broader strategy of making on-device AI a core OS capability, giving developers access to foundation models through Swift and Xcode to ensure privacy, low latency, and offline functionality. - The emergence of specialized SDKs and high-performance inference engines intensifies the competition with established frameworks like llama.cpp, which has long been a benchmark for efficient on-device performance on Apple hardware. Comparative studies show MLX throughput generally outperforming llama.cpp on Apple Silicon.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.