App‑LLM offline event triage
Developers demonstrated an 'app‑LLM' pattern that analyses events locally — for example, Sentry‑style crash and event triage done on‑device so the app calls a local model instead of sending telemetry to the cloud. (x.com)
A large language model is a text engine: an app hands it raw text, and the model returns a summary, label, or next step. Developers are now wiring that engine directly into apps so crash logs and other event data can be analyzed on the device instead of being shipped to a cloud service. (apple.com) Apple said in September 2025 that its Foundation Models framework lets developers call the on-device model behind Apple Intelligence from inside their apps, with offline use and no per-request inference cost. Apple’s developer docs say the model can also call app-defined tools, which is the pattern shown in local triage demos that turn logs into structured diagnoses. (apple.com) (developer.apple.com) Google is pushing a similar path on Android. Its Gemini Nano docs say ML Kit GenAI APIs run through Android’s AICore service, process data locally, and are aimed at low-latency, privacy-sensitive tasks that can keep working offline. (developer.android.com) That changes the economics of “triage,” the first pass that sorts incoming problems by severity and likely cause. Instead of sending every crash, support ticket, or security alert to a hosted model, an app can ask a local model to summarize the event, classify it, and decide whether anything needs to leave the device at all. (developer.android.com) (developer.apple.com) The privacy angle is straightforward: local inference keeps raw telemetry closer to the user and can reduce how much sensitive data is transmitted. Ollama, a popular local-model runtime, says prompts and responses processed locally stay on the device and are not collected or transmitted by the company. (ollama.com 1) (ollama.com 2) Developers have been building the same pattern in open source for security operations and incident response. Recent GitHub projects describe local alert triage systems that pull logs or alerts into a local model, produce plain-English summaries, and keep the data on the machine rather than forwarding it to an external application programming interface. (github.com 1) (github.com 2) The tradeoff is hardware and model quality. Apple limits its framework to devices that support Apple Intelligence, and Google’s on-device features depend on AICore and compatible Android hardware, so the “analyze everything locally” approach is not available on every phone or laptop. (developer.apple.com) (developer.android.com) What the demos show is a shift in where the first layer of software intelligence runs. For event triage, the model is starting to sit inside the app itself — close enough to the data to decide what matters before the cloud ever sees it. (apple.com) (developer.android.com)