1-bit LoRA on phones

QVAC Fabric released a cross-platform 1-bit LoRA fine-tuning framework based on BitNet that runs on iPhone/Pixel/Galaxy with ~90% less memory and claims 11x speed—demonstrating LoRA-style fine-tuning without a GPU. That could shift where small-model personalization happens for edge or private deployments. (x.com)

QVAC published a Hugging Face community post on March 17, 2026 naming BitNet b1.58 as the implementation target and listing contributors including Subash SN and Akshay Nambiar. (huggingface.co) Their reported benchmarks show fine-tuning a 125M-parameter BitNet in roughly 10 minutes on a Samsung S25 and completing a 1B fine-tune (~18k tokens) in about 1 hour 18 minutes on the S25 and 1 hour 45 minutes on an iPhone 16, with experiments pushing finetuning up to 13B on iPhone 16. (huggingface.co) Memory comparisons in the QVAC benchmarks list BitNet-1B using up to 77.8% less VRAM than Gemma-3-1B (F16) and 65.6% less than Qwen3-0.6B (F16), and report BitNet-13B using 2,789 MB — about 29% less VRAM than a 4-bit Qwen3-4B (Q4) instance. (huggingface.co) The project’s research repo and documentation live in tetherto/qvac-rnd-fabric-llm-bitnet on GitHub and point to a companion qvac-fabric-llm.cpp repo that provides multi-platform binaries plus Vulkan and Metal backends for Adreno, Mali and Apple Bionic GPUs. (github.com) Tether’s official announcement frames QVAC Fabric as an “edge-first” runtime and LoRA fine-tuning framework for Microsoft’s BitNet, published in a Tether press release on March 17, 2026. (tether.io) Coverage from outlets including Android Headlines and Blockonomi highlighted on-device fine-tuning for consumer hardware and repeated QVAC’s claim set across smartphones and heterogeneous consumer GPUs. (androidheadlines.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.