1-bit LoRA on phones

QVAC Fabric released a cross-platform 1-bit LoRA fine-tuning framework based on BitNet that runs on iPhone/Pixel/Galaxy with ~90% less memory and claims 11x speed—demonstrating LoRA-style fine-tuning without a GPU. That could shift where small-model personalization happens for edge or private deployments. (x.com)

QVAC published a Hugging Face community post on March 17, 2026 naming BitNet b1.58 as the implementation target and listing contributors including Subash SN and Akshay Nambiar. (huggingface.co) Their reported benchmarks show fine-tuning a 125M-parameter BitNet in roughly 10 minutes on a Samsung S25 and completing a 1B fine-tune (~18k tokens) in about 1 hour 18 minutes on the S25 and 1 hour 45 minutes on an iPhone 16, with experiments pushing finetuning up to 13B on iPhone 16. (huggingface.co) Memory comparisons in the QVAC benchmarks list BitNet-1B using up to 77.8% less VRAM than Gemma-3-1B (F16) and 65.6% less than Qwen3-0.6B (F16), and report BitNet-13B using 2,789 MB — about 29% less VRAM than a 4-bit Qwen3-4B (Q4) instance. (huggingface.co) The project’s research repo and documentation live in tetherto/qvac-rnd-fabric-llm-bitnet on GitHub and point to a companion qvac-fabric-llm.cpp repo that provides multi-platform binaries plus Vulkan and Metal backends for Adreno, Mali and Apple Bionic GPUs. (github.com) Tether’s official announcement frames QVAC Fabric as an “edge-first” runtime and LoRA fine-tuning framework for Microsoft’s BitNet, published in a Tether press release on March 17, 2026. (tether.io) Coverage from outlets including Android Headlines and Blockonomi highlighted on-device fine-tuning for consumer hardware and repeated QVAC’s claim set across smartphones and heterogeneous consumer GPUs. (androidheadlines.com)

1-bit LoRA on phones

Get your own daily briefing