Decentralized 107B model claimed

0G Labs says it trained DiLoCoX, a 107B‑parameter decentralized AI model run across standard internet nodes (no data centers), claiming it’s 48% larger than Bittensor’s biggest model. If accurate, it’s a provocative signal that non‑traditional training topologies are being pushed for large‑scale models. (x.com/0G_labs/status/2037303589010391451)

0G’s technical blog post on March 24, 2026, states the DiLoCoX paper was first posted to arXiv in June 2025. (0g.ai) The team says the original July 2025 training used a modified Qwen1.5-107B across 160 GPUs spread over 20 nodes on 1 Gbps links, and the paper reports a 357× improvement in communication efficiency versus standard AllReduce. (0g.ai) DiLoCoX combines pipeline parallelism, a dual‑optimizer policy, a one‑step‑delay overlap of communication and computation, and adaptive gradient compression as the core techniques that enabled the low‑communication large‑scale run. (0g.ai) 0G announced on March 24, 2026 that it has begun a public retrain of DiLoCoX‑107B with promises to release weights, checkpoints, convergence metrics and verifiable telemetry upon completion. (globenewswire.com) (itnewsonline.com) Bittensor’s recent Templar subnet run, Covenant‑72B, completed on March 10, 2026 using more than 70 contributors and roughly 1.1 trillion training tokens, and the project reported a 67.1 MMLU (zero‑shot) benchmark result. (kucoin.com) 0G’s July 2025 Chainwire/press release also names China Mobile as a collaborator on the DiLoCoX experiments and frames the work as a peer‑reviewed demonstration that decentralized training can scale to 100B+ parameters on consumer‑grade bandwidth. (analyticsinsight.net)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.