Ai2 open‑sources MolmoWeb agent

The Allen Institute released MolmoWeb — an open‑weight visual web agent with a full training stack and 30K human task trajectories, letting teams audit and fine‑tune browser‑controlling agents without per‑call APIs. That makes agentic browser automation reproducible for startups worried about vendor lock‑in and auditable workflows. (venturebeat.com)

Ai2 posted MolmoWeb, MolmoWebMix, a public demo and paper on March 24, 2026, and linked the release artifacts from the project page and GitHub repository. (thenewstack.io) MolmoWeb is implemented on Ai2’s Molmo 2 multimodal model family and is offered in two parameter sizes: a 4B variant and an 8B variant. (allenai.org) Ai2 says the models were trained without distilling from proprietary vision-based agents, using a mix of synthetic trajectories generated by text-only accessibility-tree agents plus human demonstration traces instead. (allenai.org) The accompanying MolmoWebMix training corpus covers interactions across more than 1,100 websites, roughly 590,000 individual subtask demonstrations and about 2.2 million screenshot question–answer pairs, per Ai2’s dataset release notes. (glideslope.ai) All code, model checkpoints, and the license file are posted on GitHub and the 8B checkpoints are mirrored on Hugging Face, with the release artifacts carried under an Apache‑2.0 license. (github.com) Ai2 reports MolmoWeb-8B sets a new open‑weight state‑of‑the‑art across four major web‑agent benchmarks and that the 8B agent outperformed some agents built on proprietary models (including GPT‑4o baselines) on key web navigation tasks. (allenai.org)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.