Microsoft Fara1.5 tops web benchmarks
- Microsoft Research said on May 21 that its new Fara1.5 browser-agent family outperformed rival computer-use systems on live web benchmarks. - Microsoft reported Fara1.5-27B scored 72% on Online-Mind2Web, versus 58.3% for OpenAI's Operator and 57.3% for Google's Gemini 2.5 Computer Use. - GitHub shows a Fara1.5 agent harness is "coming soon," with Microsoft already publishing model details and benchmark notes.
Microsoft Research said on May 21 that its Fara1.5 family of browser-based computer-use agents outperformed competing systems on a live web benchmark that has become a closely watched test for AI agents. The company said the lineup includes 4-billion, 9-billion and 27-billion-parameter models designed to navigate websites, fill forms and complete other browser tasks. Microsoft said the top Fara1.5-27B model scored 72% on Online-Mind2Web, a benchmark spanning 300 tasks across 136 websites. The result puts Microsoft ahead of OpenAI's Operator and Google's Gemini 2.5 Computer Use in the same comparison, according to figures cited by Microsoft and repeated by Decrypt and Crypto Briefing this week. Those reports said Operator scored 58.3% and Gemini 2.5 Computer Use scored 57.3% on the live-web test. Microsoft described Fara1.5 as open-weight, meaning the model weights are available for outside use. (microsoft.com) ### What exactly did Microsoft release? Microsoft Research said Fara1.5 is a family of "computer use agent" models for the browser: Fara1.5-4B, Fara1.5-9B and Fara1.5-27B. The company said the models are built to handle tasks such as comparing products, filling out forms and booking events on the web. The May 21 Microsoft post said the 9B model scored 63% on Online-Mind2Web, while the 4B model scored 57% and the 27B model reached 72%. (microsoft.com) Microsoft said the 9B model improved on GUI-Owl-1.5-8B at 49% and that the 27B model was closing the gap with proprietary systems such as Yutori's n1. ### Which benchmark produced the headline numbers? (microsoft.com) Microsoft said the headline comparison came from Online-Mind2Web, which it described as 300 tasks across 136 popular sites. Crypto Briefing reported the same setup and the same score line for Fara1.5-27B, Operator and Gemini 2.5 Computer Use. GitHub records in Microsoft's Fara repository show the company also refreshed WebTailBench tasks and rubrics on May 12 and said on May 21 that a Fara1.5 agent harness was "coming soon." (microsoft.com) That suggests Microsoft is still updating the evaluation and release tooling around the models. ### How do Operator and Gemini fit into this race? OpenAI introduced Operator on January 23, 2025, as a research preview agent that could use its own browser to type, click and scroll on a user's behalf. (microsoft.com) OpenAI said at the time the product would be released to a limited audience first and would evolve with feedback. Google said when it introduced the Gemini 2.5 Computer Use model that developers could access it through the Gemini API in Google AI Studio and Vertex AI. (github.com) Google described that model as a specialized system for browser and mobile control tasks using screenshots and generated UI actions. ### Why are open-weight results getting attention? (openai.com) Microsoft's GitHub repository for Fara says the project is open under an MIT license, and the company has previously released Fara-7B as an open-weight computer-use model. The new Fara1.5 result is drawing notice because it pairs an open release model with a benchmark lead over closed commercial systems in a task category focused on doing things on the web, not just answering questions. (blog.google) That characterization comes from the benchmark results and release details published by Microsoft and reported by Decrypt. Microsoft's next public step appears to be the release of the Fara1.5 agent harness referenced in the GitHub update dated May 21. The Microsoft Research article and repository are the main places where the company is posting model specifications, benchmark scores and follow-on tooling updates. (microsoft.com) (github.com)