New Open-Source 'Tulu 3' Model Enters the Ring

Published March 9, 2026 by The Daily Scout

A new open-source model called Tulu 3 is being touted as a strong competitor to proprietary giants. Benchmarks reportedly show Tulu 3 matching or exceeding models like GPT-4 and Claude 3 on several reasoning and coding tasks, with a focus on modularity for easier fine-tuning.

Why it matters

Developed by the Allen Institute for AI (AI2), the Tulu 3 family of models is built upon Llama 3.1 base models. This initiative aims to bridge the gap between proprietary and open post-training recipes for language models. Tulu 3's training process involves a four-stage recipe that includes supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique called Reinforcement Learning with Verifiable Rewards (RLVR). This RLVR method is designed to improve performance on tasks with verifiable outcomes, like math problems. In a move towards greater transparency in AI development, AI2 has made the entire Tulu 3 pipeline open, releasing the training datasets, data curation tools, training code, and evaluation suites. This level of openness is intended to foster reproducibility and spur further research in the open-source community. Performance benchmarks show that Tulu 3 models are competitive with, and in some cases surpass, other open-weight models of similar size such as Llama 3.1-Instruct and Qwen2.5-Instruct. The largest model in the family, Tulu 3 405B, is also competitive with closed models like GPT-4o-mini and Claude 3.5-Haiku. The open-source nature of Tulu 3, released under an Apache 2.0 license, allows developers to avoid the restrictive licensing and proprietary limitations of other high-performance models. This provides startups with greater flexibility and control over their AI implementations. For engineers looking to experiment with Tulu 3 locally, the recommended hardware includes a high-end GPU such as an RTX 4090 or RTX A6000, at least 8 GB of RAM, and 200 GB of disk space.

Key numbers

A new open-source model called Tulu 3 is being touted as a strong competitor to proprietary giants.
Benchmarks reportedly show Tulu 3 matching or exceeding models like GPT-4 and Claude 3 on several reasoning and coding tasks, with a focus on modularity for easier fine-tuning.
Developed by the Allen Institute for AI (AI2), the Tulu 3 family of models is built upon Llama 3.1 base models.
Tulu 3's training process involves a four-stage recipe that includes supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique called Reinforcement Learning with Verifiable Rewards (RLVR).

What happens next

This initiative aims to bridge the gap between proprietary and open post-training recipes for language models.

Sources

Quick answers

What happened in New Open-Source 'Tulu 3' Model Enters the Ring?

A new open-source model called Tulu 3 is being touted as a strong competitor to proprietary giants. Benchmarks reportedly show Tulu 3 matching or exceeding models like GPT-4 and Claude 3 on several reasoning and coding tasks, with a focus on modularity for easier fine-tuning.

Why does New Open-Source 'Tulu 3' Model Enters the Ring matter?

Developed by the Allen Institute for AI (AI2), the Tulu 3 family of models is built upon Llama 3.1 base models. This initiative aims to bridge the gap between proprietary and open post-training recipes for language models. Tulu 3's training process involves a four-stage recipe that includes supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique called Reinforcement Learning with Verifiable Rewards (RLVR). This RLVR method is designed to improve performance on tasks with verifiable outcomes, like math problems. In a move towards greater transparency in AI development, AI2 has made the entire Tulu 3 pipeline open, releasing the training datasets, data curation tools, training code, and evaluation suites. This level of openness is intended to foster reproducibility and spur further research in the open-source community. Performance benchmarks show that Tulu 3 models are competitive with, and in some cases surpass, other open-weight models of similar size such as Llama 3.1-Instruct and Qwen2.5-Instruct. The largest model in the family, Tulu 3 405B, is also competitive with closed models like GPT-4o-mini and Claude 3.5-Haiku. The open-source nature of Tulu 3, released under an Apache 2.0 license, allows developers to avoid the restrictive licensing and proprietary limitations of other high-performance models. This provides startups with greater flexibility and control over their AI implementations. For engineers looking to experiment with Tulu 3 locally, the recommended hardware includes a high-end GPU such as an RTX 4090 or RTX A6000, at least 8 GB of RAM, and 200 GB of disk space.

New Open-Source 'Tulu 3' Model Enters the Ring

What happened

Why it matters

Key numbers

What happens next

Sources

Quick answers

What happened in New Open-Source 'Tulu 3' Model Enters the Ring?

Why does New Open-Source 'Tulu 3' Model Enters the Ring matter?

Get your own daily briefing