New Open-Source 'Tulu 3' Model Enters the Ring
What happened
A new open-source model called Tulu 3 is being touted as a strong competitor to proprietary giants. Benchmarks reportedly show Tulu 3 matching or exceeding models like GPT-4 and Claude 3 on several reasoning and coding tasks, with a focus on modularity for easier fine-tuning.
Why it matters
Developed by the Allen Institute for AI (AI2), the Tulu 3 family of models is built upon Llama 3.1 base models. This initiative aims to bridge the gap between proprietary and open post-training recipes for language models. Tulu 3's training process involves a four-stage recipe that includes supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique called Reinforcement Learning with Verifiable Rewards (RLVR). This RLVR method is designed to improve performance on tasks with verifiable outcomes, like math problems. In a move towards greater transparency in AI development, AI2 has made the entire Tulu 3 pipeline open, releasing the training datasets, data curation tools, training code, and evaluation suites. This level of openness is intended to foster reproducibility and spur further research in the open-source community. Performance benchmarks show that Tulu 3 models are competitive with, and in some cases surpass, other open-weight models of similar size such as Llama 3.1-Instruct and Qwen2.5-Instruct. The largest model in the family, Tulu 3 405B, is also competitive with closed models like GPT-4o-mini and Claude 3.5-Haiku. The open-source nature of Tulu 3, released under an Apache 2.0 license, allows developers to avoid the restrictive licensing and proprietary limitations of other high-performance models. This provides startups with greater flexibility and control over their AI implementations. For engineers looking to experiment with Tulu 3 locally, the recommended hardware includes a high-end GPU such as an RTX 4090 or RTX A6000, at least 8 GB of RAM, and 200 GB of disk space.
Key numbers
- A new open-source model called Tulu 3 is being touted as a strong competitor to proprietary giants.
- Benchmarks reportedly show Tulu 3 matching or exceeding models like GPT-4 and Claude 3 on several reasoning and coding tasks, with a focus on modularity for easier fine-tuning.
- Developed by the Allen Institute for AI (AI2), the Tulu 3 family of models is built upon Llama 3.1 base models.
- Tulu 3's training process involves a four-stage recipe that includes supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique called Reinforcement Learning with Verifiable Rewards (RLVR).
What happens next
- This initiative aims to bridge the gap between proprietary and open post-training recipes for language models.
Quick answers
What happened in New Open-Source 'Tulu 3' Model Enters the Ring?
A new open-source model called Tulu 3 is being touted as a strong competitor to proprietary giants. Benchmarks reportedly show Tulu 3 matching or exceeding models like GPT-4 and Claude 3 on several reasoning and coding tasks, with a focus on modularity for easier fine-tuning.
Why does New Open-Source 'Tulu 3' Model Enters the Ring matter?
Developed by the Allen Institute for AI (AI2), the Tulu 3 family of models is built upon Llama 3.1 base models. This initiative aims to bridge the gap between proprietary and open post-training recipes for language models. Tulu 3's training process involves a four-stage recipe that includes supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique called Reinforcement Learning with Verifiable Rewards (RLVR). This RLVR method is designed to improve performance on tasks with verifiable outcomes, like math problems. In a move towards greater transparency in AI development, AI2 has made the entire Tulu 3 pipeline open, releasing the training datasets, data curation tools, training code, and evaluation suites. This level of openness is intended to foster reproducibility and spur further research in the open-source community. Performance benchmarks show that Tulu 3 models are competitive with, and in some cases surpass, other open-weight models of similar size such as Llama 3.1-Instruct and Qwen2.5-Instruct. The largest model in the family, Tulu 3 405B, is also competitive with closed models like GPT-4o-mini and Claude 3.5-Haiku. The open-source nature of Tulu 3, released under an Apache 2.0 license, allows developers to avoid the restrictive licensing and proprietary limitations of other high-performance models. This provides startups with greater flexibility and control over their AI implementations. For engineers looking to experiment with Tulu 3 locally, the recommended hardware includes a high-end GPU such as an RTX 4090 or RTX A6000, at least 8 GB of RAM, and 200 GB of disk space.