MiniMax Releases Framework for Scalable Reinforcement Learning

AI research firm MiniMax has revealed a new framework named Forge for scalable reinforcement learning (RL) in real-world agent systems. The framework is designed to improve the throughput, stability, and flexibility of training industrial-grade embodied AI. Such advancements are critical for deploying robots that learn and adapt in dynamic environments.

- The Forge framework achieves a reported 40x training speedup by using optimized asynchronous scheduling strategies and a tree-structured merging strategy for training samples. - Its architecture is designed to be "agent-native," introducing an intermediary layer that completely decouples the training and inference engine from the agent's internal implementation, allowing it to work with arbitrary and even black-box agents. - For algorithmic stability with Mixture-of-Experts (MoE) models, the framework continues to use the CISPO (Cognitively Inspired Scheduled Policy Optimization) algorithm, which MiniMax introduced in earlier research. - To handle credit assignment challenges in long-context tasks, Forge employs a composite reward framework that includes "Process Rewards" for dense feedback on intermediate steps, rather than relying only on the final outcome. - The framework was battle-tested in the development of MiniMax's M2.5 model, which was trained across hundreds of thousands of distinct real-world environments and agent scaffolds. - Internally, the M2.5 model trained with Forge now automates 30% of real business tasks at MiniMax, and its generated code accounts for 80% of the company's newly committed code. - MiniMax was founded in 2021 by Yan Junjie, former VP at AI giant SenseTime, and is backed by major tech and venture capital firms including Alibaba, Tencent, and Hillhouse Capital. - The company is a major player in China's AI sector, having raised approximately $850 million in private funding before a successful Hong Kong IPO that raised an additional $619 million.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.