New Framework Simulates AI Agent ROI
A new open-source framework called ClawWork has been introduced by HKUDS to evaluate AI agents based on their return on investment. The framework runs agents in simulated economies, where top-performing models have generated equivalents of over $1,500 per hour, providing a new way to benchmark agentic side projects.
The ClawWork framework is built upon the ultra-lightweight OpenClaw and Nanobot architecture, aiming to shift the evaluation of AI agents from purely academic benchmarks to real-world economic viability. This approach reframes the AI from a simple assistant to an "AI Coworker" that must be self-sufficient. The entire open-source project is designed for educational and research purposes. Agents operating within the ClawWork simulation begin with a mere $10 balance. From this initial capital, they must pay for every token generated and API call made, creating extreme economic pressure. Failure to generate sufficient income to cover these operational costs results in the agent's "bankruptcy." The tasks assigned to the agents are drawn from the GDPVal benchmark dataset, which includes 220 real-world professional jobs across 44 different economic sectors like finance, manufacturing, and healthcare. This dataset was specifically designed to estimate the potential contribution of AI to the Gross Domestic Product. A key feature of the framework is the strategic dilemma faced by the agents: they must choose between performing tasks for immediate income or investing time and resources in learning to enhance their future performance. This mimics the real-world trade-offs between billable hours and professional development. Performance is tracked on a live dashboard that visualizes an agent's balance, income, costs, and survival metrics. Top-performing models have demonstrated significant economic potential within this simulation. For instance, Qwen3.5-Plus achieved an equivalent hourly rate of around $1,390, while Gemini 3.1 Pro earned over $15,700 in total. The project, developed by HKU's Data Intelligence Lab (HKUDS), has gained significant traction since its launch in February 2026, quickly accumulating thousands of stars on GitHub. The lab has a history of creating popular open-source AI tools, including LightRAG and DeepCode. The future roadmap for ClawWork includes introducing multi-agent competition, a marketplace of tasks with varying difficulty, and the integration of additional AI agent frameworks beyond Nanobot. This suggests a move towards more complex and dynamic simulated economies.