OpenClaw runs $1.3M agent bill

- Peter Steinberger said on May 16 that OpenClaw is running about 100 Codex agents, pushing OpenAI API usage to roughly $1.3 million monthly. - The 30-day tally reached 603 billion tokens across 7.6 million requests, with GPT-5.5 as the top model, according to reporting citing Steinberger. - OpenAI’s API pricing page lists GPT-5.5 at $5 per million input tokens and $30 per million output tokens.

Peter Steinberger said on May 16 that OpenClaw is running about 100 Codex coding agents at once, a setup he said is generating roughly $1.3 million a month in API costs. Reporting by The Decoder, citing Steinberger’s public comments, said the three-person team uses the agents to write code, review pull requests, find bugs and monitor regressions. The same report said the system consumed 603 billion tokens across 7.6 million requests in 30 days, with GPT-5.5 as the top model. Steinberger described the spending as a research expense aimed at testing what software development looks like when token costs are not the main constraint. ### Where does the $1.3 million figure come from? The $1.3 million figure came from a 30-day usage snapshot that Steinberger shared publicly and that multiple outlets summarized on May 16. The Decoder reported the total at $1.3 million over 30 days, alongside 603 billion tokens and 7.6 million requests. OfficeChai separately reported a screenshot showing a 30-day spend of $1,305,088.81. (the-decoder.com) The reported workload is not a single chatbot session. The Decoder said roughly 100 Codex instances are kept running in the cloud, with different agents assigned to separate software tasks. Those tasks include opening pull requests, reviewing commits, deduplicating issues, watching benchmarks and flagging regressions in Discord, according to the report. (the-decoder.com) ### What are those agents actually doing all day? The OpenClaw agents are being used across the normal mechanics of a software project rather than for one-off demos. The Decoder reported that some agents write fixes, some review pull requests, and some look for security holes in commits. Others monitor benchmark performance and report regressions, while another class of agents can listen to meetings and start pull requests for features discussed by the team, according to the same report. (the-decoder.com) That setup matters because it turns token usage into an operating input for engineering work. Steinberger told The Decoder that he was exploring how software would be built if token costs did not matter, framing the bill as part of an R&D experiment rather than a normal commercial budget. He also said disabling “Fast Mode” would cut costs by 70%, according to the report. (the-decoder.com) ### How expensive is GPT-5.5 at list prices? OpenAI’s API pricing page lists GPT-5.5 at $5 per million input tokens and $30 per million output tokens for standard short-context usage. The same page lists GPT-5.4 at $2.50 per million input tokens and $15 per million output tokens, showing that GPT-5.5 carries a higher published rate than the preceding flagship tier. (the-decoder.com) Those prices help explain how a large multi-agent coding system can produce a seven-figure monthly bill. The Decoder reported GPT-5.5 was the top model in Steinberger’s setup, and the reported 603 billion-token total indicates that the cost is being driven by sustained, repeated agent activity rather than a single large-context run. That is an inference from the published token totals and OpenAI’s posted pricing. (developers.openai.com) ### Who is paying for the experiment? The Decoder reported that OpenAI is covering the API bill for the OpenClaw work. The article described the team as about three people “working at OpenAI” while running the open-source project. That detail is important to the economics of the experiment because the published bill is not necessarily a direct out-of-pocket expense for an ordinary software startup. (the-decoder.com) The report said Steinberger defended the return on investment by pointing to the project’s open-source output and by arguing that the work tests how coding changes when token costs are removed from the equation. ### Why are people linking this to security risk as well as cost? Anthropic said on April 7 that its Claude Mythos Preview model was capable, in its testing, of identifying and exploiting zero-day vulnerabilities in every major operating system and every major web browser when directed by a user. Anthropic said it launched “Project Glasswing” in response and described the capability jump as a “watershed moment” for security. (the-decoder.com) The Register reported on May 15 that new benchmark work was testing whether AI agents could move beyond spotting vulnerabilities to producing working exploits. The Decoder separately reported on May 16 that Carnegie Mellon researchers used a benchmark called ExploitBench to test browser exploitation against Google’s V8 engine, and that Claude Mythos outperformed GPT-5.5 while both were able to complete the task in at least some cases. (red.anthropic.com) OpenAI’s pricing page and Steinberger’s reported usage figures remain the clearest public markers for the OpenClaw experiment. Steinberger’s next public proof points are likely to come through OpenClaw’s open-source releases and any further disclosures about how the roughly 100-agent setup changes code output, review throughput or bug-finding rates. (the-decoder.com) (theregister.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.