Codex pricing model
- Developers report many code-model vendors shifting from per-request to token billing, which raised single-prompt costs. (x.com) - Users cited examples where a single prompt’s cost jumped from about $0.04 to as much as $10 under new token schemes. (x.com) - The billing change is intensifying debate over whether code assistants should charge per-call, per-token, or per-resolution for predictable budgets. ( )
OpenAI changed Codex billing on April 2 from per-message charges to token-based pricing, tying the cost of coding prompts to how much text the model reads and writes. (developers.openai.com) (help.openai.com) A token is a small chunk of text, so longer prompts, larger code files, and bigger model outputs can all raise the bill on a single coding task. OpenAI’s Codex pricing page says usage is now calculated in credits per million input, cached input, and output tokens. (developers.openai.com) (help.openai.com) The change applies to new and existing ChatGPT Plus, Pro, and Business customers, and to new Enterprise plans; OpenAI said existing Enterprise customers would stay on a legacy rate card until a later migration. OpenAI also added pay-as-you-go pricing for Business and Enterprise teams in an April 2026 product update. (help.openai.com) (openai.com) The pricing shift lands as coding assistants move from short autocomplete replies to longer “agentic” jobs that inspect repositories, plan changes, run tools, and generate multi-file edits. OpenAI markets Codex as a system for feature builds, refactors, reviews, and releases, not just single-line suggestions. (openai.com 1) (openai.com 2) That usage pattern favors token billing for vendors because the meter rises with the amount of context and output each task consumes. It also makes budgeting harder for developers who were used to a fixed price per prompt or per seat. (developers.openai.com) (help.openai.com) OpenAI is not alone in pricing coding-capable models by tokens. Its API pricing page lists per-million-token rates for GPT models, Anthropic lists token rates for Claude Opus models, and Google publishes token-based Gemini API pricing. (openai.com) (anthropic.com) (ai.google.dev) Anthropic has also bundled Claude Code into premium Team and Enterprise seats, showing that vendors are mixing subscription access with usage-based charges rather than settling on one model. Google, meanwhile, sells Gemini Code Assist in Standard and Enterprise editions while separately publishing token pricing for Gemini APIs. (anthropic.com) (docs.cloud.google.com) (ai.google.dev) The argument now is less about whether coding models cost money and more about which meter developers can predict: per call, per token, per seat, or per completed task. As code assistants take on larger jobs, the bill increasingly depends on how much code the model has to read before it can write. (developers.openai.com) (openai.com)