OpenAI pricing split
BenchLM’s April summary shows OpenAI API pricing now differentiates across GPT-5.4, GPT-5.2 and GPT-5.1 with separate rates for input, cached input and batch API discounts. (benchlm.ai) The write-up frames model selection as an application‑architecture and cost‑management decision when adding AI features to products. (benchlm.ai)
OpenAI’s application programming interface pricing page now lists GPT-5.4, GPT-5.2, and GPT-5.1 as separate model tiers with different token rates. (openai.com) On OpenAI’s current pricing page, GPT-5.4 is listed at $2.50 per 1 million input tokens, $0.25 per 1 million cached input tokens, and $15.00 per 1 million output tokens. The same page shows batch processing at half price for supported workloads. (openai.com) OpenAI’s model pages list GPT-5.2 at $1.75 per 1 million input tokens and $14.00 per 1 million output tokens, and describe it as the “previous frontier model for professional work.” The GPT-5.4 model page’s comparison table places GPT-5.2 below GPT-5.4 on price while still positioning GPT-5.4 as the recommended upgrade. (developers.openai.com 1) (developers.openai.com 2) BenchLM’s April 13, 2026 pricing roundup says many comparison tables miss two details that change the math: cached prompts can cut repeated input costs to 10% of the standard rate, and the Batch Application Programming Interface cuts input and output prices by 50%. Its table presents GPT-5.4, GPT-5.2, and GPT-5.1 as distinct cost choices rather than one generic “GPT-5” line item. (benchlm.ai) That turns model choice into an engineering decision as much as a model-quality decision. A team building a chat product with repeated system prompts, or a back-office workflow that can wait up to 24 hours, can pay very different rates from a team sending every request live at standard prices. (platform.openai.com) (benchlm.ai) OpenAI’s Batch Application Programming Interface documentation says batched jobs are processed asynchronously within 24 hours for a 50% discount. OpenAI’s flex processing guide adds that slower, lower-priority jobs are priced at batch rates and can stack additional savings from prompt caching. (platform.openai.com) (developers.openai.com) The pricing split also comes with technical boundaries. OpenAI says GPT-5.4 and GPT-5.4 pro have a 1.05 million-token context window, but prompts above about 272,000 input tokens trigger higher pricing for the full session under standard, batch, and flex modes. (developers.openai.com 1) (developers.openai.com 2) OpenAI’s developer pricing page also shows a 10% surcharge for regional processing, which it calls data residency, on GPT-5.4 family models. That means the final bill can depend on where requests are processed as well as which model a developer picks. (developers.openai.com) The result is a pricing menu that is no longer just “best model versus cheapest model.” OpenAI’s own pages now break cost into standard, cached, batch, flex, long-context, and regional-processing cases, and BenchLM’s April summary argues developers need to design around those levers before they ship new artificial intelligence features. (developers.openai.com) (benchlm.ai)