b.ai touts verifiable pricing, 90% savings
- B.AI is pushing a simple pitch: its LLM gateway mirrors official model rates in credits and shows the cache math instead of hiding markup. - The headline number is real but conditional — B.AI’s docs show 90% savings on cache reads, matching OpenAI and Anthropic discount mechanics. - That matters because AI middleware pricing is getting easier to compare, so “trust us” margins are becoming a weaker sales story.
AI API pricing is turning into a trust problem. Everybody says they make models cheaper, but a lot of platforms bundle routing, convenience, and mystery margin into one bill. B.AI is trying to turn that into a product pitch. Its docs now lean hard on “verifiable” pricing — basically, published pass-through rates in a credits system, plus explicit prompt-caching discounts that can cut repeat input costs by as much as 90%. (docs.b.ai) ### What is B.AI actually selling? B.AI is not pitching a new frontier model. It’s pitching a gateway — one API layer that gives developers access to models from OpenAI, Anthropic, Google, DeepSeek, MiniMax, Zhipu, and others, while settling usage through its own credits ledger. On its site, B.AI describes itself as infrastructure for agents and multi-model access, wit(docs.b.ai) of the package. (b.ai) ### What does “verifiable pricing” mean here? In plain English, it means B.AI publishes a conversion rate and token table instead of asking customers to accept a blended invoice. Its docs say 1 USD equals 1,000,000 credits, then list per-token input, cache write, cache read, and output charges for each supported model. For several models, those numbers line up exactly with first-party pricing pages — for ex(b.ai)input, 0.25 cached input, and 15.00 output credits per token-million equivalent, which matches OpenAI’s published rates after the credit conversion. (docs.b.ai) ### Where does the “90% savings” number come from? Mostly from prompt caching — not from B.AI inventing a cheaper base rate. B.AI’s pricing page uses Claude Sonnet 4.6 as the cleanest example: 1,000 cached tokens cost 3,750 credits on the first write, then 300 credits on later cache reads, which it labels as a 90% savings versus standard input pricing. Anthropic uses t(docs.b.ai)pt caching can cut input token costs by up to 90% too. (docs.b.ai) ### So is B.AI actually cheaper? Sometimes yes, but the important point is why. If B.AI is truly matching official rates, then the savings come from architecture — cache hits, model routing, maybe lower-friction access — rather than secret wholesale discounts. That is a more believable claim. But it also means the big “up to 90%” number only shows up when prompts share(docs.b.ai)s. If your prompts are highly variable, the savings shrink fast. (developers.openai.com) ### Why does prompt structure matter so much? Because caching is picky. OpenAI says cache hits depend on exact prefix matches, with static instructions and examples placed first and changing user data pushed to the end. It also only kicks in for prompts 1,024 tokens or longer. So this is not magic discounting — it rewards a specific app design (developers.openai.com)onstantly changing prompts do not. (developers.openai.com) ### Why make pricing transparency the headline now? Because middleware is getting squeezed. Once model vendors publish cleaner pricing tables and caching rules, any wrapper charging extra has to justify the spread with something concrete — reliability, compliance, analytics, support, or workflow tools. “We simplify access” is still useful, but (developers.openai.com)d that, so it’s turning transparency itself into the feature. (docs.b.ai) ### What’s the catch? The catch is that “no markup” is easier to verify on token prices than on the full bill. B.AI also charges for extras like web search on some models, and real-world costs still depend on output length, cache misses, and which provider you route to. So the cleanest reading is not “B.AI made AI 90% cheaper.” It’s “B.AI is exposing the same discount (docs.b.ai)g not to skim on top.” (docs.b.ai) ### Bottom line? This story is less about a breakthrough discount and more about a changing market norm. AI buyers are starting to demand receipts — real rate cards, visible cache math, and fewer black-box margins. B.AI’s bet is that transparency itself will win customers. That feels plausible — but only if the pass-through promise holds up in actual invoices.