Seoul Economic Daily: output tokens 6x

- OpenAI, Anthropic, and Nvidia are leaning harder into token metering, and Seoul Economic Daily says the new center of gravity is charging for every generated word. - The striking detail is the spread: OpenAI lists GPT-5.5 at $5 per million input tokens and $30 per million output tokens — a 6x gap. - That shifts the AI race from just bigger models to cheaper inference — especially output-heavy workloads, agents, and chat products.

AI pricing is getting less abstract. That is the real story here. The big model companies are no longer just selling “access to AI” in a fuzzy SaaS bundle — they are spelling out exactly what costs money, and the expensive part is increasingly the model’s answer, not the prompt. OpenAI’s current pricing page shows that clearly, and Anthropic’s does too. Seoul Economic Daily’s point is basically that Silicon Valley is turning token pricing into the business model itself, not just the billing unit. ### What is a token, really? A token is the chunk of text the model reads or writes. Not exactly a word, but close enough for intuition. Your prompt burns input tokens. The model’s reply burns output tokens. That split matters because the two sides are now priced very differently, which turns product design into an economics problem as much as a software problem. ### Where does the “6x” come from? OpenAI’s official API pricing page makes the ratio obvious on some flagship models. (en.sedaily.com) GPT-5.5 is listed at $5 per 1 million input tokens and $30 per 1 million output tokens. Anthropic shows a similar pattern — Claude Opus 4.7 is $5 per million input tokens and $25 per million output tokens, while Sonnet 4.6 is $3 and $15. So the exact multiple varies by model, but “output costs several times more” is not a metaphor anymore. (openai.com) It is the posted price card. ### Why is output more expensive? Because generating tokens is the costly part of inference. The model has to keep producing the next token, one step at a time, while holding the whole conversational state together. Reading a prompt is expensive too, but long answers, chain-of-thought-heavy tasks, and agent loops can turn output into the real budget killer. That is why a product that looks cheap in demos can get pricey in production — especially if users expect long, polished responses. (openai.com) ### Why does this matter now? Because AI companies need cleaner monetization. For a while, the market rewarded growth, model launches, and benchmark wins. But token pricing ties revenue directly to usage. More prompts mean more billable input. More generated text means even more billable output. That makes inference economics visible to customers and to investors at the same time. (en.sedaily.com) ### What changes for builders? Prompting starts to look like cost engineering. Teams will try to shrink outputs, cap verbosity, cache repeated context, and route easy tasks to cheaper models. They will also push more work to retrieval, rules, or local models before asking a frontier model to write a long answer. Basically, every unnecessary sentence becomes a tiny billable event. ### Why does Nvidia show up in this story? (en.sedaily.com) Because once pricing is metered per token, hardware efficiency becomes easier to sell. If customers are paying for generated tokens, then faster and cheaper inference infrastructure matters more. Seoul Economic Daily frames Nvidia’s message in exactly that direction — not just “buy GPUs,” but “buy the machinery that lowers token economics.” That also helps explain the growing interest in smaller models, mixture-of-experts designs, and on-device or hybrid setups that avoid expensive cloud output when they can. (openai.com) ### Does this change the AI race? Yes — a bit. The competition is no longer only about who has the smartest model. It is also about who can deliver useful answers with fewer generated tokens, lower latency, and less compute burn. A model that is slightly worse but much cheaper to run can win real workloads. That is the quiet shift underneath the headline. (en.sedaily.com) ### Bottom line The important change is not that token pricing exists. It has existed. The change is that the spread between input and output is now explicit enough to shape product strategy, hardware demand, and who actually makes money in AI. Output is becoming the meter that matters most. (en.sedaily.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.