OpenAI's True Operating Costs Investigated
Investigative reports highlight the significant compute and infrastructure costs behind free services like ChatGPT, which are often subsidized by venture capital. OpenAI CEO Sam Altman predicts these costs will fall dramatically, but the current economics are driving enterprises to scrutinize their AI stack for efficiency.
- While training costs for models like GPT-4 are a significant one-time expense, estimated to be over $100 million, the daily operational or "inference" costs for running the service are the larger, recurring challenge. In 2023, the daily cost to run ChatGPT was estimated at approximately $700,000, a figure that is now considered conservative. - The economics of AI are bifurcated into training (a capital expenditure-like cost) and inference (an ongoing operational expenditure). For every dollar spent on training, it is estimated that many more are spent on inference over the model's lifetime, with costs scaling directly with user engagement. - Hyperscalers are increasingly designing their own custom silicon to optimize for cost and performance, creating a competitive "build vs. buy" landscape. Amazon's AWS, for example, offers Trainium chips for training and Inferentia chips for inference, which they claim provide significantly higher throughput and lower cost per inference compared to general-purpose GPUs. - AWS has deployed nearly 500,000 of its custom Trainium2 chips to train Anthropic's Claude models, demonstrating a large-scale alternative to NVIDIA-based infrastructure. This custom silicon strategy is part of a broader trend of vertical integration to control the full AI stack, from chip design to cloud service delivery. - The venture capital landscape for AI hardware is robust, with U.S. semiconductor startups raising a record $6.2 billion in 2025. Notable deals include AI chipmaker Cerebras Systems raising $1.1 billion and Toronto-based startup Taalas, which designs custom chips for specific AI models, raising $169 million. - A circular investment pattern has emerged where major tech companies and chipmakers like NVIDIA and Microsoft invest in AI startups, who then often use the capital to purchase chips or cloud computing services from their investors. - Energy consumption is a significant and growing component of operating costs, with the power needed for AI computation doubling roughly every 100 days. U.S. data centers for AI are projected to consume approximately 88 terawatt-hours annually by 2030, which is 1.6 times the electricity consumption of New York City in 2023. - Despite high current costs, the price of AI inference has been dropping dramatically. Sam Altman has stated that the cost of AI usage is falling by a factor of 10 every year, a trend that could significantly alter the economic landscape for AI applications.