Meta Cuts Stock Awards to Fund AI Push
Meta Platforms is cutting employee stock awards by 5% for the second consecutive year to help fund a massive AI infrastructure investment. The company is deploying millions of Nvidia Grace CPUs for a new standalone inference architecture designed for generative AI and recommendation workloads. This AI spending is part of a strategy that has seen a total investment tracking a $135 billion "S-curve," with up to $65 billion spent in 2025 alone, according to reports.
- This architectural shift is one of the first large-scale deployments of standalone, Arm-based CPUs for AI inference, moving away from traditional x86 processors for these specific workloads. The explicit goal is to optimize the performance-per-watt for serving recommendation models and agentic AI products, where the economics of cost-per-query are more critical than raw training speed. - The strategy highlights a growing MLOps trend of creating divergent hardware paths for training and inference. While massive GPU clusters handle model training, this new architecture uses a more cost-effective CPU-based approach for production inference, which prioritizes low-latency and predictable performance for live user requests. - Meta is also developing its own custom silicon, the MTIA 2i chip, which is specifically optimized for the inference workloads of its recommendation models. This dual approach of a deep Nvidia partnership and in-house chip development aims to reduce the total cost of ownership and mitigate risks from unpredictable GPU supply. - The massive investment is being rewarded by investors because its AI-driven recommendation systems are generating immediate returns, increasing ad impressions by 18% and the average price per ad by 6% in a recent quarter. This direct monetization of AI infrastructure is a key reason Wall Street is supporting the high capital expenditures. - In comparison, Google is focusing on vertical integration with its custom Tensor Processing Units (TPUs). Its AI Hypercomputer architecture is designed as a fully integrated system of hardware and software, including the latest "Ironwood" TPUs, giving Google end-to-end control over its AI stack. - The partnership with Nvidia also includes adopting the Spectrum-X Ethernet networking platform, which is critical for interconnecting the massive AI clusters. This high-speed networking fabric is essential for maintaining low-latency and high-throughput communication between thousands of processors during large-scale inference tasks. - This level of spending is part of a broader FAANG trend, with