OpenAI Pushes Global Expansion, Touts NVIDIA Gains
OpenAI CEO Sam Altman is promoting a "Democratic AI" approach while pursuing major expansion in India, which he hailed as a "full stack AI leader." This comes as NVIDIA announced it nearly doubled the output of OpenAI's GPT OSS-120B model through joint software-hardware optimization efforts.
- The "Democratic AI" approach is a core part of Sam Altman's global tour, where he advocates for international collaboration on AI safety and governance to ensure benefits are widely shared. This initiative is presented as a contrast to potential authoritarian control of AI by nations like China and Russia. However, OpenAI has also lobbied to reduce the regulatory burden on its own systems, such as the E.U.'s AI Act. - OpenAI's expansion into India includes opening an office in New Delhi and plans for a massive 1-gigawatt data center, part of the larger $500 billion "Stargate" AI infrastructure project backed by SoftBank and Oracle. This move is designed to serve India, OpenAI's second-largest user base, and address data sovereignty concerns. - The India strategy aligns with the government's ₹10,371 crore ($1.25 billion) IndiaAI Mission, which aims to build sovereign compute capacity and multilingual AI applications. OpenAI's approach involves a three-pillar strategy for India: ensuring broad access to AI tools, driving adoption in sectors like education and healthcare, and building "AI literacy". - The GPT OSS-120B is a 117-billion parameter Mixture-of-Experts (MoE) open-weight model, with 5.7 billion active parameters during inference. It utilizes MXFP4 quantization for its MoE weights, allowing the entire model to run on a single 80GB GPU like an NVIDIA H100. - The performance gains with NVIDIA are achieved by using libraries like TensorRT-LLM, which compiles the raw model weights into an optimized binary for a specific GPU architecture, such as Hopper or Blackwell. This process involves techniques like kernel fusion, low-precision inference (FP8, INT8), and advanced parallelism to boost throughput and reduce latency. - For the GPT OSS-120B model, performance optimization often requires parallelizing across multiple GPUs using either Tensor Parallelism for lower latency or Expert Parallelism for higher throughput. Further gains are being explored through techniques like speculative decoding. - India is home to the largest population of students using ChatGPT globally, prompting OpenAI to launch the "OpenAI Learning Accelerator," an initiative to distribute roughly half a million ChatGPT licenses to educators and students. - Sam Altman's global engagement tour has included meetings with numerous world leaders, including French President Emmanuel Macron and Indian Prime Minister Narendra Modi, to shape the global conversation on AI regulation.