Oracle + NVIDIA expand AI stack
Oracle Cloud Infrastructure and Nvidia broadened accelerated computing support so customers can run NVIDIA AI Enterprise, RAPIDS for data prep, and NeMo Retriever RAG pipelines integrated with Oracle DB 23ai on Kubernetes (x.com). The change packages data preparation, retrieval and model services closer to the database layer, streamlining cloud-native AI workflows for enterprises running Kubernetes (x.com).
Most company artificial intelligence projects break in the same place: the model is not the hard part, the data plumbing is. A document has to be cleaned, turned into vectors, stored somewhere searchable, and fetched fast enough that a chatbot does not feel like it is thinking through molasses. (oracle.com) Oracle and NVIDIA are trying to collapse that plumbing into one lane inside Oracle Cloud Infrastructure. Their latest expansion puts NVIDIA AI Enterprise, NVIDIA RAPIDS for accelerated data preparation, and NVIDIA NeMo Retriever for retrieval-augmented generation pipelines alongside Oracle Database 23ai and Oracle Kubernetes Engine. (oracle.com) Retrieval-augmented generation is the trick that keeps a model from guessing. Instead of asking a model to answer from memory, the system first searches company documents, pulls back the relevant passages, and then asks the model to answer with that material in view. (docs.oracle.com) A vector database is the shelf that makes that search work. Oracle Database 23ai stores vectors inside the database, so a company can keep ordinary business records and the mathematical fingerprints of documents in the same system instead of splitting them across separate tools. (oracle.com) NVIDIA NeMo Retriever handles the messy part at the front of the line. NVIDIA says its extraction tools can pull text, tables, charts, and images out of complex documents, and its Retriever stack is designed to feed those results into retrieval systems for enterprise questions and answers. (nvidia.com, nvidia.com) NVIDIA RAPIDS works one step earlier, during data preparation. It uses graphics processing units instead of only central processing units to speed up data science and analytics jobs, which is useful when a company needs to process large piles of logs, spreadsheets, or transaction data before any model sees them. (nvidia.com, oracle.com) Kubernetes is the software layer that keeps containers running across many servers, like an air-traffic controller for application pieces. Oracle Kubernetes Engine is Oracle’s managed version, and Oracle’s NVIDIA page now pitches these artificial intelligence workflows as something customers can deploy there rather than stitching together by hand. (oracle.com) This did not start today. In March 2025, Oracle and NVIDIA announced native integration of NVIDIA AI Enterprise with Oracle Cloud Infrastructure and tied NVIDIA inference software into Oracle’s artificial intelligence services, which set up the current move toward a tighter database-centered stack. (nvidianews.nvidia.com, oracle.com) Oracle had already been showing the pieces in smaller form. Oracle published a 23ai and NeMo Retriever architecture for retrieval pipelines, and NVIDIA published a 2024 technical walkthrough showing Oracle Database generative artificial intelligence workloads accelerated with NVIDIA inference microservices and vector search libraries. (oracle.com, developer.nvidia.com) The new part is the packaging. Oracle is selling the idea that data preparation, retrieval, inference software, and the database should sit close together on the same cloud platform, so an enterprise team can build a question-answering system without managing four vendors and a maze of connectors. (oracle.com, nvidia.com) That is a very specific bet on where enterprise artificial intelligence spending is going. The winners may not be the companies with the flashiest chatbot demo, but the companies that make the boring middle layers — ingest, search, orchestration, and security — feel like one product instead of a weekly integration problem. (oracle.com, nvidia.com)