OpenAI Details Architecture for "Agentic" AI Systems

OpenAI has published the architecture for its Codex App Server, signaling a shift toward more capable "agentic" AI. The unified architecture is designed to enable AI agents to act across multiple modalities and environments, from cloud platforms to edge devices. This approach aims to create tightly coupled, context-aware systems where AI is integrated into various user interfaces by default.

- The Codex App Server functions as a bidirectional JSON-RPC API, decoupling the agent's core logic from various user interfaces like CLIs, IDE extensions, and web apps. This allows clients to communicate with a single, persistent "harness" that manages tasks like state management, authentication, and tool execution. - To structure the complex interactions of agentic systems, the architecture defines three core conversation primitives: an "Item" as the atomic unit of input/output, a "Turn" which groups a sequence of items from a single agent action, and a "Thread" which is the durable container for an entire session. - This architecture is a key component of OpenAI's broader strategy for agentic AI, which also includes the Agents SDK, a toolkit for developers to build agent-based applications with features like safety guardrails, context management, and tracing for debugging. - OpenAI recently hired Peter Steinberger, the creator of the popular open-source desktop agent OpenClaw, to lead its personal agent development, signaling a push to create agents that can interact with desktop environments and applications. - The move toward agentic AI is enabled by advances in multimodal models like GPT-4o (the "o" stands for "omni"), which can natively process text, images, and voice in a single model, a capability also being pursued by competitors like Google with Gemini and Anthropic with Claude. - By targeting edge devices, this architecture taps into a major industry trend of moving AI processing away from the cloud to reduce latency, lower bandwidth costs, and improve privacy for applications in robotics, IoT, and automotive systems. - The App Server's design supports multiple deployment models, including running as a local child process for IDE extensions like VS Code or being launched inside a container for web clients, which ensures the agent's state is preserved even if a browser tab is closed. - OpenAI's approach with a specific protocol for its Codex harness complements broader industry efforts like the Agent Client Protocol (ACP), which aims to create a universal standard for connecting any coding agent to any editor, similar to how the Language Server Protocol standardized language tooling.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.