Talk: humans steer, agents execute

A talk titled 'Harness Engineering: How to Build Software When Humans Steer, Agents Execute' outlined a model where humans define intent and constraints while agents carry out bounded tasks like scaffolding and repetitive test generation. The presentation described agents handling refactors and boilerplate while humans retain architecture decisions and quality control. (youtube.com)

A recent OpenAI talk argued that software teams should have humans set direction and rules while coding agents handle bounded execution. (youtube.com) The talk, posted April 17, 2026, featured Ryan Lopopolo of OpenAI and pointed viewers to an OpenAI essay published February 11, 2026. In that essay, Lopopolo said his team spent five months building an internal beta product with “0 lines of manually-written code.” (youtube.com) (openai.com) OpenAI said Codex wrote the application logic, tests, continuous integration configuration, documentation, observability, and internal tooling. The company estimated the product was built in about one-tenth the time hand-coding would have taken. (openai.com) The basic idea is to move engineers up a level. Instead of typing every function, humans define the task, limits, and review process, while the agent does narrow jobs such as scaffolding a repository, generating repetitive tests, and iterating on pull requests. (openai.com) OpenAI described that shift as a change in the engineer’s job from writing code to “design environments, specify intent, and build feedback loops.” The company said early progress was slow until the team made the development environment more explicit for the agent. (openai.com) The numbers in OpenAI’s write-up were specific: the first commit landed in late August 2025, the codebase grew to about 1 million lines, and roughly 1,500 pull requests were opened and merged. OpenAI said three engineers initially drove the system, and throughput later rose as the team expanded to seven engineers. (openai.com) The “harness” in harness engineering is the control system around the model, not the model itself. OpenAI said it used an AGENTS.md file as a short map into repository docs, rather than a giant instruction manual, so agents could find the right rules without flooding their context window. (openai.com) OpenAI also said it wired the Chrome DevTools Protocol into the agent runtime so Codex could inspect pages, take screenshots, navigate the app, reproduce bugs, and verify fixes. For logs and metrics, the company said agents queried local observability stacks with LogQL and PromQL inside isolated worktrees. (openai.com) That setup leaves architecture and quality control with people, even when agents do most of the keystrokes. OpenAI said humans could review pull requests but increasingly pushed review loops to agents, while keeping repository structure, constraints, and source-of-truth docs under human control. (openai.com) The thread running through the talk and the essay was simple: if agents are going to write more code, the scarce resource is no longer typing. OpenAI’s claim was that the bottleneck becomes human attention, and the work shifts to deciding what should be built and how the system checks itself. (youtube.com) (openai.com)

Talk: humans steer, agents execute

Get your own daily briefing