Hard, long experiment with Claude

A principal engineer logged 120 hours using Claude and Codex across an 80k‑line codebase and reported that architecture docs, review loops and judgment mattered more than raw model speed. The thread framed LLMs as amplifiers of engineering decision‑making rather than replacements, and stressed operational practices for integrating generative tools. (x.com)

A principal engineer said 120 hours with Claude and Codex on an 80,000-line codebase changed less about coding speed than about how work gets reviewed and decided. (x.com) In the post, the engineer said architecture documents, repeated review loops and human judgment did more to improve results than raw model speed. He described the work as a long test across a real production-sized repository, not a short benchmark. (x.com) Claude Code and Codex are both agent-style coding tools: they read files, edit code, run commands and check outputs inside a development workflow. Anthropic says Claude Code works through a loop of gathering context, taking action and verifying results, and OpenAI has made Codex available alongside Claude in GitHub’s Agent HQ preview. (code.claude.com, github.blog) That framing cuts against the sales pitch that the fastest model wins. Anthropic’s own documentation tells users to manage context carefully because long sessions fill the model’s working memory fast and performance drops as more files, messages and command output pile up. (code.claude.com) The practical lesson is old software discipline in new packaging. If a team has weak architecture notes, unclear ownership and no review process, an agent that can type and run tests faster will still inherit those gaps. (x.com, code.claude.com) That is also why engineers increasingly split work between tools instead of treating one model as a full replacement. GitHub said in February 2026 that both Claude and Codex were being offered in public preview in Agent HQ, a sign that major developer platforms expect teams to mix agents inside one workflow. (github.blog) Anthropic markets Claude Code as a terminal and desktop tool that can inspect a codebase, edit files, run tests and show visual diffs. Its pricing page says the product is bundled into Pro, Team, Max and Enterprise plans, which puts it directly into day-to-day engineering operations rather than one-off experiments. (claude.com) The engineer’s account also fits how these systems actually work in large repositories. Anthropic says Claude Code is an “agentic coding tool” that depends on context management and tool use around the model, which means the surrounding process can matter as much as the model itself. (code.claude.com, code.claude.com) After 120 hours, the reported takeaway was narrower than “artificial intelligence replaces engineers.” The claim was that good documents, tight review loops and experienced judgment still set the ceiling, and the models mostly raise the volume. (x.com)

Hard, long experiment with Claude

Get your own daily briefing