Engineering shops build CI for AI agents
- Crafting’s March 9 launch turned a fuzzy trend into a product: AI coding agents now get production-like sandboxes, tests, credentials, and shipping paths. - The sharp detail is where the bottleneck moved — not code generation, but validation and deployment inside real enterprise systems with guardrails. - That matters because enterprise AI is drifting toward services-heavy adoption, where forward-deployed engineers help customers wire agents into messy workflows.
AI coding agents can already write a lot of code. That part is no longer the surprise. The real problem is everything after the first draft — testing, validation, permissions, deployment, and the ugly reality of enterprise systems. That gap is why this story matters. On March 9, Crafting launched Crafting for Agents and paired it with a $5.5 million seed round, basically betting that the next layer of AI software delivery is CI for agents, not just autocomplete for developers. (thenewstack.io) ### What actually changed? The concrete news is product, not just vibe. Crafting moved its platform from developer environments for humans into what it calls infrastructure for agentic engineering — isolated environments where agents can build, test, validate, iterate, and even ship code with controlled access to real dependencies and credentials. The company said the product is generally available, and the launch came with funding led by Mischief. (thenewstack.io) ### Why isn’t code generation the hard part anymore? Because enterprise teams already know how to get code out of a model. The catch is proving that code works in the same tangled world where the real app runs — with actual services, staging tiers, network rules, secrets, and compliance boundaries. Crafting’s own pitch is blunt: the bottleneck has shifted from writing code to validating and shipping it, and most (thenewstack.io)ed CI/CD pipelines, and cramped sandboxes. (crafting.dev) ### Why do sandboxes matter so much? A normal coding sandbox is like teaching someone to cook in a toy kitchen. They can practice motions, but they are not dealing with the real stove, pantry, or dinner rush. Enterprise software is the same. If an agent cannot touch the right dependencies, data, and services, it cannot really verify anything important. Crafting’s whole argument is that agents need production-like environments with scoped credentials, not sterile demo boxes. (crafting.dev) ### So is this basically CI/CD for agents? Yes — but with more guardrails and more orchestration. The platform describes separate stages for running code, testing it, validating it in production-like setups, iterating, and then shipping. That sounds a lot like CI/CD, except the actor moving through the pipeline is increasingly an agent instead of only a human engineer. Microsoft and others are framing the same broader shift as an “AI-led SD(crafting.dev)ity checks become the control system around autonomous builders. (crafting.dev) ### Are there real signs this is happening already? Some, yes. Crafting says early customer results included 25% more pull requests quarter over quarter without adding headcount, AI-generated code rising from single digits to 70% over 12 months, and engineers saving about 2.5 hours per week on environment setup. Those are company-supplied numbers, so treat them as directional, but they fit the broader pattern: more generated code creates mo(crafting.dev)crafting.dev) ### Where do forward-deployed engineers come in? They show up because enterprise AI still needs custom wiring. Andreessen Horowitz made the case last year that forward-deployed engineers are becoming central in AI startups because customers do not just buy a model — they need someone to connect it to internal databases, APIs, workflows, and business logic. Basically, if agents are going to act inside real software systems, someone has to map the mess first. (a16z.com) ### Why is this showing up in engineering shops now? Because the market is maturing one layer down. Six to nine months ago, the focus was faster code generation. Now the pain is orchestration, coordination, resource usage, validation, and safe deployment at scale. That is a pretty classic platform shift — the first wave makes creation cheap, then the second wave builds control systems around the chaos. (thenewstack.io)om line? The new bottleneck in AI-assisted software isn’t writing code. It’s building the harness that lets agents prove the code is safe enough to merge. The shops that win here may not be the ones with the flashiest model layer — they may be the ones that build the agent sandbox, the validation loop, and the services team that makes the whole thing usable. (thenewstack.io)