Podcast: Redefine 'Done' for Engineering Impact
A recent episode of *The CTO Playbook* argues that engineering teams often limit their business impact by having a vague "definition of done." The host suggests that many teams optimize for activity, like writing code, rather than outcomes. The podcast advocates defining "done" as the point when work is in production, users are engaging with it, and telemetry is flowing, ensuring clear accountability and preventing projects from stalling.
- The concept of a "Definition of Done" (DoD) originated with early Agile methodologies like Scrum and Extreme Programming (XP) around 2002 to create a shared understanding of what it means for work to be complete. Initially, it focused on development-centric tasks like code being written, refactored, and passing acceptance tests. However, the rise of DevOps has pushed teams to expand their DoD to include operational readiness, such as successful integration into the main branch and satisfactory feedback from continuous integration systems. - Frameworks like DORA (DevOps Research and Assessment) provide specific, outcome-oriented metrics that modern engineering teams use to build a more impactful Definition of Done. These metrics include Deployment Frequency, Mean Lead Time for Changes, Mean Time to Recover, and Change Failure Rate, shifting the focus from developer activity to software delivery performance. - For CTOs at scaling companies, leadership frameworks like CTO Levels offer a structured approach to evolving engineering practices, including the Definition of Done. This framework maps a CTO's focus areas—Speed, Stretch, Shield, and Sales—against company stages defined by team size and budget, guiding how practices should mature. - In the context of AI agent development, a simple DoD is insufficient; orchestration frameworks like LangGraph are critical for coordinating multiple agents. These frameworks model agentic workflows as state machines or directed graphs, which is essential for managing the complexity of agent-to-agent communication, shared memory, and preventing costly, compounding errors. - Architectural patterns for multi-agent systems, such as sequential pipelines, coordinator/dispatcher, and parallel fan-out/gather, directly influence reliability and cost. Choosing the wrong architecture can increase LLM calls by 5-15 iterations per task or decrease performance by up to 70% on sequential tasks, making architectural planning a key part of defining "done" for an AI feature. - China's AI ecosystem, where Pyra operates, has surpassed 700 billion yuan ($97.5 billion) in scale, driven by national strategies like the "Next Generation Artificial Intelligence Development Plan". Local competitors such as Zhipu AI, MiniMax, and Baichuan Intelligence are part of a government-supported push, with over 346 generative AI services registered with the Cyberspace Administration as of March 2025. - For consumer-facing AI products, the "Definition of Done" must extend to UX principles that ensure user trust and control. Key principles include making AI's memory and decision-making visible, providing clear error handling and recovery paths, and using simple language that avoids technical jargon to build user confidence. - Shifting the "Definition of Done" from output (like story points) to outcomes requires connecting engineering work to business metrics such as customer satisfaction, revenue growth, or churn rate. This involves creating dashboards that visualize engineering metrics like cycle time alongside product and business KPIs to demonstrate the tangible impact of development efforts.