New Framework for Controllable AI
A new tutorial details a method for building AI agents that require explicit user approval at each stage of a plan-and-execute loop. The approach uses LangGraph and Streamlit frameworks, offering a reference pattern for implementing traceable human oversight in safety-critical systems.
- LangGraph is an extension of the LangChain framework, designed to create stateful, multi-agent AI applications by structuring workflows as graphs. This graph-based architecture allows for more complex and flexible execution flows, including loops and conditional branching, which are difficult to implement in the linear, sequential structure of LangChain. - The "plan-and-execute" model separates the AI's reasoning from its actions; it first creates a step-by-step plan and then carries out each task. This differs from other architectures like ReAct (Reason + Act) which interleave reasoning and action at every step, a process that can be slower for complex workflows. - Streamlit is an open-source Python framework that enables developers to build and share web apps for machine learning and data science projects with minimal web design knowledge. In this context, it provides the user interface for reviewing, editing, and approving the AI's proposed plan before execution. - The human-in-the-loop (HITL) approach is critical in safety-focused industries like aerospace, where it's used in applications such as autopilot systems and air traffic control to combine human oversight with automated processes. In AI systems, HITL allows for continuous human engagement to guide and verify the AI's actions in real-time. - Applying AI in safety-critical aerospace systems presents significant challenges for certification under standards like DO-178C. Current standards were not designed for machine learning, making verification of AI-driven software, especially for high-criticality functions (Levels A-C), a complex issue. - The combination of LangGraph's ability to interrupt a workflow and Streamlit's interface creates a practical method for implementing human oversight. This "human-on-the-loop" model, where the system operates autonomously but requires human approval at key points, is a key paradigm for ensuring safety and trust in AI systems. - LangGraph's architecture includes persistent state management, meaning all parts of the graph can access and modify a shared state. This allows an agent to maintain context across multiple steps and interactions, a crucial feature for complex, long-running tasks. - While human-in-the-loop is considered the gold standard for AI evaluation, its implementation can be costly and time-consuming compared to automated methods like using public benchmarks or a model-as-a-judge approach. The framework presented aims to streamline this by integrating human feedback directly into the operational workflow.