New Open-Source Tool for AI Trust Released
A new open-source tool called TrustPlane has been released to serve as an "AI trust control plane" for LLM-based systems. It's designed to help engineering teams operationalize transparency and risk management, which is critical for products serving children.
TrustPlane is built on an open-source epistemic scoring engine called CognOS, developed by Base76 Research Lab. This engine evaluates every LLM request for epistemic uncertainty before it reaches the user, assigning a score based on a formula that considers prior confidence and uncertainty factors. Based on this score, each request is dispositioned as PASS, REFINE, ESCALATE, or BLOCK. The underlying framework for the scoring model has been peer-reviewed and is detailed in publications on Applied AI Philosophy. This provides a theoretical and auditable foundation for the risk management process. Every decision made by TrustPlane is assigned a trace ID and recorded in an audit log, creating an immutable record for compliance and legal teams. For situations requiring human intervention, TrustPlane can trigger webhooks to incident management systems when a request is escalated or blocked. This ensures that potential issues are flagged for human review before they impact the end-user. The system supports pluggable providers, including Ollama, OpenAI, Anthropic, and Groq. The focus on child safety in AI is a growing concern, with organizations and governments calling for a "duty of care" from tech companies. Frameworks like the UK's Online Safety Act and the EU's Digital Services Act are increasing pressure on platforms to implement robust safety measures. For products serving children, this includes conducting clear risk assessments, having formal complaint mechanisms, and ensuring transparency in AI safety policies. Key risks for children identified in relation to generative AI include exposure to harmful content, sexual grooming, cyberbullying, and sextortion. To mitigate these, experts recommend a combination of robust content filtering, age-assurance mechanisms, and the labeling of AI-generated content. The National Institute of Standards and Technology (NIST) has been directed to develop risk management profiles for AI products likely to be accessed by children. Building AI for children presents unique challenges, including data privacy concerns and the potential for algorithmic bias to negatively impact developmental assessments. Ensuring the validity and reliability of AI models is critical, as inaccuracies can lead to misinterpretations of a child's learning progress. There's a recognized need for AI developers to engage with child safety experts, psychologists, and educators throughout the design and training process.