CAISI proposals discussed
U.S. proposals to strengthen the Center for AI Standards and Innovation (CAISI) aim to create stronger auditing and oversight for frontier AI risks, tying technical scrutiny more closely to national-security concerns. Observers such as Miles Brundage have unpacked the proposals publicly, signalling growing appetite in Washington to formalise audit and standards capacities for high-risk models (x.com).
Washington is moving toward a system where the biggest artificial intelligence models do not just get benchmarked for speed or chatbot polish. They get examined more like dual-use machinery, with checks for cyber abuse, biosecurity misuse, and hidden vulnerabilities tied to national security. (nist.gov) The office at the center of that shift is the Center for AI Standards and Innovation, or CAISI, inside the National Institute of Standards and Technology. Its public mission says it will make voluntary agreements with private developers, run unclassified evaluations of models that may pose national-security risks, and assess both United States and adversary systems. (nist.gov) That is a noticeable change from the older United States AI Safety Institute, which the Commerce Department reorganized into CAISI on June 3, 2025. Outside observers at Data & Society noted that the new framing put more weight on security, international competition, and protecting American firms from foreign regulation. (techpolicy.press) The practical idea is simple: if a frontier model is powerful enough to help with cyberattacks or dangerous lab work, the government wants a repeatable way to test that before the model spreads through agencies and companies. CAISI’s own mission page says its evaluations focus on demonstrable risks including cybersecurity, biosecurity, and chemical weapons. (nist.gov) That is why the recent proposals around CAISI are less about writing abstract ethics principles and more about building an audit shop. Miles Brundage and dozens of coauthors defined “frontier AI auditing” in a January 2026 paper as third-party verification of developers’ safety and security claims using secure access to non-public information. (arxiv.org) Washington is already laying the plumbing for that approach. On January 8, 2026, CAISI published a Request for Information asking for concrete methods to measure the secure development and deployment of artificial intelligence agents, which it described as systems that can take autonomous actions in the real world and may be vulnerable to hijacking or backdoors. (federalregister.gov) The White House gave CAISI a larger lane in July 2025 when America’s AI Action Plan told the government to build an “AI evaluations ecosystem.” The plan assigned the National Institute of Standards and Technology, including CAISI, to develop the science of measuring and evaluating models and to publish guidance so federal agencies can run their own evaluations. (whitehouse.gov) That language matters because it turns testing from a one-off lab exercise into a procurement rulebook. On March 18, 2026, CAISI signed a memorandum with the General Services Administration so its evaluation methods can be used inside USAi, the federal government’s shared platform for buying and deploying generative artificial intelligence tools. (nist.gov) So the fight here is not really over whether models should be “safe” in the abstract. It is over who gets to inspect them, what evidence they must hand over, and whether those inspections stay voluntary standards work or harden into a standing audit regime for the most capable systems. (nist.gov) (arxiv.org) If CAISI keeps expanding on its current path, the United States will have something closer to a test range than a discussion forum. The center is already presenting itself as industry’s main government contact for evaluations, security guidance, and standards work, while coordinating with the Department of Defense, the Department of Energy, the Department of Homeland Security, and the intelligence community. (nist.gov)