AI Pentesting Tool Gets Splunk Rules
A new multi-agent AI penetration testing tool called BlacksmithAI has been detailed, and it comes with tailored detection rules for Splunk, Sigma, and Suricata. The tool uses LangGraph orchestration to automate pentesting activities. The release provides SOCs with ready-made detections to spot the tool's use, but it is intended for authorized lab environments only.
BlacksmithAI's architecture, developed by Yohannes Gebrekirstos, intentionally mirrors a human penetration testing team rather than relying on a single "super agent." An orchestrator agent acts as a project lead, delegating tasks to specialized agents for reconnaissance, vulnerability analysis, exploitation, and post-exploitation, distributing the reasoning and execution across defined roles. The framework operates within a pre-configured, containerized "mini-Kali" environment that includes established security tools like `nmap`, `sqlmap`, `nuclei`, and `impacket`. This design improves resource efficiency by avoiding the need to spin up new containers for each task, while access controls prevent agents from altering the toolset, ensuring consistency. The tool's multi-agent system is managed by LangGraph, a library designed to orchestrate complex, stateful workflows for LLMs. LangGraph enables the creation of modular and auditable execution flows, allowing each specialized agent to operate independently while the framework manages their overall interaction and state. The inclusion of Sigma rules is a critical component for detection engineers. Sigma is an open, generic format for SIEM detections, allowing security teams to write one rule that can be converted into queries for various platforms, including Splunk's SPL. This enables rapid deployment of detection logic for the tool's specific techniques. By providing these detection rules from the outset, the project equips security operations centers to proactively hunt for the tool's indicators. This aligns with a core principle of Zero Trust architecture by enabling continuous verification and detection of advanced, automated threats within an environment, rather than just defending the perimeter.