Mar 30, 2026 Ilya Kabanov

702 Splunk references in DefenseClaw, Cisco's open-source AI agent security tool

Cisco launched DefenseClaw at RSAC 2026 on March 27, an open-source security governance sidecar for OpenClaw agents. It promises that "nothing runs until it's scanned, and anything dangerous is blocked automatically." The tool aims to scan every skill, MCP (Model Context Protocol) server, and plugin before execution, blocking dangerous ones in under two seconds, and streaming every verdict to Splunk.

I looked under the hood of 77,000 lines of Go, Python, and TypeScript to understand how it is built, how it works, and what you need to know about the protections it provides and the risks it creates.

Highlights:

DefenseClaw is a three-component system. A Python CLI for scanning and policy management, a Go gateway daemon (REST API, WebSocket bridge to OpenClaw, OPA (Open Policy Agent) policy engine, SQLite audit store, guardrail LLM proxy), and a TypeScript plugin that runs inside OpenClaw and intercepts every tool call via a before_tool_call hook.
Six built-in scanning engines: CodeGuard (10 regex rules for credentials, unsafe exec, SQL injection, weak crypto, path traversal, outbound HTTP, and unsafe deserialization), ClawShield Malware (signature patterns for reverse shells, cryptominers, C2, and credential harvesting), ClawShield Injection (three-tier prompt injection detection: regex, statistical analysis with base64 decoding, and Unicode analysis for zero-width characters and homoglyphs), ClawShield Secrets (41 provider-specific credential patterns), ClawShield PII (credit cards with Luhn validation, SSNs, and 8 other categories), and ClawShield Vuln (SQL injection, SSRF, path traversal, command injection, and XSS detection).
Two external wrappers invoke Cisco's Skill Scanner and MCP Scanner CLIs with optional LLM and Cisco AI Defense backends (plus VirusTotal for the Skill Scanner).
Three-phase admission gate: block/allow list check, scan with 5-minute timeout, OPA policy evaluation with Go fallback. HIGH/CRITICAL auto-block. Enforcement: quarantine, disable via WebSocket RPC, sandbox policy updates. Guardrail proxy inspects LLM prompts and completions. Audit to SQLite, Splunk HEC (HTTP Event Collector), and OpenTelemetry.

My take:

DefenseClaw is a solid open-source agentic AI governance tool. The architecture is sound: sidecar pattern, OPA policies, SIEM-first audit, streaming guardrail proxy, hot-reloadable enforcement. Some novel ideas, particularly the three-tier Unicode injection detection and an observation mode that proposes firewall rules from live agent behavior.
In my analysis of the TeamPCP supply chain attack, I wrote that vendors' open-source projects are not products, but go-to-market tools. DefenseClaw fits the pattern. The repo ships a Splunk app with 11 dashboards, saved searches, and Docker Compose files that spin up Splunk Enterprise on a 60-day trial. On day 61 you're faced with a pay-or-rip-out choice.
Freemium, fine, we get it. But the tool may create a false sense of security. Its regex-based skills scanner doesn't flag dangerous os.popen, subprocess.Popen, subprocess.run, and os.execv (only os.system is covered) in the patterns list. It also ignores httpx, aiohttp, urllib3, and socket. This is obviously to reduce false positives at the regex stage, but the LLM-based analyzer that should dig deeper is disabled by default!
When the LLM analyzer is enabled, we get another problem: the source code being analyzed IS the untrusted input. A malicious skill can embed prompt injection in its comments or docstrings targeting the LLM judge to manipulate a decision. In the IPI Arena benchmark, the best commercial model (Claude Opus 4.5) had a 0.5% attack success rate and the worst (Gemini 2.5 Pro) hit 8.5%.
For teams evaluating DefenseClaw today, be aware of the power of controls and observability it brings, but also understand the risks and the sales intent the tool carries.

Highlights:

My take:

Sources: