Anthropic reveals its cybersecurity domination strategy. Claude and Gemini scored equally on cybersecurity tasks, but Claude Code scaffold showed its superiority.

Wiz just launched the AI Cyber Model Arena. 257 real-world challenges across five offensive categories: zero-day discovery, CVE detection, API security, web security, and cloud security.

Highlights:

My take:

  1. Biggest surprise. Gemini scored low on software vulnerabilities, despite their heavy investments in AI secure coding with Code Mender.
  2. Expected. Frontier models heavily depend on scaffolding for cybersecurity tasks.
  3. Anthropic's enterprise strategy is clear. Make Claude Code the most powerful and model-agnostic scaffold for major enterprise tasks. It lifts up every model that runs on it. Gemini CLI favors the Gemini model family.

AI Cyber Model Arena

Wiz AI Cyber Model Arena: Claude Code scaffold lifts all models on cybersecurity tasks