5 stories this week that change your decisions (May 11-17, 2026)

TL;DR Researchers poisoned 3 nodes in a 42-million-node code graph and 9 frontier models trusted the planted output 100% via MCP. The attack worked when the fake nodes used correct naming and one OWASP reference. Separately, Google's GTIG confirmed adversaries have moved AI into live attack operations, naming a likely AI-built Python 2FA-bypass exploit and PROMPTSPY, an Android trojan that calls Gemini at runtime to keep itself pinned on every phone vendor's UI. And Microsoft's MDASH harness topped CyberGym at 88.45% and dumped 16 fresh Windows CVEs into Patch Tuesday.

1. Microsoft poisoned 3 nodes in a 42M-node code graph and 9 frontier models trusted it 100% via MCP

Coding agents treat a graph index of a codebase as ground truth. Any code knowledge graph connected to an AI agent through MCP is an attack surface. No vendor today provides graph-level integrity controls.

2. Google confirms adversaries have operationalized AI

GTIG's new report confirms attackers have moved AI into live operations. Concrete cases: a Python 2FA-bypass exploit GTIG concluded was AI-written, and PROMPTSPY, an Android trojan that calls Gemini at runtime to keep itself pinned on every phone vendor's UI.

3. Microsoft brings the Azure playbook to AI AppSec with MDASH

MDASH orchestrates 100+ agents across SOTA and distilled models, hits 88.45% on CyberGym, and dumps 16 fresh Windows CVEs into the Patch Tuesday cohort. Microsoft repeats one thesis sixteen times across the post: the harness does the work, the model is one input.

4. OpenAI Daybreak wraps GPT-5.5, Codex Security, and many promises

OpenAI's Daybreak wraps GPT-5.5 and Codex Security into three access tiers, including a KYC-gated GPT-5.5-Cyber preview for authorized red teaming. It's the only frontier-lab cyber offering a buyer can engage on today. But not everyone is excited about it.

5. Mythos Vs Curl, one of the most-audited open source codebases

Mythos flagged 5 'Confirmed' vulnerabilities in curl. Only 1 survived maintainer review. curl is the worst-case test for any AI scanner: single-purpose, every line refactored 4+ times, audited by every major tool. Don't generalize this result to typical enterprise code.

Sources: