AI agent security and blast radius management
9/10 CriticalProduction incidents show AI agents leaking internal data, shipping ransomware through plugins, and executing destructive actions (deleting repos). Security shifted from prompt injection to actual agent capabilities and operational risk.
Sources
- AI | 2025 Stack Overflow Developer Survey
- State of the API 2025: API Strategy Is Becoming AI Strategy
- 5 Major Pain Points AI Agent Developers Can't Stop Ranting About ...
- 2025 Recap: The Year Software Development Changed Shape
- 2. Controlled Agency And...
- A Year of MCP: From Internal Experiment to Industry Standard | Pento
- A practical business guide to the OpenAI API in 2025
- MCP in enterprise: real-world applications and challenges - Xenoss
Collection History
In April 2025, Invariant Labs discovered that MCP is vulnerable to tool poisoning, a type of attack where a prompt with malicious instructions is launched at the LLM. The instructions are not visible to humans but understandable to the AI agent.
AI agents remain highly vulnerable to prompt injection and jailbreak attacks, with success rates exceeding 90%. Security researchers discovered the first zero-click attack on AI agents through Microsoft 365 Copilot, where 'attackers hijack the AI assistant just by sending an email... The AI reads the email, follows hidden instructions, steals data, then covers its tracks'. Microsoft took five months to fix this issue.
managing permissions and making sure the bot only pulls information from the right, up-to-date sources is a huge security and maintenance headache. The last thing you want is your IT bot accidentally sharing sensitive HR info.
Agents leaking internal data within minutes, Malicious plugins shipping ransomware, Supply-chain bugs in AI tooling, Agents deleting repos or months of work