andrewbaker.ninja

MCP in 2026: Rise, Fall, and What Every AI User Must Know

3/22/2026Updated 3/30/2026

Excerpt

## 4. The Architectural Problems: Slowness, Bloat, and the Double Hop Tax Security breaches get headlines, but MCP’s architectural limitations are what quietly frustrate developers day to day. These problems don’t cause dramatic incidents. They just make everything slower, heavier, and harder to scale. Understanding them is essential to understanding why alternatives exist. The double hop tax is the most visible performance problem. Every time an AI agent wants to call a tool in MCP, the request doesn’t go directly to the tool. It makes two trips. The agent sends a JSON-RPC request to the MCP Server, the server parses it, reformats it, and forwards it to the actual tool. The tool responds, the MCP Server receives it, reformats it again, and sends it back to the agent. Visually the flow looks like this: … Context window bloat is subtler but hits just as hard in practice. When an MCP client connects to a server, it typically loads the full list of tools that server exposes, including the name, description, and JSON schema for every parameter of every tool. All of that gets injected into the LLM’s context window before the AI even starts thinking about the user’s request. A typical tool schema looks like this: … The stateful session problem becomes painful at scale. MCP’s original design assumed a persistent stateful connection between client and server, a reasonable assumption for a local development tool where one Claude Desktop instance talks to one MCP Server on the same machine. But production deployments route traffic through load balancers across many server instances: ⚡ Powered by CloudScale … When MCP sessions are stateful and the load balancer routes the next request to a different server instance, that instance has no record of the session, and things break. The workarounds, such as sticky sessions, shared Redis session stores, and distributed state management, add operational complexity and cost that teams did not anticipate when they thought they were simply adding MCP support to their stack. The MCP 2026 roadmap explicitly names this as a top priority, but it is a hard problem that was not solved at launch. The wrapper tax is the hidden infrastructure cost that accumulates over time. To expose any tool via MCP, someone has to write and maintain an MCP Server, a dedicated process that wraps the tool’s native API. For a tool with a perfectly good REST API already, the before and after looks like this: ⚡ Powered by CloudScale … That MCP Server needs to be written in Python or TypeScript, hosted somewhere, kept running, updated whenever the underlying tool’s API changes, secured against the vulnerabilities described in the next section, monitored for failures, and scaled if load increases. For a small team, this per tool overhead accumulates fast and becomes a significant ongoing maintenance burden. … This is why enterprise teams running production agentic workflows have been among the loudest voices pushing for MCP to evolve or for alternatives to be considered. ... The Security Crisis: Breaches and Real World Failures The same openness and power that made MCP attractive became its Achilles heel. As adoption scaled into production environments, a pattern familiar from the history of internet protocols repeated itself: when powerful technology moves faster than security practices, breaches follow. In March 2025, security firm Equixly published research finding command injection vulnerabilities in 43% of tested MCP implementations, with another 30% vulnerable to server side request forgery attacks and 22% allowing arbitrary file access. This was not a theoretical paper. It was a survey of real deployed servers. In April 2025, security researcher Simon Willison documented how MCP’s architecture created severe prompt injection risk. Because LLMs process tool outputs as context, a malicious MCP Server, or even a malicious message sent to a user’s WhatsApp that gets processed by an LLM, could hijack the AI’s behavior, extract private data, or execute unauthorized commands. The spec noted that there “should always be a human in the loop,” but in practice many implementations skipped this entirely. … By October 2025, JFrog Security had disclosed critical vulnerabilities in mcp-remote, an OAuth proxy used by hundreds of thousands of environments. CVE-2025-6514 was rated CVSS 9.6 and allowed remote code execution via OS commands embedded in OAuth discovery fields. CVE-2025-6515 enabled what researchers called Prompt Hijacking, where attackers exploiting predictable session IDs could intercept and redirect MCP sessions entirely. And Anthropic’s own developer debugging tool, the MCP Inspector, was found to allow unauthenticated remote code execution, turning a diagnostic tool into a potential remote shell. … Because MCP Servers are distributed via npm and PyPI without universal verification, the ecosystem is exposed to the same supply chain attacks that have plagued web development for years. Tool descriptions can also be modified after a user approves them, a technique researchers call a rug pull, meaning an LLM that was told a tool does one thing can silently be fed a new description instructing it to do something entirely different.

Source URL

https://andrewbaker.ninja/2026/03/22/the-rise-and-relative-fall-of-mcp-what-every-ai-user-needs-to-know-in-2026/

Related Pain Points

Security Vulnerabilities in Repository Configuration and MCP

Three CVEs discovered: malicious code in documents can exfiltrate private data; Model Context Protocol (MCP) allows repository config to override user approval safeguards enabling remote code execution; repository-controlled settings redirect API traffic to attacker servers to steal API keys.

securityClaude CodeModel Context Protocol

Schema Overhead Consumes 16-50% of Context Window

Full tool schemas load into context on every request with no lazy loading, selective injection, or summarization. This causes context window exhaustion before meaningful work begins, with confirmed instances ranging from 45K tokens for a single tool to 1.17M tokens in production deployments.

performanceMCP

MCP supply chain attacks via npm/PyPI distribution

MCP servers are distributed via npm and PyPI without universal verification, exposing the ecosystem to the same supply chain attacks that plague web development. Tool descriptions can be modified post-approval (rug pulls).

securityMCPnpmPyPI

Stateful session routing breaks with load balancers

MCP assumes persistent 1:1 client-server connections, but production deployments with load balancers route requests across instances. When a session routes to a different server without state, connections fail. Workarounds (sticky sessions, Redis, distributed state) add significant operational complexity.

architectureMCPload balancersRedis

Common Security Vulnerabilities in MCP Deployments

Rapid MCP ecosystem growth has revealed common vulnerability patterns in deployed servers including command injection, insufficient input validation, privilege escalation, authentication implementation flaws, and lack of rate limiting.

securityMCP

Double hop performance tax in MCP request routing

Every MCP tool call requires two round trips (agent → MCP server → tool → MCP server → agent) instead of direct calls, adding latency and overhead to each interaction. This architectural inefficiency compounds at scale and makes production deployments slower.

performanceMCPJSON-RPC

MCP server wrapper maintenance overhead

Every tool exposed via MCP requires writing and maintaining a dedicated MCP Server wrapper in Python or TypeScript, plus hosting, updating, securing, monitoring, and scaling. This per-tool overhead accumulates significantly for teams integrating multiple tools.

ecosystemMCPPythonTypeScript