dev.to
We tracked 29 MCP pain points across 7 communities. Which one ...
Excerpt
- **Cloudflare's standard MCP server consumed 1.17M tokens** in production. That's not a benchmark — that's an emergency. They shipped a "Code Mode" workaround in February 2026 specifically because of it. - **Block rebuilt their Linear MCP integration 3 times** for the same underlying reason: context destruction from schema overhead. Three rewrites, same root cause. - **Perplexity's CTO publicly moved away from MCP** citing overhead as a core issue. - One practitioner I found in a GitHub thread: **45K tokens just for GitHub MCP alone** — that's 22.5% of a 200K context window consumed before the agent does a single useful thing. … ## The 5 patterns that kept coming back ### 1. Schema overhead eating 16–50% of context window before the conversation starts *6+ confirmed sightings* The full tool schema loads into context on every request. There's no lazy loading, no selective injection, no summarization. Just the entire schema, every time. One developer put it exactly right: *"that's not overhead, that's your context budget gone before the agent does anything."* The Cloudflare 1.17M token incident is the extreme version of this. The GitHub MCP 45K-token practitioner is the median version. Both are the same pattern. ### 2. MCP process orphans leaking memory with no standard cleanup hook *8+ confirmed sightings — most widespread pattern in the dataset* When an MCP session ends abnormally, the subprocess keeps running. Memory climbs. Port stays bound. No standard lifecycle hook exists in the spec for "clean up after yourself." Teams are writing custom janitors: cron jobs that kill zombie processes, watchdog scripts, restart-on-threshold automation. Every team reinvents the same janitor. … ### 3. Agent intent misclassification: wrong tool subset injected silently, runtime fails or burns 2-3x tokens *3+ independent practitioners, converged on the same root cause independently* When the agent chooses the wrong tool, or gets routed to the wrong tool subset, nothing tells you. There's no explicit failure. The agent just... burns tokens on the wrong path. Or silently fails. Or produces output that looks correct but isn't. One developer I spoke with described it as their *"biggest incident cost, by a wide margin. Misclassification is per-request and compounding."* Three different practitioners, building three different things, arrived at the same diagnosis independently. That's a signal. ### 4. MCP OAuth token refresh not handled by any major client *10+ confirmed users across multiple platforms* Atlassian, Cursor, Claude Code. Pick your client. OAuth tokens expire, and the standard response is: re-auth manually. This isn't a 30-minute annoyance for developers. In production agents running overnight jobs, it's a process death with no recovery path. The workflow just stops. You find out in the morning. … ### 5. Subagent hallucination of MCP tool results instead of failing gracefully *Persistent open issue — no fix shipped anywhere in the ecosystem* When a tool call fails, some models hallucinate plausible-looking results rather than surfacing the error. The worst part isn't the hallucination itself — it's the detectability. As one developer described it: *"hallucinated errors are syntactically plausible but factually incorrect... results look valid, making the bug hard to detect."*
Related Pain Points
AI Agent Hallucination and Factuality Failures
9AI agents confidently generate false information with hallucination rates up to 79% in reasoning models and ~70% error rates in real deployments. These failures cause business-critical issues including data loss, liability exposure, and broken user trust.
Schema Overhead Consumes 16-50% of Context Window
9Full tool schemas load into context on every request with no lazy loading, selective injection, or summarization. This causes context window exhaustion before meaningful work begins, with confirmed instances ranging from 45K tokens for a single tool to 1.17M tokens in production deployments.
Lack of visibility and debugging transparency
8When AI agents fail, developers have no unified visibility across the entire stack. They must stitch together logs from the agent framework, hosting platform, LLM provider, and third-party APIs, creating a debugging nightmare. This makes it impossible to determine whether failures stem from tool calls, prompts, memory logic, model timeouts, or hallucinations.
MCP Process Orphans Leak Memory Without Cleanup Hook
8When MCP sessions end abnormally, subprocesses continue running, memory climbs, and ports remain bound. No standard lifecycle hook exists in the spec for cleanup. Teams must write custom janitors using cron jobs and watchdog scripts.
Refresh token management and silent revocation
8Refresh token expiration intervals vary wildly across providers, some revoke tokens silently without notification, and there is no standardized `refresh_expires_in` field. Race conditions occur when multiple requests simultaneously attempt to refresh tokens, and misconfigured token handling cascades into failed jobs and broken integrations.