Pains
2403 pains collected
MSAL stability issues with SSO, login, and token validation causing project abandonment
9A production application was terminated primarily due to persistent MSAL issues affecting Single Sign-On, login flows, and token validation. The application was otherwise stable, but constant MSAL-related failures caused repeated client dissatisfaction and made it impossible to maintain service quality.
Credential leakage risks in token acquisition flows
9MSAL's interactive authentication and client secret flows create opportunities for credential leakage, particularly when credentials are retrieved and stored in application state. Even certificate-based authentication alternatives carry similar risks of credential exposure.
Azure AD reliability and availability issues
9Azure AD (Entra ID) experiences frequent outages, which critically impacts all Azure and Office 365 services that depend on it. This is treated as an acceptable risk despite the widespread impact.
Azure codebase deterioration preventing bug fixes
9Azure's internal codebase has accumulated such severe technical debt that bug fixes are rejected because they risk breaking entire systems, preventing engineers from refactoring or improving code quality.
React/Next.js serialization vulnerabilities expose TypeScript runtime risks
9Critical security vulnerabilities like React2Shell (CVE-2025-55182, CVSS 10.0) in Next.js RSC serialization revealed that full-stack JavaScript and TypeScript lack secure serialization models. These runtime CVEs forced developers to reassess security assumptions in TypeScript/React stacks.
Unreliable and unpredictable framework behavior in production
9LangChain exhibits difficult-to-predict behavior with undocumented or poorly explained default settings and intricacies. Developers report erratic behavior such as ConversationRetrievalChain unexpectedly rephrasing input questions, leading to unstable production environments and costly downtime.
Schema Overhead Consumes 16-50% of Context Window
9Full tool schemas load into context on every request with no lazy loading, selective injection, or summarization. This causes context window exhaustion before meaningful work begins, with confirmed instances ranging from 45K tokens for a single tool to 1.17M tokens in production deployments.
API Route Security Issues in Next.js
9Next.js API routes are vulnerable to injection attacks (SQL, NoSQL, command injection), rate limiting bypass, information disclosure through error messages, and missing input validation.
Account suspension without warning or appeals process
9User accounts have been suspended without warning within minutes of deployment with vague 'fair use violation' emails. Appeals go unanswered for weeks, resulting in lost access to production sites with no recourse or clear explanation.
Silent data errors in GPU computations
9Silent data errors (SDEs) in GPUs propagate through calculations without triggering detection mechanisms, potentially compromising results in critical applications. These errors stem from timing violations, thermal stress, electromigration, and voltage fluctuations on modern silicon.
Cross-Site Scripting (XSS) Vulnerabilities in Next.js
9XSS attacks can occur in Next.js through improper use of dangerouslySetInnerHTML, unvalidated user input in dynamic content, third-party scripts, and server-side rendering of malicious content.
Power Delivery and Cooling Infrastructure Insufficient for Production Workloads
9GPU infrastructure planned for 6-8 kW per node discovers actual power demands of 10-12 kW when enabling higher TDP profiles in production, requiring physical infrastructure renegotiation and topology redesign.
Insecure default configurations enabling privilege escalation
9Deploying containers with insecure settings (root user, 'latest' image tags, disabled security contexts, overly broad RBAC roles) persists because Kubernetes doesn't enforce strict security defaults. This exposes clusters to container escape, privilege escalation, and unauthorized production changes.
Long-term memory corruption and data loss
9On February 5, 2025, ChatGPT's memory system silently broke, causing users to lose years of accumulated context, forgotten names, timelines, and entire creative projects. Some users lost files and context without warning or ability to recover data.
Security vulnerabilities and account hijacking risks
9Persistent security vulnerabilities exist in OpenAI's platform, with documented instances of account hijacking and authentication exposure. Developers lack clear security protocols and data privacy safeguards.
Abrupt Free Tier Removal and Quota Slashing Without Notice
9Google removed free tier access to Gemini 2.5-Pro entirely and slashed Gemini 2.5-Flash daily limits by 92% (250 to 20 requests) with no advance notice, email alerts, or grace period. Production applications broke overnight with 429 quota exceeded errors.
GitHub Actions lacks lockfile dependency management
9GitHub Actions has no lockfile system to pin exact versions of third-party actions. Every workflow run re-resolves dependencies from the manifest without recording what was actually chosen, creating non-deterministic builds and enabling supply chain attacks. This is a fundamental gap compared to mature package managers.
Incomplete or skipped token validation in APIs
9APIs frequently validate only that a token is present rather than performing full server-side validation of signature, issuer, audience, expiry, and required scopes, leaving the system vulnerable to forged or expired tokens.
AWS IAM permission model is fundamentally broken for security requirements
9AWS IAM's core design prioritizes deterministic permission evaluation over security usability, resulting in a system where CRUD-style permissions cannot be implemented auditably. The architecture uses low-level API action lists with boolean logic complexity ('deny sandwich'), strict character limits forcing wildcard usage, and unpredictable new actions added without warning, making it impossible to implement basic security expectations.
Corrupted or malicious npm package code breaking builds worldwide
9Popular npm libraries like Faker.js and Colors.js have had their source code corrupted by maintainers, causing widespread build failures across millions of dependent projects. When heavily-used small modules maintained by 1-2 people break, the impact cascades globally.
95% Failure Rate in Corporate AI Agent Projects
995% of generative AI business projects fail in production. This systemic failure rate reflects fundamental challenges in building AI agents that remain relevant, adaptable, and trustworthy over time.
Non-deterministic and non-repeatable agent behavior
9AI agents behave differently for the same exact input, making repeatability nearly impossible. This non-deterministic behavior is a core reliability issue that prevents developers from confidently shipping features or trusting agents to run autonomously in production.
Poor page rendering performance at scale
9Next.js exhibits slow page rendering performance in production: basic pages take 200-400ms, large dynamic pages exceed 700ms, and crawlers hitting multiple pages simultaneously cause site crashes. Caching is unpredictable across replicas.
Widespread use of end-of-life PHP versions creates security vulnerabilities
955% of PHP teams are still running at least one EOL version, with 70% of those lacking security confidence. Deprecated versions like PHP 7.1 (44% of WordPress sites) present genuine security risks and are frequent targets for hackers.