Pains

2403 pains collected

Category:

Tech:

Severity:

mongos (sharding router) crashes frequently under load

The mongos routing layer is unreliable and crashes every few hours to days under load. Some crashes involve assertion failures that don't fully terminate the process, leaving it in a broken state even with restart supervision.

deployMongoDB

Breaking change in HTTP query template function usage

Neon's Node.js SDK v19+ introduces a breaking change in how the HTTP query template function can be called. Calling it as a conventional function (with parentheses) is now an SQL injection risk and throws an error, requiring developers to update their applications.

compatibilityNeonNode.jsSQL

Docker build reproducibility issues with dependency version changes

Docker builds pulling dependencies from the public internet during build time cannot guarantee reproducibility over time. Different versions of dependencies may be pulled on subsequent builds, and if exact versions are no longer available, Docker throws errors, blocking deployments.

buildDocker

Prisma environment variable handling breaks in monorepos and ESM contexts

Prisma struggles to correctly load `.env` files in monorepo setups, doesn't support NODE_ENV-based `.env` switching, and silently pollutes `process.env` without explicit dotenv usage. Recent versions (6.7.0+) have introduced critical ESM-related module resolution failures across Turborepo, Next.js, Remix, and other frameworks.

configPrismaTurborepoNext.js+2

Self-Hosted Deployment Complexity

Self-hosted Sentry is a distributed system requiring management of PostgreSQL, ClickHouse, Kafka, and Redis. It demands dedicated DevOps/SRE resources for scaling and maintenance, often resulting in total cost of ownership exceeding SaaS pricing.

deploySentryPostgreSQLClickHouse+2

Project File Access Regression Breaking Existing Workflows

Project file access functionality broke on November 25 after previously working, with no rollback capability. This regression persists during investigation, blocking users who relied on this feature.

deployAnthropic Consoleproject files

Extremely slow bulk delete operations

MongoDB's CUD (Create, Update, Delete) operations are inefficient at scale. Deleting all documents from a 50-million-document collection takes many hours, forcing developers to drop and recreate collections instead. MongoDB lacks a TRUNCATE TABLE equivalent.

performanceMongoDB

Premature Microservices Adoption Creates Operational Complexity

Teams adopt microservices before understanding business domain, resulting in distributed transactions, data consistency issues, painful debugging, and unnecessary operational complexity that becomes a blocker for scalability rather than an enabler.

architectureMicroservicesDistributed Systems

Sensitive data exposure and authorization complexity

GraphQL's unified endpoint and flexible query structure can inadvertently expose sensitive data. Without strict authentication and authorization checks at the field level, unauthorized users can query restricted information. Field-level security is complex, error-prone, and can cause entire requests to fail.

securityGraphQL

Sharding fails under high load during chunk migration

Adding a shard to a MongoDB cluster under heavy load is problematic. MongoDB either migrates chunks so aggressively that it causes DoS conditions on production traffic, or refuses to move chunks at all, making it unsuitable for high-traffic sites with heavy write volumes.

deployMongoDB

Replication becomes bottleneck on busy servers

Replication on heavily loaded MongoDB servers either causes DoS on the master or replicates so slowly that the operation log is exhausted, requiring very large oplog sizes (e.g., 50GB) and still failing to keep up.

performanceMongoDB

Production Database Concurrency Issues

The official FastAPI documentation's recommended DB integration pattern using dependencies leads to deadlocks when handling more concurrent users in production environments.

compatibilityFastAPI

Lack of observability makes it impossible to trust agents in production

94% of organizations with agents in production have implemented observability tooling because agents cannot be trusted without visibility into execution traces and reasoning. Observability is a blocker for production deployment despite 89% adoption attempts.

monitoringobservabilitytracinglogging

Security vulnerabilities with unbundled dev servers over networks

Unbundled dev servers can expose sensitive files and create unintended access vulnerabilities when exposed over networks for testing, requiring explicit permissions and careful configuration to mitigate risks.

securityVite

Garbage collection causes unpredictable latency

Go's garbage collector is unpredictable and unsuitable for latency-sensitive environments like high-frequency trading or real-time analytics. GOMEMLIMIT is described as unreliable, allowing requests 10x over the limit.

performanceGo

Global write lock kills performance under heavy write loads

MongoDB requires a global write lock for any write operation. Under write-heavy loads, this severely degrades performance, making it unsuitable for applications with balanced or write-heavy read/write ratios.

performanceMongoDB

Overly broad scopes and long-lived access tokens

Teams define scopes too broadly (e.g., `full_access`, `admin_all`) and issue access tokens valid for hours or days instead of minutes, dramatically increasing the blast radius if a token is stolen.

securityOAuth 2.0

Authorization code and access token leakage through redirect vulnerabilities

OAuth implementations risk leaking authorization codes via HTTP Referrer headers and access tokens through URL hash fragments. Redirect hijacking vulnerabilities enable account takeover, and optional CSRF state token protection is frequently ignored in implementations.

securityOAuth 2.0

Serverless function timeout limits prevent complex workloads

Vercel's serverless functions have a 10-second timeout limit on free tier and 60-300 second limits on paid plans, causing issues with complex payment processing, long-running agents, and AI workloads. Documentation claims 300 seconds but functions timeout at 60 seconds under load. Edge functions have even stricter limits and lack full Node.js compatibility.

performanceVercelserverless functionsedge functions

Using wrong OAuth 2.0 grant types for the scenario

Developers select inappropriate grant types (e.g., Client Credentials for user authentication, Implicit or Password grant) without considering whether the client can securely store secrets, leading to security vulnerabilities and blurred trust boundaries.

authOAuth 2.0

Slow Maintainer Response and PR Review Bottleneck

The FastAPI maintainer (@tiangolo) is a bottleneck for development; most PRs go months without response, require extensive rework, or remain unmerged despite being high-quality. No delegation of merge permissions limits community contribution.

ecosystemFastAPI

Big-Bang SwiftUI Rewrite Risk for Legacy Applications

Wholesale adoption of SwiftUI to rewrite large, long-lived applications introduces significant business risk. Incremental migration strategies focusing only on the view layer while preserving UIKit navigation are recommended but require more planning than big-bang rewrites.

migrationSwiftUIUIKit

Frequent pipeline failures in interconnected services

Pipeline failures occur frequently in enterprise environments when changes affect multiple interconnected services, stretching MTTR into hours.

deployCI/CDmicroservices

Configuration drift from identical dev and prod manifests

Using the same Kubernetes manifests across development, staging, and production without environment-specific customization leads to instability, poor performance, and security gaps. Environment factors like traffic patterns, scaling needs, and access control differ significantly.

configKubernetes

1…12 13 14 15 16…101