gist.github.com

Don't Use MongoDB

11/7/2011Updated 3/23/2026

Excerpt

**2. MongoDB can lose data in many startling ways** Here is a list of ways we personally experienced records go missing: 1. They just disappeared sometimes. Cause unknown. 2. Recovery on corrupt database was not successful, pre transaction log. 3. Replication between master and slave had *gaps* in the oplogs, causing slaves to be missing records the master had. Yes, there is no checksum, and yes, the replication status had the slaves current 4. Replication just stops sometimes, without error. Monitor your replication status! **3. MongoDB requires a global write lock to issue any write** Under a write-heavy load, this will kill you. If you run a blog, you maybe don't care b/c your R:W ratio is so high. **4. MongoDB's sharding doesn't work that well under load** Adding a shard under heavy load is a nightmare. Mongo either moves chunks between shards so quickly it DOSes the production traffic, or refuses to more chunks altogether. This pretty much makes it a non-starter for high-traffic sites with heavy write volume. **5. mongos is unreliable** The mongod/config server/mongos architecture is actually pretty reasonable and clever. Unfortunately, mongos is complete garbage. Under load, it crashed anywhere from every few hours to every few days. Restart supervision didn't always help b/c sometimes it would throw some assertion that would bail out a critical thread, but the process would stay running. Double fail. … **7. Things were shipped that should have never been shipped** Things with known, embarrassing bugs that could cause data problems were in "stable" releases--and often we weren't told about these issues until after they bit us, and then only b/c we had a super duper crazy platinum support contract with 10gen. The response was to send up a hot patch and that they were calling an RC internally, and then run that on our data. **8. Replication was lackluster on busy servers** Replication would often, again, either DOS the master, or replicate so slowly that it would take far too long and the oplog would be exhausted (even with a 50G oplog). … Unfortunately, it doesn't matter. The real problem is that so many of these problems existed in the first place. Database developers must be held to a higher standard than your average developer. Namely, your priority list should typically be something like: 1. Don't lose data, be very deterministic with data 2. Employ practices to stay available 3. Multi-node scalability 4. Minimize latency at 99% and 95% 5. Raw req/s per resource

Source URL

https://gist.github.com/mitio/1343383

Related Pain Points