gist.github.com
Don't Use MongoDB
Excerpt
**2. MongoDB can lose data in many startling ways** Here is a list of ways we personally experienced records go missing: 1. They just disappeared sometimes. Cause unknown. 2. Recovery on corrupt database was not successful, pre transaction log. 3. Replication between master and slave had *gaps* in the oplogs, causing slaves to be missing records the master had. Yes, there is no checksum, and yes, the replication status had the slaves current 4. Replication just stops sometimes, without error. Monitor your replication status! **3. MongoDB requires a global write lock to issue any write** Under a write-heavy load, this will kill you. If you run a blog, you maybe don't care b/c your R:W ratio is so high. **4. MongoDB's sharding doesn't work that well under load** Adding a shard under heavy load is a nightmare. Mongo either moves chunks between shards so quickly it DOSes the production traffic, or refuses to more chunks altogether. This pretty much makes it a non-starter for high-traffic sites with heavy write volume. **5. mongos is unreliable** The mongod/config server/mongos architecture is actually pretty reasonable and clever. Unfortunately, mongos is complete garbage. Under load, it crashed anywhere from every few hours to every few days. Restart supervision didn't always help b/c sometimes it would throw some assertion that would bail out a critical thread, but the process would stay running. Double fail. … **7. Things were shipped that should have never been shipped** Things with known, embarrassing bugs that could cause data problems were in "stable" releases--and often we weren't told about these issues until after they bit us, and then only b/c we had a super duper crazy platinum support contract with 10gen. The response was to send up a hot patch and that they were calling an RC internally, and then run that on our data. **8. Replication was lackluster on busy servers** Replication would often, again, either DOS the master, or replicate so slowly that it would take far too long and the oplog would be exhausted (even with a 50G oplog). … Unfortunately, it doesn't matter. The real problem is that so many of these problems existed in the first place. Database developers must be held to a higher standard than your average developer. Namely, your priority list should typically be something like: 1. Don't lose data, be very deterministic with data 2. Employ practices to stay available 3. Multi-node scalability 4. Minimize latency at 99% and 95% 5. Raw req/s per resource
Source URL
https://gist.github.com/mitio/1343383Related Pain Points
Unpredictable data loss in production
9MongoDB has exhibited severe data loss issues including unexplained record disappearance, unsuccessful recovery from corruption, replication gaps causing missing records on slaves, and replication stopping without errors.
Global write lock kills performance under heavy write loads
8MongoDB requires a global write lock for any write operation. Under write-heavy loads, this severely degrades performance, making it unsuitable for applications with balanced or write-heavy read/write ratios.
Sharding fails under high load during chunk migration
8Adding a shard to a MongoDB cluster under heavy load is problematic. MongoDB either migrates chunks so aggressively that it causes DoS conditions on production traffic, or refuses to move chunks at all, making it unsuitable for high-traffic sites with heavy write volumes.
mongos (sharding router) crashes frequently under load
8The mongos routing layer is unreliable and crashes every few hours to days under load. Some crashes involve assertion failures that don't fully terminate the process, leaving it in a broken state even with restart supervision.
Replication becomes bottleneck on busy servers
8Replication on heavily loaded MongoDB servers either causes DoS on the master or replicates so slowly that the operation log is exhausted, requiring very large oplog sizes (e.g., 50GB) and still failing to keep up.