Monday, December 30, 2013

Unjustified MongoDB Complains

I have done some research on MongoDB recently. I took their online course for MongoDB developer and read the book MongoDB: the Definitive Guide. I also attended their certification program, took the test and became a Certified MongoDB developer.

There are lots of headed discussions of the drawbacks of MongoDB in the online forum. I am going to discuss about some complains that are not justified.

Complaint 1: MongoDB does not support join. That's a typical trait for document/column family NoSQL database. If we allow join between collections, the scalability is going to suffer. The key of MongoDB design is to understand the data usage pattern. We usually use "embedding" to store related data together into one document. Because these data are usually accessed together,  there is need for join operation during runtime. A typical example is order and line itmes. Since the data of order and line items are always accessed together, it makes sense to denormalize and store them in one single document. On the other hand, the order would have a customer which probably is a separate collection. The join has to be done at the client side. A client can read the custom_id from the order document, and then go fetch the data from customer collection as needed separately.

Complaint 2: MongoDB does not support transaction. First of all, let's clarify that mongodb does support atomic operation on one single document, but it does not support atomic transaction across more than one document or more than one collection. There is a way to do two-phase commit, as described in MongoDB documentation, but boy, it is really convoluted. The idea is in most cases, your data that accessed together should be stored in one document using "embedding", thus we do not need cross document transaction. 

Complaint 3: Map-Reduce in MongoDB is slow because it is single-threaded. That's old news. Since MongoDB 2.4, The SpiderMonkey JavaScript engine has been replaced by the V8 JavaScript engine. There is no longer a global JavaScript lock, which means that multiple Map/Reduce threads can run concurrently. Also MongoDB 2.2 introduced an aggregation framework that can be used to replace lots of map-reduce jobs. The aggregation framework run much faster than a similar map-reduce job.

Complaint 4: The default write behavior is unsafe. Again, this is old news. In the past, MongoDB's default write concern is unacknowledged, which means MongoDB does not acknowledge the receipt of write operation. It is a fire-and-forget operation. But this default behavior has been changed. MongoDB has a new connection class named MongoClient. The default write concern on the new MongoClient class will be to acknowledged all write operations