Thursday, November 28, 2013

Design considerations of MongoDB-Log4j appender

I recently implemented a MongoDB Log4j Appender. The drive behind it is that we are considering expanding the WebLogic cluster to more nodes. The current logging is file-based. When more and more nodes are added to cluster, it is hard to check log when they are distributed on multiple machines. Using MongoDB gives me a central location to check the log for the whole cluster. Besides, it is easier to query/analyze the log data in MongoDB than in flat files.

There are a few key design considerations I would like to share:

  1. Choose the write collection type. You can choose a capped collection or a TTL collection. The capped collection is a fixed-size collection that supports high-throughput operations that insert, retrieve, and delete documents based on insertion order. When the size is exceeded, the oldest document in the collection is removed automatically. A TTL collection make it possible to store data in MongoDB and remove outdated document automatically after a specified number of seconds or at a specific clock time. We choose the capped collection because our current file-based log4j logging rotation is filesize based. 
  2. Be careful with the schema design. We should extract the relevant information such as timestamp from the log data into individual fields in a json document. The timestamp should be stored as a Date type using ISODate(String) constructor. It will make your document size smaller and easier to be analyzed.
  3. Choose the right write-concerns. If you choose the write concern to be "unacknowledged", the DB write operation is asynchronous and return very fast. Write concern "acknowledged" is much safer, but the speed and throughput is worse. In the Log4j appender design, I decide to use unacknowledged write concern for info/debug logging and use "acknowledged" write concern for warn/error logging.
  4. Consider using AsyncAppender. Even though the acknowledged/unacknowledged write concern mix can achieve a good balance between data durability and write speed, it is still going to slow down the system a bit, in our case (around 10% overhead). You can consider attaching the mongodb appender to an AsyncAppender.