28 3 / 2013

Yesterday I did a post on the idea that Redis now needs a binary protocol. Seems like people are listening actively and I was followed up with a reply (which I am not ready to believe) in this post saying:

We actually found at Tumblr that the memcache binary protocol was less performant than the text protocol. I also know a number of other large web properties that use the memcache text protocol. I don’t think the ‘benefits’ of a binary protocol are super clear cut for applications like this.

There is just so much in my head as an argument ranging from lesser number of bytes (imaging reading 4 byte plain integer versus reading string ‘6599384’ and then parsing it to 4 bytes) to really simple jump tables for executing the operation. Then I thought let the evidence speak for itself and wrote some Go code available in this Gist to simply benchmark 100K, and 1M get operations (also tried various numbers various machines). For testing purposes I used Go ; with various reason like saving myself from complex configurations, avoid any VMs, simulate the complete client library written in purely in same language Go in this case (assuming the implementations were decent enough) and get the coding part done really quick.

It’s a really simple benchmark always getting same key (to remove any other variables) and simply discarding the results trying to benchmark pure get/parse time. Results were just what I expected; with 100K gets took 6.560225s in ASCII protocol and 5.288102s in binary protocol. As I scaled the number of gets up to 1M the time grew linearly (see gist).

In closing I think an ASCII protocol can never beat binary protocol (assuming you have not designed a stupid protocol). To me it sounds like interpreting bytecodes vs lines of code. There is a reason why many of NoSQL stores (e.g. Riak on Protobufs and HTTP) ship purely on binary protocol or an alternative binary protocol. I would love to know the libraries and methods Tumblr used to communicate with memcached over binary protocol. I am not convinced! If readers of this post have done some benchmarks, or anything that brings up a valid argument be free to share!

  1. maxpert posted this