Skip to content
thijsterlouw edited this page Sep 13, 2010 · 8 revisions

This is probably the best performing Erlang Memcached client out there. It is totally optimized for speed:

  • avoid calls to erlang:now() by using NIFs (C code)
  • uses the Memcached binary protocol (sequence numbers + faster parsing)
  • avoid blocking the gen_server which maintains the persistent connections to the Memcached server
  • that means using {active, true} for the tcp connection.
  • If you use {active, false} (which almost every other client does), then you are blocking the gen_server process while you are waiting for the gen_tcp:recv(). That means the gen_server can get a huge inbox, which in turn means that many requests will timeout in the erlang side (or you just get very bad performance)
  • it uses dynamically compiled code to store the configuration. This is expecially important for the Ketama-style consistent hash ring and binary-search lookups of the server address based on the hashed key
  • if you store the config in a process, then you have the bottleneck of sequential access
  • if you read the config from config only, then you don’t get dynamic updates
  • you could use an ETS table to store the points on the continuum, but if you store all as one key, it’s very slow to copy to the requesting process
  • if you store seperate keys in ETS, then it’s already much faster, but still have to access ETS, which is slow
  • the best solution is to dynamically compile the configuration and have a process responsible for maintaining the state and updating the file
  • even then several options exist for matching: gb_trees, case-matching etc. It turns out that "erlang’s case matching doesn’t scale well::http://wiki.github.com/echou/memcached-client/continuum-generation with many points on the ring (160 points * number of servers). gb_trees is much better then. We are also experimenting with a driver (C code) for this part. You would gain more speed, but flexibility becomes a bit harder.
  • this client supports the concept of Memcached pools. So you can say server A,B and C are together in one pool and servers D,E and F in another. When you get many servers, or you are migrating servers, this is a huge benefit. Fewer servers per pool means that multi-get requests become faster as well.
  • supports very fast JSON encoder/decoder via a customized eep0018/yajl extension

TODO:

  • removing (and adding back) servers that go down (and up), for example in the case of network splits
Clone this wiki locally