{"id":696,"date":"2009-10-23T00:33:18","date_gmt":"2009-10-23T07:33:18","guid":{"rendered":"http:\/\/xiehang.com\/blog\/?p=696"},"modified":"2009-10-23T00:47:08","modified_gmt":"2009-10-23T07:47:08","slug":"nosql-start-with-consistent-hashing","status":"publish","type":"post","link":"https:\/\/xiehang.com\/blog\/2009\/10\/23\/nosql-start-with-consistent-hashing\/","title":{"rendered":"NoSQL – start with consistent hashing"},"content":{"rendered":"
Most NoSQL solutions are kind of caching, with persistent data store, with or without replication support. One of the key issue in production environment is using consistent hashing to avoid cache failure.<\/p>\n
I talked to a friend days ago about memcached deployment problem, he asked question about what to do with adding new memcached node to expand capacity, to avoid loading bunch of data from database to cache nodes. I told him I don’t have any experience, but if I encounter this problem, I will try to restart memcached client machines one by one, to use new configuration, so to avoid put massive load to database, also I will think about changing hashing function of memcached client, try to maximize entries that can keep partition unchanged.<\/p>\n
It turned out my second idea is correct (I should have read all those articles before talking to him :P). There are couple of articles discussing about this issue, and the good start point, of course, is wikipedia<\/a>.<\/p>\n
cluster<\/th>\n | capacity<\/th>\n | capacity \nchanged<\/th>\n | key moved<\/th>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4x512M<\/td>\n | 2G<\/td>\n | 0%<\/td>\n | 0%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4x512M \n1x1G<\/td>\n | 3G<\/td>\n | 50%<\/td>\n | 40%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4x512M \n2x1G<\/td>\n | 4G<\/td>\n | 33%<\/td>\n | 30%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4x512M \n3x1G<\/td>\n | 5G<\/td>\n | 25%<\/td>\n | 25%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4x512M \n4x1G<\/td>\n | 6G<\/td>\n | 20%<\/td>\n | 20%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3x512M \n4x1G<\/td>\n | 5.5G<\/td>\n | 8%<\/td>\n | 12%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2x512M \n4x1G<\/td>\n | 5G<\/td>\n | 9%<\/td>\n | 13%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1x512M \n4x1G<\/td>\n | 4.5G<\/td>\n | 10%<\/td>\n | 18%<\/td>\n<\/tr>\n | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4x1G<\/td>\n | 4G<\/td>\n | 11%<\/td>\n | 19%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n relatively, percentage of keys got moved to other partitions is close to capacity changes, which means it is close to the best number.<\/p>\n And key distribution is pretty even (capacity\/utilization, node #1~#4 are 512M, #5~38 are 1G):<\/p>\n
|