{"id":696,"date":"2009-10-23T00:33:18","date_gmt":"2009-10-23T07:33:18","guid":{"rendered":"http:\/\/xiehang.com\/blog\/?p=696"},"modified":"2009-10-23T00:47:08","modified_gmt":"2009-10-23T07:47:08","slug":"nosql-start-with-consistent-hashing","status":"publish","type":"post","link":"https:\/\/xiehang.com\/blog\/2009\/10\/23\/nosql-start-with-consistent-hashing\/","title":{"rendered":"NoSQL &#8211; start with consistent hashing"},"content":{"rendered":"<p>Most NoSQL solutions are kind of caching, with persistent data store, with or without replication support. One of the key issue in production environment is using consistent hashing to avoid cache failure.<\/p>\n<p>I talked to a friend days ago about memcached deployment problem, he asked question about what to do with adding new memcached node to expand capacity, to avoid loading bunch of data from database to cache nodes. I told him I don&#8217;t have any experience, but if I encounter this problem, I will try to restart memcached client machines one by one, to use new configuration, so to avoid put massive load to database, also I will think about changing hashing function of memcached client, try to maximize entries that can keep partition unchanged.<\/p>\n<p>It turned out my second idea is correct (I should have read all those articles before talking to him :P). There are couple of articles discussing about this issue, and the good start point, of course, is <a href=\"http:\/\/en.wikipedia.org\/wiki\/Consistent_hashing\">wikipedia<\/a>.<\/p>\n<p>I tried <a href=\"http:\/\/www.last.fm\/user\/RJ\/journal\/2007\/04\/10\/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients\">libketama<\/a>, seems pretty good in term of retention rate. I did some tests that could be (sort of) real world use case. Say, we have 4 weak (512M) nodes and want to replace them with all new nodes with double capacity (1G), I&#8217;m going to add new nodes to the cluster one by one, and then remove old nodes one by one, and here are what I got:<\/p>\n<table border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<th>cluster<\/th>\n<th>capacity<\/th>\n<th>capacity<br \/>\nchanged<\/th>\n<th>key moved<\/th>\n<\/tr>\n<tr>\n<td>4x512M<\/td>\n<td>2G<\/td>\n<td>0%<\/td>\n<td>0%<\/td>\n<\/tr>\n<tr>\n<td>4x512M<br \/>\n1x1G<\/td>\n<td>3G<\/td>\n<td>50%<\/td>\n<td>40%<\/td>\n<\/tr>\n<tr>\n<td>4x512M<br \/>\n2x1G<\/td>\n<td>4G<\/td>\n<td>33%<\/td>\n<td>30%<\/td>\n<\/tr>\n<tr>\n<td>4x512M<br \/>\n3x1G<\/td>\n<td>5G<\/td>\n<td>25%<\/td>\n<td>25%<\/td>\n<\/tr>\n<tr>\n<td>4x512M<br \/>\n4x1G<\/td>\n<td>6G<\/td>\n<td>20%<\/td>\n<td>20%<\/td>\n<\/tr>\n<tr>\n<td>3x512M<br \/>\n4x1G<\/td>\n<td>5.5G<\/td>\n<td>8%<\/td>\n<td>12%<\/td>\n<\/tr>\n<tr>\n<td>2x512M<br \/>\n4x1G<\/td>\n<td>5G<\/td>\n<td>9%<\/td>\n<td>13%<\/td>\n<\/tr>\n<tr>\n<td>1x512M<br \/>\n4x1G<\/td>\n<td>4.5G<\/td>\n<td>10%<\/td>\n<td>18%<\/td>\n<\/tr>\n<tr>\n<td>4x1G<\/td>\n<td>4G<\/td>\n<td>11%<\/td>\n<td>19%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>relatively, percentage of keys got moved to other partitions is close to capacity changes, which means it is close to the best number.<\/p>\n<p>And key distribution is pretty even (capacity\/utilization, node #1~#4 are 512M, #5~38 are 1G):<\/p>\n<table border=\"1\" align=\"center\">\n<tbody>\n<tr>\n<th>node #1<\/th>\n<th>node #2<\/th>\n<th>node #3<\/th>\n<th>node #4<\/th>\n<th>node #5<\/th>\n<th>node #6<\/th>\n<th>node #7<\/th>\n<th>node #8<\/th>\n<\/tr>\n<tr>\n<td>25.0%<\/p>\n<p>25.6%<\/td>\n<td>25.0%<\/p>\n<p>21.7%<\/td>\n<td>25.0%<\/p>\n<p>24.7%<\/td>\n<td>25.0%<\/p>\n<p>28.0%<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<\/tr>\n<tr>\n<td>16.7%<\/p>\n<p>16.9%<\/td>\n<td>16.7%<\/p>\n<p>15.2%<\/td>\n<td>16.7%<\/p>\n<p>19.0%<\/td>\n<td>16.7%<\/p>\n<p>17.7%<\/td>\n<td>33.3%<\/p>\n<p>31.1%<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<\/tr>\n<tr>\n<td>12.5%<\/p>\n<p>13.5%<\/td>\n<td>12.5%<\/p>\n<p>10.8%<\/td>\n<td>12.5%<\/p>\n<p>13.7%<\/td>\n<td>12.5%<\/p>\n<p>12.7%<\/td>\n<td>25.0%<\/p>\n<p>24.5%<\/td>\n<td>25.0%<\/p>\n<p>24.8%<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<\/tr>\n<tr>\n<td>10.0%<\/p>\n<p>10.9%<\/td>\n<td>10.0%<\/p>\n<p>9.4%<\/td>\n<td>10.0%<\/p>\n<p>11.0%<\/td>\n<td>10.0%<\/p>\n<p>8.3%<\/td>\n<td>20.0%<\/p>\n<p>19.6%<\/td>\n<td>20.0%<\/p>\n<p>20.0%<\/td>\n<td>20.0%<\/p>\n<p>20.9%<\/td>\n<td>&#8211;<\/td>\n<\/tr>\n<tr>\n<td>8.3%<\/p>\n<p>8.9%<\/td>\n<td>8.3%<\/p>\n<p>8.3%<\/td>\n<td>8.3%<\/p>\n<p>8.1%<\/td>\n<td>8.3%<\/p>\n<p>7.0%<\/td>\n<td>16.7%<\/p>\n<p>16.7%<\/td>\n<td>16.7%<\/p>\n<p>17.1%<\/td>\n<td>16.7%<\/p>\n<p>17.9%<\/td>\n<td>16.7%<\/p>\n<p>16.1%<\/td>\n<\/tr>\n<tr>\n<td>&#8211;<\/td>\n<td>9.1%<\/p>\n<p>9.0%<\/td>\n<td>9.1%<\/p>\n<p>9.6%<\/td>\n<td>9.1%<\/p>\n<p>8.2%<\/td>\n<td>18.2%<\/p>\n<p>17.5%<\/td>\n<td>18.2%<\/p>\n<p>18.3%<\/td>\n<td>18.2%<\/p>\n<p>19.8%<\/td>\n<td>18.2%<\/p>\n<p>17.6%<\/td>\n<\/tr>\n<tr>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>10.0%<\/p>\n<p>9.7%<\/td>\n<td>10.0%<\/p>\n<p>8.9%<\/td>\n<td>20.0%<\/p>\n<p>20.3%<\/td>\n<td>20.0%<\/p>\n<p>20.5%<\/td>\n<td>20.0%<\/p>\n<p>21.9%<\/td>\n<td>20.0%<\/p>\n<p>18.6%<\/td>\n<\/tr>\n<tr>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>11.1%<\/p>\n<p>9.2%<\/td>\n<td>22.2%<\/p>\n<p>22.3%<\/td>\n<td>22.2%<\/p>\n<p>22.2%<\/td>\n<td>22.2%<\/p>\n<p>25.2%<\/td>\n<td>22.2%<\/p>\n<p>21.1%<\/td>\n<\/tr>\n<tr>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>&#8211;<\/td>\n<td>25.0%<\/p>\n<p>24.2%<\/td>\n<td>25.0%<\/p>\n<p>24.5%<\/td>\n<td>25.0%<\/p>\n<p>27.2%<\/td>\n<td>25.0%<\/p>\n<p>24.1%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>I still need to try out fnv to see if it has better distribution and\/or less key shakiness, from the article above it was said at least it has better performance.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Most NoSQL solutions are kind of caching, with persistent data store, with or without replication support. One of the key issue in production environment is using consistent hashing to avoid cache failure. I talked to a friend days ago about memcached deployment problem, he asked question about what to do with adding new memcached node <a href='https:\/\/xiehang.com\/blog\/2009\/10\/23\/nosql-start-with-consistent-hashing\/' class='excerpt-more'>[&#8230;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[47,13,133,28],"_links":{"self":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts\/696"}],"collection":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/comments?post=696"}],"version-history":[{"count":6,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts\/696\/revisions"}],"predecessor-version":[{"id":698,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/posts\/696\/revisions\/698"}],"wp:attachment":[{"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/media?parent=696"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/categories?post=696"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/xiehang.com\/blog\/wp-json\/wp\/v2\/tags?post=696"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}