2009-10-23 – Flying Bug

Plan for NoSQL

Oct 232009

I’ve read too many articles talking about this NoSQL stuffs, now I have to have a plan to proceed (with what? :-W).

First of all, I’m going to remove MySQL and OpenLDAP installation in my testing environment :P. MySQL is kind of slow to me though I have pretty much experience in setting it up, include replication, etc, and I will check to see if all applications can be done in a key-value based data store (check below). OpenLDAP is another story, I still haven’t figure out how to set it up with replication – last time I tried was 2.2/2.3, but 2.4 introduces a whole new approach to do replication and I think I’m going to leave it untouched for now. Note that I still need to come back to LDAP later on since it is still perfect solution for running corporate-like application, such as what I did couple of times before – integrate mail, IM, wiki, blog, etc together with single user-id.

OK, back to NoSQL, couple of things to do:

Consistent hashing, I still need to read all those articles and try out different implementation, I don’t think I will compose my own, but I need something can work on Linux and Windows (OSX? Don’t think so), and support some major programming languages (C/C++, PHP/Python, Java). I also need to do test similar to what I’ve done and understand how it affects deployment.
Try out different engines, most likely I won’t try out things too fancy (read it “complicated”), for example, I will prefer Redis over memcachedb just because memcachedb’s replication is not that simple to me, I believe anything complicate in setup will be a headache in maintenance. I will also skip those so-called document store/graph store unless they can support simple key-value store in same performance (then those features will be a nice add-on). I don’t have the list so far, but I will get one in the coming weekend. Things to be tested include installation, replication, fail-over, backup and recovery, monitoring, etc. Also programming language supported will be another important factor, I wish a similar list to item #1.
Application … I’m going to conclude “traditional” web features that involves data store, and check to see how to implement them in distributed key-value data store. For example, user registration, login, edit preference/profile is one of the fundamental features, and buddy related operation (add as buddy, blacklist, check online status, notify buddy for event/be notified by buddy) is another one. Things current in my mind include message feature (internal/external IM/mail), post features (threaded post like forum, vote/survey may be in this category as well), and maybe some search features. I don’t think I can come up with a full list in the coming days, but I will keep posting here.

This is pretty much what’s in my mind. All these stuffs are seems to be new and almost none of them are well packaged, so after 4~5 years of using yum/apt, now I need to do what I used to do – build everything from scratch, if I have time, I will compose some packages so to ease my deployment.

NoSQL – start with consistent hashing

Triviality 2 Responses »

Oct 232009

Most NoSQL solutions are kind of caching, with persistent data store, with or without replication support. One of the key issue in production environment is using consistent hashing to avoid cache failure.

I talked to a friend days ago about memcached deployment problem, he asked question about what to do with adding new memcached node to expand capacity, to avoid loading bunch of data from database to cache nodes. I told him I don’t have any experience, but if I encounter this problem, I will try to restart memcached client machines one by one, to use new configuration, so to avoid put massive load to database, also I will think about changing hashing function of memcached client, try to maximize entries that can keep partition unchanged.

It turned out my second idea is correct (I should have read all those articles before talking to him :P). There are couple of articles discussing about this issue, and the good start point, of course, is wikipedia.

I tried libketama, seems pretty good in term of retention rate. I did some tests that could be (sort of) real world use case. Say, we have 4 weak (512M) nodes and want to replace them with all new nodes with double capacity (1G), I’m going to add new nodes to the cluster one by one, and then remove old nodes one by one, and here are what I got:

cluster	capacity	capacity changed	key moved
4x512M	2G	0%	0%
4x512M 1x1G	3G	50%	40%
4x512M 2x1G	4G	33%	30%
4x512M 3x1G	5G	25%	25%
4x512M 4x1G	6G	20%	20%
3x512M 4x1G	5.5G	8%	12%
2x512M 4x1G	5G	9%	13%
1x512M 4x1G	4.5G	10%	18%
4x1G	4G	11%	19%

relatively, percentage of keys got moved to other partitions is close to capacity changes, which means it is close to the best number.

And key distribution is pretty even (capacity/utilization, node #1~#4 are 512M, #5~38 are 1G):

node #1	node #2	node #3	node #4	node #5	node #6	node #7	node #8
25.0% 25.6%	25.0% 21.7%	25.0% 24.7%	25.0% 28.0%	–	–	–	–
16.7% 16.9%	16.7% 15.2%	16.7% 19.0%	16.7% 17.7%	33.3% 31.1%	–	–	–
12.5% 13.5%	12.5% 10.8%	12.5% 13.7%	12.5% 12.7%	25.0% 24.5%	25.0% 24.8%	–	–
10.0% 10.9%	10.0% 9.4%	10.0% 11.0%	10.0% 8.3%	20.0% 19.6%	20.0% 20.0%	20.0% 20.9%	–
8.3% 8.9%	8.3% 8.3%	8.3% 8.1%	8.3% 7.0%	16.7% 16.7%	16.7% 17.1%	16.7% 17.9%	16.7% 16.1%
–	9.1% 9.0%	9.1% 9.6%	9.1% 8.2%	18.2% 17.5%	18.2% 18.3%	18.2% 19.8%	18.2% 17.6%
–	–	10.0% 9.7%	10.0% 8.9%	20.0% 20.3%	20.0% 20.5%	20.0% 21.9%	20.0% 18.6%
–	–	–	11.1% 9.2%	22.2% 22.3%	22.2% 22.2%	22.2% 25.2%	22.2% 21.1%
–	–	–	–	25.0% 24.2%	25.0% 24.5%	25.0% 27.2%	25.0% 24.1%

I still need to try out fnv to see if it has better distribution and/or less key shakiness, from the article above it was said at least it has better performance.