I setup a testing environment on couple of company boxes to see how Cassandra performs with real machines (real here means powerful enough to be a data node), here are details of the environment:
- Two client nodes, one server nodes, all are RHEL 4.x. I use two clients nodes as I found that during the performance test, single client machine is unable to generate enough load
- All three machines are 8 cores/16G memory (well, memory is not a big deal for my tests)
- Running Cassandray 0.5.0 RC3 (built from svn last night)
- Client is using Python
Here is the graph for simple request (single key lookup):
It seems the result is pretty encouraging – query per second of the server is growing almost linearly, at about 5,000 QPS, over CPU utilization is still under 40% (25% user, 12% sys), I cannot get more client boxes to test, but if it goes this way, and let’s make 80% is threshold of CPU utilization, then this kind of box can handle 10K QPS, roughly, with latency at around 3ms.
Note that CPU utilization, QPS per client, and latency is not quite clear as the overall QPS is too high, but you can get some ideas from next graph …
Here is the graph for application (login, which will do one user lookup, and then 10~100 user lookups, each lookup is to get one buddy’s information):
The result is kind of worrying me, since the CPU utilization is 70% already (45% user and 25% sys), it seems 200 QPS is what the cluster can provide. However, thinking of the login operation is doing way too many table lookups (average 55 lookups per login), so just matches the simple lookup we discussed above (10K QPS per box), while latency is at around 80ms.
Actually, 20% sys is pretty bad, means the kernel is busy switching (I didn’t check vmstat during that time, but this is a reasonable guess), but again, this may be reasonable since the machine is handling 16 active clients who are sending bunch of requests, while it has only 8 physical cores so context switching is unavoidable.
Since everything’s linear, I can assume 4 cores boxes can offer 5,000 QPS with reasonable latency. I will do some similar tests with MySQL and memcached, and I will do similar test with multiple data nodes as well, since I got impression that multiple data nodes is far slower than single node (inter-node communication?).
8 Responses to “Cassandra’s read performance”
Sorry, the comment form is closed at this time.
Did a quick test and it seems 2-nodes (ReplicationFactor=2) cluster is getting 10% performance lost compare with single node cluster, and it seems the performance lost is growing along with number of concurrent clients growth.
Now, weird thing – with ReplicationFactor=1, the performance went down another 8~10%. It seems I have to read codes to see how Cassandra deals with multiple replica as the behavior looks inconsistent.
Same box running un-tuned mysql give me 10X throughput for login operation – 128 concurrent clients, overall QPS is about 2,800, latency is about 45ms.
It seems Cassandra is more like a solution for scalability, whenever data volume is not that big, mysql would be a better solution in term of speed.
I think I need to check out CouchDB and/or MongoDB for online feature (replication and cluster are two major concerns), and will check if Cassandara can work as offline processing (log processing?).
[…] believe previous test was using a data set that is too small – it has only 100K records and volume is about 120M […]
With 0.5 you should increase KeysCachedFraction to 0.2 or even more, depending on how many rows you have and how “hot” they are. The default of 0.01 is too conservative for many cases.
Also, try to denormalize as much as possible; Cassandra will not perform well if you just try to port a MySQL schema to it straight across.
Let us know on irc or the mailing list if you have any other questions!
I will try out those configurations.
I don’t think denormalization is a concern here, as “login” application is just a key-value lookup, that is, looking for the user based on user name, verify password, that’s all.
I take my works back – “login” is not a just key-value lookup, as it needs to go through buddies in the buddy list, and do table look up again, I’m using same application logic for both Cassandra and MySQL.
However, denormalization is a problem for this case, if I store buddy’s information within a user record, it means every time a user changes his/her profile, I have to go though all users that having him/her in buddy list to update, this will be something like twitter – I may trigger hundreds or even thousands of updates for a simple change.
Maybe this kind of application may not be suitable for Cassandra, those pure key-value pair data stores may do better.
Cassandra’s read performance – cassandra query performance…
I setup a testing environment on couple of company boxes to see how Cassandra performs with real machines (real here means powerful enough to be a data node), here are details of the environment: Two client nodes, one server nodes, all are RHEL 4.x. I …