I have some questions about cassandra that haven’t got answer yet, I’m writing them done now to make sure I won’t forget them in the future, all these questions are pretty critical for operation:
- How does cassandra utilize multi-core? Does it have multiple threads internally, handling different requests in different threads? If the answer is no then it will be pretty ugly since I have to run multiple cassandra instances on one machine
- Is cassandra (maybe Java) capable to handle large memory? Make it clear, can it fully utilize 64-bit machine’s memory (8G/16G/64G)? Again if the answer is no I have to run multiple instances per machine.
- I know cassandra can replicate data from one colo to another colo, but what I understand is that they are virtually same cluster – is it possible to make two colos both have full data set, and for client request it only return data from local node?
- Is it possible to stream updates to cassandra to another source? What I want is capturing a live data set in another data store (such as RDBMS) for other purpose, so prefer a plug-in type of implementation so that I can grab updates and send to different downstream.
Will post more once there is anything jump into my mind, and will post answers (separated blog) if I hear of any.