Jan 132010

Since I’m getting myself messed up with all sort of distros, here are something I have to write down to keep track.

To make Linux/FreeBSD distro up-to-date:

  • Debian & Ubuntu:
    alias update=’sudo apt-get -y update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade’
  • CentOS & Fedora:
    alias update=’sudo yum -y update’
  • openSUSE:
    alias update=’sudo zypper refresh && sudo zypper –no-gpg-checks -n update’
  • Gentoo:
    alias update=’sudo emerge –sync && sudo emerge –update –deep –newuse world’
  • FreeBSD (need to have portmanager installed):
    alias update=’sudo portmanager -u’
Jan 132010

I have some questions about cassandra that haven’t got answer yet, I’m writing them done now to make sure I won’t forget them in the future, all these questions are pretty critical for operation:

  • How does cassandra utilize multi-core? Does it have multiple threads internally, handling different requests in different threads? If the answer is no then it will be pretty ugly since I have to run multiple cassandra instances on one machine
  • Is cassandra (maybe Java) capable to handle large memory? Make it clear, can it fully utilize 64-bit machine’s memory (8G/16G/64G)? Again if the answer is no I have to run multiple instances per machine.
  • I know cassandra can replicate data from one colo to another colo, but what I understand is that they are virtually same cluster – is it possible to make two colos both have full data set, and for client request it only return data from local node?
  • Is it possible to stream updates to cassandra to another source? What I want is capturing a live data set in another data store (such as RDBMS) for other purpose, so prefer a plug-in type of implementation so that I can grab updates and send to different downstream.

Will post more once there is anything jump into my mind, and will post answers (separated blog) if I hear of any.