Jul 292013
 

I got a weird feeling that redis is more stable than memcached … but I don’t have solid evidence for that.

The case is, I migrated an application’s backend from memcached to redis, there were surely some other bug fixes and trivial performance tuning, but thinking of the memcached version had been running in production for more than 6 months without any significant problem, I wouldn’t think my changes make any big difference. Also, I changed my codes only, nothing to do with redis/memcached so I think it’s a fair comparison.

memcached use to hit 6GB memory (VIRT) usage after a week to 10 days running, then for most cases it will crash cos ops people set RAM limitation. Ops people asked me to do a restart once the memory utilization hit a certain level, but I found that restart memcached will simply wipe out all my in-memory-only data like counter, which is not critical but will make the application looks weird (“no more visitor in the past 7 days???”), and this initiated the migration to redis.

After migrated to redis, the memory utilization started from 4200MB, and one of the oldest instance had been running there since Jun 4 (50+ days till now), and memory is at 4300MB. I guess the increment was due to data size growth instead of memory leak, though again, I don’t have evidence.

Anyway, I’m not going to go back to memcached in the future, so the difference won’t bother me.

Jul 222013
 

Playing with GlusterFS now, here’s the to-do list:

  1. Installation and basic configuration, plus getting familiar with command line utilities
  2. Set up RAID-10-like configuration, with geo-replication if it is possible
  3. Regular routine maintenance, haven’t got clear idea yet, but should include: expand a volume, shrink a volume, replace a brick in volume, re-balance data, convert another fs to glusterfs, recover from various disasters, etc.
  4. Performance testing with various scenarios, even people have certain number with them, include: mail server with maildir (large number of small files with small amount of concurrent access), file server (medium number of files and medium size with almost no concurrent access), video server (small number of large files with large amount of concurrent access), file based database like SQLite and BerkeleyDB (concurrent access with lots of seek operation), and RDBMS like mysql/postgre.

Sound like a great plan, right :D? Let’s see.

Jul 192013
 

I was assigned to a Web project which is to present data analysis result to users. Original data came from Web log, plus some extra information, then headed to Hive, then populated statistics files after scientist’s analysis There are several interesting topics: geo graph, rendering another web page, and metrics graph, roughly speaking, I have no idea of any of these at the beginning of the project.

I think the best decision I made is to use GD based solution, actually that could be the only solution I can think about, I decided to use PHP for Web, plus Perl for batch processing, this actually seems not to be quite right as I’m migrating everything to PHP now as there is not much “real” batch processing, and everything could be done in shell. Also I decided to use server-side DOM model (read: PHP DOM) so not to slow down the project by my poor JS skills, actually my PHP skill is not that good but JS is definitely *poor*. Continue reading »

Jul 182013
 

PHPlot looks better than Perl’s GD::Graph … I don’t know how to exactly describe the difference, but feel like: neat, easy to understand, easy to control, and better default values.

I’m migrating some BI sites to PHPlot so to get rid of Perl stuffs, so to make things purely PHP based, thus other PHP guys can take over it easily.

Jul 172013
 

Haven’t dig into legal issues yet, but for my hobby project I got everything regarding map from here:

http://www.gadm.org/country

After getting shp files from the web site, I use pyshp module:

https://code.google.com/p/pyshp/

To extract data to plain text format so that other programs can read it directly. There is an old version of pyshp comes with Ubuntu, but it’s sufficient to me.