Sep 182011
 

Just found that I’ve been away from here or more than a month.

Hmmm … actually I don’t have much to update. I was busy with all legacy systems, putting on monitoring facilities, establish deployment processes, and sort out oncall schedule, etc. People HATE this – yes you didn’t hear wrong, people hate this. One of them mentioned this in a IM conference, saying “after launched monitoring facilities, I never got good sleep, and was always panic on phone call and SMS”.

These systems never been monitored, so nobody knows if there is anything with it ๐Ÿ™‚ . In old days, the only way project team got notified about system problem was either customer care ring NOC, or another service relies on the system yelling at them. It seems people feel better with that kind of escallation. ๐Ÿ˜€

Anyway, they are doing something more aggressive now which I have to stop them from time to time – since they want to know potential problem of the system, they put lots of monitoring facilities on, and some of them generate hundreds or even thousands of alerts daily. You should already know the problem – since there are way too many alerts, so people choose to ignore ALL of them, which turn the monitoring stuffs totally uesless other than wasting money (bandwith and SMS cost).

There is still a long way to go to get people understand what’s the right way to do monitoring. I don’t think I have enough time to do this … pity.