Thursday, September 27, 2007

Crash-d-ay 2: Crash or a curse

I wanted to name this post as Crash-d-ay 2: Crashdays, mainly because sequels to many animal movies will have more creatures and will be named in Plural. Few examples are Alien, Anaconda, Garfield (the sequel had two Garfield cats though nothing acted like Garfield), 101 Dalmatians and more. But when things went chaotic, I had to consider it as a curse and that’s why I chose the current title.

Now hear the news. As if my post triggered the curse, I am facing so many crashes and a day with all the servers running fine has become a dream. Let’s face it. Just two weeks after a Production server crash that led to 48 hours outage, if we face crashes in two other production servers, there must be some suspicious activity going around.

1. It was last Wednesday night that one of our severity-2 production servers crashed bringing down some twenty loads. The recovery had to be done very quickly and was completed with just 1 minute to spare before the jobs would get timed out.

2. It was Thursday afternoon few minutes before I can leave for lunch, the application server crashed terminating all the severity 1 jobs that ran for about 4 hrs and would have completed in the next hour.

3. Same day, I called my roommate and he said "Oh my God!" instead of saying "Hi!” At first, I thought he mistook me for God. But then I found that his system crashed the same moment, he picked my call. Do you think its coincidence? I very much hope so.

4. Sunday Morning, I was woken up by the offshore team with the news of another server crash. With my thoughts left on the server all the day, we lost both the cricket matches and I was bowled out for a duck in the first match. I couldn’t concentrate on the game was all I said to the press after the match.

5. Tuesday night 2:40 AM. Offshore woke me up about the failure of the same server again. This time, it was worse with the server offline for the whole night. I was left as they would definitely wake me up anytime to start the applications. At last, with off shore’s help, I took short naps and the server was back online at 9 AM. The day was pretty bad as the server was still not stable and we had to monitor it closely.

6. Wednesday night 12:11 AM, offshore called. I know the reason before they could say anything. The server crashed again. With the on-call UNIX person fixing the problem in less than an hour, I was so happy, I could sleep well. But then the offshore connectivity was interrupted and I had to work till 4 AM. With all loads recovered and offshore trying to help as much as they can, I took a short nap.

7. Thursday morning 7AM. I was woken up with the usual greeting. "Server crashed again". The server is still not back to normal and we had to do most of the tasks manually. So there I am awake waiting to hear another crash.


Sorry for making this post more like a system log. Looking at all the server logs I am still not back in mood to write a funny post. Now if any of you are experts in Vaasthu, advise me on the direction at which we should keep the server to prevent it from further crashes. Or even a ritual is acceptable. No dark magic or a voodoo though.

No comments: