colinramsay.co.uk

PC Reliability and Disaster Recovery

25 Apr 2007

Yesterday afternoon I was extracting the enormous archive for the Orca Beta 1 VPC image while watching an episode of 24 on my PC. Without warning, it rebooted. Now this is never a good sign, but at least it restarted... until it tried to get back into Windows, at which point it rebooted again. Safe mode and Last Known Config had the same problems, so I've now got a fried Windows install.

I've got the hard disk in another PC right now, and I'm running chkdsk on it. So far it has taken over 10 hours, though I have managed to retrieve all of my important documents from it. I'm still holding out hope of actually being able to use the disk again without resorting to reinstalling Windows, but maybe I'm just being optimistic.

Why Did This Happen?

As mentioned, I was running a long extraction and watching video, so the load on my PC was pretty high for an extended period of time. The case I have is poor, and my three hard drives are right next to each other and were hot. I'm wondering whether this caused the problem. I've only got one chassis fan too, and there was loads of dust floating round inside.

What Now?

If I have to reinstall Windows, that means I've got to set up Visual Studio and all of the other little pieces that make using my computer bearable. As Ayende mentioned recently, this takes serious time, and is something that I want to avoid or at least speed up. I've heard talk of using Virtual PC images to have a development environment backed up and ready to use, which could be a good way to get up and running fast. However it does seem that to get to this stage would mean setting up two PCs. And if I'm running VPC images, there will be a performance hit - and on my current machine this could be painful. Which means I'd have to buy a new PC, and so this particular disaster would have cost me hundreds of pounds.

Data Policy

Last year, I had another hard disk fail on my and I lost the source code for my Embrace website, Happy and Lost. This annoyed me big-time, and straightaway I began to use SVN for all of my code. However, there are still plenty of documents which I have no backup plan for, and that's probably not wise. Scott Hanselman details his family backup policy, which is something I am going to look at in detail after I get up and running again.

In Conclusion

Firstly, I need a new case for my PC. I have my eyes on the Antec P180, which is fairly pricey but gets rave reviews. There are two reasons for this - to improve airflow, and to give me better access to my drives. At the moment my case is all screws and bolts and horror. I need something better.

Secondly, I need to come up with a way of getting up and running fast after a PC disaster. As a software developer I live and die by the hardware which I use, and any downtime could be expensive.

Anyone who says "get a Mac" in response to this may experience the full force of my already frayed temper...

UPDATE: looks like the Subtext guys are feeling my pain.