How I recovered our server

I know, for you IT folks out there it should never come to this. But, it did.

I walked into the server room on September 26th and saw 2 Orange LED’s on RAID Array 1. Not good. This array is a RAID 1 (two drives mirrored). Together these drives make up the C: drive for our Windows Server 2003 R2 domain, at least one has to be operating at all times. Double not good.

Since a computer won’t boot to a corrupted drive I had to resort to the Automated System Recovery I made when I first setup the server. So I dug out our bootable floppy and installation CD. The HP server would recognize that a CD was present, but it would never boot to it. I tried all kinds of BIOS settings to force it to boot to the optical drive, no luck. I tried different Windows media, nothing. I even tried to use HP’s iLO (integrated lights out) feature that allows another computer to share its optical drive with the HP over a network. Still nothing. Things were looking terribad.

In somewhat of a panic I navigated to HP’s website and found an OS Installation ISO image. I figured if I couldn’t boot to a Windows Disc perhaps I could boot to an HP disc. When the server booted to the media my excitement was reserved. Since the ASR only worked from the Windows media and I couldn’t get that to work, booting to HP media didn’t seem very helpful. I charged ahead anyway and used the HP media to install a fresh copy of Windows Server 2003 R2. I believe the HP process actually slipstreams your existing media with a package of all their drivers and utilities to make a virtual bootable partition on the drive.

In the back of my mind was the knowledge that I would have to build a new domain from scratch and visit every computer in the building to update it. This process takes about 30 minutes per computer and we have over 50 computers. Yuck.

Nevertheless, I needed an active server. This time rather than install the OS on a RAID 1 array I chose to use a RAID 5 array with one hot-swap. This meant that in the event of a drive failure the active hot-swap would take over allowing me to lose 2 drives at the most. Once the OS install was complete I searched around for the drivers to our Dell 124T Tape Backup Autoloader. If I could get the server to recognize this I might be able to recover from tape, albeit in a backwards order than a typical restore happens. I rotated through 5 tapes, each of them failing to properly pull up a proper backup. That sinking feeling started to make me sick.

Things began looking up though. Our very last backup tape had a full backup that the server was recognizing. The date was from September 9th, a couple weeks old. But it would do. I wasn’t quite sure how Server 03 would handle writing over files that it was actively using. You actually have to enable this feature within NTbackup under Tools…Options. To my delight the process was seamless. NTBackup did it’s job and then forced a reboot. The domain login prompt came up. This was a very good since I hadn’t installed Active Directory yet on the fresh install, meaning the domain was back.

With the server back in shape I then established a plan to add a secondary domain controller to our network. Thank you to Kimbrough Dental for donating their old Dell PowerEdge 1800. You can check out this post to see how things went.

Lessons learned

  • Check your backups
  • Do a mock restore from your backup device
  • In a domain of any size have a second domain controller (even if it is on a regular PC)
  • Keep trying…keep coming up with alternate ideas to make it work. Keep inovating.
Posted in Tech and tagged , , .

Leave a Reply

Your email address will not be published.