Recently we have been experiencing significant problems with our main server (Windows Server 2003 R2 on an HP DL380 G3). Yes I know it is old, but we are doing the best we can on a shoestring budget. Long story short is that I removed an Adaptec 39160 SCSI card that was used for our tape backup solution. This card began malfunctioning a few weeks ago and I was not able to update the firmware with it in the HP. So I pulled it to install it in more compatible hardware. Immediately after this the server was running fine. However by Sunday morning the HP was showing errors in the raid array that houses our shared drive.
Sunday evening, two of the drives became corrupt. This wasn’t entirely bad because we run a raid 10 with a maximum fault tolerance of 2. Luckily, the two drives that failed were on opposite sides of the raid stripe. It took about 2 hours for the controller to rebuild the array after I hot swapped two replacement drives. Everything appeared to be fine, even after a reboot.
However, this photo was from Wednesday at 12:30 or so. As you can see 3 drives have become corrupt. A RAID 10 does not have fault tolerance for 3 drives. The array was lost and as such the partitions that stored our main file shares.
We run multiple backups each evening. One is directly to disk via an external WD USB hard drive. The other is a network backup to a Dell Poweredge 1800. Even though the USB restore would have been faster I chose to use the network backup since I was already confident in restoring from USB. I began the restore just before 1:00 PM. Everything went smoothly and finished at 3:47.
There was some data loss though. The backup was from 3:21 AM and the failure happened around 12:20. So we lost about 9 hours of work.
I was especially happy though that our entire active directory, anti-virus, and printers are all located on the first raid array (the two drives on the far right). This meant that DHCP, DNS, AV, and AD all performed as usual. So the staff could still access their email, printers, and other web content.
- Have a backup plan that you are confident in!
- Never put your AD across multiple arrays (unless across multiple hardware platforms)
- For under $100 an external USB is great piece of mind
- Dell OEM SCSI cards seem to work best in Dell hardware (not HP, go figure)