After the Circus

Random Sample
Green Little Tikiman

contact me
jean at geemoo dot ca

May 27, 2008
For all the people that say RAID is not a backup solution, and for all the people that still haven't listen... let me say something for you.

RAID is _not_ a backup solution

It's amazing how many people don't realize this, and how many people fall for it. At work, we had a pair of Linksys 1U rackmount 4 drive NASs, setup in a RAID5 with two LVM volumes. One was the primary, and the second, was a backup of the primary which was copied to each night. All was well, until a bunch of networking upgrades. The backup procedure was broken due to changing of IPs networks, but.. "Ahh it'll be ok, it's a RAID drive." Luckily for me, I had tried to convince them otherwise, but that fateful day finally arrived when one of the volumes on the NAS vanished and our backup was 2 months old.
Much time was spent on the phone with linksys, and finally we get an admission from them...
Them: "If you mount the drives into a linux box, you can recover the data that way." 
Us: "How do we go about doing that?"
Them: "Uh, sorry sir, we're not trained in linux support."
Oh well. Atleast that's a starting point. So, we track down (go out and buy) a motherboard with 4 SATA ports in it, plug in the array and begin exploring. As it turns out, and great kudos to Linksys here, they don't do anything sneaky or tricky with the array. The NSS6000 runs linux, and the drives work just fine in linux.

1) The NAS uses the entire drive (/dev/sda), not a partition on the drive (/dev/sda1), so when you mount your RAID array (in our case, a RAID5), you have to pass the full drive as args to mdadm. mdadm -A /dev/md0 /dev/sda /dev/sdb /dev/sdc /dev/sdd

2) On top of the RAID array, Linksys puts an LVM2 setup on there, which allows you to resize your drives.. which theoretically can be handy. Type lvm to get into the lvm prompt, and then pvscan to detect your volumes, and then vgchange -ay to activate your partitions. This will create your dev entries inside /dev/NAME_OF_RAIDARRAY, in our case /dev/RAIDB/ and our volume was named PUB.

3) At this point, it would be wise to make a backup of your partition. If you have the drive space to do that, you can dd if=/dev/RAIDB/PUB of=/dev/sde sde is the spare 500gig sata drive that I was making my backup onto. A note to those trying this... this is SLOW. We copied 400gig and it took atleast 8 hours.. perhaps a little more. I suspect my rescue disk might not have been using the fastest SATA drivers available, or maybe it just takes that long to copy that much data, but it takes a while.

4) Now you can mount your volumes. Linksys uses XFS on the partitions (again for the resize functionality, I suppose). In our case, the XFS partition had become corrupted and was unmountable. I had to do an xfs_repair -L /dev/RAIDB/PUB, which deleted the replay log (as it was corrupted) and did a bunch of repairs on the filesystem and eventually brought me back to a prompt. After that, I could mount -t xfs /dev/RAIDB/PUB /mnt/nas and copy our data off the drive onto a backup disk.

5) Celebration!
Tags: linux, nas, diskrecovery