Ruby, iOS, and Other Development

A place to share useful code snippets, ideas, and techniques

All code in posted articles shall be considered public domain unless otherwise noted.
Comments remain the property of their authors.

2010-11-10

Why Software RAID?

The first question, of course, is why RAID at all? (Okay, the first question might be "what is RAID?" but I'm not going to address that here. See the Wikipedia article on RAID if you aren't familiar with it. For the purpose of this post I'm only going to be talking about RAID1. You may also find this posting on why RAID5 is the wrong choice interesting.) Some data you can always get again. Installed software (including the OS) is the big one, but also purchased (or free) audio/video (e.g. stuff bought through iTunes, music/video legally ripped from your own CDs/DVDs, downloaded podcasts, etc.). There is also plenty of stuff you could recreate, possibly even better the second time, such as various configuration files and user customizations. RAID is for everything else: pictures and video of your baby, programs you've written, all the email you've saved over the years, etc.

While RAID doesn't replace regular backups (including offsite backups), it protects you from minor disasters. Anything in the chain of hardware making your data available, including the disk, disk enclosure, cable, disk adapter, motherboard, power supply, memory, network adapter, network cable, etc. can fail. These components are all relatively inexpensive (the I in RAID) these days. Replacing the failed component brings you back up and running in no time, unless it's the disk with your data on it. RAID makes recovering from a failed disk as much of a simple drop-in replacement as a network cable (well, mostly — more on that later).

RAID is becoming increasingly popular in consumer-level machines, but the way machines tend to be preconfigured is exactly wrong for consumers. If I buy a desktop machine with two internal hard drives connected to a hardware RAID card that is preconfigured in BIOS and automatically boots into Windows with everything just working. It's dead easy, until a drive actually dies and needs replacing. If the user is tech-savvy, it's a matter of shutting the machine down, opening the case, pulling out the dead drive, replacing it, and possibly poking at the BIOS on boot to make it aware of the new drive. There is no good reason for that downtime. Furthermore, moving to a new computer means copying all the data from the old computer rather than just transferring the physical drives. Worse, if the hardware RAID controller dies there is no guarantee that it will be possible to find another RAID controller that understands the old controller's disk format. There is no standard disk layout for RAID, so each manufacturer's controller's use their own proprietary format. The format may vary from model to model or even version to version from the same manufacturer.

To be fair, it can be much easier to replace drives and move to a new computer. The Drobo is a good example. Where it fails is on both price and proprietary format. If your Drobo fails, you need a new Drobo to get to your data. There is no other manufacturer you can turn to, and if Data Robotics goes out of business then you will be obligated to pay whatever the asking price is on eBay to replace it, or lose your data. A good backup plan will let you restore from a backup to some other manufacturer's system, but now you're talking about serious downtime.

My setup uses software RAID under Debian GNU/Linux. I have an old machine I bought for $200 (including shipping!) from some surplus storefront on the web. I have a no-name eSATA PCI card, a pair of no-name eSATA enclosures, and a pair of disks which are probably either Seagate or Western Digital. When I get the chance, I'm going to add a third drive and enclosure as a hot spare. I keep around spare cables, and I can buy a replacement component, even the computer, for little cost and receive it quickly. Replacing a disk doesn't require any downtime, just unplugging the dead one, plugging in a new one, and letting Linux know about the new one. I know that I will always be able to move my RAID to a new machine because I will always be able to get the same (or backward-compatible) RAID implementation under Linux.

There are tradeoffs, of course. Software RAID will never be as fast as hardware RAID (at least not for a reasonable price). There is some manual configuration of software RAID under Linux, and it will never be as simple as a Drobo or a preconfigured system. If you don't use the Linux RAID machine as your primary computer, there is configuration involved in setting up network shares (especially in making them secure). If you prioritize availability, maintainability, and price over simplicity, as I do, Linux software RAID is the right choice. If you have other priorities and are fully aware of the tradeoffs, however, it may not be the right choice for you. Enjoy!

Labels: ,