old-www/HOWTO/Software-RAID-HOWTO-4.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
 <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.21">
 <TITLE>The Software-RAID HOWTO: Hardware issues</TITLE>
 <LINK HREF="Software-RAID-HOWTO-5.html" REL=next>
 <LINK HREF="Software-RAID-HOWTO-3.html" REL=previous>
 <LINK HREF="Software-RAID-HOWTO.html#toc4" REL=contents>
</HEAD>
<BODY>
<A HREF="Software-RAID-HOWTO-5.html">Next</A>
<A HREF="Software-RAID-HOWTO-3.html">Previous</A>
<A HREF="Software-RAID-HOWTO.html#toc4">Contents</A>
<HR>
<H2><A NAME="s4">4.</A> <A HREF="Software-RAID-HOWTO.html#toc4">Hardware issues</A></H2>

<P><B>This HOWTO is deprecated; the Linux RAID HOWTO is maintained as a wiki by the
linux-raid community at
<A HREF="http://raid.wiki.kernel.org/">http://raid.wiki.kernel.org/</A></B></P>
<P>This section will mention some of the hardware concerns involved when
running software RAID.</P>
<P>If you are going after high performance, you should make sure that the
bus(ses) to the drives are fast enough. You should not have 14 UW-SCSI
drives on one UW bus, if each drive can give 20 MB/s and the bus can
only sustain 160 MB/s.  Also, you should only have one device per IDE
bus. Running disks as master/slave is horrible for performance. IDE is
really bad at accessing more that one drive per bus.  Of Course, all
newer motherboards have two IDE busses, so you can set up two disks in
RAID without buying more controllers. Extra IDE controllers are rather
cheap these days, so setting up 6-8 disk systems with IDE is easy and
affordable.</P>

<H2><A NAME="ss4.1">4.1</A> <A HREF="Software-RAID-HOWTO.html#toc4.1">IDE Configuration</A>
</H2>

<P>It is indeed possible to run RAID over IDE disks. And excellent
performance can be achieved too. In fact, today's price on IDE drives
and controllers does make IDE something to be considered, when setting
up new RAID systems.
<UL>
<LI><B>Physical stability:</B> IDE drives has traditionally
been of lower mechanical quality than SCSI drives. Even today, the
warranty on IDE drives is typically one year, whereas it is often
three to five years on SCSI drives.  Although it is not fair to say,
that IDE drives are per definition poorly made, one should be aware
that IDE drives of <EM>some</EM> brand <EM>may</EM> fail more often
that similar SCSI drives. However, other brands use the exact same
mechanical setup for both SCSI and IDE drives. It all boils down to:
All disks fail, sooner or later, and one should be prepared for that.</LI>
<LI><B>Data integrity:</B> Earlier, IDE had no way of assuring
that the data sent onto the IDE bus would be the same as the data
actually written to the disk. This was due to total lack of parity,
checksums, etc.  With the Ultra-DMA standard, IDE drives now do a
checksum on the data they receive, and thus it becomes highly unlikely
that data get corrupted. The PCI bus however, does not have parity or
checksum, and that bus is used for both IDE and SCSI systems.</LI>
<LI><B>Performance:</B> I am not going to write thoroughly about
IDE performance here. The really short story is:
<UL>
<LI>IDE drives are fast, although they are not (as of this writing)
found in 10.000 or 15.000 rpm versions as their SCSI counterparts</LI>
<LI>IDE has more CPU overhead than SCSI (but who cares?)</LI>
<LI>Only use <B>one</B> IDE drive per IDE bus, slave disks spoil
performance</LI>
</UL>
</LI>
<LI><B>Fault survival:</B> The IDE driver usually survives a failing
IDE device. The RAID layer will mark the disk as failed, and if you
are running RAID levels 1 or above, the machine should work just fine
until you can take it down for maintenance.</LI>
</UL>
</P>
<P>It is <B>very</B> important, that you only use <B>one</B> IDE disk
per IDE bus. Not only would two disks ruin the performance, but the
failure of a disk often guarantees the failure of the bus, and
therefore the failure of all disks on that bus.  In a fault-tolerant
RAID setup (RAID levels 1,4,5), the failure of one disk can be
handled, but the failure of two disks (the two disks on the bus that
fails due to the failure of the one disk) will render the array
unusable. Also, when the master drive on a bus fails, the slave or the
IDE controller may get awfully confused. One bus, one drive, that's
the rule.</P>
<P>There are cheap PCI IDE controllers out there. You often get two or
four busses for around $80. Considering the much lower price of IDE
disks versus SCSI disks, an IDE disk array can often be a really nice
solution if one can live with the relatively low number (around 8
probably) of disks one can attach to a typical system.</P>
<P>IDE has major cabling problems when it comes to large arrays. Even if
you had enough PCI slots, it's unlikely that you could fit much more
than 8 disks in a system and still get it running without data
corruption caused by too long IDE cables.</P>
<P>Furthermore, some of the newer IDE drives come with a restriction that
they are only to be used a given number of hours per day. These drives
are meant for desktop usage, and it <B>can</B> lead to severe
problems if these are used in a 24/7 server RAID environment.</P>


<H2><A NAME="ss4.2">4.2</A> <A HREF="Software-RAID-HOWTO.html#toc4.2">Hot Swap</A>
</H2>

<P>Although hot swapping of drives is supported to some extent, it is
still not something one can do easily.</P>

<H3>Hot-swapping IDE drives</H3>

<P><B>Don't !</B> IDE doesn't handle hot swapping at all.  Sure, it may
work for you, if your IDE driver is compiled as a module (only
possible in the 2.2 series of the kernel), and you re-load it after
you've replaced the drive.  But you may just as well end up with a
fried IDE controller, and you'll be looking at a lot more down-time
than just the time it would have taken to replace the drive on a
downed system.</P>
<P>The main problem, except for the electrical issues that can destroy
your hardware, is that the IDE bus must be re-scanned after disks are
swapped. While newer Linux kernels do support re-scan of an IDE bus
(with the help of the hdparm utility), re-detecting partitions is
still something that is lacking.  If the new disk is 100% identical to
the old one (wrt. geometry etc.), it <EM>may</EM> work, but really,
you are walking the bleeding edge here.</P>

<H3>Hot-swapping SCSI drives</H3>

<P>Normal SCSI hardware is not hot-swappable either. It <B>may</B>
however work. If your SCSI driver supports re-scanning the bus, and
removing and appending devices, you may be able to hot-swap
devices. However, on a normal SCSI bus you probably shouldn't unplug
devices while your system is still powered up. But then again, it may
just work (and you may end up with fried hardware).</P>
<P>The SCSI layer <B>should</B> survive if a disk dies, but not all
SCSI drivers handle this yet. If your SCSI driver dies when a disk
goes down, your system will go with it, and hot-plug isn't really
interesting then.</P>

<H3>Hot-swapping with SCA</H3>

<P>With SCA, it is possible to hot-plug devices. Unfortunately, this is
not as simple as it should be, but it is both possible and safe.</P>
<P>Replace the RAID device, disk device, and host/channel/id/lun numbers
with the appropriate values in the example below:</P>
<P>
<UL>
<LI>Dump the partition table from the drive, if it is still
readable:
<PRE>
sfdisk -d /dev/sdb > partitions.sdb
</PRE>
</LI>
<LI>Remove the drive to replace from the array:
<PRE>
raidhotremove /dev/md0 /dev/sdb1
</PRE>
</LI>
<LI>Look up the Host, Channel, ID and Lun of the drive to replace,
by looking in
<PRE>
/proc/scsi/scsi
</PRE>
</LI>
<LI>Remove the drive from the bus:
<PRE>
echo "scsi remove-single-device 0 0 2 0" > /proc/scsi/scsi
</PRE>
</LI>
<LI>Verify that the drive has been correctly removed, by looking in
<PRE>
/proc/scsi/scsi
</PRE>
</LI>
<LI>Unplug the drive from your SCA bay, and insert a new drive</LI>
<LI>Add the new drive to the bus:
<PRE>
echo "scsi add-single-device 0 0 2 0" > /proc/scsi/scsi
</PRE>

(this should spin up the drive as well)</LI>
<LI>Re-partition the drive using the previously dumped partition
table:
<PRE>
sfdisk /dev/sdb &lt; partitions.sdb
</PRE>
</LI>
<LI>Add the drive to your array:
<PRE>
raidhotadd /dev/md0 /dev/sdb2
</PRE>
</LI>
</UL>
</P>
<P>The arguments to the "scsi remove-single-device" commands
are: Host, Channel, Id and Lun. These numbers are found in the
"/proc/scsi/scsi" file.</P>
<P>The above steps have been tried and tested on a system with IBM SCA
disks and an Adaptec SCSI controller. If you encounter problems or
find easier ways to do this, please discuss this on the linux-raid
mailing list.</P>


<HR>
<A HREF="Software-RAID-HOWTO-5.html">Next</A>
<A HREF="Software-RAID-HOWTO-3.html">Previous</A>
<A HREF="Software-RAID-HOWTO.html#toc4">Contents</A>
</BODY>
</HTML>