287 lines
12 KiB
HTML
287 lines
12 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.21">
|
|
<TITLE>The Software-RAID HOWTO: Detecting, querying and testing</TITLE>
|
|
<LINK HREF="Software-RAID-HOWTO-7.html" REL=next>
|
|
<LINK HREF="Software-RAID-HOWTO-5.html" REL=previous>
|
|
<LINK HREF="Software-RAID-HOWTO.html#toc6" REL=contents>
|
|
</HEAD>
|
|
<BODY>
|
|
<A HREF="Software-RAID-HOWTO-7.html">Next</A>
|
|
<A HREF="Software-RAID-HOWTO-5.html">Previous</A>
|
|
<A HREF="Software-RAID-HOWTO.html#toc6">Contents</A>
|
|
<HR>
|
|
<H2><A NAME="s6">6.</A> <A HREF="Software-RAID-HOWTO.html#toc6">Detecting, querying and testing</A></H2>
|
|
|
|
<P><B>This HOWTO is deprecated; the Linux RAID HOWTO is maintained as a wiki by the
|
|
linux-raid community at
|
|
<A HREF="http://raid.wiki.kernel.org/">http://raid.wiki.kernel.org/</A></B></P>
|
|
<P>This section is about life with a software RAID system, that's
|
|
communicating with the arrays and tinkertoying them.</P>
|
|
<P>Note that when it comes to md devices manipulation, you should always
|
|
remember that you are working with entire filesystems. So, although
|
|
there could be some redundancy to keep your files alive, you must
|
|
proceed with caution.</P>
|
|
|
|
|
|
<H2><A NAME="ss6.1">6.1</A> <A HREF="Software-RAID-HOWTO.html#toc6.1">Detecting a drive failure</A>
|
|
</H2>
|
|
|
|
<P>No mistery here. It's enough with a quick look to the standard log and
|
|
stat files to notice a drive failure.</P>
|
|
<P>It's always a must for <CODE>/var/log/messages</CODE> to fill screens with
|
|
tons of error messages, no matter what happened. But, when it's about
|
|
a disk crash, huge lots of kernel errors are reported.
|
|
Some nasty examples, for the masochists,
|
|
<PRE>
|
|
kernel: scsi0 channel 0 : resetting for second half of retries.
|
|
kernel: SCSI bus is being reset for host 0 channel 0.
|
|
kernel: scsi0: Sending Bus Device Reset CCB #2666 to Target 0
|
|
kernel: scsi0: Bus Device Reset CCB #2666 to Target 0 Completed
|
|
kernel: scsi : aborting command due to timeout : pid 2649, scsi0, channel 0, id 0, lun 0 Write (6) 18 33 11 24 00
|
|
kernel: scsi0: Aborting CCB #2669 to Target 0
|
|
kernel: SCSI host 0 channel 0 reset (pid 2644) timed out - trying harder
|
|
kernel: SCSI bus is being reset for host 0 channel 0.
|
|
kernel: scsi0: CCB #2669 to Target 0 Aborted
|
|
kernel: scsi0: Resetting BusLogic BT-958 due to Target 0
|
|
kernel: scsi0: *** BusLogic BT-958 Initialized Successfully ***
|
|
</PRE>
|
|
|
|
Most often, disk failures look like these,
|
|
<PRE>
|
|
kernel: sidisk I/O error: dev 08:01, sector 1590410
|
|
kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 28000002
|
|
</PRE>
|
|
|
|
or these
|
|
<PRE>
|
|
kernel: hde: read_intr: error=0x10 { SectorIdNotFound }, CHS=31563/14/35, sector=0
|
|
kernel: hde: read_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
|
|
</PRE>
|
|
|
|
And, as expected, the classic <CODE>/proc/mdstat</CODE> look will also reveal problems,
|
|
<PRE>
|
|
Personalities : [linear] [raid0] [raid1] [translucent]
|
|
read_ahead not set
|
|
md7 : active raid1 sdc9[0] sdd5[8] 32000 blocks [2/1] [U_]
|
|
</PRE>
|
|
|
|
Later on this section we will learn how to monitor RAID with mdadm so we
|
|
can receive alert reports about disk failures. Now it's time to learn more
|
|
about <CODE>/proc/mdstat</CODE> interpretation.</P>
|
|
|
|
|
|
<H2><A NAME="ss6.2">6.2</A> <A HREF="Software-RAID-HOWTO.html#toc6.2">Querying the arrays status</A>
|
|
</H2>
|
|
|
|
<P>You can always take a look at <CODE>/proc/mdstat</CODE>. It won't hurt. Let's learn
|
|
how to read the file. For example,
|
|
<PRE>
|
|
Personalities : [raid1]
|
|
read_ahead 1024 sectors
|
|
md5 : active raid1 sdb5[1] sda5[0]
|
|
4200896 blocks [2/2] [UU]
|
|
|
|
md6 : active raid1 sdb6[1] sda6[0]
|
|
2104384 blocks [2/2] [UU]
|
|
|
|
md7 : active raid1 sdb7[1] sda7[0]
|
|
2104384 blocks [2/2] [UU]
|
|
|
|
md2 : active raid1 sdc7[1] sdd8[2] sde5[0]
|
|
1052160 blocks [2/2] [UU]
|
|
|
|
unused devices: none
|
|
</PRE>
|
|
|
|
To identify the spare devices, first look for the [#/#] value on a line.
|
|
The first number is the number of a complete raid device as defined.
|
|
Lets say it is "n".
|
|
The raid role numbers [#] following each device indicate its
|
|
role, or function, within the raid set. Any device with "n" or
|
|
higher are spare disks. 0,1,..,n-1 are for the working array.</P>
|
|
<P>Also, if you have a failure, the failed device will be marked with (F)
|
|
after the [#]. The spare that replaces this device will be the device
|
|
with the lowest role number n or higher that is not marked (F). Once the
|
|
resync operation is complete, the device's role numbers are swapped.</P>
|
|
<P>The order in which the devices appear in the <CODE>/proc/mdstat</CODE> output
|
|
means nothing.</P>
|
|
<P>Finally, remember that you can always use raidtools or mdadm to
|
|
check the arrays out.
|
|
<PRE>
|
|
mdadm --detail /dev/mdx
|
|
lsraid -a /dev/mdx
|
|
</PRE>
|
|
|
|
These commands will show spare and failed disks loud and clear.</P>
|
|
|
|
|
|
<H2><A NAME="ss6.3">6.3</A> <A HREF="Software-RAID-HOWTO.html#toc6.3">Simulating a drive failure</A>
|
|
</H2>
|
|
|
|
<P>If you plan to use RAID to get fault-tolerance, you may also want to
|
|
test your setup, to see if it really works. Now, how does one
|
|
simulate a disk failure?</P>
|
|
<P>The short story is, that you can't, except perhaps for putting a fire
|
|
axe thru the drive you want to "simulate" the fault on. You can never
|
|
know what will happen if a drive dies. It may electrically take the
|
|
bus it is attached to with it, rendering all drives on that bus
|
|
inaccessible. I have never heard of that happening though, but it is
|
|
entirely possible. The drive may also just report a read/write fault
|
|
to the SCSI/IDE layer, which in turn makes the RAID layer handle this
|
|
situation gracefully. This is fortunately the way things often go.</P>
|
|
<P>Remember, that you must be running RAID-{1,4,5} for your array to be
|
|
able to survive a disk failure. Linear- or RAID-0 will fail completely
|
|
when a device is missing.</P>
|
|
|
|
|
|
<H3>Force-fail by hardware</H3>
|
|
|
|
<P>If you want to simulate a drive failure, you can just plug out the drive.
|
|
You should do this with the power off. If you are interested in testing
|
|
whether your data can survive with a disk less than the usual number,
|
|
there is no point in being a hot-plug cowboy here. Take the system
|
|
down, unplug the disk, and boot it up again.</P>
|
|
<P>Look in the syslog, and look at <CODE>/proc/mdstat</CODE> to see how the RAID is
|
|
doing. Did it work?</P>
|
|
<P>Faulty disks should appear marked with an <CODE>(F)</CODE> if you look at
|
|
<CODE>/proc/mdstat</CODE>.
|
|
Also, users of mdadm should see the device state as <CODE>faulty</CODE>.</P>
|
|
<P>When you've re-connected the disk again (with the power off, of
|
|
course, remember), you can add the "new" device to the RAID again,
|
|
with the raidhotadd command.</P>
|
|
|
|
|
|
<H3>Force-fail by software</H3>
|
|
|
|
<P>Newer versions of raidtools come with a <CODE>raidsetfaulty</CODE> command.
|
|
By using <CODE>raidsetfaulty</CODE> you can just simulate a drive failure without
|
|
unplugging things off.</P>
|
|
<P>Just running the command
|
|
<PRE>
|
|
raidsetfaulty /dev/md1 /dev/sdc2
|
|
</PRE>
|
|
|
|
should be enough to fail the disk /dev/sdc2 of the array /dev/md1.
|
|
If you are using mdadm, just type
|
|
<PRE>
|
|
mdadm --manage --set-faulty /dev/md1 /dev/sdc2
|
|
</PRE>
|
|
|
|
Now things move up and fun appears. First, you should see something
|
|
like the first line of this on your system's log. Something like the
|
|
second line will appear if you have spare disks configured.
|
|
<PRE>
|
|
kernel: raid1: Disk failure on sdc2, disabling device.
|
|
kernel: md1: resyncing spare disk sdb7 to replace failed disk
|
|
</PRE>
|
|
|
|
Checking <CODE>/proc/mdstat</CODE> out will show the degraded array. If there was a
|
|
spare disk available, reconstruction should have started.</P>
|
|
<P>Another fresh utility in newest raidtools is <CODE>lsraid</CODE>. Try with
|
|
<PRE>
|
|
lsraid -a /dev/md1
|
|
</PRE>
|
|
|
|
users of mdadm can run the command
|
|
<PRE>
|
|
mdadm --detail /dev/md1
|
|
</PRE>
|
|
|
|
and enjoy the view.</P>
|
|
<P>Now you've seen how it goes when a device fails. Let's fix things up.</P>
|
|
<P>First, we will remove the failed disk from the array. Run the command
|
|
<PRE>
|
|
raidhotremove /dev/md1 /dev/sdc2
|
|
</PRE>
|
|
|
|
users of mdadm can run the command
|
|
<PRE>
|
|
mdadm /dev/md1 -r /dev/sdc2
|
|
</PRE>
|
|
|
|
Note that <CODE>raidhotremove</CODE> cannot pull a disk out of a running array.
|
|
For obvious reasons, only crashed disks are to be hotremoved from an
|
|
array (running raidstop and unmounting the device won't help).</P>
|
|
<P>Now we have a /dev/md1 which has just lost a device. This could be
|
|
a degraded RAID or perhaps a system in the middle of a reconstruction
|
|
process. We wait until recovery ends before setting things back to normal.</P>
|
|
<P>So the trip ends when we send /dev/sdc2 back home.
|
|
<PRE>
|
|
raidhotadd /dev/md1 /dev/sdc2
|
|
</PRE>
|
|
|
|
As usual, you can use mdadm instead of raidtools. This should be the
|
|
command
|
|
<PRE>
|
|
mdadm /dev/md1 -a /dev/sdc2
|
|
</PRE>
|
|
|
|
As the prodigal son returns to the array, we'll see it becoming an active
|
|
member of /dev/md1 if necessary. If not, it will be marked as an spare disk.
|
|
That's management made easy.</P>
|
|
|
|
|
|
<H2><A NAME="ss6.4">6.4</A> <A HREF="Software-RAID-HOWTO.html#toc6.4">Simulating data corruption</A>
|
|
</H2>
|
|
|
|
<P>RAID (be it hardware- or software-), assumes that if a write to a disk
|
|
doesn't return an error, then the write was successful. Therefore, if
|
|
your disk corrupts data without returning an error, your data
|
|
<EM>will</EM> become corrupted. This is of course very unlikely to
|
|
happen, but it is possible, and it would result in a corrupt
|
|
filesystem.</P>
|
|
<P>RAID cannot and is not supposed to guard against data corruption on
|
|
the media. Therefore, it doesn't make any sense either, to purposely
|
|
corrupt data (using <CODE>dd</CODE> for example) on a disk to see how the
|
|
RAID system will handle that. It is most likely (unless you corrupt
|
|
the RAID superblock) that the RAID layer will never find out about the
|
|
corruption, but your filesystem on the RAID device will be corrupted.</P>
|
|
<P>This is the way things are supposed to work. RAID is not a guarantee
|
|
for data integrity, it just allows you to keep your data if a disk
|
|
dies (that is, with RAID levels above or equal one, of course).</P>
|
|
|
|
|
|
<H2><A NAME="ss6.5">6.5</A> <A HREF="Software-RAID-HOWTO.html#toc6.5">Monitoring RAID arrays</A>
|
|
</H2>
|
|
|
|
<P>You can run mdadm as a daemon by using the follow-monitor mode.
|
|
If needed, that will make mdadm send email alerts to the system
|
|
administrator when arrays encounter errors or fail. Also, follow mode
|
|
can be used to trigger contingency commands if a disk fails, like
|
|
giving a second chance to a failed disk by removing and reinserting it,
|
|
so a non-fatal failure could be automatically solved.</P>
|
|
<P>Let's see a basic example.
|
|
Running
|
|
<PRE>
|
|
mdadm --monitor --mail=root@localhost --delay=1800 /dev/md2
|
|
</PRE>
|
|
|
|
should release a mdadm daemon to monitor /dev/md2.
|
|
The delay parameter means that polling will be done in intervals of
|
|
1800 seconds. Finally, critical events and fatal errors should be
|
|
e-mailed to the system manager. That's RAID monitoring made easy.</P>
|
|
<P>Finally, the <CODE>--program</CODE> or <CODE>--alert</CODE> parameters
|
|
specify the program to be run whenever an event is detected.</P>
|
|
<P>Note that the mdadm daemon will never exit once it decides that
|
|
there are arrays to monitor, so it should normally be run in the
|
|
background. Remember that your are running a daemon, not a
|
|
shell command.</P>
|
|
<P>Using mdadm to monitor a RAID array is simple and effective. However,
|
|
there are fundamental problems with that kind of monitoring - what
|
|
happens, for example, if the mdadm daemon stops? In order to overcome
|
|
this problem, one should look towards "real" monitoring
|
|
solutions. There is a number of free software, open source, and
|
|
commercial solutions available which can be used for Software RAID
|
|
monitoring on Linux. A search on
|
|
<A HREF="http://freshmeat.net">FreshMeat</A> should return a good number of matches.</P>
|
|
|
|
|
|
<HR>
|
|
<A HREF="Software-RAID-HOWTO-7.html">Next</A>
|
|
<A HREF="Software-RAID-HOWTO-5.html">Previous</A>
|
|
<A HREF="Software-RAID-HOWTO.html#toc6">Contents</A>
|
|
</BODY>
|
|
</HTML>
|