old-www/HOWTO/Software-RAID-0.4x-HOWTO-5....

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
 <META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
 <TITLE>Software-RAID HOWTO: Troubleshooting Install Problems</TITLE>
 <LINK HREF="Software-RAID-0.4x-HOWTO-6.html" REL=next>
 <LINK HREF="Software-RAID-0.4x-HOWTO-4.html" REL=previous>
 <LINK HREF="Software-RAID-0.4x-HOWTO.html#toc5" REL=contents>
</HEAD>
<BODY>
<A HREF="Software-RAID-0.4x-HOWTO-6.html">Next</A>
<A HREF="Software-RAID-0.4x-HOWTO-4.html">Previous</A>
<A HREF="Software-RAID-0.4x-HOWTO.html#toc5">Contents</A>
<HR>
<H2><A NAME="s5">5. Troubleshooting Install Problems</A></H2>

<P>
<OL>
<LI><B>Q</B>:
What is the current best known-stable patch for RAID in the
2.0.x series kernels?

<BLOCKQUOTE>
<B>A</B>:
As of 18 Sept 1997, it is
"2.0.30 + pre-9 2.0.31 + Werner Fink's swapping patch
+ the alpha RAID patch".  As of November 1997, it is
2.0.31 + ... !?
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
The RAID patches will not install cleanly for me.  What's wrong?
<BLOCKQUOTE>
<B>A</B>:
Make sure that <CODE>/usr/include/linux</CODE> is a symbolic link to
<CODE>/usr/src/linux/include/linux</CODE>.

Make sure that the new files <CODE>raid5.c</CODE>, etc.
have been copied to their correct locations.  Sometimes
the patch command will not create new files.  Try the
<CODE>-f</CODE> flag on <CODE>patch</CODE>.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
While compiling raidtools 0.42, compilation stops trying to
include &lt;pthread.h&gt; but it doesn't exist in my system.
How do I fix this?

<BLOCKQUOTE>
<B>A</B>:
raidtools-0.42 requires linuxthreads-0.6 from:
<A HREF="ftp://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy">ftp://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy</A>
Alternately, use glibc v2.0.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I get the message: <CODE>mdrun -a /dev/md0: Invalid argument</CODE>

<BLOCKQUOTE>
<B>A</B>:
Use <CODE>mkraid</CODE> to initialize the RAID set prior to the first use.
<CODE>mkraid</CODE> ensures that the RAID array is initially in a
consistent state by erasing the RAID partitions. In addition,
<CODE>mkraid</CODE> will create the RAID superblocks.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I get the message: <CODE>mdrun -a /dev/md0: Invalid argument</CODE>
The setup was:
<UL>
<LI>raid build as a kernel module</LI>
<LI>normal install procedure followed ... mdcreate, mdadd, etc.</LI>
<LI><CODE>cat /proc/mdstat</CODE> shows
<PRE>
    Personalities :
    read_ahead not set
    md0 : inactive sda1 sdb1 6313482 blocks
    md1 : inactive
    md2 : inactive
    md3 : inactive

</PRE>
</LI>
<LI><CODE>mdrun -a</CODE> generates the error message
<CODE>/dev/md0: Invalid argument</CODE></LI>
</UL>


<BLOCKQUOTE>
<B>A</B>:
Try <CODE>lsmod</CODE> (or, alternately, <CODE>cat
/proc/modules</CODE>) to see if the raid modules are loaded.
If they are not, you can load them explicitly with
the <CODE>modprobe raid1</CODE> or <CODE>modprobe raid5</CODE>
command.  Alternately,  if you are using the autoloader,
and expected <CODE>kerneld</CODE> to load them and it didn't
this is probably because your loader is missing the info to
load the modules.  Edit <CODE>/etc/conf.modules</CODE> and add
the following lines:

<PRE>
    alias md-personality-3 raid1
    alias md-personality-4 raid5

</PRE>

</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
While doing <CODE>mdadd -a</CODE> I get the error:
<CODE>/dev/md0: No such file or directory</CODE>.  Indeed, there
seems to be no <CODE>/dev/md0</CODE> anywhere.  Now what do I do?

<BLOCKQUOTE>
<B>A</B>:
The raid-tools package will create these devices when
you run <CODE>make install</CODE> as root.  Alternately,
you can do the following:
<PRE>
    cd /dev
    ./MAKEDEV md

</PRE>
</BLOCKQUOTE>


</LI>
<LI><B>Q</B>:
After creating a raid array on <CODE>/dev/md0</CODE>,
I try to mount it and get the following error:
<CODE> mount: wrong fs type, bad option, bad superblock on /dev/md0,
or too many mounted file systems</CODE>. What's wrong?
<BLOCKQUOTE>
<B>A</B>:
You need to create a file system on <CODE>/dev/md0</CODE>
before you can mount it.  Use <CODE>mke2fs</CODE>.
</BLOCKQUOTE>


</LI>
<LI><B>Q</B>:
Truxton Fulton wrote:
<BLOCKQUOTE>
On my Linux 2.0.30 system, while doing a <CODE>mkraid</CODE> for a
RAID-1 device,
during the clearing of the two individual partitions, I got
<CODE>"Cannot allocate free page"</CODE> errors appearing on the console,
and <CODE>"Unable to handle kernel paging request at virtual address ..."</CODE>
errors in the system log.  At this time, the system became quite
unusable, but it appears to recover after a while.  The operation
appears to have completed with no other errors, and I am
successfully using my RAID-1 device.  The errors are disconcerting
though.  Any ideas?
</BLOCKQUOTE>


<BLOCKQUOTE>
<B>A</B>:
This was a well-known bug in the 2.0.30 kernels.  It is fixed in
the 2.0.31 kernel; alternately, fall back to 2.0.29.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I'm not able to <CODE>mdrun</CODE> a RAID-1, RAID-4 or RAID-5 device.
If I try to <CODE>mdrun</CODE> a <CODE>mdadd</CODE>'ed device I get
the message ''<CODE>invalid raid superblock magic</CODE>''.

<BLOCKQUOTE>
<B>A</B>:
Make sure that you've run the <CODE>mkraid</CODE> part of the install
procedure.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
When I access <CODE>/dev/md0</CODE>, the kernel spits out a
lot of errors like <CODE>md0: device not running, giving up !</CODE>
and <CODE>I/O error...</CODE>. I've successfully added my devices to
the virtual device.

<BLOCKQUOTE>
<B>A</B>:
To be usable, the device must be running. Use
<CODE>mdrun -px /dev/md0</CODE> where x is l for linear, 0 for
RAID-0 or 1 for RAID-1, etc.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I've created a linear md-dev with 2 devices.
<CODE>cat /proc/mdstat</CODE> shows
the total size of the device, but <CODE>df</CODE> only shows the size of the first
physical device.

<BLOCKQUOTE>
<B>A</B>:
You must <CODE>mkfs</CODE> your new md-dev before using it
the first time, so that the filesystem will cover the whole device.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I've set up <CODE>/etc/mdtab</CODE> using mdcreate, I've
<CODE>mdadd</CODE>'ed, <CODE>mdrun</CODE> and <CODE>fsck</CODE>'ed
my two <CODE>/dev/mdX</CODE> partitions.  Everything looks
okay before a reboot.  As soon as I reboot, I get an
<CODE>fsck</CODE> error on both partitions: <CODE>fsck.ext2: Attempt to read block from filesystem resulted in short
read while trying too open /dev/md0</CODE>.  Why?! How do
I fix it?!

<BLOCKQUOTE>
<B>A</B>:
During the boot process, the RAID partitions must be started
before they can be <CODE>fsck</CODE>'ed.  This must be done
in one of the boot scripts.  For some distributions,
<CODE>fsck</CODE> is called from <CODE>/etc/rc.d/rc.S</CODE>, for others,
it is called from <CODE>/etc/rc.d/rc.sysinit</CODE>. Change this
file to <CODE>mdadd -ar</CODE> *before* <CODE>fsck -A</CODE>
is executed.  Better yet, it is suggested that
<CODE>ckraid</CODE> be run if <CODE>mdadd</CODE> returns with an
error.  How do do this is discussed in greater detail in
question 14 of the section ''Error Recovery''.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I get the message <CODE>invalid raid superblock magic</CODE> while
trying to run an array which consists of partitions which are
bigger than 4GB.

<BLOCKQUOTE>
<B>A</B>:
This bug is now fixed. (September 97)  Make sure you have the latest
raid code.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I get the message <CODE>Warning: could not write 8 blocks in inode table starting at 2097175</CODE> while trying to run mke2fs on
a partition which is larger than 2GB.

<BLOCKQUOTE>
<B>A</B>:
This seems to be a problem with <CODE>mke2fs</CODE>
(November 97).  A temporary work-around is to get the mke2fs
code, and add <CODE>#undef HAVE_LLSEEK</CODE> to
<CODE>e2fsprogs-1.10/lib/ext2fs/llseek.c</CODE> just before the
first <CODE>#ifdef HAVE_LLSEEK</CODE> and recompile mke2fs.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
<CODE>ckraid</CODE> currently isn't able to read <CODE>/etc/mdtab</CODE>

<BLOCKQUOTE>
<B>A</B>:
The RAID0/linear configuration file format used in
<CODE>/etc/mdtab</CODE> is obsolete, although it will be supported
for a while more.  The current, up-to-date config files
are currently named <CODE>/etc/raid1.conf</CODE>, etc.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
The personality modules (<CODE>raid1.o</CODE>) are not loaded automatically;
they have to be manually modprobe'd before mdrun. How can this
be fixed?

<BLOCKQUOTE>
<B>A</B>:
To autoload the modules, we can add the following to
<CODE>/etc/conf.modules</CODE>:
<PRE>
    alias md-personality-3 raid1
    alias md-personality-4 raid5

</PRE>

</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I've <CODE>mdadd</CODE>'ed 13 devices, and now I'm trying to
<CODE>mdrun -p5 /dev/md0</CODE> and get the message:
<CODE>/dev/md0: Invalid argument</CODE>

<BLOCKQUOTE>
<B>A</B>:
The default configuration for software RAID is 8 real
devices. Edit <CODE>linux/md.h</CODE>, change
<CODE>#define MAX_REAL=8</CODE> to a larger number, and
rebuild the kernel.
</BLOCKQUOTE>

</LI>
<LI><B>Q</B>:
I can't make <CODE>md</CODE> work with partitions on our
latest SPARCstation 5.  I suspect that this has something
to do with disk-labels.

<BLOCKQUOTE>
<B>A</B>:
Sun disk-labels sit in the first 1K of a partition.
For RAID-1, the Sun disk-label is not an issue since
<CODE>ext2fs</CODE> will skip the label on every mirror.
For other raid levels (0, linear and 4/5), this
appears to be a problem; it has not yet (Dec 97) been
addressed.
</BLOCKQUOTE>

</LI>
</OL>
<HR>
<A HREF="Software-RAID-0.4x-HOWTO-6.html">Next</A>
<A HREF="Software-RAID-0.4x-HOWTO-4.html">Previous</A>
<A HREF="Software-RAID-0.4x-HOWTO.html#toc5">Contents</A>
</BODY>
</HTML>