old-www/LDP/LG/issue76/tag/10.html

386 lines
15 KiB
HTML

<!--startcut ======================================================= -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<META NAME="generator" CONTENT="lgazmail v1.4F.n">
<TITLE>The Answer Gang 76: Hard Disk: BadCRC errors from dma_intr on bootup...</TITLE>
</HEAD><BODY BGCOLOR="#FFFFFF" TEXT="#000000"
LINK="#3366FF" VLINK="#A000A0">
<!--endcut ========================================================= -->
<P> <hr>
<!--startcut ======================================================= -->
<CENTER>
<!-- *** BEGIN navbar *** -->
<!-- *** END navbar *** -->
</CENTER>
</p>
<!--endcut ========================================================= -->
<!--startcut ======================================================= -->
<P> <hr>
<!-- begin tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::-->
<p align="center">
<table width="100%" border="0"><tr>
<td align="right" valign="center"
><IMG ALT="" SRC="../../gx/navbar/left.jpg"
WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="middle" border="0"
><A HREF="..//"
><IMG SRC="../../gx/navbar/toc.jpg" align="middle"
ALT="[ Table Of Contents ]" border="0"></A
><A HREF="../lg_answer.html"
><IMG SRC="../../gx/dennis/answertoc.jpg" align="middle"
ALT="[ Answer Guy Current Index ]" border="0"></A></td>
<td align="center" valign="center"><A HREF="../lg_answer.html#greeting"><img align="middle"
src="../../gx/dennis/smily.gif" alt="greetings" border="0"></A> &nbsp;
<A HREF="bios.html">Meet&nbsp;the&nbsp;Gang</A> &nbsp;
<A HREF="1.html">1</A> &nbsp;
<A HREF="2.html">2</A> &nbsp;
<A HREF="3.html">3</A> &nbsp;
<A HREF="4.html">4</A> &nbsp;
<A HREF="5.html">5</A> &nbsp;
<A HREF="6.html">6</A> &nbsp;
<A HREF="7.html">7</A> &nbsp;
<A HREF="8.html">8</A> &nbsp;
<A HREF="9.html">9</A> &nbsp;
<A HREF="10.html">10</A> &nbsp;
<A HREF="11.html">11</A> &nbsp;
<A HREF="12.html">12</A>
</td>
<td align="left" valign="center"><A HREF="../../tag/kb.html"
><IMG SRC="../../gx/dennis/answerpast.jpg" align="middle"
ALT="[ Index of Past Answers ]" border="0"></A
><IMG ALT="" SRC="../../gx/navbar/right.jpg" align="middle"
WIDTH="14" HEIGHT="45" BORDER="0"></td></tr></table>
</p>
<!-- end tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<!--endcut ========================================================= -->
<P> <hr> <P>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<center>
<H1><A NAME="answer">
<img src="../../gx/dennis/qbubble.gif" alt="(?)"
border="0" align="middle">
<font color="#B03060">The Answer Gang</font>
<img src="../../gx/dennis/bbubble.gif" alt="(!)"
border="0" align="middle">
</A></H1>
<BR>
<H4>By Jim Dennis, Ben Okopnik, Dan Wilder, Breen, Chris, and...
(<a href="bios.html">meet the Gang</a>) ...
the Editors of Linux Gazette...
and You!
<br>Send questions (or interesting answers) to
The Answer Gang
for possible publication
(but read the <a href="ask-the-gang.html">guidelines</a> first)
</H4>
</center>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<p><hr><p>
<!-- begin 10 -->
<H3 align="left"><img src="../../gx/dennis/qbubble.gif"
height="50" width="60" alt="(?) " border="0"
>Hard Disk: BadCRC errors from dma_intr on bootup...</H3>
<p><strong>From Karthik Subramanian
</strong></p>
<p></strong></p>
<p align="right"><strong>Answered By Jay R. Ashworth, Chris Gianakopoulos, Didier Heyden, Johan H
</strong></p>
<P><STRONG>
Before i start, Many thanks for the good work
<IMG SRC="../../gx/dennis/smily.gif" ALT=":-)"
height="24" width="20" align="middle">
</STRONG></P>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jay]
We try.
</blockQuote>
<blockQuote>
Some of us are very trying, but you're expected to not notice.
<IMG SRC="../../gx/dennis/smily.gif" ALT=":-)"
height="24" width="20" align="middle">
</blockQuote>
<P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
I have a Samsung SV2042H (20 GB) as my primary master, and an ATAPI CD-ROM
of unknown make as my primary slave.
</STRONG></P>
<P><STRONG>
I recently noticed the following messages on bootup:
(extract from my <TT>/var/log/boot.msg</TT>)
</STRONG></P>
<pre><strong>&lt;4&gt;Freeing unused kernel memory: 112k freed
&lt;4&gt;hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
&lt;4&gt;hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
&lt;4&gt;hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
&lt;4&gt;hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
&lt;4&gt;hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
&lt;4&gt;hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
&lt;4&gt;hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
&lt;4&gt;hda: dma_intr: error=0x84 { DriveStatusError BadCRC }
&lt;4&gt;hdb: DMA disabled
&lt;4&gt;ide0: reset: success
</strong></pre>
<P><STRONG>
1) What do the dma_intr messages mean? Does my HDD go to the junk heap,
or is it possible for me to continue working with it? I have had no
problems with it so far, despite the error messages.
</STRONG></P>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jay]
I've been seeing something similar; same results, ie: nothing.
</blockQuote>
<blockQuote>
I think the IDE drivers got changed...
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Didier]
The DMA interrupt handler in the IDE driver seems to detect a data
transfer failure (BadCRC) 4 times consecutively. All drives present on
the corresponding IDE interface are then reset; in such a case (at least
if you run a 2.4.x kernel), (U)DMA is disabled on <EM>both</EM> drives, even
though you're told so only for your <TT>/dev/hdb</TT> CDROM (don't ask me why
the kernel people have chosen to do so - one would have expected the
faulty drive, hda, to be mentioned in a `DMA disabled' message as well
<IMG SRC="../../gx/dennis/smily.gif" ALT=":)"
height="24" width="20" align="middle">
</blockQuote>
<blockQuote>
The fact that everything works fine (?) after the reset (no more awful
messages and your system <EM>does</EM> boot, obviously) is reassuring: if your
hard drive is indeed ready for something this is not (yet) for being
sold back to your worst enemy
<IMG SRC="../../gx/dennis/smily.gif" ALT=";)"
height="24" width="20" align="middle"> Be careful, though...
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Johan]
If dma is enabled on a controller that is not well supported, these
errors can appear. ( I had it on a VIA KT266a with kernel 2.2.
Upgrading to kernel 2.4 fixed it beautifully.
</blockQuote>
<blockQuote>
If you are sure that the IDE controller is supported, the drive is on
its way out. You can run fsck with the badblock option turned on to
mark these blocks as bad... As a rule, once these errors start, we throw
the disk away(This is a high availability production environment).
</blockQuote>
<blockQuote>
If you dont mind that the disk can crash in the near future, make a
backup and continue using it, it might work for a long time to come.
</blockQuote>
<blockQuote>
If the disk is under guarantee... take it back, it is not worth risking
data loss if the drive can be replaced for free.
</blockQuote>
<blockQuote>
This is how you hunt for and fix badblocks.
</blockQuote>
<blockquote><pre># e2fsck -c /dev/hda1
</pre></blockquote>
<blockQuote>
Make sure that you have a backup, badblock scans can destroy data
running with certain switches.
</blockQuote>
<blockquote><pre># man badblocks &amp;&amp; man e2fsck (And read them carefully)
</pre></blockquote>
<blockQuote>
To turn of dma per drive
</blockQuote>
<blockquote><pre># hdparm -d0 /dev/hd[a-d]
</pre></blockquote>
<blockQuote>
To list dma settings
</blockQuote>
<blockquote><pre># hdparm -d /dev/hd[a-d]
</pre></blockquote>
<blockQuote>
To turn dma on
</blockQuote>
<blockquote><pre># hdparm -d1 /dev/hd[a-d]
</pre></blockquote>
<blockQuote>
Where hd[a-d] is hda, hdb, hdc, hdd.
</blockQuote>
<P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
2) I didn't see any options to turn DMA off for the peripherals in my BIOS
options - so why/how is DMA being disabled for hdb? ( i put in an 'hdparm
-d1 <TT>/dev/hdb</TT>' in my <TT>/etc/boot.local</TT> to enable DMA for hdb. )
</STRONG></P>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Didier]
You can pass an `ide=nodma' option to the boot loader to achieve this.
Note that in the present case you'd better remove the `hdparm' line
from your bootup script (-d1 is for forcing DMA on). Unfortunately I
don't think it can be done on a per-drive basis (nor even on a
per-interface basis).
</blockQuote>
<blockQuote>
To clarify, (U)DMA at kernel startup can only be <EM>globally</EM> disabled.
You'll have then to fiddle with the hdparm utility
to change this for a given drive (at your own risks).
</blockQuote>
<blockQuote>
There doesn't seem to be any `hdx=nodma'
(x = 'a', 'b', 'c' or 'd') nor `idex=nodma' (x = '0' or '1') kernel
options available at present -- the so-called note has been inserted
at a wrong place
<IMG SRC="../../gx/dennis/smily.gif" ALT=":)"
height="24" width="20" align="middle">
</blockQuote>
<blockQuote>
Apart from this you could try setting your CDROM drive as master on the
IDE1 interface.
</blockQuote>
<P><STRONG>
<IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
3) What does the number 4 prepended to the messages in <TT>/var/log/boot.msg</TT>
(there are other numbers for the other messages) mean?
</STRONG></P>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Jay]
You're running Mandrake, aren't you?
<IMG SRC="../../gx/dennis/smily.gif" ALT=":-)"
height="24" width="20" align="middle">
</blockQuote>
<blockQuote>
It's got something to do with the "debug level" that produces that
particular line of kprintf output, I believe.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Didier]
This number is most probably the log level associated to the given kernel
message (&lt;4&gt; is usually the default value and corresponds to the
KERN_WARNING level). A log level of &lt;0&gt; is for emergency conditions
(system unusable) and &lt;7&gt; for debug messages.
</blockQuote>
<blockQuote>
<IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
> [Chris]
Hi there,
I beleive that dma_intr implies that a DMA interrupt occurred that is
associated with your hard disk controller. You might be getting a Seek
Complete error due to a bad CRC. In other words, either your media (the
actual sectors of your hard disk platter) might be corrupt, or you might
have a problem with your cabling.
</blockQuote>
<blockQuote>
Before I trashed the drive, I would unplug and replug the IDE cable from
your disk controller AND your hard drive. Your disk controller might
reside on your motherboard, and in that case, you would unplug the cable
from the motherboard. You might also try a different IDE cable (the 40-pin
ribbon cable) between your disk and the disk controller.
</blockQuote>
<blockQuote>
I start to worry when I see the BadCRC error messages, because when that
happened to me, the hard disk eventually became useless. Make sure that
you back up any data that you want to keep.
</blockQuote>
<blockQuote>
I saw those error messages on my son's computer when I gave him one of my
hard drives that happened to be laying around. It was a 2Gb hard drive.
At first, the messages were an annoyance during boot up. As time passed,
we could not even get the system to boot up without running through fsck.
Finally, things got so bad that fsck couldn't fix the filesystems. The
drive is now on display, in parts, so that my son can show off the disk
platters to his friends.
</blockQuote>
<blockQuote>
Good luck, and don't forget to back up your data.
</blockQuote>
<!-- end 10 -->
<P> <hr> </p>
<!-- *** BEGIN copyright *** -->
<H5 align="center">This page edited and maintained by the Editors
of <I>Linux Gazette</I>
<a href="http://www.linuxgazette.com/copying.html"
>Copyright &copy;</a> 2002
<BR>Published in issue 76 of <I>Linux Gazette</I> March 2002</H5>
<H6 ALIGN="center">HTML script maintained by
<A HREF="mailto:star@starshine.org">Heather Stern</a> of
Starshine Technical Services,
<A HREF="http://www.starshine.org/">http://www.starshine.org/</A>
</H6>
<!-- *** END copyright *** -->
<!--startcut ======================================================= -->
<P> <hr>
<!-- begin tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::-->
<p align="center">
<table width="100%" border="0"><tr>
<td align="right" valign="center"
><IMG ALT="" SRC="../../gx/navbar/left.jpg"
WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="middle" border="0"
><A HREF="..//"
><IMG SRC="../../gx/navbar/toc.jpg" align="middle"
ALT="[ Table Of Contents ]" border="0"></A
><A HREF="../lg_answer.html"
><IMG SRC="../../gx/dennis/answertoc.jpg" align="middle"
ALT="[ Answer Guy Current Index ]" border="0"></A></td>
<td align="center" valign="center"><A HREF="../lg_answer.html#greeting"><img align="middle"
src="../../gx/dennis/smily.gif" alt="greetings" border="0"></A> &nbsp;
<A HREF="bios.html">Meet&nbsp;the&nbsp;Gang</A> &nbsp;
<A HREF="1.html">1</A> &nbsp;
<A HREF="2.html">2</A> &nbsp;
<A HREF="3.html">3</A> &nbsp;
<A HREF="4.html">4</A> &nbsp;
<A HREF="5.html">5</A> &nbsp;
<A HREF="6.html">6</A> &nbsp;
<A HREF="7.html">7</A> &nbsp;
<A HREF="8.html">8</A> &nbsp;
<A HREF="9.html">9</A> &nbsp;
<A HREF="10.html">10</A> &nbsp;
<A HREF="11.html">11</A> &nbsp;
<A HREF="12.html">12</A>
</td>
<td align="left" valign="center"><A HREF="../../tag/kb.html"
><IMG SRC="../../gx/dennis/answerpast.jpg" align="middle"
ALT="[ Index of Past Answers ]" border="0"></A
><IMG ALT="" SRC="../../gx/navbar/right.jpg" align="middle"
WIDTH="14" HEIGHT="45" BORDER="0"></td></tr></table>
</p>
<!-- end tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<!--endcut ========================================================= -->
<P> <hr>
<!--startcut ======================================================= -->
<CENTER>
<!-- *** BEGIN navbar *** -->
<!-- *** END navbar *** -->
</CENTER>
</p>
<!--endcut ========================================================= -->
<!--startcut ======================================================= -->
</BODY></HTML>
<!--endcut ========================================================= -->