448 lines
17 KiB
HTML
448 lines
17 KiB
HTML
<!--startcut ======================================================= -->
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
|
|
<html>
|
|
<head>
|
|
<META NAME="generator" CONTENT="lgazmail v1.1pre6">
|
|
<TITLE>The Answer Guy 29: Kernel crashes </TITLE>
|
|
</head>
|
|
|
|
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#A000A0"
|
|
ALINK="#FF0000">
|
|
<!--endcut ========================================================= -->
|
|
<H4>"Linux Gazette...<I>making Linux just a little more fun!</I>"
|
|
</H4>
|
|
<P> <hr> <P>
|
|
|
|
<!-- =============================================================== -->
|
|
<H1 align="center"><A NAME="answer">
|
|
<img src="../gx/dennis/qbubble.gif" alt="" border="0" align="middle">
|
|
<a href="./index.html">The Answer Guy</a>
|
|
<img src="../gx/dennis/bbubble.gif" alt="" border="0" align="middle">
|
|
</A></H1> <BR>
|
|
<H4 align="center">By James T. Dennis,
|
|
<a href="mailto:linux-questions-only@ssc.com">linux-questions-only@ssc.com</a><BR>
|
|
Starshine Technical Services,
|
|
<A HREF="http://www.starshine.org/">http://www.starshine.org/</A> </H4>
|
|
<p><hr><p>
|
|
<H3><img src="../gx/dennis/qbub.gif" alt="(?)" width="50" height="28"
|
|
align="left" border="0">Kernel crashes </H3>
|
|
|
|
<p><strong>From David W. MacDougall on 18 May 1998
|
|
|
|
<br><br>
|
|
Hello Answer Guy,
|
|
|
|
<br><br>
|
|
Since December, I have tried posting this problem to
|
|
newsgroups and the responses I received were not
|
|
helpful.
|
|
|
|
<br><br>
|
|
I am running Red Hat 5.0 on a Pentium 233 processor with 128mb of EDO RAM,
|
|
<a href="http://www.award.com/nofrhome.htm">Award</a> BIOS, Award PnP BIOS,
|
|
an <a href="http://www.adaptec.com/">Adaptec</a> 2940 Host Adapter,
|
|
and a <a href="http://www.quantum.com/">Quantum</a> SCSI 3.2gig hard drive.
|
|
One of the boot up messages I get from the BIOS is "Checking DMI pool data."
|
|
|
|
<br><br>
|
|
I have tried running Slackware and Red Hat, both
|
|
with kernel version 2.0.31 and 2.0.31. No matter what I
|
|
do, it seems the kernel will "see" only about 14 mb of
|
|
RAM (see meminfo file below). The first advice I got
|
|
was to use an append command in LILO "<tt>mem=128mb</tt>" and
|
|
several variations thereof. I tried entering that
|
|
command every possible way, and though it did increase
|
|
the amount of RAM available, the system would crash
|
|
almost immediately, or as soon as I tried to access a
|
|
disk drive. I would get a kernel panic message, then
|
|
some messages about the SCSI hosts and then a complete
|
|
freeze-up.
|
|
|
|
</strong></p>
|
|
<blockquote><img src="../gx/dennis/bbub.gif" width="50" height="28" alt="(!)"
|
|
align="left" border="0">
|
|
First that append directive should be:
|
|
|
|
<blockquote><code>
|
|
append="mem=128M"
|
|
|
|
</code></blockquote>
|
|
note that the quotes are required to assign this value to
|
|
the append directive because of the "<tt>=</tt>" sign that
|
|
is contained within the value. Also note that this
|
|
value should be written with a capital letter "<tt>M</tt>"
|
|
and not as "<tt>mb</tt>" (I'm hoping that your question has
|
|
a typo that's not reflected in your actual <tt>lilo.conf</tt>
|
|
file). You could also specify the size in hexadecimal
|
|
(precede it with "<tt>0x</tt>") and I suspect that you could
|
|
specify it as something like <tt>mem=134217728</tt> (which is
|
|
128M in decimal). However I've never seen anyone
|
|
do that.
|
|
|
|
<br><br>
|
|
Note also: You can pass a parameter such as <tt>mem=128M</tt>
|
|
to your kernel by simply typing it it on the <tt>lilo</tt>
|
|
command line during boot. In other words, when you
|
|
pause your system at the '<tt>lilo</tt>' prompt during the
|
|
boot cycle you can manually "append" kernel parameters
|
|
as you select which kernel/lilo image to load.
|
|
|
|
<br><br>
|
|
For example most of my lilo configuration have <tt>org</tt>
|
|
"original," <tt>old</tt>, <tt>cur</tt> "current," and <tt>new</tt>
|
|
stanza (images). So I could temporarily limit this machine
|
|
(canopus) to use only 32 of its 64Mb of RAM by typing something like:
|
|
|
|
<blockquote><code>
|
|
new mem=32M
|
|
</code></blockquote>
|
|
|
|
... at the lilo boot prompt. A line like:
|
|
|
|
<blockquote><code>
|
|
old root=/dev/sdb1 single mem=8M
|
|
</code></blockquote>
|
|
|
|
... would run my "old" kernel, mounting the first partition
|
|
on my second SCSI disk as the root filesystem, and limiting the
|
|
kernel to only use eight megabytes of RAM (and I'd better hope
|
|
that the rc scripts on <tt>/dev/sdb1</tt> give me a bit of swap space
|
|
or that first few minutes will be painful on with only eight
|
|
MB). You can also pass these parameters to your kernel via
|
|
<tt>LOADLIN.EXE</tt>.
|
|
|
|
<br><br>
|
|
You can learn all about the other boot prompt parameters
|
|
in the BootPrompt-HOWTO at any LDP mirror:
|
|
<br><a href="http://www.linuxresources.com/LDP/HOWTO/BootPrompt-HOWTO.html"
|
|
>http://www.linuxresources.com/LDP/HOWTO/BootPrompt-HOWTO.html</a>
|
|
|
|
<br><br>
|
|
If your actual attempts have been in the correct syntax
|
|
(and the error is just in your question) than it sounds
|
|
suspiciously like a memory hole or a chunk of address space
|
|
being used as a framebuffer (like a weird video card).
|
|
|
|
<br><br>
|
|
In the early 386 days (many years and three or
|
|
four generations of processor ago) there were some
|
|
systems that had 'top memory' or a 'memory hole'
|
|
--- a chunk of address space used by the motherboard's
|
|
chipset at about the 16Mb line (since <strong>nobody</strong> would
|
|
have more than 16Mb of RAM <img src="../gx/dennis/smily.gif"
|
|
alt="<g>" width="20" height="24" align="absmiddle">).
|
|
Back in those days I was on the Quearterdeck tech support team (Linux
|
|
didn't exist, yet) and we used to use the QEMM "<tt>notop</tt>"
|
|
switch (if I remember it correctly) to work around the problems
|
|
when this oddity wasn't automatically detected by QEMM.
|
|
|
|
<br><br>
|
|
However I don't think any Pentium or PII motherboard
|
|
would have such a problem (it would <strong>definitely</strong> be
|
|
considered a design flaw these days).
|
|
|
|
<br><br>
|
|
Are you saying that the kernel seems to be stable
|
|
when you run it with only 14 or 15Mb of RAM --- but
|
|
crashes soon after you force your kernel to use
|
|
the rest? Have you tried setting the append line
|
|
to 64Mb (or just manually passing a <tt>mem=64M</tt> parameter
|
|
to your kernel)?
|
|
|
|
<br><br>
|
|
You basically want to narrow down exactly where the
|
|
crash is occuring. Once you think you have it ---
|
|
it's worth taking out your RAM DIMM's (SIMM's or
|
|
whatever they are in your case) and swapping them
|
|
(put all the DIMM's from one bank into the to
|
|
slots and vice versa). If the crase "moves" with
|
|
the chips than replace the RAM modules (the things
|
|
might even work in another box --- since you can
|
|
sometimes see some inexplicable "timing" glitches
|
|
that amount to: "this board doesn't "like" those
|
|
chips"
|
|
|
|
<br><br>
|
|
(Sounds scientific, doesn't it!).
|
|
|
|
<br><br>
|
|
Other than that you might consider removing all the
|
|
cards in the system --- putting in the cheapest,
|
|
plainest video card and IDE/multi-function card and
|
|
drive you can find and testing it with that configuration.
|
|
(Then you re-introduce your preferred adapters until
|
|
the problems re-occur. In this way you isolated the
|
|
problem to a specific hardware or software component).
|
|
|
|
</blockquote>
|
|
<p><strong><img src="../gx/dennis/qbub.gif" width="50" height="28" alt="(?)"
|
|
align="left" border="0">
|
|
My question is this: Where is my problem? Is there
|
|
something inherent in the kernel that won't let it see
|
|
any more than 14mb of RAM on this system without my
|
|
having to add append statements to LILO? And why does
|
|
the kernel panic once is does see more memory? I have
|
|
tried adjusting the amount of memory stipulated in the
|
|
command line to lower levels, tried entering it in HEX,
|
|
and in bytes. The results are always the same.
|
|
|
|
</strong></p>
|
|
<blockquote><img src="../gx/dennis/bbub.gif" width="50" height="28" alt="(!)"
|
|
align="left" border="0">
|
|
There is definitely nothing that limits Linux to
|
|
14Mb of RAM. My oldest system ('<em>antares</em>' --- the
|
|
old 33Mhz 386 that handles my <tt>uucp</tt> mail, INN netnews,
|
|
fax, dial-in and dial-out modem, is the household
|
|
web and POP server, and is the backup masquerading
|
|
router when my ISDN goes out) has used 32Mb of RAM
|
|
(auto-detected) for several years --- from Linux 0.99p10
|
|
through my current 2.0.33. That machine is about
|
|
a decade old now. '<em>Betelgeuse</em>' and '<em>Canopus</em>' each
|
|
have 64Mb. I've managed and configured systems with
|
|
128Mb and more (although most PC hardware tops out at
|
|
128 or 256 Mb).
|
|
|
|
<br><br>
|
|
Most systems will need the <tt>mem=</tt> parameter if they have
|
|
more than 64Mb since there is no standard way to
|
|
detect more than that. The 2.1/2.2 kernels may auto-detect
|
|
large memory configurations on <strong>some</strong> systems, but
|
|
I've heard that some of the methods for probing for this
|
|
sort of RAM can lock up some other systems (something that
|
|
prevents users from installing NT on some of those "unsupported"
|
|
systems on the Microsoft "bad boy" list --- or so I've heard).
|
|
|
|
</blockquote>
|
|
<p><strong><img src="../gx/dennis/qbub.gif" width="50" height="28" alt="(?)"
|
|
align="left" border="0">
|
|
Or is my problem more with my Adaptec host adapter?
|
|
Why will it work just fine with only 14mb of RAM but
|
|
fall to pieces with more? (I also have an SCSI CD-ROM
|
|
drive and a second 3.2 SCSI Quantum hard drive (which I
|
|
use for Windows 95 in this dual-boot system.)
|
|
|
|
</strong></p>
|
|
<blockquote><img src="../gx/dennis/bbub.gif" width="50" height="28" alt="(!)"
|
|
align="left" border="0">
|
|
As I've said, I don't know. It could be any
|
|
component of the system. However I've used Adaptec
|
|
2940's in 128Mb systems with Linux and
|
|
<a href="http://www.frebsd.org/">FreeBSD</a>.
|
|
So that, by itself, is not likely to be the problem.
|
|
|
|
<br><br>
|
|
It's also almost inconceivable that the model of hard drive would
|
|
affect this situation. Basically any SCSI hard drive will usually
|
|
work on any <strong>supported</strong> SCSI controller without
|
|
crashing the kernel. (A couple of SCSI devices can fight with one
|
|
another and crash the SCSI bus --- particularly if you mix
|
|
"differential" devices with others. However, you'd get kernel
|
|
messages that would clearly indicate the subsystem involved --- and
|
|
it would be independent of the memory layout (in every case
|
|
I can think of).
|
|
|
|
</blockquote>
|
|
<p><strong><img src="../gx/dennis/qbub.gif" width="50" height="28" alt="(?)"
|
|
align="left" border="0">
|
|
I am reluctant to keep fiddling with the append parameters because every
|
|
time the system crashes, it eats a hole or two in the filesystem. A
|
|
Linux-knowledgeable friend suggested I might want to try upgrading to the
|
|
<a href="http://www.kernel.org/pub/linux/kernel/v2.1/">development kernel</a>
|
|
(2.1.??) to see if that would cure the problem. He also suggested I write to
|
|
Linux Torvalds, but I thought I should try you first.
|
|
|
|
</strong></p>
|
|
<blockquote><img src="../gx/dennis/bbub.gif" width="50" height="28" alt="(!)"
|
|
align="left" border="0">
|
|
I would definitely not bother Linus Torvalds directly with a problem
|
|
of this sort. If I could isolate it to a specific module or block
|
|
of code I <strong>might</strong> --- but I'd problem just post it
|
|
to the <a href="http://linuxwww.db.erau.edu/mail_archives/linux-kernel/">Linux Kernel mailing list</a> in <strong>any</strong> event.
|
|
Linus is quite active on that list
|
|
--- and it would simply be rude to request is personal
|
|
attention to something that any kernel developer might
|
|
be able to handle.
|
|
|
|
</blockquote>
|
|
<p><strong><img src="../gx/dennis/qbub.gif" width="50" height="28" alt="(?)"
|
|
align="left" border="0">
|
|
I feel sad sitting here with the world's greatest
|
|
OS and all this RAM I can't use. Any guidance or
|
|
direction you could give me would be greatly
|
|
appreciated!
|
|
|
|
</strong></p>
|
|
<blockquote><img src="../gx/dennis/bbub.gif" width="50" height="28" alt="(!)"
|
|
align="left" border="0">
|
|
Again, double and triple check the syntax of your
|
|
mem= parameter. Then try different values --- 64Mb,
|
|
96Mb, etc to isolate the specific limit. Read the
|
|
<a href="http://sunsite.unc.edu/LDP/HOWTO/BootPrompt-HOWTO.html"
|
|
>BootPrompt-HOWTO</a>. Try taking out that Matrox Mystique
|
|
and using just a plain VGA card for testing. Disable
|
|
all "Plug-n-Pray" features on your motherboard.
|
|
|
|
<br><br>
|
|
The first steps in all troubleshooting are to precisely
|
|
describe and isolate the problem. In the worst case
|
|
you might have a bad motherboard or some bad memory chips.
|
|
|
|
</blockquote>
|
|
<p><strong><img src="../gx/dennis/qbub.gif" width="50" height="28" alt="(?)"
|
|
align="left" border="0">
|
|
Thank you,
|
|
<br>Dave
|
|
</strong></p>
|
|
<pre>
|
|
--------------
|
|
proc/meminfo
|
|
--------------
|
|
|
|
total: used: free: shared: buffers:
|
|
cached:
|
|
Mem: 14004224 13795328 208896 8200192 180224
|
|
4214784
|
|
Swap: 41119744 17133568 23986176
|
|
MemTotal: 13676 kB
|
|
MemFree: 204 kB
|
|
MemShared: 8008 kB
|
|
Buffers: 176 kB
|
|
Cached: 4116 kB
|
|
SwapTotal: 40156 kB
|
|
SwapFree: 23424 kB
|
|
|
|
|
|
--------------------
|
|
/proc/cpuinfo
|
|
--------------------
|
|
|
|
processor : 0
|
|
cpu : 586
|
|
model : 4
|
|
vendor_id : GenuineIntel
|
|
stepping : 3
|
|
fdiv_bug : no
|
|
hlt_bug : no
|
|
fpu : yes
|
|
fpu_exception : yes
|
|
cpuid : yes
|
|
wp : yes
|
|
flags : fpu vme de pse tsc msr mce cx8 mmx
|
|
bogomips : 348.16
|
|
|
|
|
|
---------
|
|
/proc/scsi
|
|
----------
|
|
|
|
Attached devices:
|
|
Host: scsi0 Channel: 00 Id: 00 Lun: 00
|
|
Vendor: QUANTUM Model: FIREBALL_TM3200S Rev: 300X
|
|
Type: Direct-Access ANSI SCSI
|
|
revision: 02
|
|
Host: scsi0 Channel: 00 Id: 01 Lun: 00
|
|
Vendor: QUANTUM Model: FIREBALL ST3.2S Rev: 0F0C
|
|
Type: Direct-Access ANSI SCSI
|
|
revision: 02
|
|
Host: scsi0 Channel: 00 Id: 06 Lun: 00
|
|
Vendor: MATSHITA Model: CD-ROM CR-508 Rev: XS03
|
|
Type: CD-ROM ANSI SCSI
|
|
revision: 02
|
|
|
|
|
|
---------------
|
|
/proc/pci
|
|
---------------
|
|
|
|
PCI devices found:
|
|
Bus 0, device 19, function 0:
|
|
VGA compatible controller: Matrox Mystique (rev 3).
|
|
Medium devsel. Fast back-to-back capable. IRQ
|
|
10. Master Capable. Late
|
|
ncy=32.
|
|
Prefetchable 32 bit memory at 0xe0000000.
|
|
Non-prefetchable 32 bit memory at 0xe1000000.
|
|
Non-prefetchable 32 bit memory at 0xe0800000.
|
|
Bus 0, device 18, function 0:
|
|
SCSI storage controller: Adaptec AIC-7881U (rev 0).
|
|
Medium devsel. Fast back-to-back capable. IRQ
|
|
11. Master Capable. Late
|
|
ncy=32. Min Gnt=8.Max Lat=8.
|
|
I/O at 0x6000.
|
|
Non-prefetchable 32 bit memory at 0xe1004000.
|
|
Bus 0, device 7, function 1:
|
|
IDE interface: Intel 82371SB Natoma/Triton II PIIX3
|
|
(rev 0).
|
|
Medium devsel. Fast back-to-back capable.
|
|
Master Capable. Latency=32.
|
|
I/O at 0xf000.
|
|
Bus 0, device 7, function 0:
|
|
ISA bridge: Intel 82371SB Natoma/Triton II PIIX3
|
|
(rev 1).
|
|
Medium devsel. Fast back-to-back capable.
|
|
Master Capable. No bursts.
|
|
Bus 0, device 0, function 0:
|
|
Host bridge: Intel 82437VX Triton II (rev 2).
|
|
Medium devsel. Master Capable. Latency=32.
|
|
|
|
</pre>
|
|
|
|
<!--================================================================-->
|
|
<P> <hr> <P>
|
|
<H5 align="center"><a href="http://www.linuxgazette.com/copying.html"
|
|
>Copyright ©</a> 1998, James T. Dennis <BR>
|
|
Published in <I>Linux Gazette</I> Issue 29 June 1998</H5>
|
|
<P> <hr>
|
|
<!--================================================================-->
|
|
<p align="center"><table width="95%"><tr align="center">
|
|
<td rowspan="4"><A HREF="lg_answer29.html"><IMG
|
|
SRC="../gx/dennis/answernew.gif"
|
|
ALT="[ Answer Guy Index ]"i
|
|
align="left"></A></td>
|
|
</tr><tr align="center">
|
|
|
|
<!-- begins -->
|
|
<td><A HREF="tag_versions.html">versions</A></td>
|
|
<td><A HREF="tag_lilo.html">lilo</A></td>
|
|
<td><A HREF="tag_virtdom.html">virtdom</a></td>
|
|
<td><A HREF="tag_kernel.html">kernel</A></td>
|
|
<td><A HREF="tag_winmodem.html">winmodem</a></td>
|
|
<td><A HREF="tag_basicmail.html">basicmail</a></td>
|
|
<td><A HREF="tag_betterbak.html">betterbak</a></td>
|
|
</tr><tr align="center">
|
|
|
|
<td><A HREF="tag_shadow.html">shadow</a></td>
|
|
<td><A HREF="tag_dell.html">dell</a></td>
|
|
<td><A HREF="tag_dumbterm.html">dumbterm</a></td>
|
|
<td><A HREF="tag_whylinux.html">whylinux</a></td>
|
|
<td><A HREF="tag_redhat.html">redhat</a></td>
|
|
<td><A HREF="tag_netcard.html">netcard</a></td>
|
|
<td><A HREF="tag_macrovir.html">macrovir</a></td>
|
|
</tr><tr align="center">
|
|
|
|
<td><A HREF="tag_newlook.html">newlook</a></td>
|
|
<td><A HREF="tag_tacacs.html">tacacs</a></td>
|
|
<td><A HREF="tag_sendmail.html">sendmail</a></td>
|
|
<td><A HREF="tag_dialdppp.html">dialdppp</a></td>
|
|
<td><A HREF="tag_ppp233.html">ppp233</a></td>
|
|
<td><A HREF="tag_msmail.html">msmail</a></td>
|
|
<td><A HREF="tag_procmail.html">procmail</a></td>
|
|
<!-- ends -->
|
|
</tr></table>
|
|
|
|
</P> <hr> <P>
|
|
<!--================================================================-->
|
|
<A HREF="./index.html"><IMG SRC="../gx/indexnew.gif"
|
|
ALT="[ Table Of Contents ]"></A>
|
|
<A HREF="../index.html"><IMG SRC="../gx/homenew.gif"
|
|
ALT="[ Front Page ]"></A>
|
|
<A HREF="lg_bytes29.html"><IMG SRC="../gx/back2.gif"
|
|
ALT="[ Previous Section ]"></A>
|
|
<A HREF="./hamilton.html"><IMG SRC="../gx/fwd.gif"
|
|
ALT="[ Next Section ]"></A>
|
|
<!--startcut ======================================================= -->
|
|
</body>
|
|
</html>
|
|
<!--endcut ========================================================= -->
|
|
|