old-www/LDP/LG/issue78/tag/2.html

206 lines
8.9 KiB
HTML

<!--startcut ======================================================= -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<META NAME="generator" CONTENT="lgazmail v1.4F.r">
<TITLE>The Answer Gang 78: Watchdog daemon</TITLE>
</HEAD><BODY BGCOLOR="#FFFFFF" TEXT="#000000"
LINK="#3366FF" VLINK="#A000A0">
<!--endcut ========================================================= -->
<P> <hr>
<!--startcut ======================================================= -->
<CENTER>
<!-- *** BEGIN navbar *** -->
<!-- *** END navbar *** -->
</CENTER>
</p>
<!--endcut ========================================================= -->
<!--startcut ======================================================= -->
<P> <hr>
<!-- begin tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::-->
<p align="center">
<table width="100%" border="0"><tr>
<td align="right" valign="center"
><IMG ALT="" SRC="../../gx/navbar/left.jpg"
WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="middle" border="0"
><A HREF="../index.html"
><IMG SRC="../../gx/navbar/toc.jpg" align="middle"
ALT="[ Table Of Contents ]" border="0"></A
><A HREF="../lg_answer.html"
><IMG SRC="../../gx/dennis/answertoc.jpg" align="middle"
ALT="[ Answer Guy Current Index ]" border="0"></A></td>
<td align="center" valign="center"><A HREF="../lg_answer.html#greeting"><img align="middle"
src="../../gx/dennis/smily.gif" alt="greetings" border="0"></A> &nbsp;
<A HREF="../tag/bios.html">Meet&nbsp;the&nbsp;Gang</A> &nbsp;
<A HREF="1.html">1</A> &nbsp;
<A HREF="2.html">2</A> &nbsp;
<A HREF="3.html">3</A> &nbsp;
<A HREF="4.html">4</A> &nbsp;
<A HREF="5.html">5</A> &nbsp;
<A HREF="6.html">6</A> &nbsp;
<A HREF="7.html">7</A>
</td>
<td align="left" valign="center"><A HREF="../../tag/kb.html"
><IMG SRC="../../gx/dennis/answerpast.jpg" align="middle"
ALT="[ Index of Past Answers ]" border="0"></A
><IMG ALT="" SRC="../../gx/navbar/right.jpg" align="middle"
WIDTH="14" HEIGHT="45" BORDER="0"></td></tr></table>
</p>
<!-- end tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<!--endcut ========================================================= -->
<P> <hr> <P>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<center>
<H1><A NAME="answer">
<img src="../../gx/dennis/qbubble.gif" alt="(?)"
border="0" align="middle">
<font color="#B03060">The Answer Gang</font>
<img src="../../gx/dennis/bbubble.gif" alt="(!)"
border="0" align="middle">
</A></H1>
<BR>
<H4>By Jim Dennis, Ben Okopnik, Dan Wilder, Breen, Chris, and...
(<a href="tag/bios.html">meet the Gang</a>) ...
the Editors of Linux Gazette...
and You!
<br>Send questions (or interesting answers) to
The Answer Gang
for possible publication
(but read the <a href="../tag/ask-the-gang.html">guidelines</a> first)
</H4>
</center>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<p><hr><p>
<!-- begin 2 -->
<H3 align="left"><img src="../../gx/dennis/bbubble.gif"
height="50" width="60" alt="(!) " border="0"
>Watchdog daemon</H3>
<p align="right"><strong>Answer By James T. Dennis
<p></strong></p>
<blockQuote>
The Linux kernel supports a class of devices called "watchdog"
drivers. These are programmable timers which are wired to a system's
reset or power lines. They are common on non-PC servers and workstations
and in embedded devices and are increasing included in PC PCI chipsets.
There are also PC adapter cards that can function as watchdog timers,
some of them are included in adapters with other functions (such as the
PC Weasel 2000, or some high precision real-time clocks?) and some of
them have electronics to monitor CPU or case temperature, power supply
voltages, etc.
</blockQuote>
<blockQuote>
These all have one function in common, they can be set to some time
interval (60 seconds by default, under Linux) and will count down
towards zero. If they ever reach zero they'll strobe the reset line
and force the hardware to reboot. Thus the require period "petting"
or they'll "bite" you.
</blockQuote>
<blockQuote>
The Linux kernel supports a variety of watchdog hardware, and also
includes one which is a software emulation of what a watchdog timer
does. (Those are a bit less robust since some forms of kernel panic
or failure <EM>might</EM> leave the system wedged and unable to execute the
softdog code). (The Linux kernel can be set to reset after a time delay
in case of panic --- the default is to dump a message and registers to the
the console and wait for a human to read them and reboot. Read the
bootparam(7) man pages and search for panic= for details on how to
over-ride that).
</blockQuote>
<blockQuote>
All of this is of no use unless you also have a daemon or utility that
can set the watchdog, monitor the system, and periodically "pet the
dog." (Some texts on this topic use the more abusive "kicking" analogy
--- but I find that distasteful).
</blockQuote>
<blockQuote>
Of course one can write one's own daemon, or even a cron job (if one
over-rode the default 60 second value to be a bit longer, to account for
possible cron delays). However, it's best to start with one that's
already written and reasonably well proven. The <A HREF="http://www.debian.org/">Debian</A> project has one
that's simply called "watchdog." Although it is a Debian package it can
be adapted for use on any Linux distribution.
</blockQuote>
<blockQuote>
This particular daemon performs up to 10 internal system tests
(most are optional) and it can be configured to execute a custom suite
of tests --- your own script or binary which must return a zero exit
value on success (and should run in under some liberal time limit).
In other words, it's extensible. On failure it can attempt to execute
a custom "repair" script or binary, then it can try a soft reboot
(with statically compile code -- NOT by calling the normal 'shutdown'
or 'reboot' binaries). Failing all of that, it will simply fail to
write to the <TT>/dev/watchdog</TT> which will cause the kernel to fail to
"pet the dog" (hardware) or cause the kernel to reboot (softdog).
</blockQuote>
<blockQuote>
In (almost) any event a system failure should result in a reboot
instead of a hang. That can be good for systems that are remotely
located and hard to get reach. Of course Linux is pretty robust and
reliable: so it's rare that the watchdog will be needed; and of course
it <EM>could</EM> be that the watchdog will cause some spurious reboots,
sometimes --- especially when initially configuring and tuning it.
But there are cases where it's worth the risk and effort.
</blockQuote>
<!-- end 2 -->
<P> <hr> </p>
<!-- *** BEGIN copyright *** -->
<H5 align="center">This page edited and maintained by the Editors
of <I>Linux Gazette</I>
<a href=""
>Copyright &copy;</a> 2002
<BR>Published in issue 78 of <I>Linux Gazette</I> May 2002</H5>
<H6 ALIGN="center">HTML script maintained by
<A HREF="mailto:star@starshine.org">Heather Stern</a> of
Starshine Technical Services,
<A HREF="http://www.starshine.org/">http://www.starshine.org/</A>
</H6>
<!-- *** END copyright *** -->
<!--startcut ======================================================= -->
<P> <hr>
<!-- begin tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::-->
<p align="center">
<table width="100%" border="0"><tr>
<td align="right" valign="center"
><IMG ALT="" SRC="../../gx/navbar/left.jpg"
WIDTH="14" HEIGHT="45" BORDER="0" ALIGN="middle" border="0"
><A HREF="../index.html"
><IMG SRC="../../gx/navbar/toc.jpg" align="middle"
ALT="[ Table Of Contents ]" border="0"></A
><A HREF="../lg_answer.html"
><IMG SRC="../../gx/dennis/answertoc.jpg" align="middle"
ALT="[ Answer Guy Current Index ]" border="0"></A></td>
<td align="center" valign="center"><A HREF="../lg_answer.html#greeting"><img align="middle"
src="../../gx/dennis/smily.gif" alt="greetings" border="0"></A> &nbsp;
<A HREF="../tag/bios.html">Meet&nbsp;the&nbsp;Gang</A> &nbsp;
<A HREF="1.html">1</A> &nbsp;
<A HREF="2.html">2</A> &nbsp;
<A HREF="3.html">3</A> &nbsp;
<A HREF="4.html">4</A> &nbsp;
<A HREF="5.html">5</A> &nbsp;
<A HREF="6.html">6</A> &nbsp;
<A HREF="7.html">7</A>
</td>
<td align="left" valign="center"><A HREF="../../tag/kb.html"
><IMG SRC="../../gx/dennis/answerpast.jpg" align="middle"
ALT="[ Index of Past Answers ]" border="0"></A
><IMG ALT="" SRC="../../gx/navbar/right.jpg" align="middle"
WIDTH="14" HEIGHT="45" BORDER="0"></td></tr></table>
</p>
<!-- end tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<!--endcut ========================================================= -->
<P> <hr>
<!--startcut ======================================================= -->
<CENTER>
<!-- *** BEGIN navbar *** -->
<!-- *** END navbar *** -->
</CENTER>
</p>
<!--endcut ========================================================= -->
<!--startcut ======================================================= -->
</BODY></HTML>
<!--endcut ========================================================= -->