LDP/LDP/howto/docbook/Usenet-News-HOWTO/monitoring.sgml

<section><title>Monitoring and administration</title>
<para>
Once the Usenet News system is in place and running, the news administrator
is then aided in monitoring the system by various reports generated by the
system. Also, he needs to make regular checks in specific directories and
file to ascertain the smooth working of the system.
</para>

<section><title>The <literal>newsdaily</literal> report</title>
<para>
This report is generated by newsdaily which is typically run through cron. I
shall enumerate some of the problems reported based on what I have seen.
</para>

<itemizedlist>
<listitem><para>bad input batches:
    This reports articles that have been processed and
    declared bad and hence not digested. The reason for it is not mentioned. You
    are expected to check the article and determine the cause.
</para></listitem>

<listitem><para>leading unknown newsgroups by articles:
    This gives a list of newsgroups
    whose hierarchy has been subscribed to, but the specific newsgroup does not
    appear in the active file. You could add the newsgroup in the active file if
    you think it is important enough.
</para></listitem>

<listitem><para>leading unsubscribed newsgroups:
    This gives a list of newsgroups
    that have not been subscribed to, of which the news server receives a
    maximum no. of articles. You really cannot do much about this except to
    subscribe to them if they are required.
</para></listitem>

<listitem><para>leading sites sending bad headers:
    This will list your NDNs who
    are sending articles with malformed/insufficient headers.
</para></listitem>

<listitem><para>leading sites sending stale/future/misdated news:
    This will list your NDNs who are sending you articles that are older than
    the date you have specified for accepting feeds.
</para></listitem>

<listitem><para>Some of the reports generated by us:
    We have modified the newsdaily script to include some more statistics.
    <itemizedlist>
    <listitem><para>disk usage:
	This reports the size in bytes of the <literal>$NEWSARTS</literal>
	area. If you are receiving feeds regularly, you should see this figure
	increasing.
    </para></listitem>

    <listitem><para>incoming feed statistics:
	This reports the no. of articles and total bytes recevied from each of
	your NDNs.
    </para></listitem>

    <listitem><para>NNTP traffic report:
	The output of nestor has also been included in
	this report which gives details of each nntp connection and the overall
	performance of the network connection read from the newslog file.
	To understand the format, read manpage of nestor.
    </para></listitem>
    </itemizedlist>
</para></listitem>

<listitem><para>Error reporting from the errorlog file:
    Reports  errors logged in the errorlog file. Usually these are  file
    ownership or file missing problems which can be easily handled.
</para></listitem>
</itemizedlist>
</section>

<section><title>Crisis reports from <literal>newswatch</literal></title>
<para>
Most of the problems reported to me are ones with either space shortage or
persistent locks. There are instances when the scripts have created locks files
and have aborted/terminated without removing them. Sometimes they are
innocuous enough to be deleted but this should be determined after a careful
analysis. They could be an indication of some part of the system not working
correctly. For <emphasis>e.g.</emphasis> I would receive this error message when
sendbatches would abnormally terminate trying to transmit huge togo files. I had
to determine why sendbatches was failing this often.
</para>

<para>
The space shortage issue has to be addressed immediately. You could
delete unwanted articles by running doexpire or add more disk space at the OS
level. The latter seems a better option.
<para>
</section>

<section><title>Disk space</title>
<para>
The <literal>$NEWSBIN</literal> area occupies space that is fixed. Since the
binaries do not grow once installed, you do not have to worry about disk
shortage here. The areas that take up more space as feeds come in are
<literal>$NEWSCTL</literal> and <literal>$NEWSARTS</literal>. The
<literal>$NEWSCTL</literal> has log files that keep growing with each feed and
as the articles are digested in huge numbers the <literal>$NEWSARTS</literal>
continues to grow. Also, if articles are being archived on expiry you will need
space. Allocate a few GB of disk space for <literal>$NEWSARTS</literal>
depending on the no. of hierarchies you are subscribing and the feeds that come
in everyday. <literal>$NEWSCTL</literal> grows to a lesser proportion as
compared to <literal>$NEWSARTS</literal>. Allocate space for this accordingly.
</para>
</section>

<section><title>CPU load and RAM usage</title>
<para>With modern C-News and NNTPd, there is very little usage of these
system resources for processing news article flow. Key components like
<literal>newsrun</literal> or <literal>sendbatches</literal> do not load
the system much, except for cases where you have a very heavy flow of
compressed outgoing batches and the compression utility is run by
<literal>sendbatches</literal> frequently. <literal>newsrun</literal> is
amazingly efficient in the current C-News release. Even when it takes
half an hour to digest a large consignment of batches, it hardly loads the
CPU of a slow Pentium 200 MHz CPU or consumes much RAM in a 64 MB
system.</para>

<para>One thing which does slow down a system is a large bunch of
users connecting using NNTP to browse newsgroups. We do not have
heuristic based figures off-hand to provide a guidance figure for
resource consumption for this, but we have found that the load on the
CPU and RAM for a certain number of active users invoking
<literal>nntpd</literal> is more than with an equal number of
users connecting to the POP3 port of the same system for pulling
out mailboxes. A few hundred active NNTP users can really slow down
a dual-P-III Intel Linux server, for instance. This loading has no
bearing on whether you are using INN or <literal>nntpd</literal>;
both have practically identical implementations for NNTP
<emphasis>reading</emphasis> and differ only in their handling of
feeds.</para>

<para>Another situation which will slow down your Usenet news server is
when downstream servers connect to you for pulling out NNTP feeds using
the pull method. This has been mentioned before. This can really load
your server's I/O system and CPU.</para>

</section>

<section><title>The <literal>in.coming/bad</literal> directory</title>
<para>
The in.coming directory is where the batches/articles reside when you have
received feeds from your NDN and before processing happens. Checking this
directory regularly to see if there are batches is a good way of determining
that feeds are coming in. The batches and articles have different nomenclature.
Names like nntp.GxhsDj are indicative of batches and individual
articles are named beginning with digits like <literal>0.10022643380.t</literal>
</para>

<para>
The bad sub-directory under in.coming holds batches/articles that have
encountered errors when they were being processed by relaynews. You will have
to look into the directory for the cause of it. Ideally speaking, this
directory should be empty.
</para>
</section>

<section><title>Long pending queues in <literal>out.going</literal></title>

<para>TO BE ADDED.</para>

</section>

<section><title>Problems with <literal>nntpxmit</literal> and <literal>nntpsend</literal></title>

<para>TO BE ADDED.</para>

</section>

<section><title>The <literal>junk</literal> and <literal>control</literal> groups</title>
<para>
Control messages are those that have a newgroup/rmgroup/cancel/checkgroup in
their subject line. Such messages result in relaynews calling the appropriate
script and on execution a message is mailed to the admin about the action
taken. These control messages are stored in the control directory of
<literal>$NEWSARTS</literal>.  For the propogation of such messages, one must
subscribe to the control hierarchy.
</para>

<para>
When your news system determines that a certain article has not been subscribed
by you, it is 'junked' i.e. such articles appear in the junk directory. This
directory plays a key role in transferring articles to your NDNs as they would
subscribe to the junk hierarchy to receive feeds. If you are a leaf node, there
is no reason why articles should pile here. Keep deleting them on a daily
basis.
</para>

</section>
</section>