new

2002-07-30 14:54:25 +00:00 · 2002-07-30 14:54:25 +00:00 · a244760233
parent f5992572cc
commit a244760233
14 changed files with 3180 additions and 0 deletions
--- a/LDP/howto/docbook/Usenet-News-HOWTO/Usenet-News-HOWTO.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/Usenet-News-HOWTO.sgml
@ -0,0 +1,66 @@
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [
+<!ENTITY what system "what.sgml">
+<!ENTITY pop system "pop.sgml">
+<!ENTITY software system "software.sgml">
+<!ENTITY settingup system "settingup.sgml">
+<!ENTITY inn system "inn.sgml">
+<!ENTITY mail2news system "mail2news.sgml">
+<!ENTITY accesscontrol system "accesscontrol.sgml">
+<!ENTITY components system "components.sgml">
+<!ENTITY monitoring system "monitoring.sgml">
+<!ENTITY clients system "clients.sgml">
+<!ENTITY perspective system "perspective.sgml">
+<!ENTITY doc system "doc.sgml">
+<!ENTITY conclusion system "conclusion.sgml">
+]>
+
+<book>
+    <bookinfo>
+        <title>Usenet News HOWTO </title>
+        <authorgroup>
+            <author>
+                <firstname>Shuvam Misra</firstname>
+                <othername>
+                </othername>
+            </author>
+            <author>
+		<firstname>Hema Kariyappa</firstname>
+                <othername>
+                <emphasis>(usenet@starcomsoftware.com)</emphasis>
+                </othername>
+            </author>
+        </authorgroup>
+	<address>Starcom Software Private Limited.
+                 starcomsoftware.com
+            <country>Mumbai, India</country>
+        </address>
+	<revhistory>
+    	    <revision>
+		<revnumber>2.0</revnumber>
+		<date>2002-07-30</date>
+		<authorinitials>sm</authorinitials>
+		<revremark>Major update by new authors.</revremark>
+	    </revision>
+    	    <revision>
+		<revnumber>1.4</revnumber>
+		<date>1995-11-29</date>
+		<authorinitials>vs</authorinitials>
+		<revremark>Original document; authored by Vince Skahan.</revremark>
+	    </revision>
+	</revhistory>
+    </bookinfo>
+<toc></toc>
+&what;
+&pop;
+&software;
+&settingup;
+&inn;
+&mail2news;
+&accesscontrol;
+&components;
+&monitoring;
+&clients;
+&perspective;
+&doc;
+&conclusion;
+</book>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/accesscontrol.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/accesscontrol.sgml
@ -0,0 +1,52 @@
+<chapter><title>Access control in NNTPd</title>
+<para>
+The original NNTPd had host-based authentication which allowed clients
+connecting from a particular IP address to read only certain newsgroups.
+This was very clearly inadequate for enterprise deployment on an
+Intranet, where each desktop computer has a different IP address, often
+DHCP-assigned, and the mapping between person and desktop is not static.
+</para>
+
+<para>
+What was needed was a user-based authentication, where a username and
+password could be used to authenticate the user. Even this was provided
+as an extension to NNTPd, but more was needed. The corporate IS manager
+needs to ensure that certain Usenet discussion groups remain visible only
+to certain people. This authorisation layer was not available in NNTPd.
+Once authenticated, all users could read all newsgroups.
+</para>
+
+<para>
+We have extended the user-based authentication facility in NNTPd in some
+(we hope!) useful ways, and we have added an entire authorisation layer
+which lets the administrator specify which newsgroups each user can
+read. With this infrastructure, we feel NNTPd is fit for enterprise
+deployment and can be used to handle corporate document repositories,
+messages, and discussion archives. Details are given below.
+</para>
+
+<section><title>Host-based access control</title>
+<para></para>
+</section>
+
+<section><title>User authentication and authorisation</title>
+<para></para>
+
+    <section><title>The NNTPd password file</title>
+    <para></para>
+    </section>
+
+    <section><title>Mapping users to newsgroups</title>
+    <para></para>
+    </section>
+
+    <section><title>The <literal>X-Authenticated-Author</literal> article header</title>
+    <para></para>
+    </section>
+
+    <section><title>Other article header additions</title>
+    <para></para>
+    </section>
+
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/clients.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/clients.sgml
@ -0,0 +1,67 @@
+<chapter><title>Usenet news clients</title>
+<para>
+This HOWTO was written to allow a Linux system administrator provide the
+Usenet news service to readers of those articles. The rest of this HOWTO
+focuses on the server-end software and systems, but one chapter
+dedicated to the clients does not seem disproportionate, considering
+that the <emphasis>raison d'etre</emphasis> of Usenet news servers is to serve
+these clients.
+</para>
+
+<para>
+The overwhelming majority of clients are software programs which access
+the article database, either by reading <literal>/var/spool/news</literal> on a
+Unix system or over NNTP, and allow their human users to read and post
+articles. We can therefore probably term this class of programs UUA, for
+Usenet User Agents, along the lines of MUA for Mail User Agents.
+</para>
+
+<para>
+There are other special-purpose clients, which either pull out articles
+to copy or transfer somewhere else, or for analysis, <emphasis>e.g.</emphasis> a
+search engine which allows you to search a Usenet article archive, like Google
+(<literal>www.google.com</literal>) does.
+</para>
+
+<para>
+This chapter will discuss issues in UUA software design, and bring out
+essential features and efficiency and management issues. What this
+chapter will certainly <emphasis>never</emphasis> attempt to do is catalogue all
+the different UUA programs available in the world --- that is best left to
+specialised catalogues on the Internet.
+</para>
+
+<para>
+This chapter will also briefly cover special-purpose clients which
+transfer articles or do other special-purpose things with them.
+</para>
+
+<section><title>Usenet User Agents</title>
+
+    <section><title>Accessing articles: NNTP or spool area?</title>
+    <para></para>
+    </section>
+
+    <section><title>Threading</title>
+    <para></para>
+    </section>
+
+    <section><title>Quick reading features</title>
+    <para></para>
+    </section>
+
+</section>
+
+<section><title>Clients that transfer articles</title>
+
+<para>
+We will discuss Suck and <literal>nntpxfer</literal> from the NNTP server
+distribution here. Suck has already discussed earlier. We will be happy
+to take contributed additions that discuss other client software.
+</para>
+</section>
+
+<section><title>Special clients</title>
+<para></para>
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/components.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/components.sgml
@ -0,0 +1,373 @@
+<chapter><title>Components of a running system</title>
+<para>
+This chapter reviews the components of a running CNews+NNTPd server.
+Analogous components will be found in an INN-based system too. We invite
+additions from readers familiar with INN to add their pieces to this
+chapter.
+</para>
+
+<section><title><literal>/var/lib/news</literal>: the CNews control area</title>
+<para>
+This directory is more popularly known as <literal>$NEWSCTL</literal>. It
+contains configuration, log and status files. There are no
+articles or binaries kept here. Let's see what some of the
+files are meant for.
+</para>
+
+<itemizedlist>
+<listitem><para><literal>sys</literal>: 
+    One line per system/NDN listing all the newsgroup 
+    hierarchies each system subscribes to. Each line is prefixed with the system
+    name and the one beginning with ME: indicates what we are going to receive.
+    Look up manpage of <literal>newssys</literal>.
+</para></listitem>
+
+<listitem><para><literal>explist</literal>:
+    This file has entries indicating articles of which
+    newsgroup expire and when and if they have to be archived.  The order in
+    which the newsgroups are listed is important. See manpage of
+    <literal>expire</literal> for file format.
+</para></listitem>
+
+<listitem><para><literal>batchparms</literal>:
+    Details of how to feed other sites/NDN, like the size of
+    batches, the mode of transmission (UUCP/NNTP) are specified here.
+    manpage to refer: <literal>newsbatch</literal>.
+</para></listitem>
+
+<listitem><para><literal>controlperm</literal>: 
+    If you wish to authenticate a control message before any
+    action is taken on it, you must enter authentication-related information 
+    here.  The <literal>controlperm</literal> manpage will list all the fields
+    in detail.
+</para></listitem>
+
+<listitem><para><literal>mailpaths</literal>: 
+    It features the e-mail address of the moderator for each 
+    newsgroup who is responsible for approving/disapproving
+    articles posted to moderated newsgroups. The sample
+    <literal>mailpaths</literal> file in the <literal>tar</literal> will 
+    you give an idea of how entries are made.
+</para></listitem>
+
+<listitem><para><literal>nntp_access/user_access</literal>:  
+    These files contain entries of servernames 
+    and usernames on whom restrictions will apply when accessing newsgroups. 
+    Again, the sample file in the tarball shall explain the format of the file.
+</para></listitem>
+
+<listitem><para><literal>log, errlog</literal>: 
+    These are log files that keep growing large with each batch 
+    that is received. The <literal>log</literal> file has one entry per
+    <literal>article</literal> telling you if it 
+    has been accepted by your news server or rejected. To understand the
+    format of this file, refer to Chapter 2.2 of the <literal>CNews</literal>
+    guide.  Errors, if any, while digesting the articles are
+    logged in <literal>errlog</literal>. These 
+    log files have to be rolled as the files hog a lot of disk space. 
+</para></listitem>
+
+<listitem><para><literal>nntplog</literal>:  
+    This file logs information of the <literal>nntp daemon</literal> giving
+    details of when a connection was established/broken and what commands were 
+    issued. This file needs to be configured in syslog and syslog 
+    <literal>daemon</literal> should be running.
+</para></listitem>
+
+<listitem><para><literal>active</literal>: 
+    This file has one line per newsgroup to be found in your news
+    server. Besides other things, it tells you how many articles are
+    currently present in each newsgroup. It is updated when each batch is
+    digested or when articles are expired. The <literal>active</literal>
+    manpage will furnish more details about other paramaters.
+</para></listitem>
+
+<listitem><para><literal>history</literal>: 
+    This file, again, contains one line per <literal>article</literal>, mapping 
+    <literal>message-id</literal> to newsgroup name and also giving its
+    associated <literal>article</literal> no. in that newsgroup. It is updated
+    each time a feed is digested 
+    and when <literal>doexpire</literal> is run. Plays a key role in
+    loop-detection and serves as an article database. Read manpage of
+    <literal>newsdb</literal>, <literal>doexpire</literal> for the file format 
+</para></listitem>
+
+<listitem><para>newsgroups: 
+    It has a one line description for each newsgroup explaining 
+    what kind of posts go into each of them. Ideally speaking, it should cover 
+    all the newsgroups found in the <literal>active</literal> file.
+</para></listitem>
+
+<listitem><para>Miscellaneous files:
+    Files like <literal>mailname</literal>, <literal>organisation</literal>,
+    <literal>whoami</literal> contain information required for forming some of
+    the headers of an <literal>article</literal>. The contents of
+    <literal>mailname</literal> form the <literal>From:</literal> header and
+    that of <literal>organisation</literal> form the
+    <literal>Organisation:</literal> header. <literal>whoami</literal> contains
+    the name of the news system. Refer to chapter 2.1 of
+    <literal>guide.ps</literal> for a detailed list of files in the
+    <literal>$NEWSCTL</literal> area.  Read <literal>RFC 1036</literal> for
+    description of article headers .
+</para></listitem>
+</itemizedlist>
+</section>
+
+<section><title><literal>/var/spool/news</literal>: the article repository</title>
+<para>
+This is also known as the <literal>$NEWSARTS</literal> or
+<literal>$NEWSSPOOL</literal> directory. This is where the
+articles reside on your disk. No binaries or control files
+should belong here.  Enough space should be allocated to this 
+directory as the number of articles keep increasing with each
+batch that is digested. An explanation of the following sub-directories will
+give you an overview of this directory:
+<itemizedlist>
+<listitem><para><literal>in.coming</literal>:
+    Feeds/batches/articles from NDNs on their arrival and
+    before being processed reside in this directory. After processing, they
+    appear in 
+    <literal>$NEWSARTS</literal> or in its <literal>bad</literal> sub-directory
+    if there were errors. 
+</para></listitem>
+
+<listitem><para><literal>out.going</literal>: 
+    This directory contains batches/feeds to be sent to your
+    NDNs i.e. feeds to be pushed to your neighbouring sites reside here
+    before they are transmitted. It contains one sub-directory per NDN mentioned
+    in the <literal>sys</literal> file. These sub-directories contain files
+    called <literal>togo</literal> 
+    which contain information about the <literal>article</literal> like the
+    <literal>message-id</literal> 
+    or the article no. that is queued for transmission. 
+</para></listitem>
+
+<listitem><para><anchor id="newsgroupdir"/>newsgroup directories:
+    For each newsgroup hierarchy that the news server
+    has subscribed to, a directory is created under
+    <literal>$NEWSARTS</literal>. 
+    Further sub-directories are created under the parent to hold
+    articles of specific newsgroups. For instance, for a
+    newsgroup like <literal>comp.music.compose</literal>, the parent directory
+    <literal>comp</literal> will appear in <literal>$NEWSARTS</literal> and a
+    sub-directory called <literal>music</literal> will be created under
+    <literal>comp</literal>. The <literal>music</literal> sub-directory 
+    shall contain a further sub-directory called <literal>compose</literal> and
+    all articles of <literal>comp.music.compose</literal>
+    shall reside here. In effect, 
+    <literal>article</literal> 242 of newsgroup
+    <literal>comp.music.compose</literal> shall map to file
+    <literal>$NEWSARTS/comp/music/compose/242</literal>.
+</para></listitem>
+
+<listitem><para>control: 
+    The control directory houses only the control messages that
+    have been received by this site. The control messages could be any of the
+    following: <literal>newgroup, rmgroup, checkgroup</literal> and
+    <literal>cancel</literal> 
+    appearing in the subject line of the <literal>article</literal>.
+</para></listitem>
+
+<listitem><para><literal>junk</literal>: 
+    The <literal>junk</literal> directory contains all
+    articles that the news 
+    server has received and has decided, after processing, that it does not 
+    belong to any of the hierarchies it has subscribed to. The news server 
+    transfers/passes all <literal>articles</literal> in this directory to NDNs
+    that have subscribed to the <literal>junk</literal> hierarchy.
+</para></listitem>
+</itemizedlist>
+</section>
+
+<section><title><literal>/usr/lib/newsbin</literal>: the executables</title>
+<para></para>
+</section>
+
+<section id="cronjobs"><title><literal>crontab and cron jobs </literal></title>
+<para>
+The heart of the Usenet news server is the various scripts that run at regular
+intervals processing articles, digesting/rejecting them and
+transmitting them to NDNs. I shall try to enumerate the ones that are important
+enough to be cronned. :)
+</para>
+
+<itemizedlist>
+<listitem><para><literal>newsrun</literal>: 
+    The key script. This script picks the batches in the 
+    <literal>in.coming</literal> directory, uncompresses them if necessary and
+    feeds it to <literal>relaynews</literal> which then processes each
+    <literal>article</literal> digesting and 
+    batching them and logging any errors. This script needs to run through cron
+    as frequently as you want the feeds to be digested. Every half hour should 
+    suffice for a non-critical requirement.
+</para></listitem>
+
+<listitem><para><literal>sendbatches</literal>: 
+    This script is run to transmit the togo files formed in
+    the <literal>out.going</literal> directory to your NDNs. It reads the
+    <literal>batchparms</literal> file to know 
+    exactly how and to whom the batches need to be transmitted. The frequency,
+    again, can be set according to your requirements. Once an hour should be 
+    sufficient.
+</para></listitem>
+
+<listitem><para><literal>newsdaily</literal>: 
+    This script does maintenance chores like rolling logs and 
+    saving them, reporting errors/anomalies and doing cleanup jobs.
+    It should typically run once a day.
+</para></listitem>
+
+<listitem><para><literal>newswatch</literal>: 
+    This looks for news problems at a more detailed level than
+    newsdaily like looking for persistent lock files, determining if there is 
+    enough space for a minimum no. of files, if there is a huge queue of 
+    unattended batches and the likes. This should typically run once every hour.
+    For more on this and the above, read the <literal>newsmaint</literal>
+    manpage.
+</para></listitem>
+
+<listitem><para><literal>doexpire</literal>: 
+    This script expires old articles as determined by the
+    control file <literal>explist</literal> and updates the
+    <literal>active</literal> file. This is necessary if you do not 
+    want unnecessary/unwanted articels hogging up your disk space. Run it once 
+    a day.  Manpage: <literal>expire</literal>
+</para></listitem>
+
+<listitem><para><literal>newsrunning off/on</literal>: 
+    This script shuts/starts off the news server for you.
+    You could choose to add this in your cron job if you think the news server 
+    takes up lots of CPU time during peak hours and you wish to keep a check on
+    it. 
+</para></listitem>
+</itemizedlist>
+</section>
+
+<section><title><literal>newsrun</literal> and <literal>relaynews</literal>: digesting received articles </title>
+<para>
+The heart and soul of the Usenet News system, <literal>newsrun</literal> just picks up the batches/
+articles in the <literal>in.coming</literal> directory of
+<literal>$NEWSARTS</literal> and uncompresses them (if required) and calls
+<literal>relaynews</literal>. It should run from cron.
+</para>
+
+<para>
+<literal>relaynews</literal> picks up each <literal>article</literal> one by one
+through stdin, determines if it belongs to a subscribed group by looking up
+<literal>sys</literal> file, looks in the <literal>history</literal> file
+to determine that it does not already exist locally, digests it updating the 
+<literal>active</literal>  and <literal>history</literal> file and batches it
+for neighbouring sites. Logs errors on encountering problems while processing
+the <literal>article</literal> and takes appropriate action if it happens to be
+a control message. More info in manpage of <literal>relaynews</literal>.
+</para>
+</section>
+
+<section><title><literal>doexpire</literal> and <literal>expire</literal>: removing old articles </title>
+<para>
+A good way to get rid of unwanted/old articles from the
+<literal>$NEWSARTS</literal> area is to run doexpire once a day. It reads the
+<literal>explist</literal> file from the <literal>$NEWSCTL</literal> directory
+to determine what articles expire today. It can archive the
+said <literal>article</literal> if so configured. It then updates the
+<literal>active</literal> and the <literal>history</literal> file accordingly.
+If you wish to retain the <literal>article</literal> entry in the
+<literal>history</literal> file to avoid re-digesting it as a new
+article after having expired it add a special /expired/; line
+in the control file. More on the options and functioning in the expire manpage.
+</para>
+</section>
+
+<section><title><literal>nntpd</literal> and <literal>msgidd</literal>: managing the NNTP interface </title>
+<para>
+As has already been discussed in the chapter on setting up the software,
+<literal>nntpd</literal> is a TCP-based server daemon which runs under
+<literal>inetd</literal>. It is fired by <literal>inetd</literal>
+whenever there's an incoming connection on the NNTP port, and it takes
+over the dialogue from there. It reads the C-News configuration and data
+files in <literal>$NEWSCTL</literal>, article files from
+<literal>$NEWSARTS></literal>, and receives incoming posts and
+transfers. These it dutifully queues in
+<literal>$NEWSARTS/in.coming</literal>, either as batch files or single
+article files.</para>
+
+<para>It is important that <literal>inetd</literal> be configured to
+fire <literal>nntpd</literal> as user <literal>news</literal>, not as
+<literal>root</literal> like it does for other daemons like
+<literal>telnetd</literal> or <literal>ftpd</literal>. If this is not
+done correctly, a lot of problems can be caused in the functioning of
+the C-News system later.</para>
+
+<para><literal>nntpd</literal> is fired each time a new NNTP connection
+is received, and dies once the NNTP client closes its connection. Thus,
+if one <literal>nntpd</literal> receives a few articles by an incoming
+batch feed (not a <literal>POST</literal> but an <literal>XFER</literal>),
+then another <literal>nntpd</literal> will not know about the receipt of
+these articles till the batches are digested. This will hamper
+duplicate newsfeed detection if there are multiple upstream NDNs feeding
+our server with the same set of articles over NNTP. To fix this,
+<literal>nntpd</literal> uses an ally: <literal>msgidd</literal>, the
+message ID daemon. This
+daemon is fired once at server bootup time through
+<literal>newsboot</literal>, and keeps running quietly in the
+background, listening on a named Unix socket in the
+<literal>$NEWSCTL</literal> area. It keeps in its memory a list of all
+message IDs which various incarnations of <literal>nntpd</literal> have
+asked it to remember.</para>
+
+<para>Thus, when one copy of <literal>nntpd</literal> receives an
+incoming feed of news articles, it updates <literal>msgidd</literal>
+with the message IDs of these messages through the Unix socket. When
+another copy of <literal>nntpd</literal> is fired later and the NNTP
+client tries to feed it some more articles, the <literal>nntpd</literal>
+checks each message ID against <literal>msgidd</literal>. Since
+<literal>msgidd</literal> stores all these IDs in memory, the lookup is
+very fast, and duplicate articles are blocked at the NNTP interface
+itself.</para>
+
+<para>On a running system, expect to see one instance of
+<literal>nntpd</literal> for each active NNTP connection, and just one
+instance of <literal>msgidd</literal> running quietly in the background,
+hardly consuming any CPU resources. Our <literal>nntpd</literal> is
+configured to die if the NNTP connection is more than a few minutes
+idle, thus conserving server resources. This does not inconvenience the
+user because modern NNTP clients simply re-connect. If an
+<literal>nntpd</literal> instance is found to be running for days, it is
+either hung due to a network error, or is receiving a very long incoming
+NNTP feed from your upstream server. We used to receive our primary
+incoming feed from our service provider through NNTP sessions lasting 18
+to 20 hours without a break, every day.</para>
+
+</section>
+
+<section><title><literal>nov</literal>, the News Overview system</title>
+<para>NOV, the News Overview System is a recent augmentation to the
+C-News and NNTP systems and to the NNTP protocol. This subsystem
+maintains a file for each active newsgroup, in which it maintains one
+line per current article. This line of text contains some key meta-data
+about the article, <emphasis>e.g.</emphasis> the contents of the
+<literal>From</literal>, <literal>Subject</literal>,
+<literal>Date</literal> and the article size and message ID. This speeds
+up NNTP response enormously. The <literal>nov</literal> library has been
+integrated into the <literal>nntpd</literal> code, and into key binaries
+of C-News, thus providing seamless maintenance of the News Overview
+database when articles are added or deleted from the repository.</para>
+
+<para>When <literal>newsrun</literal> adds an article into
+<literal>starcom.test</literal>, it also updates
+<literal>$NEWSARTS/starcom/test/.overview</literal> and adds a line with
+the relevant data, tab-separated, into it. When <literal>nntpd</literal>
+comes to life with an NNTP client, and it sees the
+<literal>XOVER</literal> NNTP command, it reads this
+<literal>.overview</literal> file, and returns the relevant lines to the
+NNTP client. When <literal>expire</literal> deletes an article, it also
+removes the corresponding line from the <literal>.overview</literal>
+file. Thus, the maintenance of the NOV database is seamless.</para>
+</section>
+
+<section><title>Batching feeds with UUCP and NNTP</title>
+<para>Some information about batching feeds has been provided in earlier
+sections. More will be added later here in this document.</para>
+</section>
+
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/conclusion.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/conclusion.sgml
@ -0,0 +1,96 @@
+<chapter><title>Wrapping up</title>
+
+<section><title>Acknowledgements</title>
+<para>
+This HOWTO is a by-product of many years of experience setting up and
+managing Usenet news servers. We have learned a lot from those who have
+trod the path ahead of us. Some of them include the team of the ERNET
+Project, which brought the Internet technology to India's academic
+institutions in the early nineties. We specially remember what we have
+learned from the SIGSys Group of the Department of Computer Science of
+the Indian Institute of Technology, Mumbai. We have also benefited
+enormously from the guidance we received from the Networking Group at
+the NCST in Mumbai, specially from Geetanjali Sampemane.
+</para>
+
+<para>On a wider scale, our learning along the path of systems and
+networks started with Unix, without which our appreciation of computer
+systems would have remained very fragmented and superficial. Our
+insight into Unix came from our village elders at the Department
+of Computer Science of the IIT at Mumbai, specially from ``Hattu,''
+``Sathe,'' and ``Sapre,'' none of which are with the IIT today, and from
+Professor D.B.Phatak and others, many of whom, luckily are still with
+the IIT.</para>
+
+<para>Coming down to specifics, all the members of Starcom Software who
+have worked on various problems with networking, Linux, and Usenet news
+installations, have helped the authors in understanding what works and
+what doesn't. Without their work, this HOWTO would have been a dry text
+book.</para>
+</section>
+
+<section><title>Comments invited</title>
+<para>Your comments and contributions are invited. We cannot possibly
+write all sections of this HOWTO based on our knowledge alone. Please
+contribute all you can, starting with minor corrections and bug fixes
+and going on to entire sections and chapters. Your contributions will be
+acknowledged in the HOWTO.</para>
+</section>
+
+<section><title>Copyright</title>
+<para>
+Copyright (c) 2002 by Starcom Software Private Limited, India
+</para>
+
+<para>Please freely copy and distribute (sell or give away) this
+document in any format. It is requested that corrections and/or
+comments be fowarded to the document maintainer, reachable at
+<literal>usenet@starcomsoftware.com</literal>. When these comments
+and contributions are incorporated into this document and released
+for distribution in future versions of this HOWTO, the content of the
+incorporated text will become the copyright of Starcom Software Private
+Limited. By submitting your contributions to us, you implicitly agree
+to these terms.</para>
+
+<para>You may create a derivative work and distribute it provided that
+you:</para>
+
+<orderedlist>
+<listitem><para>
+    Send your derivative work (in the most suitable format such as SGML) to the 
+    LDP (Linux Documentation Project) or the like for posting on the Internet. 
+    If not the LDP, then let the LDP know where it is available.                
+</para></listitem>    
+
+<listitem><para>
+    License the derivative work with this same license or use GPL. Include a 
+    copyright notice and at least a pointer to the license used.                
+</para></listitem>    
+
+<listitem><para>
+    Give due credit to previous authors and major contributors.                
+    If you're considering making a derived work other than a
+    translation, it is requested that you discuss your plans with the 
+    current maintainer.
+</para></listitem>    
+</orderedlist>
+</section>
+
+<section><title>About Starcom Software Private Limited</title>
+<para>
+<emphasis role=bold>starcom</emphasis> (Starcom Software Private
+Limited, <literal>www.starcomsoftware.com</literal>) has been building
+products and solutions using Linux and Web technology since 1996. Our
+entire office runs on Linux, and we have built mission-critical
+solutions for some of the top corporate entities in India and abroad.
+Our client list includes arguably the world's largest securities
+depository (The National Securities Depository of India Limited), one of
+the world's top five stock exchanges in terms of trading volumes (The
+National Stock Exchange of India Limited), and one of India's premier
+financial institutions, which is listed on the NYSE. In all these cases,
+we have introduced them to Linux, and in many cases, we have built them
+their first mission-critical business applications on Linux. Contact the
+authors or check the Website for more information about the work we have done.
+</para>
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/doc.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/doc.sgml
@ -0,0 +1,218 @@
+<chapter><title>Documentation and information</title>
+<section><title>The manpages</title>
+
+<para>The following manpages are installed automatically when our
+integrated software distribution is compiled and installed, listed here
+in no particular order:</para>
+
+<itemizedlist>
+<listitem><para><literal>badexpiry:</literal>
+utility to look for articles with bad explicit Expiry headers
+</para></listitem>
+
+<listitem><para><literal>checkactive:</literal>
+utility to perform some sanity checks on the <literal>active</literal>
+file
+</para></listitem>
+
+<listitem><para><literal>cnewsdo:</literal>
+utility to perform some checks and then run C-News maintenance commands
+</para></listitem>
+
+<listitem><para><literal>controlperm:</literal>
+configuration file for controlling responses to Usenet control messages
+</para></listitem>
+
+<listitem><para><literal>expire:</literal>
+utility to expire old articles
+</para></listitem>
+
+<listitem><para><literal>explode:</literal>
+internal utility to convert a master batch file to ordinary batch files
+</para></listitem>
+
+<listitem><para><literal>inews:</literal>
+the program which forms the entry point for fresh postings to be
+injected into the Usenet system
+</para></listitem>
+
+<listitem><para><literal>mergeactive:</literal>
+utility to merge one site's newsgroups to another site's
+<literal>active</literal> file
+</para></listitem>
+
+<listitem><para><literal>mkhistory:</literal>
+utility to rebuild news <literal>history</literal> file
+</para></listitem>
+
+<listitem><para><literal>news(5):</literal>
+description of Usenet news article file and batch file formats
+</para></listitem>
+
+<listitem><para><literal>newsaux:</literal>
+a collection of C-News utilities used by its own scripts and by the
+Usenet news administrator for various maintenance purposes
+</para></listitem>
+
+<listitem><para><literal>newsbatch:</literal>
+covers all the utilities and programs which are part of the news
+batching system of C-News
+</para></listitem>
+
+<listitem><para><literal>newsctl:</literal>
+describes the file formats and uses of all the files in
+<literal>$NEWSCTL</literal> other than the two key files,
+<literal>sys</literal> and <literal>active</literal>
+</para></listitem>
+
+<listitem><para><literal>newsdb:</literal>
+describes the key files and directories for news articles, including the
+structure of <literal>$NEWSARTS</literal>, the <literal>active</literal>
+file, the <literal>active.times</literal> file, and the
+<literal>history</literal> file.
+</para></listitem>
+
+<listitem><para><literal>newsflag:</literal>
+utility to change the flag or type column of a newsgroup in the
+<literal>active</literal> file
+</para></listitem>
+
+<listitem><para><literal>newsmail:</literal>
+utility scripts used to send and receive newsfeeds by email. This is
+different from a mail-to-news gateway, since this is for communication
+between two Usenet news servers.
+</para></listitem>
+
+<listitem><para><literal>newsmaint:</literal>
+utility scripts used by Usenet administrator to manage and maintain
+C-News system
+</para></listitem>
+
+<listitem><para><literal>newsoverview(5):</literal>
+file formats for the NOV database
+</para></listitem>
+
+<listitem><para><literal>newsoverview(8):</literal>
+library functions of the NOV library and the utilities which use them
+</para></listitem>
+
+<listitem><para><literal>newssys:</literal>
+the important <literal>sys</literal> file of C-News
+</para></listitem>
+
+<listitem><para><literal>relaynews:</literal>
+the <literal>relaynews</literal> program of C-News
+</para></listitem>
+
+<listitem><para><literal>report:</literal>
+utility to generate and send email reports of errors and events from
+C-News scripts
+</para></listitem>
+
+<listitem><para><literal>rnews:</literal>
+receive news batches and queue them for processing
+</para></listitem>
+
+<listitem><para><literal>nntpd:</literal>
+The NNTP daemon
+</para></listitem>
+
+<listitem><para><literal>nntpxmit:</literal>
+The NNTP batch transmit program for outgoing push feeds
+</para></listitem>
+
+</itemizedlist>
+
+</section>
+
+<section><title>The C-News guide</title>
+
+<para>This document is part of the C-News source, and is available in
+the <literal>c-news/doc</literal> directory of the source tree. The
+<literal>makefile</literal> here uses <literal>troff</literal> and the
+source files to generate <literal>guide.ps</literal>. This C-News Guide
+is a very well-written document to provide an introduction to the
+functioning of C-News.</para>
+
+</section>
+
+<section><title>O'Reilly's books on Usenet news</title>
+
+<para>O'Reilly and Associates had an excellent book that can form the
+foundations for understanding C-News and Usenet news in general, titled
+``Managing UUCP and Usenet,'' dated 1992. This was considered a bit
+dated because it did not cover INN or the Internet protocols.</para>
+
+<para>They have subsequently published a more recent book, titled
+``Managing Usenet,'' written by Henry Spencer, the co-author of C-News,
+and David Lawrence, one of the most respected Usenet veterans and
+administrators today. The book was published in 1998 and includes both
+C-News and INN.</para>
+
+<para>We have a distinct preference for books published by O'Reilly; we
+usually find them the best books on their subjects. We make no attempts
+to hide this bias. We recommend both books. We believe that there is
+very little in this HOWTO of value to someone who studies one of these
+books and then peruses information on the Internet.</para>
+
+</section>
+
+<section><title>Usenet-related RFCs</title>
+
+<para>TO BE ADDED</para>
+
+</section>
+
+<section><title>The source code</title>
+
+<para>TO BE ADDED</para>
+
+</section>
+
+<section><title>Usenet newsgroups</title>
+
+<para>There are many discussion groups on the Usenet dedicated to the
+technical and non-technical issues in managing a Usenet server and
+service. These are:</para>
+
+<itemizedlist>
+<listitem><para><literal>news.admin.technical</literal>
+Discusses technical issues about administering Usenet news
+</para></listitem>
+<listitem><para><literal>news.admin.policy</literal>
+Discusses policy issues about Usenet news
+</para></listitem>
+<listitem><para><literal>news.software.b</literal>
+Discusses C-News (no separate newsgroup was created after B-News gave
+way to C-News) source, configuration and bugs (if any)
+</para></listitem>
+</itemizedlist>
+
+<para>MORE WILL BE ADDED LATER</para>
+</section>
+
+<section><title>We</title>
+
+<para>We, at Starcom Software, offer the services of our Usenet news
+team to provide assistance to you by email, as a service to the Linux
+and Usenet administrator community, on a best effort basis.</para>
+
+<para>We will endeavour to answer all queries sent to
+<literal>usenet@starcomsoftware.com</literal>, pertaining to the source
+distribution we have put together and its configuration and maintenance,
+and also pertaining to general technical issues related to running a
+Usenet news service off a Unix or Linux server.</para>
+
+<para>We may not be in a position to assist with software components we
+are not familiar with, <emphasis>e.g.</emphasis> Leafnode, or platforms
+we do not have access to, <emphasis>e.g.</emphasis> SGI IRIX. Intel
+Linux will be supported as long as our group is alive; our entire office
+runs on Linux servers and diskless Linux desktops.</para>
+
+<para>You do not need to be dependent on us, because neither do we have
+proprietary knowledge nor proprietary closed-source software. All the
+extensions we are currently involved in with C-News and NNTPd will
+immediately be made available to the Internet.</para>
+
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/inn.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/inn.sgml
@ -0,0 +1,50 @@
+<chapter><title>Setting up INN</title>
+
+<section><title>Getting the source</title>
+<para>INN is maintained and archived by the ISC (Internet Software
+Consortium, <literal>www.isc.org</literal>) since 1996, and the INN
+homepage is at <literal>http://www.isc.org/products/INN/</literal>. The
+latest release of INN as of the time of this writing is INN v2.3.3,
+released 7 May 2002. The full sources can be downloaded from
+<literal>ftp://ftp.isc.org/isc/inn/inn-2.3.3.tar.gz</literal></para>
+
+</section>
+
+<section><title>Compiling and installing</title>
+
+<para>TO BE ADDED LATER.</para>
+
+</section>
+
+<section><title>Configuring the system</title>
+
+<para>TO BE ADDED LATER.</para>
+
+</section>
+
+<section><title>Setting up <literal>pgpverify</literal></title>
+
+<para>TO BE ADDED LATER.</para>
+
+</section>
+
+<section><title>Feeding off an upstream neighbour</title>
+
+<para>TO BE ADDED LATER.</para>
+
+</section>
+
+<section><title>Setting up outgoing feeds</title>
+
+<para>TO BE ADDED LATER.</para>
+
+</section>
+
+<section id=innefficiency>
+    <title>Efficiency issues and advantages</title>
+
+<para>TO BE ADDED LATER.</para>
+
+</section>
+
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/mail2news.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/mail2news.sgml
@ -0,0 +1,101 @@
+<chapter><title>Connecting email with Usenet news</title>
+<para>
+Usenet news and mailing lists constantly remind us of each other.  And the
+parallels are so strong that many mailing lists are gatewayed two-way
+with corresponding Usenet newsgroups, in the <literal>bit</literal> hierarchy
+which maps onto the old BITNET, and elsewhere.
+</para>
+
+<para>
+There are probably ten different situations where a mailing list is
+better, and ten others where the newsgroup approach works better. The
+point to recognise is that the system administrator needs a choice of
+gatewaying one with the other, whenever tradeoffs justify it. Instead
+of getting into the tradeoffs themselves, this chapter will then focus
+on the mechanisms of gatewaying the two worlds.
+</para>
+
+<para>
+One clear and recurring use we find for this gatewaying is for mailing
+lists which are of general use to many employees in a corporate network.
+For instance, in stockbroking company, many employees may like to
+subscribe to a business news mailing list. If each employee had to
+subscribe to the mailing list independently, it would waste mail spool
+area and perhaps bandwidth. In such situations, we receive the mailing
+list into an internal newsgroup, so that individual mailboxes are not
+overloaded. Everyone can then read the newsgroup, and messages are also
+archived till expired.
+</para>
+
+<section><title>Feeding Usenet news to email</title>
+
+<para>
+In CNews, this is trivially done by adding one line to the
+<literal>sys</literal> file, defining a new outgoing feed listing all the
+relevant groups and distributions, and specifying the commandline to be executed
+which is supposed to send out the outgoing message to that ``feed.'' This
+command, in our case, should be a mail-sending program,
+<emphasis>e.g.</emphasis>
+<literal>/bin/mail user@somewhere.com</literal>. This is often adequate to get
+the job done. We are sure almost every Usenet news software system will have
+an equally easy way of piping the feed of a newsgroup to an email address.
+</para>
+</section>
+
+<section><title>Feeding email to news: the <literal>mail2news gateway</literal></title>
+
+<para>With our Usenet software sources has been integrated a set of
+scripts which we have been using for at least five years internally.
+This set of scripts is called <literal>mail2news</literal>. It contains
+one shellscript called <literal>mail2news</literal>, which takes an
+email message from <literal>stdin</literal>, processes it, and feeds the
+processed version to <literal>inews</literal>, the
+<literal>stdin</literal>-based news article injection utility of C-News.
+The <literal>inews</literal> utility accepts a new article post in its
+<literal>stdin</literal> and queues it for digestion by
+<literal>newsrun</literal> whenever it runs next.</para>
+
+<para>To use <literal>mail2news</literal>, we assume you are using
+Sendmail to process incoming email. Our instructions can easily be
+modified to adapt to any Mail Transport Agent (MTA) of your choice. You
+will have to configure Sendmail or any other MTA to redirect incoming
+mails for the gateway to a program called <literal>m2nmailer</literal>,
+a Perlscript which accepts the incoming message in its standard input
+and a list of newsgroup names, space separated, on its command line.
+Sendmail can be easily configured to trigger <literal>m2nmailer</literal>
+this way by defining a new mailer in <literal>sendmail.cf</literal>,
+and directing all incoming emails meant for the Usenet news system to
+this mailer. Once you set up the appropriate rulesets for Sendmail,
+it automatically triggers <literal>m2nmailer</literal> each time an
+incoming email comes for the <literal>mail2news</literal>
+gateway.</para>
+
+<para>The precise configuration changes to Sendmail have already been
+specified in the chapter titled ``Setting up C-News + NNTPd.''</para>
+
+</section> 
+
+<section><title>Using GNU Mailman as an email-NNTP gateway</title>
+
+<para>TO BE ADDED LATER</para>
+
+    <section><title>GNU's all-singing all-dancing MLM</title>  
+
+	<para>TO BE ADDED LATER</para>
+
+    </section>
+
+    <section><title>Features of GNU Mailman</title>
+
+	<para>TO BE ADDED LATER</para>
+
+    </section>
+    
+    <section><title>Gateway features connecting NNTP and email</title>
+
+	<para>TO BE ADDED LATER</para>
+
+    </section>
+
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/monitoring.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/monitoring.sgml
@ -0,0 +1,194 @@
+<chapter><title>Monitoring and administration</title>
+<para>
+Once the Usenet News system is in place and running, the news administrator
+is then aided in monitoring the system by various reports generated by the
+system. Also, he needs to make regular checks in specific directories and
+file to ascertain the smooth working of the system.
+</para>
+
+<section><title>The <literal>newsdaily</literal> report</title>
+<para>
+This report is generated by newsdaily which is typically run through cron. I 
+shall enumerate some of the problems reported based on what I have seen. 
+</para>
+
+<itemizedlist>
+<listitem><para>bad input batches:  
+    This reports articles that have been processed and
+    declared bad and hence not digested. The reason for it is not mentioned. You
+    are expected to check the article and determine the cause. 
+</para></listitem>
+
+<listitem><para>leading unknown newsgroups by articles:  
+    This gives a list of newsgroups
+    whose hierarchy has been subscribed to, but the specific newsgroup does not
+    appear in the active file. You could add the newsgroup in the active file if
+    you think it is important enough. 
+</para></listitem>
+
+<listitem><para>leading unsubscribed newsgroups:
+    This gives a list of newsgroups
+    that have not been subscribed to, of which the news server receives a 
+    maximum no. of articles. You really cannot do much about this except to 
+    subscribe to them if they are required.
+</para></listitem>
+
+<listitem><para>leading sites sending bad headers: 
+    This will list your NDNs who
+    are sending articles with malformed/insufficient headers. 
+</para></listitem>
+
+<listitem><para>leading sites sending stale/future/misdated news: 
+    This will list your NDNs who are sending you articles that are older than
+    the date you have specified for accepting feeds. 
+</para></listitem>
+
+<listitem><para>Some of the reports generated by us: 
+    We have modified the newsdaily script to include some more statistics. 
+    <itemizedlist>
+    <listitem><para>disk usage: 
+	This reports the size in bytes of the <literal>$NEWSARTS</literal>
+	area. If you are receiving feeds regularly, you should see this figure
+	increasing.
+    </para></listitem>
+
+    <listitem><para>incoming feed statistics: 
+	This reports the no. of articles and total bytes recevied from each of
+	your NDNs.
+    </para></listitem>
+
+    <listitem><para>NNTP traffic report: 
+	The output of nestor has also been included in
+	this report which gives details of each nntp connection and the overall 
+	performance of the network connection read from the newslog file.
+	To understand the format, read manpage of nestor.
+    </para></listitem>
+    </itemizedlist>
+</para></listitem>
+
+<listitem><para>Error reporting from the errorlog file: 
+    Reports  errors logged in the errorlog file. Usually these are  file
+    ownership or file missing problems which can be easily handled. 
+</para></listitem>
+</itemizedlist>
+</section>
+
+<section><title>Crisis reports from <literal>newswatch</literal></title>
+<para>
+Most of the problems reported to me are ones with either space shortage or
+persistent locks. There are instances when the scripts have created locks files
+and have aborted/terminated without removing them. Sometimes they are 
+innocuous enough to be deleted but this should be determined after a careful
+analysis. They could be an indication of some part of the system not working
+correctly. For <emphasis>e.g.</emphasis> I would receive this error message when
+sendbatches would abnormally terminate trying to transmit huge togo files. I had
+to determine why sendbatches was failing this often.
+</para>
+
+<para>
+The space shortage issue has to be addressed immediately. You could
+delete unwanted articles by running doexpire or add more disk space at the OS 
+level. The latter seems a better option.
+<para>
+</section>
+
+<section><title>Disk space</title>
+<para>
+The <literal>$NEWSBIN</literal> area occupies space that is fixed. Since the
+binaries do not grow once installed, you do not have to worry about disk 
+shortage here. The areas that take up more space as feeds come in are
+<literal>$NEWSCTL</literal> and <literal>$NEWSARTS</literal>. The 
+<literal>$NEWSCTL</literal> has log files that keep growing with each feed and
+as the articles are digested in huge numbers the <literal>$NEWSARTS</literal>
+continues to grow. Also, if articles are being archived on expiry you will need
+space. Allocate a few GB of disk space for <literal>$NEWSARTS</literal>
+depending on the no. of hierarchies you are subscribing and the feeds that come
+in everyday. <literal>$NEWSCTL</literal> grows to a lesser proportion as 
+compared to <literal>$NEWSARTS</literal>. Allocate space for this accordingly.
+</para>
+</section>
+
+<section><title>CPU load and RAM usage</title>
+<para>With modern C-News and NNTPd, there is very little usage of these
+system resources for processing news article flow. Key components like
+<literal>newsrun</literal> or <literal>sendbatches</literal> do not load
+the system much, except for cases where you have a very heavy flow of
+compressed outgoing batches and the compression utility is run by
+<literal>sendbatches</literal> frequently. <literal>newsrun</literal> is
+amazingly efficient in the current C-News release. Even when it takes
+half an hour to digest a large consignment of batches, it hardly loads the
+CPU of a slow Pentium 200 MHz CPU or consumes much RAM in a 64 MB
+system.</para>
+
+<para>One thing which does slow down a system is a large bunch of
+users connecting using NNTP to browse newsgroups. We do not have
+heuristic based figures off-hand to provide a guidance figure for
+resource consumption for this, but we have found that the load on the
+CPU and RAM for a certain number of active users invoking
+<literal>nntpd</literal> is more than with an equal number of
+users connecting to the POP3 port of the same system for pulling
+out mailboxes. A few hundred active NNTP users can really slow down
+a dual-P-III Intel Linux server, for instance. This loading has no
+bearing on whether you are using INN or <literal>nntpd</literal>;
+both have practically identical implementations for NNTP
+<emphasis>reading</emphasis> and differ only in their handling of
+feeds.</para>
+
+<para>Another situation which will slow down your Usenet news server is
+when downstream servers connect to you for pulling out NNTP feeds using
+the pull method. This has been mentioned before. This can really load
+your server's I/O system and CPU.</para>
+
+</section>
+
+<section><title>The <literal>in.coming/bad</literal> directory</title>
+<para>
+The in.coming directory is where the batches/articles reside when you have 
+received feeds from your NDN and before processing happens. Checking this
+directory regularly to see if there are batches is a good way of determining
+that feeds are coming in. The batches and articles have different nomenclature.
+Names like nntp.GxhsDj are indicative of batches and individual 
+articles are named beginning with digits like <literal>0.10022643380.t</literal>
+</para>
+
+<para>
+The bad sub-directory under in.coming holds batches/articles that have
+encountered errors when they were being processed by relaynews. You will have 
+to look into the directory for the cause of it. Ideally speaking, this 
+directory should be empty.
+</para>
+</section>
+
+<section><title>Long pending queues in <literal>out.going</literal></title>
+
+<para>TO BE ADDED.</para>
+
+</section>
+
+<section><title>Problems with <literal>nntpxmit</literal> and <literal>nntpsend</literal></title>
+
+<para>TO BE ADDED.</para>
+
+</section>
+
+<section><title>The <literal>junk</literal> and <literal>control</literal> groups</title>
+<para>
+Control messages are those that have a newgroup/rmgroup/cancel/checkgroup in
+their subject line. Such messages result in relaynews calling the appropriate
+script and on execution a message is mailed to the admin about the action 
+taken. These control messages are stored in the control directory of
+<literal>$NEWSARTS</literal>.  For the propogation of such messages, one must
+subscribe to the control hierarchy.
+</para>
+
+<para>
+When your news system determines that a certain article has not been subscribed
+by you, it is 'junked' i.e. such articles appear in the junk directory. This
+directory plays a key role in transferring articles to your NDNs as they would
+subscribe to the junk hierarchy to receive feeds. If you are a leaf node, there
+is no reason why articles should pile here. Keep deleting them on a daily 
+basis.
+</para>
+
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/perspective.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/perspective.sgml
@ -0,0 +1,327 @@
+<chapter><title>Our perspective</title>
+<para>
+This chapter has been added to allow us to share our perspective on
+certain technical choices. Certain issues which are more a matter of
+opinion than detail, are discussed here.
+</para>
+
+<section id=feedefficiency><title>Efficiency issues of NNTP</title>
+<para>
+    To understand why NNTP is often an inappropriate choice for
+    newsfeeds, we need to understand TCP's sliding window protocol
+    and the nature of NNTP. NNTP is an apalling waste of bandwidth
+    for most bulk article transfer situations, because of the
+    following simple reasons:
+</para>
+
+<itemizedlist>
+<listitem><para>
+    <emphasis>No compression</emphasis>: articles are transferred in plain text. 
+</para></listitem>
+
+<listitem><para> 
+    <emphasis>No article transmission restart</emphasis>: if a
+    connection breaks halfway through an article, the next round
+    will have to start with the beginning of the article.
+</para></listitem>
+
+<listitem><para> 
+    <emphasis>Ping-pong protocol</emphasis>: NNTP is unsuitable for
+    bulk streaming data transfer.
+</para></listitem>
+</itemizedlist>
+
+<para>
+    A word of explanation on the ping-pong issue is perhaps
+    needed here. TCP uses a sliding window mechanism to pump out
+    data in one direction very rapidly, and can achieve near
+    wire speeds under most circumstances. However, this only
+    works if the application layer protocol can aggregate a
+    large amount of data and pump it out without having to stop
+    every so often, waiting for an ack or a response from the
+    other end's application layer. This is precisely why sending
+    one file of 100~Mbytes by FTP takes so much less clock time
+    than 10,000 files of 10~Kbytes each, all other parameters
+    remaining unchanged. The trick is to keep the sliding window
+    sliding smoothly over the outgoing data, blasting packets
+    out as fast as the wire will carry it, without ever
+    allowing the window to empty out while you wait for an ack.
+    Protocols which require short bursts of data from either end
+    constantly, <emphasis>e.g.</emphasis> in the case of remote
+    procedure calls, are called ``ping pong protocols'' because they
+    remind you of a table-tennis ball.  
+</para>
+
+<para>
+    With NNTP, this is precisely the problem. The average size
+    of Usenet news messages, including header and body, is
+    3 Kbytes. When thousands of such articles are sent out by
+    NNTP, the sending server has to send the message~ID of the
+    first article, then wait for the receiving server to respond
+    with a ``yes'' or ``no.'' Once the sendiing server gets the
+    ``yes'', it sends out that article, and waits for an ``ok''
+    from the receiving server. Then it sends out the message~ID
+    of the second article, and waits for another ``yes'' or
+    ``no.'' And so on. The TCP sliding window never gets to do
+    its job.  
+</para>
+
+<para>
+    This sub-optimal use of TCP's data pumping ability, coupled with
+    the absence of compression, make for a protocol which is great
+    for synchronous connectivity, <emphasis>e.g.</emphasis> for news
+    reading or real-time
+    updates, but very poor for batched transfer of data which can be
+    delayed and pumped out. All these are precisely reversed in the
+    case of UUCP over TCP.
+</para>
+
+<para>
+    To decide which protocol, UUCP over TCP or NNTP, is appropriate
+    for your server, you must address two questions:
+</para>
+
+<orderedlist>
+<listitem><para> 
+    How much time can your server afford to wait from the time
+    your upstream server receives an article to the time it
+    passes it on to you?
+</para></listitem>
+
+<listitem><para> 
+    Are you receiving the same set of hierarchies from multiple
+    next-door neighbour servers, <emphasis>i.e.</emphasis> is your
+    newsfeed flow pattern a mesh instead of a tree?
+</para></listitem>
+</orderedlist>
+
+<para>
+    If your answers to the two questions above are ``messages cannot
+    wait'' and ``we operate in a mesh'', then NNTP is the correct
+    protocol for your server to receive its primary feed(s). 
+</para>
+
+<para>
+    In most cases, carrier-class servers operated by major service
+    providers do not want to accept even a minute's delay from the
+    time they receive an article to the time they retransmit it out.
+    They also operate in a mesh with other servers operated by their
+    own organisations (<emphasis>e.g.</emphasis> for redundancy) or
+    others. They usually
+    sit very close to the Internet backbone,
+    <emphasis>i.e.</emphasis> with Tier 1 ISPs,
+    and have extremely fast Internet links, usually more than
+    10 Mbits/sec. The amount of data that flows out of such servers
+    in outgoing feeds is more than the amount that comes in, because
+    each incoming article is retained, not for local consumption,
+    but for retransmission to others lower down in the flow. And
+    these servers boast of a retransmission latency of less than 30
+    seconds, <emphasis>i.e.</emphasis> I will retransmit an article
+    to you within 30 seconds of my having received it.  
+</para>
+
+<para>
+    However, if your server is used by a company for making Usenet
+    news available for its employees, or by an institute to make the
+    service available for its students and teachers, then you are
+    not operating your server in a mesh pattern, nor do you mind it
+    if messages take a few hours to reach you from your upstream
+    neighbour. 
+</para>
+
+<para>
+    In that case, you have enormous bandwidth to conserve by moving
+    to UUCP.  Even if, in this Internet-dominated era, you have no
+    one to supply you with a newsfeed using dialup point-to-point
+    links, you can pick up a compressed batched newsfeed using UUCP
+    over TCP, over the Internet.  
+</para>
+
+<para>
+    In this context, we want to mention Taylor UUCP, an excellent
+    UUCP implementation available under GNU GPL. We use this UUCP
+    implementation in preference to the bundled UUCP systems offered
+    by commercial Unix vendors even for dialup connections, because
+    it is far more stable, high performance, and always supports
+    file transfer restart. Over TCP/IP, Taylor is the only one we
+    have tried, and we have no wish to try any others.  
+</para>
+
+<para>
+    Apart from its robustness, Taylor UUCP has one invaluable
+    feature critical to large Usenet batch transfers: file transfer
+    restart. If it is transferring a 10 MB batch, and the connection
+    breaks after 8 MB, it will restart precisely where it left off
+    last time. Therefore, no bytes of bandwidth are wasted, and
+    queues never get stuck forever.  </para>
+
+<para>
+    Over NNTP, since there is no batching, transfers happen one
+    article at a time. Considering the (relatively) small size of an
+    article compared to multi-megabyte UUCP batches, one would
+    expect that an article would never pose a major problem while
+    being transported; if it can't be pushed across in one attempt,
+    it'll surely be copied the next time.  However, we have
+    experienced entire NNTP feeds getting stuck for days on end
+    because of one article, with logs showing the same article
+    breaking the connection over and over again while being
+    transferred <footnote><para>
+    This lack of a restart facility is something NNTP shares with
+    its older cousin, SMTP, and we have often seen email messages
+    getting stuck in a similar fashion over flaky data links. In
+    many such networks which we manage for our clients, we have
+    moved the inter-server mail transfer to Taylor UUCP, using UUCP
+    over TCP.</para></footnote>. Some rare articles can be
+    more than a megabyte in size, particularly in
+    <literal>comp.binaries</literal>. In each such incident, we have
+    had to manually edit the queue file on the transmitting server
+    and remove the offending article from the head of the queue.
+    Taylor UUCP, on the other hand, has never given us a single
+    hiccup with blocked queues.  
+</para>
+
+<para>
+    We feel that the overwhelming majority of servers offering the
+    Usenet news service are at the leaf nodes  of the Usenet news
+    flow, not at the heart. These servers are usually connected in a
+    tree, with each server having one upstream ``parent node'', and
+    multiple downstream ``child nodes.'' These servers receive their
+    bulk incoming feed from their upstream server, and their users
+    can tolerate a delay of a few hours for articles to move in and
+    out. If your server is in this class, we feel you should
+    consider using UUCP over TCP and transfer compressed batches.
+    This will minimise bandwidth usage, and if you operate using
+    dialup Internet connections, it will directly reduce your
+    expenses.  
+</para>
+
+<para>
+    A word about the link between mesh-patterned newsfeed flow and
+    the need to use NNTP. If your server is receiving primary ---
+    as against trickle --- feeds from multiple next-door neighbours,
+    then you have to use NNTP to receive these feeds. The reason
+    lies in the way UUCP batches are accepted. UUCP batches are
+    received in their entirety into your server, and then they are
+    uncompressed and processed. When the sending server is giving
+    you the batch, it is not getting a chance to go through the
+    batch article by article and ask your server whether you have or
+    don't have each article. This way, if multiple servers give you
+    large feeds for the same hierarchies, then you will be bound to
+    receive multiple copies of each article if you go the UUCP way.
+    All the gains of compressed batches will then be neutralised.
+    NNTP's <literal>IHAVE</literal> and <literal>SENDME</literal>
+    dialogue in effect
+    permits precisely this double-check for each article, and thus
+    you don't receive even a single  article twice. 
+</para>
+
+<para>
+    For Usenet servers which connect to the Internet periodically
+    using dialup connections to fetch news, the UUCP option is
+    especially important. Their primary incoming newsfeed cannot be
+    pushed into them using queued NNTP feeds for reasons described
+    in the above <link linkend="dialupnonntp">paragraph</link>
+    These
+    hapless servers are usually forced to pull out their articles
+    using a pull NNTP feed, which is often very slow. This may lead
+    to long connect times, repeat attempts after every line break,
+    and high Internet connection charges.  
+</para>
+
+<para>
+    On the other hand, we have been using UUCP over TCP and
+    <literal>gzip</literal>'d batches for more than five years now
+    in a variety of sites. Even today, a full feed of all eight
+    standard hierarchies, plus the full
+    <literal>microsoft</literal>, <literal>gnu</literal>
+    and <literal>netscape</literal> hierarchies, minus 
+    <literal>alt</literal> and <literal>comp.binaries</literal>, can
+    comfortably be handled in just a few hours of connect time every
+    night, dialing up to the 
+    Internet at 33.6 or 56 Kbits/sec. We believe that the proverbial
+    `full feed' with all hierarchies including
+    <literal>alt</literal> can be handled comfortably with a 24-hour
+    link at 56 Kbits/sec, provided you forget about NNTP feeds. We
+    usually get compression ratios of 4:1 using
+    <literal>gzip -9</literal> on our news batches, incidentally. 
+</para>
+
+</section>
+
+<section><title>C-News+NNTPd or INN?</title>
+<para>
+INN and CNews are the two most popular free software implementations
+of Usenet news. Of these two, we prefer CNews, primarily because
+we have been using it across a very large range of Unixen for more
+than one decade, starting from its earliest release --- the so-called
+``Shellscript release'' --- and we have yet to see a need to
+change.<footnote><para>One of us did his first installation with with BNews,
+actually, at the IIT Mumbai. Then we rapidly moved from there to CNews
+Shellscript Release, then CNews Performance Release, CNews Cleanup
+Release, and our current release has fixed some bugs in the latest
+Cleanup Release.</para></footnote>
+</para>
+
+<para>
+We have seen INN, and we are not comfortable with a software
+implementation which puts in so much of functionality inside one
+executable. This reminds us of Windows NT, Netscape Communicator,
+and other complex and monolithic systems, which make us uncomfortable
+with their opaqueness. We feel that CNews' architecture, which comprises
+many small programs, intuitively fits into the Unix approach of building
+large and complex systems, where each piece can be understood, debugged,
+and if needed, replaced, individually.
+</para>
+
+<para>
+Secondly, we seem to see the move towards INN accompanied by a move
+towards NNTP as a primary newsfeed mechanism. This is no fault of INN;
+we suspect it is a sort of cultural difference between INN users and
+CNews users.  We find the issue of UUCP versus NNTP for batched newsfeeds
+a far more serious issue than the choice of CNews versus INN. We simply
+cannot agree with the idea that NNTP is an appropriate protocol for bulk
+Usenet feeds for most sites. Unfortunately, we seem to find that most
+sites which are more comfortable using INN seem to also prefer NNTP over
+UUCP, for reasons not clear to us.
+</para>
+
+<para>
+Our comments should not be taken as expressing any reservation about
+INN's quality or robustness. Its popularity is testimony to its
+quality; it most certainly ``gets the job done'' as well as anything
+else. In addition, there are a large number of commercial Usenet news
+server implementations which have started with the INN code; we do not
+know of any which have started with the CNews code. The Netwinsite DNews
+system and the Cyclone Typhoon, we suspect, both are INN-spired.
+</para>
+
+<para>
+We will recommend CNews and NNTPd over INN, because we are more
+comfortable with the CNews architecture for reasons given above, and we
+do not run carrier-class sites. We will continue to support, maintain and
+extend this software base, at least for Linux.  And we see no reason for
+the overwhelming majority of Usenet sites to be forced to use anything
+else. Your viewpoints welcome.
+</para>
+
+<para>
+Had we been setting up and managing carrier-class sites with their
+near-real-time throughput requirements, we would probably not have
+chosen CNews. And for those situations, our opinion of NNTP versus
+compressed UUCP has been discussed in <xref linkend="feedefficiency"/>
+</para>
+
+<para>
+Suck and Leafnode have their place in the range of options, where they
+appear to be attractive for novices who are intimidated by the ``full
+blown'' appearance of CNews+NNTPd or INN. However, we run CNews + NNTPd
+even on Linux laptops. We suspect INN can be used this way too. We do
+not find these ``full blown'' implementations any more resource
+hungry than their simpler cousins. Therefore, other than administration
+and configuration familiarity, we don't see any other reason why even a
+solitary end-user will choose Leafnode or Suck over CNews+NNTPd. As
+always, contrary opinions invited.
+</para>
+</section>
+
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/pop.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/pop.sgml
@ -0,0 +1,515 @@
+<chapter><title>Principles of Operation</title>
+<para>Here we discuss the basic concepts behind the operation of a Usenet news
+system.</para>
+    
+<section><title>Newsgroups and articles </title>
+
+<para>A Usenet news article sits in a file or in some other on-disk
+data structure on the disks of a Usenet server, and its contents look
+like this:</para>
+
+<programlisting>
+<![CDATA[
+Xref: news.starcomsoftware.com starcom.tech.misc:211 starcom.tech.security:452
+Newsgroups: starcom.tech.misc,starcom.tech.security
+Path: news.starcomsoftware.com!purva!shuvam
+From: Shuvam <shuvam@starcomsoftware.com>
+Subject: "You just throw up your hands and reboot" (fwd)
+Content-Type: TEXT/PLAIN; charset=US-ASCII 
+Distribution: starcom
+Organization: Starcom Software Pvt Ltd, India
+Message-ID: <Pine.LNX.4.31.0107022153490.30462-100000@starcomsoftware.com>
+Mime-Version: 1.0
+Date: Mon, 2 Jul 2001 16:27:57 GMT
+
+Interesting quote, and interesting article.
+
+Incidentally, comp.risks may be an interesting newsgroup to follow. We
+must be receiving the feed for this group on our server, since we
+receive all groups under comp.*, unless specifically cancelled. Check it
+out sometime.
+
+comp.risks tracks risks in the use of computer technology, including
+issues in protecting ourselves from failures of such stuff.
+
+Shuvam
+
+> Date: Thu, 14 Jun 2001 08:11:00 -0400
+> From: "Chris Norloff" <cnorloff@norloff.com>
+> Subject: NYSE: "Throw up your hands and reboot"
+> 
+> When the New York Stock Exchange computer systems crashed for 85
+> minutes (8 Jun 2001), Andrew Brooks, chief of equity trading at
+> Baltimore mutual fund giant T. Rowe Price, was quoted as saying "Hey,
+> we're all subject to the vagaries of technology. It happens on your
+> own PC at home. You just throw up your hands and reboot."
+> 
+> http://www.washingtonpost.com/ac3/ContentServer?articleid=A42885-2001Jun8&pagename=article
+> 
+> Chris Norloff
+> 
+> 
+> This is from --
+> 
+> From: risko@csl.sri.com (RISKS List Owner)
+> Newsgroups: comp.risks
+> Subject: Risks Digest 21.48
+> Date: Mon, 18 Jun 2001 19:14:57 +0000 (UTC)
+> Organization: University of California, Berkeley
+> 
+> RISKS-LIST: Risks-Forum Digest  Monday 19 June 2001
+> Volume 21 : Issue 48
+> 
+>    FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS (comp.risks)
+>    ACM Committee on Computers and Public Policy,
+>    Peter G. Neumann, moderator
+> 
+> This issue is archived at <URL:http://catless.ncl.ac.uk/Risks/21.48.html>
+> and by anonymous ftp at ftp.sri.com, cd risks .
+> 
+]]>
+</programlisting>
+
+<para>A Usenet article's header is very interesting if you want to learn
+about the functioning of the Usenet. The <literal>From</literal>,
+<literal>Subject</literal>, and <literal>Date</literal> headers are
+familiar to anyone who has used email. The <literal>Message-ID</literal>
+header contains a unique ID for each message, and is present in each
+email message, though not many non-technical email users know about it.
+The <literal>Content-Type</literal> and <literal>Mime-Version</literal>
+headers are used for MIME encoding of articles, attaching files and
+other attachments, and so on, just like in email messages.</para>
+
+<para>The <literal>Organisation</literal> header is an informational header
+which is supposed to carry some information identifying the organisation
+to which the author of the article belongs. What remains now are the
+<literal>Newsgroups</literal>, <literal>Xref</literal>,
+<literal>Path</literal> and <literal>Distributions</literal> headers.
+These are special to Usenet articles and are very important.</para>
+
+<para>The <literal>Newsgroups</literal> header specifies which newsgroups
+this article should belong to. The <literal>Distributions</literal>
+header, sadly under-utilised in today's globalised Internet world,
+allows the author of an article to specify how far the article will be
+re-transmitted. The author of an article, working in conjunction with
+well-configured networks of Usenet servers, can control the ``radius'' of
+replication of his article, thus posting an article of local significance
+into a newsgroup but setting the <literal>Distribution</literal> header to
+some suitable setting, <emphasis>e.g.</emphasis> <literal>local</literal>
+or <literal>starcom</literal>, to prevent the article from being relayed
+to servers outside the specified domain.</para>
+
+<para>The <literal>Xref</literal> header specifies the precise
+<emphasis role=strong>article number</emphasis> of this article in each of the
+newsgroups in which it is inserted, for the current server. When an
+article is copied from one server to another as part of a newsfeed,
+the receiving server throws away the old <literal>Xref</literal> header
+and inserts its own, with its own article numbers. This indicates an
+interesting feature of the Usenet system: each article in a Usenet server
+has a unique number (an integer) for each newsgroup it is a part of.
+Our sample above has been added to two newsgroups on our server, and has
+the article numbers 211 and 452 in those groups. Therefore, any Usenet
+client software can query our server and ask for article number 211 in
+the newsgroup <literal>starcom.tech.misc</literal> and get this article.
+Asking for article number 452 in <literal>starcom.tech.security</literal>
+will fetch the article too. On another server, the numbers may be very
+different.</para>
+
+<para>The <literal>Path</literal> specifies the list of machines through
+which this article has travelled before it has reached the current
+server. UUCP-style syntax is used for this string. The current
+example indicates that a user called <literal>shuvam</literal> first
+wrote this article and posted it onto a computer which calls itself
+<literal>purva</literal>, and this computer then transferred this article
+by a newsfeed to <literal>news.starcomsoftware.com</literal>. The
+<literal>Path</literal> header is critical for breaking loops in
+newsfeeds, and will be discussed in detail later.</para>
+
+<para>Our sample article will sit in the two newsgroups listed above
+forever, unless expired. The Usenet software on a server is usually
+configured to expire articles based on certain conditions,
+<emphasis>e.g.</emphasis> after it's older than a certain number of
+days. The C-News software we use allows expiry control based on the
+newsgroup hierarchy and the type of newsgroup, <emphasis>i.e.</emphasis>
+moderated or unmoderated. Against each class of newsgroups, it allows
+the administrator to specify a number of days after which the article
+will be expired. It is possible for an article to control its own
+expiry, by carrying an <literal>Expires</literal> header specifying a
+date and time. Unless overriden in the Usenet server software, the
+article will be expired only after its explicit expiry time is
+reached.</para>
+</section>
+
+<section><title>Of readers and servers</title>
+<para>Computers which access Usenet articles are broadly of two classes:
+the readers and the servers. A Usenet server carries a repository of
+articles, manages them, handles newsfeeds, and offers its repository to
+authorised readers to read. A Usenet reader is merely a computer with
+the appropriate software to allow a user to access a software, fetch
+articles, post new articles, and keep track of which articles it has
+read in each newsgroup. In terms of functionality, Usenet reading
+software is less interesting to a Usenet administrator than a Usenet
+server software. However, in terms of lines of code, the Usenet reader
+software can often be much larger than Usenet server software, primarily
+because of the complexities of modern GUI code.</para>
+
+<para>Most modern computers almost exclusively access Usenet servers using
+the NNTP (Network News Transfer Protocol) for reading and posting. This
+protocol can also be used for inter-server communication, but those
+aspects will be discussed later. The NNTP protocol, like any other
+well-designed TCP-based Internet protocol, carries ASCII commands and
+responses terminated with <literal>CR-LF</literal>, and comprises a
+sequence of commands, somewhat reminiscent of the POP3 protocol for
+email. Using NNTP, a Usenet reader program connects to a Usenet server,
+asks for a list of active newsgroups, and receives this (often huge)
+list. It then sets the ``current newsgroup'' to one of these, depending
+on what the user wants to browse through. Having done this, it gets the
+meta-data of all current articles in the group, including the author,
+subject line, date, and size of each article, and displays an index of
+articles to the user.</para>
+
+<para>The user then scans through this list, selects an article, and
+asks the reader to fetch it.  The reader gives the article number of
+this article to the server, and fetches the full article for the user
+to read through. Once the user finishes his NNTP session, he exits,
+and the reader program closes the NNTP socket. It then (usually)
+updates a local file in the user's home area, keeping track of which
+news articles the user has read. These articles are typically not shown
+to the user next time, thus allowing the user to progress rapidly to new
+articles in each session. The reader software is helped along in this
+endeavour by the <literal>Xref</literal> header, using which it knows
+all the different identities by which a single article is identified
+in the server. Thus, if you read the sample article given above by
+accessing <literal>starcom.tech.misc</literal>, you'll never be shown
+this article again when you access <literal>starcom.tech.misc</literal>
+or <literal>starcom.tech.security</literal>; your reader software will
+do this by tracking the <literal>Xref</literal> header and mapping
+article numbers.</para>
+
+<para>When a user posts an article, he first composes his message using
+the user interface of his reader software. When he finally gives the
+command to send the article, the reader software contacts the Usenet
+server using the pre-existing NNTP connection and sends the article to
+it. The article carries a <literal>Newsgroups</literal> header with the
+list of newsgroups to post to, often a <literal>Distribution</literal>
+header with a distribution specification, and other headers
+like <literal>From</literal>, <literal>Subject</literal>
+<emphasis>etc.</emphasis> These headers are used by the server
+software to do the right thing. Special and rare headers like
+<literal>Expires</literal> and <literal>Approved</literal> are acted upon
+when present. The server assigns a new article number to the article for
+each newsgroup it is posted to, and creates a new <literal>Xref</literal>
+header for the article.</para>
+
+<para>Transfer of articles between servers is done in various ways, and
+is discussed in quite a bit of detail in Section XXX titled
+``Newsfeeds'' below.</para>
+
+
+</section>
+
+<section ><title>Newsfeeds </title>
+    <section><title> Fundamental concepts</title>
+	<para>When we try to analyse newsfeeds in real life, we begin to see
+	that, for most sites, traffic flow is not symmetrical in both
+	directions. We usually find that one server will feed the bulk
+	of the world's articles to one or more secondary servers every
+	day, and receive a few articles written by the users of those
+	secondary servers in exchange. Thus, we usually find that
+	articles flow down from the stem to the branches to the leaves
+	of the worldwide Usenet server network, and not exactly in a totally
+	balanced mesh flow pattern. Therefore, we use the term
+	``upstream server'' to refer to the server from which we receive
+	the bulk of our daily dose of articles, and ``downstream
+	server'' to refer to those servers which receive the bulk dose
+	of articles from us.</para>
+
+	<para>Newsfeeds relay articles from one server to their ``next door
+	neighbour'' servers, metaphorically speaking. Therefore, articles
+	move around the globe, not by a massive number of single-hop
+	transfers from the originating server to every other server in
+	the world, but in a sequence of hops, like passing the baton in
+	a relay race.  This increases the latency time for an article
+	to reach a remote tertiary server after, say, ten hops, but
+	it allows tighter control of what gets relayed at every hop,
+	and helps in redundancy, decentralisation of server loads,
+	and conservation of network bandwidth. In this respect, Usenet
+	newsfeeds are more complex than HTTP data flows, which
+	typically use single-hop techniques.</para>
+
+	<para>Each Usenet news server therefore has to worry about
+	newsfeeds each time it receives an article, either by a fresh post
+	or from an incoming newsfeed. When the Usenet server digests this
+	article and files it away in its repository, it simultaneously
+	looks through its database to see which other server it should
+	feed the article to. In order to do this, it carries out a
+	sequence of checks, described below.</para>
+
+	<para>Each server knows which other servers are its ``next door
+	neighbours;'' this information is kept in its newsfeed
+	configuration information. Against each of its ``next door
+	neighbours,'' there will be a list of newsgroups which it
+	wants, and a list of distributions. The new article's list of
+	newsgroups will be matched against the newsgroup list of the
+	``next door neighbour'' to see whether there's even a single
+	common newsgroup which makes it necessary to feed the article to
+	it. If there's a matching newsgroup, and the server's distribution
+	list matches the article's distribution, then the article is
+	marked for feeding to this neighbour.</para>
+
+	<para>When the neighbour receives the article as part of the
+	feed, it performs some sanity checks of its own. The first check
+	it performs is on the <literal>Newsgroups</literal> header of
+	the new article. If none of the newsgroups listed there are part
+	of the active newsgroups list of this server, then the article
+	can be rejected. An article rejected thus may even be queued for
+	outgoing feeds to other servers, but will not be digested for
+	incorporation into the local article repository.</para>
+
+	<para>The next check performed is against the
+	<literal>Path</literal> header of the incoming article. If this
+	header lists the name of the current Usenet server anywhere,
+	it indicates that it has already passed through this server at
+	least once before, and is now re-appearing here erroneously because
+	of a newsfeed loop. Such loops are quite often configured into
+	newsfeed topologies for redundancy: ``I'll get the articles from
+	Server X if not Server Y, and may the first one in win.'' The
+	Usenet server software automatically detects a duplicate feed
+	of an article and rejects it.</para>
+
+	<para>The next check is against what is called the server's
+	<emphasis>history database</emphasis>. Every Usenet server has
+	a history database, which is a list of the message IDs of all
+	current articles in the local repository. Oftentimes the history
+	database also carries the message IDs of all messages recently
+	expired. If the incoming article's message ID matches any of the
+	entries in the database, then again it is rejected without being
+	filed in the local repository. This is a second loop detection
+	method. Sometimes, the mere checking of the article's
+	<literal>Path</literal> header does not detection of all
+	potential problems, because the problem may be a re-insertion
+	instead of a loop. A re-insertion happens when the same incoming
+	batch of news articles is re-fed into the local server, perhaps
+	after recovering the system's data from tapes after a system
+	crash. In such cases, there's no newsfeed loop, but there's
+	still the risk that one article may be digested into the local
+	server twice. The history database prevents this.</para>
+
+	<para>All these simple checks are very effective, and work
+	across server and software types, as per the Internet standards.
+	Together, they allow robust and fail-safe Usenet article flow
+	across the world.</para>
+
+    </section>
+
+    <section><title>Types of newsfeeds</title>
+	<para>This section explains the basics of newsfeeds, without getting 
+	into details of software and configuration files.</para>
+
+	<section><title>Queued feeds</title> 
+	<para>
+	    This is the commonest method of sending articles from one server
+	    to another, and is followed whenever large volumes of articles 
+	    are to be transferred per day. This approach needs a one-time 
+	    modification to the upstream server's configuration for each 
+	    outgoing feed, to define a new <emphasis>queue.</emphasis>
+	</para>
+
+	<para>
+	    In essence all queued feeds work in the following way. When the 
+	    sending server receives an article, it processes it for 
+	    inclusion into its local repository, and also checks through all
+	    its outgoing feed definitions to see whether the article needs 
+	    to be queued for any of the feeds. If yes, it is added to a 
+	    <emphasis>queue file</emphasis> for each outgoing feed. The
+	    precise details
+	    of the queue file can change depending on the software 
+	    implementation, but the basic processes remain the same. A queue
+	    file is a list of queued articles, but does not contain the
+	    article contents. Typical queue files are ASCII text files with
+	    one line per article giving the path to a copy of the article in
+	    the local spool area.
+	</para>
+
+	<para>
+	    Later, a separate process picks up each queue file and creates 
+	    one or more <emphasis>batches</emphasis> for each outgoing feed.
+	    A <emphasis>batch</emphasis> is a large file containing multiple
+	    Usenet news 
+	    articles. Once the batches are created, various transport 
+	    mechanisms can be used to move the files from sending server to
+	    receiving server. You can even use scripted FTP.  You only need
+	    to ensure that the batch is picked up from the upstream server 
+	    and somehow copied into a designated incoming batch directory in
+	    the downstream server.
+	</para>
+
+	<para>
+	    UUCP has traditionally been the mechanism of choice for batch 
+	    movement, because it predates the Internet and wide availability
+	    of fast packet-switched data networks. Today, with TCP/IP 
+	    everywhere, UUCP once again emerges as the most logical choice 
+	    of batch movement, because it too has moved with the times: it 
+	    can work over TCP.
+	</para>
+
+	<para>
+	    NNTP is the <emphasis>de facto</emphasis> mechanism of choice
+	    for moving 
+	    queued newsfeeds for carrier-class Usenet servers on the 
+	    Internet, and unfortunately, for a lot of other Usenet servers 
+	    as well. The reason why we find this choice unfortunate is 
+	    discussed in <xref linkend="feedefficiency"/> below. But in NNTP
+	    feeds, an intermediate step of building batches out of queue 
+	    files can be eliminated --- this is both its strength and its 
+	    weakness.
+	</para>
+
+	<para>
+	    In the case of queued NNTP feeds, articles get added to queue 
+	    files as described above. An NNTP transmit process periodically
+	    wakes up, picks up a queue file, and makes an NNTP connection to
+	    the downstream server. It then begins a processing loop where, 
+	    for each queued article, it uses the NNTP
+	    <literal>IHAVE</literal> 
+	    command to inform the downstream server of the article's 
+	    message~ID. The downstream server checks its local repository to
+	    see whether it already has the message. If not, it responds with
+	    a <literal>SENDME</literal> response. The transmitting server
+	    then pumps
+	    out the article contents in plaintext form.  When all articles 
+	    in the queue have been thus processed, the sending server closes
+	    the connection. If the NNTP connection breaks in between due to
+	    any reason, the sending server truncates the queue file and 
+	    retains only those articles which are yet to be transmitted, 
+	    thus minimising repeat transmissions.
+	</para>
+
+	<para><anchor id="dialupnonntp"/>
+	    A queued NNTP feed works with the sending server making an NNTP
+	    connection to the receiving server. This implies that the 
+	    receiving server must have an IP address which is known to the 
+	    sending server or can be looked up in the DNS. If the receiving
+	    server connects to the Internet periodically using a dialup 
+	    connection and works with a dynamically assigned IP address, 
+	    this can get tricky. UUCP feeds suffer no such problems because
+	    the sending server for the newsfeed can be the UUCP server,
+	    <emphasis>i.e.</emphasis>
+	    passive. The receiving server for the feed can be the UUCP 
+	    master, <emphasis>i.e.</emphasis> the active party. So the
+	    receiving server can then
+	    initiate the UUCP connection and connect to the sending server.
+	    Thus, if even one of the two parties has a static IP address, 
+	    UUCP queued feeds can work fine.
+	</para>
+
+	<para>
+	    Thus, NNTP feeds can be sent out a little faster than the 
+	    batched transmission processes used for UUCP and other older 
+	    methods, because no batches need to be constructed. However, 
+	    NNTP is often used in newsfeeds where it is not necessary and it
+	    results in colossal waste of bandwidth.  Before we study 
+	    efficiency issues of NNTP versus batched feeds, we will cover 
+	    another way feeds can be organised using NNTP: the pull feeds.
+	</para> 
+	</section>
+
+	<section><title>Pull feeds</title>
+	<para>
+	    This method of transferring a set of articles works only over 
+	    NNTP, and requires absolutely no configuration on the 
+	    transmitting, or upstream, server. In fact, the upstream server
+	    cannot even easily detect that the downstream server is pulling
+	    out a feed --- it appears to be just a heavy and thorough
+	    newsreader, that's all.
+	</para>
+
+	<para>
+	    This pull feed works by the downstream server pulling out
+	    articles i one by one, just like any NNTP newsreader, using the
+	    NNTP <literal>ARTICLE</literal> command with the Message-ID as
+	    parameter.
+	    The interesting detail is how it gets the message~IDs to begin
+	    with. For this, it uses an NNTP command, specially designed for
+	    pull feeds, called <literal>NEWNEWS</literal>. This command
+	    takes a hierarchy and a date, 
+	    <screen> NEWNEWS comp 15081997 </screen>
+	</para>
+
+	<para>
+	    This command is sent by the downstream server over NNTP to the 
+	    upstream server, and in effect asks the upstream server to list
+	    out all news articles which are newer than 15 August 1997 in the
+	    <literal>comp</literal> hierarchy. The upstream server responds
+	    with a 
+	    (often huge) list of message~IDs, one per line, ending with a
+	    period on a line by itself.
+	</para>
+
+	<para>
+	    The pulling server then compares each newly received message~ID
+	    with its own article database and makes a (possibly shorter)
+	    list of all articles which it does not have, thus eliminating
+	    duplicate fetches.  That done, it begins fetching articles one
+	    by one, using the NNTP <literal>ARTICLE</literal> command as
+	    mentioned above.
+	</para>
+
+	<para>
+	    In addition, there is another NNTP command,
+	    <literal>NEWGROUPS</literal>,
+	    which allows the NNTP client --- <emphasis>i.e.</emphasis> the
+	    downstream server in
+	    this case --- to ask its upstream server what were the new
+	    newsgroups created since a given date. This allows the
+	    downstream server to add the new groups to its
+	    <literal>active</literal> file.
+	</para>
+
+	<para>
+	    The <literal>NEWNEWS</literal> based approach is usually one of
+	    the most inefficient methods of pulling out a large Usenet feed.
+	    By inefficiency, here we refer to the CPU loads and RAM
+	    utilisation on the upstream server, not on bandwidth usage. This
+	    inefficiency is because most Usenet news servers do not keep
+	    their article databases indexed by hierarchy and date; CNews
+	    certainly does not. This means that a <literal>NEWNEWS</literal>
+	    command issued to an upstream server will put that server into a
+	    sequential search of its article database, to see which articles
+	    fit into the hierarchy given and are newer than the given date.
+	</para>
+
+	<para>
+	    If pull feeds were to become the most common way of sending out 
+	    articles, then all upstream servers would badly need an
+	    efficient way of sorting their article databases to allow each 
+	    <literal>NEWNEWS</literal> command to rapidly generate its list
+	    of matching articles. A slow upstream server today might take
+	    minutes to begin responding to a <literal>NEWNEWS</literal>
+	    command, and
+	    the downstream server may time out and close its NNTP connection
+	    in the meanwhile. We have often seen this happening, till we
+	    tweak timeouts.
+	</para>
+
+	<para>
+	    There are basic efficiency issues of bandwidth utilisation
+	    involved in NNTP for news feeds, which are applicable for both
+	    queued and pull feeds. But the problem with
+	    <literal>NEWNEWS</literal> is unique to pull feeds, and relates
+	    to server loads, not bandwidth wastage. 
+	</para>  
+	</section>
+
+    </section>
+</section>
+
+<section id="controlmsg"> <title>Control messages</title>
+<para>
+    (Discuss control messages. Show examples of actual control messages
+    if possible. Discuss security issues in the form of control message
+    storms, and how digital signatures are being used to tackle it. This
+    sets the ground for <literal>pgpverify</literal> later on.) 
+</para>
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/settingup.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/settingup.sgml
@ -0,0 +1,716 @@
+<chapter><title>Setting up CNews + NNTPd</title>
+
+<section><title>Getting the sources and stuff</title>
+
+<section><title>The sources</title>
+
+<para>C-News software can be obtained from
+<literal>ftp://ftp.uu.net/networking/news/transport/cnews/cnews.tar.Z</literal>
+and will need to be uncompressed using the BSD
+<literal>uncompress</literal> utility or a compatible program. The
+tarball is about 650 KBytes in size. It has its own highly intelligent
+configuration and installation processes, which are very well
+documented. The version that is available is Cleanup Release revision G,
+on which our own version is based.</para>
+
+<para>NNTPd is available from
+<literal>ftp://ftp.uu.net/networking/news/nntp/nntp.1.5.12.1.tar.Z</literal>.
+It has no automatic scripts and processes to configure itself. After
+fetching the sources, you will have to follow a set of directions given
+in the documentation and configure some C header files. These
+configuration settings must be done keeping in mind what you have
+specified when you build the C-News sources, because NNTPd and C-News
+must work together. Therefore, some key file formats, directory paths,
+<emphasis>etc.</emphasis>, will have to be specified identically in both
+software systems.</para>
+
+<para>The third software system we use is Nestor. This too is to be
+found in the same place where the NNTPd software is kept, at
+<literal>ftp://ftp.uu.net/networking/news/nntp/nestor.tar.Z</literal>.
+This software compiles to one binary program, which must be run
+periodically to process the logs of <literal>nntpd</literal>, the NNTP
+server which is part of NNTPd, and report usage statistics to the
+administrator. We have integrated Nestor into our source base.</para>
+
+<para>The fourth piece of the puzzle, without which no Usenet server
+administrator dares venture out into the wild world of public Internet
+newsfeeds, is <literal>pgpverify</literal>.</para>
+
+<para>We have been working with C-News and NNTPd for many years now, and
+have fixed a few bugs in both packages. We have also integrated the four
+software systems listed above, and added a few features here and there to
+make things work more smoothly. We offer our entire source base to
+anyone for free download from
+<literal>http://www.starcomsoftware.com/proj/news/src/news.tar.gz</literal>.
+There are no licensing restrictions on our sources; they are as freely
+redistributable as the original components we started with.</para>
+
+<para>When you download our software distribution, you will extract it
+to find a directory tree with the following subdirectories and files:</para>
+
+<itemizedlist>
+<listitem><para><literal>c-news</literal>: the source tree of the CR.G
+    software release, with our additions like
+    <literal>pgpverify</literal> integration, our scripts like
+    <literal>mail2news</literal>, and pre-created configuration
+    files.
+    </para></listitem>
+
+<listitem><para><literal>nntp-1.5.12.1</literal>: the source tree of the
+    original NNTPd release, with header files pre-configured to fit in
+    with our configuration of C-News, and our addition of bits and
+    pieces like Nestor, the log analysis program.
+    </para></listitem>
+
+<listitem><para><literal>howto</literal>: this document, and its SGML
+    sources and Makefile.
+    </para></listitem>
+
+<listitem><para><literal>archives</literal>: a directory containing the
+    tarballs of the original C-News, NNTPd, Nestor and
+    <literal>pgpverify</literal> source distributions, in case you want
+    them. Strictly speaking, the <literal>archive</literal> directory is
+    not necessary unless you want to study what changes we have made,
+    what files we have added, to the original sources.
+    </para></listitem>
+
+<listitem><para><literal>build.sh</literal>: a shellscript you can run
+    to compile the entire combined source tree and install binaries in the
+    right places, if you are lucky and all goes well.
+    </para></listitem>
+</itemizedlist>
+
+<para>Needless to say, we believe that our source tree is a better
+place to start with than the original components, specially if you
+are installing a Usenet server on a Linux box and for the first time.
+We will be available on email to provide technical assistance should
+you run into trouble.</para>
+
+</section>
+
+<section><title>The key configuration files</title>
+
+<para>Once you get the sources, you will need some key configuration
+files to seed your C-News system. These configuration files are
+actually database tables, and are changing frequently, whenever
+newsgroups are created, modified or deleted. These files specify
+the list of active newsgroups in the ``public'' Usenet. You can,
+and should, add your organisation's internal newsgroups to this
+list when you set up your own server, but you will need to know
+the list of public standard newsgroups to begin with. This list
+can be obtained from the same FTP server by downloading the files
+<literal>active.gz</literal> and <literal>newsgroups.gz</literal> from
+<literal>ftp://ftp.uu.net/networking/news/config/</literal>. You
+can create your own <literal>active</literal> and
+<literal>newsgroups</literal> files by retaining a subset of the entries
+in these two files. Both these are ASCII text files.</para>
+
+<para>Getting the sources from our server will not obviate the need to
+get the latest versions of these files from
+<literal>ftp.uu.net</literal>. We do not (yet) maintain an up-to-date
+copy of these files on our server, and we will add no value to the
+original by just mirroring them.</para>
+
+</section>
+
+</section>
+
+<section><title>Compiling and installing</title>
+<para>
+    For installing, first make sure you have an entry for a user called 
+    <literal>news</literal> in your <literal>/etc/password</literal> file. Add one if not present. This
+    is setting the news-database owner to <literal>news</literal>. Now download
+    the source from us and untar it in the home directory of news. This creates
+    two main directories <emphasis>viz.</emphasis> <literal>c-news</literal>
+    and <literal>nntp</literal>. 
+    To install and compile, run the script <literal>build.sh</literal> as root
+    in the
+    directory that contains the script. It is important that the script run as
+    <literal>root</literal> as it sets ownerships and installs and compiles as
+    <literal>news</literal> and hence should have adequate permissions to do
+    this. This
+    is a one-step process that puts in place both the C-News and the
+    NNTP software, setting correct permissions and paths.
+    Following
+    is a brief description of what build.sh does:
+</para>
+
+<itemizedlist>
+<listitem><para> 
+    Checks for the <literal>OS</literal> platform and exits if
+    it is not <literal>Linux</literal>.
+</para></listitem>
+
+<listitem><para> 
+    Again, exits if you are not running as
+    <literal>root</literal>.
+</para></listitem>
+
+<listitem><para>
+    Looks for and exits if cannot find the above two directories.
+</para></listitem>
+
+<listitem><para> 
+    Compiles <literal>C-News</literal> and exits on error. This builds
+    all the software. Writes the error into a file called <literal>make.out</literal>. Read it to
+    determine the cause.  Also, performs regression tests if the 
+    compilation was successfull and does not exit on error.  Sends out a 
+    warning to read the error file <literal>make.out.r</literal> and fix 'em.
+</para></listitem>
+
+<listitem><para> 
+    Performs the above operation in the <literal>nntp</literal> directory, too. 
+</para></listitem>
+
+<listitem><para> 
+    Checks the presence of the three key directories:
+    <literal>$NEWSARTS - (/var/spool/news)</literal> that houses the artciles, 
+    <literal>$NEWSCTL -(/var/lib/news)</literal> that contain
+    the configuration, log and status files and <literal>$NEWSBIN - 
+    (/usr/lib/newsbin)</literal> that contain the binaries and
+    executables for
+    the working of the Usenet News system. Tries to create them if non-existent
+    and exits if it results in failure.
+</para></listitem>
+
+<listitem><para> 
+    Changes the ownership of these directories to <literal>news.news</literal>.
+    This is important since the entire Usenet News System runs as user <literal>news.</literal> It
+    will not function properly as any other user. 
+</para></listitem>
+
+<listitem><para> 
+    Then starts the installation process of C News. It runs
+    <literal>make install </literal>to install binaries at the right locations; <literal>make setup </literal>to set
+    the correct paths, umask, create directories for newsgroups, determine who
+    will receive reports; make ui to set up inews and injnews and 
+    make readpostcheck to use readnews, postnews and checknews provided by 
+    C News. The errors, if any are to be found in the respective make.out 
+    files. e.g. make.setup will write errors to make.out.setup
+</para></listitem>
+
+<listitem><para> 
+    <literal>Newsspool</literal>  which queues incoming
+    batches in <literal>$NEWSARTS/in.coming</literal> directory should run as
+    set-userid and set-groupid This is done.
+</para></listitem>
+
+<listitem><para> 
+    A softlink is made to <literal>/var/lib/news</literal> from 
+    <literal>/usr/lib/news.</literal>	
+</para></listitem>
+
+<listitem><para> 
+    The NNTP software is installed. 
+</para></listitem>
+
+<listitem><para> 
+    Sets up the manpages for C News and makes it world
+    readable. The NNTP manpages get installed when the software is installed.
+    Compiles the C News documentation guide.ps and makes it readable and
+    available in <literal>/usr/doc/packages/news</literal> or
+    <literal>/usr/doc/news</literal>.
+</para></listitem>
+
+<listitem><para> 
+    Checks for the PGP binary and asks the administrator to get
+    it if not found.
+</para></listitem>
+</itemizedlist>
+</section>
+
+<section><title>Configuring the system: What and how to configure files?</title>
+<para>Once installed, you have to now configure the system to accept feeds and 
+batch them for neighbours. You will have to do the following:</para>
+<itemizedlist>
+<listitem><para><literal>nntpd</literal>:  
+    Copy the compiled nntpd into a directory where
+    executables are kept and activate it. It runs on port 119 as a daemon
+    through inetd unless you have compiled it as stand-alone.
+
+    An entry in the services file for nntp would look like this:
+    <programlisting>nntp	119/tcp    \# Network News Transfer Protocol</programlisting>
+
+    An entry in the inetd.conf file will be:
+    <programlisting>nntp    stream    tcp    nowait   news    path-to-tcpd  path-to-nntpd</programlisting>
+
+    The last two fields in the inetd.conf file are the paths to binaries of the
+    tcp daemon and the nntpd daemon respectively.
+</para></listitem>
+
+<listitem><para><emphasis role=bold>Configuring control files:</emphasis>  
+    There are plenty of control files in <literal>$NEWSCTL</literal> that will
+    need to be configured before you can
+    start using the news system.  The files mentioned here are explained in some
+    detail in chapter 8, section 8.1. The files to be configured are dealt in 
+    detail in the following section.
+    </para>
+    <itemizedlist>
+    <listitem><para><literal>sys</literal>:
+	One line per system/NDN listing all the
+	newsgroup hierarchies each system subscribes to. Each line is prefixed
+	with the system name and the one beginning with ME: indicates what we
+	are going to receive. Following are typical entries that go into this
+	file:
+
+	<programlisting>ME:comp,news,misc,netscape</programlisting>
+
+	This line indicates what newsgroups your server, as determined by the 
+	whoami file have subscribed to and will receive.
+
+	<programlisting>server/server.starcomsoftware.com:all,!general/all:f</programlisting>
+
+        This is a list of newsgroups this site will pass on to its NDN.
+	The newsgroups specified should be a comma separated list and no spaces
+	should be inserted in the whole line. The f flag indicates that the
+	newsgroup name and article no. alongwith its size will be one entry in 
+	the togo file in the $NEWSARTS/out.going directory.
+    </para></listitem>
+
+    <listitem><para><literal>explist</literal>: 
+	This file has entries indicating
+	which <literal>articles</literal> expire and when and if they have to be
+	archived The order in which the newsgroups are listed is important. An
+	example follows:
+
+	<programlisting>comp.lang.java.3d	x	60	/var/spool/news/Archive</programlisting>
+	
+	This means that the articles of comp.lang.java expire after 60 days and
+	shall be archived in the directory mentioned as the fourth field. 
+	Archiving is an option. The second field indicates that this line 
+	applies to both moderated and unmoderted newsgroups.
+	<emphasis>m</emphasis> would 
+	specify moderated and <emphasis>u</emphasis> would specify unmoderated
+	groups. If you want to specify an extremely large no. as the expiry
+	period you can use the word 'never'. 
+    </para></listitem>
+
+    <listitem><para><literal>batchparms</literal>:
+	Sendbatches is a program that
+	adminsters batched transmission of news to other sites. To do this it
+	consults the batchparms file. Each line in the file specifies the
+	behaviour for each site. There are five fields for each site to be
+	specified.</para>
+
+	<screen>server   u     100000    100    batcher | gzip -9 | viauux -d gunzip</screen>
+    <para>
+	The first field is the site name which matches the entry in the sys 
+	file and has a corresponding directory in $NEWSARTS/out.going by that
+	name. 
+    </para>	
+    <para>	
+	The second field is the class of the site, 'u' for UUCP and 'n' for 
+	NNTP feeds. A '!' in this field means that batching for this site has 
+	been disabled. 
+    </para>	
+    <para>
+	The third field is the size of batches to be prepared in bytes.
+    </para>	
+    <para>	
+	The fourth field is the maximum length of the output queue for 
+	transmission to that site.
+    </para>	
+    <para>
+	The fifth field is the command line to be used to build, compress and
+	transmit batches to that site. It receives the contents of the togo file
+	on standard input.
+    </para>	
+    </listitem>
+
+    <listitem><para><literal>controlperm</literal>:
+	This file controls how the news
+	system responds to control messages. Each line consists of 4-5 fields
+	separated by white space.</para>
+
+	<programlisting>comp,sci    tale@uunet.uu.net   nrc    pv   news.announce.newsgroups</programlisting>
+
+    <para>
+	The first field is a newsgroup pattern to which the line applies.
+    </para>	
+    <para>
+	The second field is either 'any' or an e-mail address. The latter 
+	specifies that the line applies only to control messages from that
+	author.
+    </para>	
+    <para>
+	The third field is a set of opcode letters indicating what control
+	operations need to be performed on messages emanating from the e-mail
+	address mentioned in the second field. 'n' stands for creating a 
+	newgroup, 'r' stands for deleting a newsgroup and 'c' stands for 
+	checkgroup. 
+    </para>	
+    <para>
+	The fourth field is a set of flag letters indicating how to respond to
+	a control message that meets all the applicability tests:
+	<screen>
+	     y 	Do it.
+	     n	Don't do it.
+	     v 	Report it and include the entire control
+	        message in the report.
+	     q 	Don't report it.
+	     p	Do it iff the control message carries a valid PGP signature. 
+	</screen>
+	Exactly one of y, n or p must be present.
+    </para>	
+    <para>
+	The fifth field, which is optional, will be used if the fourth field
+	contains a 'p'. It must contain the PGP key ID of the public key to be
+	used for signature verification. 
+    </para>	
+    </listitem>
+
+    <listitem><para><literal>mailpaths</literal>: 
+	This file describes how to reach
+	the moderators of various heirarchies of news groups by mail. Each line
+	consists of two fields: a news group pattern and an e-mail address. The
+	first line whose group pattern matches the newsgroup is used. As an
+	example:
+
+       <screen>
+	   comp.lang.java.3d		somebody@mydomain.com
+	   all				%s@moderators.uu.net
+      </screen>
+
+	In the second example, the %s gets replaced with the groupname and all
+	dots appearing in the name are substituted with dashes.
+    </para></listitem>
+
+    <listitem><para><emphasis role=bold>Miscellaneous files:</emphasis>
+	The other files to be modified are:
+	<itemizedlist>
+	<listitem><para><literal>mailname:</literal>
+	    Contains the Internet domain name of the
+	    news system.  Consider getting one if you don't have it.
+	</para></listitem>
+
+	<listitem><para><literal>organization:</literal> 
+	    Contains the default value for the
+	    Organization: header for postings originating locally.
+	</para></listitem>
+
+	<listitem><para><literal>whoami:</literal>
+	    Contains the name of the news system. This
+	    is the site name used in the Path: headers and hence should concur
+	    with the names your neighbours use in their sys files.
+	</para></listitem>
+	</itemizedlist>
+    </para></listitem>
+
+    <listitem><para><literal>active </literal>file:
+	This file specifies one line for each
+	newsgroup (not just the hierarchy) to be found on your news system. You
+	will have to get the most recent copy of the active file from 
+	<literal>ftp://ftp.isc.org/usenet/CONFIG/active</literal> and prune it
+	to delete
+	newsgroups that you have not subscribed to. Run the script "addgroup" 
+	for each newsgroup in this file which will create relevant directories 
+	in the <literal>$NEWSARTS</literal> area.  The "addgroup" script takes
+	two paramters: the newsgroup name being created and a flag. The flag can
+	be any one of the following:
+	<screen>
+	    y		local postings are allowed
+	    n 		no local postings, only remote ones
+	    m		postings to this group must be approved
+	                by the moderator
+	    j		articles in this group are only passed and not kept
+	    x		posting to this newsgroup is disallowed
+	    =foo.bar	articles are locally filed in
+	                "foo.bar" group
+	</screen>
+
+	An entry in this file looks like this:
+
+	 <programlisting>comp.lang.java.3d	0000003716	01346	m </programlisting>
+
+	The first field is the name of the newsgroup. The second field is the
+	highest article number that has been used in that newsgroup. The
+	third field is the lowest article number in the group. The fourth
+	field is flag as explained above.
+    </para></listitem>
+
+    <listitem><para><literal>newsgroups </literal>file:
+	This contains a one line description
+	of each newsgroup to be found in the active file. You will have to
+	get the most recent file from
+	<literal>ftp://ftp.isc.org/usenet/CONFIG/newsgroups</literal> 
+	and prune it to remove unwanted information. As an example:
+
+	<programlisting>comp.lang.java.3d 	3D Grphics APIs for the Java language</programlisting>
+    </para></listitem>
+
+    <listitem><para><emphasis role=bold>Create aliases: </emphasis>
+	These aliases are required for trouble reporting. 
+	Once the system is in place and scripts are run, anomalies/problems
+	are reported to addresses in the /etc/aliases file. These entries
+	include email addresses for <literal>newsmaster, newscrisis, news,
+	usenet,  newsmap</literal>
+	They should ideally point to an email address that will be 
+	looked at regularly.  Arrange the emails for "newsmap" to be
+	discarded to minimize the effect of "sendsys bombing" by practical
+	jokers.
+    </para></listitem>
+
+    <listitem><para><emphasis role=bold>Cron jobs:</emphasis> 
+	Certain scripts like newsrun that picks up incoming 
+	batches and maintenance scripts, should run through news-database 
+	owner's cron. The cron entries ideally will be for the following: A more
+	detailed report can be found in <xref linkend="cronjobs"/> 
+	<orderedlist>
+	<listitem><para><literal>newsrun: </literal>
+	    This script processes incoming batches of
+	    article.  Run this as frequently as you want them to get digested.
+	</para></listitem>
+
+	<listitem><para><literal>sendbatches:</literal>
+	    This script transmit batches to the
+	    NDNs. Set the frequency according to your requirements.
+	</para></listitem>
+
+	<listitem><para><literal>newsdaily:</literal>
+	    This should be run ideally once a day
+	    since it reports errors and anomalies in the news system.
+	</para></listitem>
+
+	<listitem><para><literal>newswatch:</literal>
+	    This looks for errors/anomalies at a more detailed level and hence
+	    should be run atleast once every hour
+	</para></listitem>
+
+	<listitem><para><literal>doexpire:</literal>
+	    This script expires old articles as
+	    determined by the explist file. Run this once a day.
+	</para></listitem>
+	</orderedlist>
+    </para></listitem>
+
+    <listitem><para>newslog: 
+	Make an entry in the system's syslog.conf
+	file for logging messages spewed out by the nntp daemon in "newslog".
+	It should be located in <literal>$NEWSCTL</literal>. The entry will
+	look like this:
+
+	<programlisting>news.debug		-/var/lib/news/newslog</programlisting>
+    </para></listitem>
+
+    <listitem><para>Newsboot: 
+	Have newsboot run (as "news", the
+	news-database owner) when the system boots to clear out debris left
+	around by crashes.
+    </para></listitem>
+
+    <listitem><para>Add a Usenet mailer in sendmail: 
+	The mail2news program provided as 
+	part of the source code is a handy tool to send an e-mail to a newsgroup
+	which gets digested as an article. You will have to add the following 
+	ruleset and mailer definition in your sendmail.cf file:</para>
+
+	<itemizedlist>
+	<listitem><para>Under SParse1, add the following:
+	    <programlisting>
+	    R$+ . USENET < @ $=w . >      $#usenet     $: $1
+	    </programlisting>
+	</para></listitem>
+
+	<listitem><para>Under mailer definitions, define the mailer Usenet as:
+	<screen>
+	    MUsenet 	 P=/usr/lib/newsbin/mail2news/m2nmailer, F=lsDFMmn, 
+		S=10, R=0, M=2000000, T=X-Usenet/X-Usenet/X-Unix, A=m2nmailer $u
+	</screen>
+	</para></listitem>
+	</itemizedlist>
+
+	<para>In order to send a mail to a newsgroup you will now have to suffix
+	the 
+	newsgroup name with usenet <emphasis>i.e.</emphasis> your To:  header
+	will look like this:
+	<screen>To: misc.test.usenet@yourdomain.</screen>
+	The mailer definition of usenet will intercept this mail and post it to
+	the respective newsgroup, in this case, misc.test</para>
+    </listitem>
+ </itemizedlist>
+
+<para>
+This, more or less, completes the configuration part.
+</para>
+
+ </listitem>
+</itemizedlist>
+</section>
+
+<section><title>Testing the system</title>
+<para>
+To locally test the system, follow the steps given below:
+</para>
+<itemizedlist>
+<listitem><para>post an article: 
+    Create a local newsgroup
+    <screen>
+    cnewsdo addgroup mysite.test y
+    </screen>
+    and using <literal>postnews </literal>post an article to it.
+</para></listitem>
+
+<listitem><para>Has it arrived in <literal>$NEWSARTS</literal>/in.coming?:
+    The article should show up in the directory mentioned. Note the nomenclature
+    of the article.
+</para></listitem>
+
+<listitem><para>When newsrun runs: 
+    When newsrun runs through cron, the article disappears from in.coming
+    directory and appears in <literal>$NEWSARTS</literal>/mysite/test. Look how
+    the newsgroup, active, log and history (not the errorlog) files and
+    <literal> .overview </literal>file in
+    <literal>$NEWSARTS/mysite/test</literal> reflect the digestion of the file
+    into the news system.
+</para></listitem>
+
+<listitem><para>reading the article: 
+    Try to read the article through readnews or any 
+    news client. If you are able to, then you have set most everything right.
+</para></listitem>
+</itemizedlist>
+</section>
+
+<section><title><literal>pgpverify</literal> and <literal>controlperms</literal></title>
+<para>
+    As mentioned in <xref linkend="controlmsg"/>, it becomes necessary to
+    authenticate control messages to protect yourself from being attacked by
+    pranksters. For this, you will have to configure the
+    <literal>$NEWSCTL</literal>/controlperm file to declare whose control
+    messages you are willing to honour and for what newsgroups alongwith their
+    public key ID. The controlperm manpage shall give you details on the format.
+</para>
+
+<para>
+    This will work only in association with <literal> pgpverify </literal> which
+    verifies the Usenet control messages that have been signed using the 
+    <literal>signcontrol</literal> process. The script can be found at
+    <literal>ftp://ftp.isc.org/pub/pgpcontrol/pgpverify</literal>. 
+    <literal> pgpverify </literal>pgpverify internally uses the PGP binary which
+    will have to be made available in the default executables directory. If you
+    wish to send control messages for your local news system, you will have to
+    digitally sign them using the above mentioned "signcontrol" program which is
+    available at
+    <literal>ftp://ftp.isc.org/pub/pgpcontrol/signcontrol</literal>. You will
+    also have to configure the signcontrol program accordingly.
+</para>
+</section>
+
+<section><title>Feeding off an upstream neighbour</title>
+<para>
+    For external feeds, commercial customers will have to buy them
+    from a regular News Provider like <literal>dejanews.com</literal>
+    or <literal>newsfeeds.com</literal>. You will have to specify
+    to them what hierarchies you want and decide on the mode of
+    transmission, <emphasis>i.e.</emphasis> UUCP or NNTP, based on
+    your requirements. Once that is done, you will have to ask them to
+    initiate feeds, and check <literal>$NEWSARTS/in.coming</literal>
+    directory to see if feeds are coming in.
+</para>
+
+<para>
+    If your organisation belongs to the academic community or is
+    otherwise lucky enough to have an NDN server somewhere which is
+    willing to provide you a free newsfeed, then the payment issue goes
+    out of the picture, but the rest of the technical requirements
+    remain the same.
+</para>
+
+<para>
+    One problem with incoming NNTP feeds is that it is far easier to use
+    (relatively) efficient NNTP inflows if you have a server with a
+    permanent Internet connection and a fixed IP address. If you are a
+    small office with a dialup Internet connection, this may not be
+    possible. In that case, the only way to get incoming newsfeeds by
+    NNTP may be by using a highly inefficient pull feed.
+</para>
+</section>
+
+<section><title>Configuring outgoing feeds</title>
+<para>
+    If you are a leaf node, you will only have to send feeds back to your
+    news provider for your postings in public newsgroups to propagate
+    to the outside world. To enable this, you need one line in the
+    <literal>sys</literal> and <literal>batchparms</literal> files
+    and one directory in <literal>$NEWSARTS/out.going</literal>. If
+    you are willing to transmit articles to your neighbouring
+    sites, you will have to configure <literal>sys</literal> and
+    <literal>batchparms</literal> with more entries. The number of directories
+    in <literal>$NEWSARTS/out.going</literal> shall increase, too. Refer
+    to chapter 8, section 8.1 and 8.2 for a better understanding of
+    outgoing feeds. Again, you will have to determine how you wish to
+    transmit the feed: UUCP or NNTP.
+</para>
+
+    <section><title>By UUCP</title>
+    <para>For outgoing feeds by UUCP, we recommend that you start with
+    Taylor UUCP. In fact, this is the UUCP version which forms part
+    of the GNU Project and is the default UUCP on Linux
+    systems.</para>
+    
+    <para>A full treatment of UUCP configuration is beyond the scope of
+    this document. However, the basic steps will be as follows. First,
+    you will have to define a ``system'' in your Usenet server for the
+    NDN (next door neighbour) host. This definition will include various
+    parameters, including the manner in which your server will call the
+    remote server, the protocol it will use, <emphasis>etc.</emphasis>
+    Then an identical process will have to be followed on the NDN
+    server's UUCP configuration, for your server, so that
+    <emphasis>that</emphasis> server can recognize
+    <emphasis>your</emphasis> Usenet server.</para>
+
+    <para>Finally, you will need to set up appropriate
+    <literal>cron</literal> jobs for the user <literal>uucp</literal>
+    to run <literal>uucico</literal> periodically. Taylor UUCP comes with
+    a script called <literal>uusched</literal> which may be modified to
+    your requirements; this script calls <literal>uucico</literal>. One
+    <literal>uucico</literal> connection will both upload and download
+    news batches. Smaller sites can run <literal>uusched</literal> even
+    once or twice a day.</para>
+    
+    <para>Later versions of this document will include the
+    <literal>uusched</literal> scripts that we use in Starcom. We use
+    UUCP over TCP/IP, and we run the <literal>uucico</literal>
+    connection through an SSH tunnel, to prevent transmission of
+    UUCP passwords in plain text over the Internet, and our SSH tunnel
+    is established using public-key cryptography, without passwords
+    being used anywhere.</para>
+    </section>
+
+    <section><title>By NNTP</title>
+    <para>For NNTP feeds, you will have to decide whether your server
+    will be the connection initiator or connection recipient. If you are
+    the connection initiator, you can send outgoing NNTP feeds more
+    easily. If you are the connection recipient, then outgoing feeds
+    will have to be pulled out of your server using the NNTP
+    <literal>NEWNEWS</literal> command, which will place heavy loads on
+    your server. This is not recommended.</para>
+
+    <para>Connecting to your NDN server for pushing out outgoing feeds
+    will require the use of the <literal>nntpsend.sh</literal> script,
+    which is part of the NNTPd source tree. This script will perform
+    some housekeeping, and internally call the
+    <literal>nntpxmit</literal> binary to actually send the queued set
+    of articles out. You may have to provide authentication information
+    like usernames and passwords to <literal>nntpxmit</literal> to allow
+    it to connect to your NDN server, in case that server insists on
+    checking the identity of incoming connections. (You can't be too
+    careful in today's world.) <literal>nntpsend.sh</literal> will clean
+    up after an <literal>nntpxmit</literal> connection finishes, and
+    will requeue any unsent articles for the next session. Thus, even if
+    there is a network problem, typically nothing is lost and all
+    pending articles are transmitted next time.</para>
+
+    <para>Thus, pushing feeds out <emphasis>via</emphasis> may mean
+    setting up <literal>nntpsend.sh</literal> properly, and then
+    invoking it periodically from <literal>cron</literal>. If your
+    Usenet server connects to the Internet only intermittently, then the
+    process which sets up the Internet connection should be extended or
+    modified to fire <literal>nntpsend.sh</literal> whenever the Internet
+    link is established. For instance, if you are using the Linux
+    <literal>pppd</literal>, you can add statements to the
+    <literal>/etc/ppp/ip-up</literal> script to change user to
+    <literal>news</literal> and run <literal>nntpsend.sh</literal></para>
+    </section>
+</section>
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/software.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/software.sgml
@ -0,0 +1,247 @@
+<chapter><title>Usenet news software</title>
+
+<section><title>CNews and NNTPd</title>
+<para>
+Once upon a time, when Usenet news was a term not yet invented, the
+first recorded attempt to use a UUCP-based email backbone to maintain a
+replicated message repository, was called A-News. It connected four
+servers in four universities, and was written as Unix shell
+scripts.</para>
+
+<para>The designers of A-News had not anticipated how much load users
+would put on their simplistic system. A far superior, more sophisticated,
+and faster implementation of Usenet news was written later, called
+B-News. This was a mix of C and shell scripts, and was designed
+much better than A-News, to allow handling of much larger volumes of
+messages. B-News v2.x was the current version in around 1990. By 1992 or
+so, it had been surpassed by C-News.</para>
+
+<para>C-News was written by Henry Spencer and Geoff Collyer of the
+Department of Zoology, University of Toronto, almost entirely in shell
+and <literal>awk</literal>, as a replacement for B-News. Once again, the
+focus was on adding some extra features and a lot of performance. The
+first release was called Shellscript Release, which was deployed by a very
+large number of servers worldwide, as a natural upgrade to B-News.  This
+version of C-News even had upward compatibility with B-News meta-data,
+<emphasis>e.g.</emphasis> history files. This was the version of C-News
+which was initially rolled out in 1992 or so at the National Centre for
+Software Technology (NCST, <literal>http://www.ncst.ernet.in</literal>)
+and the Indian Institute of Technologies in India as part of the Indian
+ERNET network.</para>
+
+<para>The Shellscript Release was soon followed by a re-write with a lot
+more C code, called Performance Release, and then a set of cleanup and
+component integration steps leading to the last release called the
+Cleanup Release. This Cleanup Release was revised many times, and the
+last one was CR.G (Cleanup Release revision G). The version of C-News
+discussed in this HOWTO is a set of bug fixes on CR.G.</para>
+
+<para>Since C-News came from shellscript-based antecedents, its
+architecture followed the set-of-programs style so typical of Unix,
+rather than large monolothic software systems traditional to some other
+OSs. All pieces had well-defined roles, and therefore could be easily
+replaced with other pieces as needed. This allowed easy adaptations and
+upgradations. This never affected performance, because key components
+which did a lot of work at high speed, <emphasis>e.g.</emphasis>
+<literal>newsrun</literal>, had been rewritten in C by that time. Even
+within the shellscripts, crucial components which handled binary data,
+<emphasis>e.g.</emphasis> a component called <literal>dbz</literal>
+to manipulate efficient on-disk hash arrays, were C programs with
+command-line interfaces, called from scripts.</para>
+
+<para>C-News was born in a world with widely varying network line speeds,
+where bandwidth utilisation was a big issue and dialup links with UUCP
+file transfers was common. Therefore, it has very strong support for
+batched feeds, specially with a variety of compression techniques and
+over a variety of fast and slow transport channels. And C-News virtually
+does not know the existence of TCP/IP, other than one or two tiny batch
+transport programs like <literal>viarsh</literal>. However, its design
+was so modular that there was absolutely no problem in plugging in NNTP
+functionality using a separate set of C programs without modifying
+a single line of C-News. This was done by a program suite called
+NNTPd.</para>
+
+<para>This software suite could work with B-News and C-News article
+repositories, and provided the full NNTP functionality.  Since B-News
+died a gradual death, the combination of C-News and NNTPd became a freely
+redistributable, portable, modern, extensible, and high-performance
+software suite for Unix Usenet servers.  Further refinements were
+added later, <emphasis>e.g.</emphasis> <literal>nov</literal>, the News
+Overview package and <literal>pgpverify</literal>, a public-key-based
+digital signature module to protect Usenet news servers against
+fraudulent control messages.</para>
+
+</section>
+
+<section><title>INN</title>
+<para>
+INN is one of the two most widely used Usenet news server solutions. It
+was written by Rich Salz for Unix systems which have a socket API ---
+probably all Unix systems do, today.
+</para>
+
+<para>
+INN has an architecture diametrically opposite to CNews. It is a
+monolithic program, which is started at bootup time, and keeps running
+till your server OS is shut down. This is like the way high performance
+HTTP servers are run in most cases, and allows INN to cache a lot of
+things in its memory, including message-IDs of recently posted messages,
+<emphasis>etc.</emphasis> This interesting architecture has been discussed
+in an interesting paper by the author, where he explains the problems
+of the older BNews and CNews systems that he tried to address. Anyone
+interested in Usenet software in general and INN in particular should
+study this paper.</para>
+
+<para>
+INN addresses a Usenet news world which revolves around NNTP, though it
+has support for UUCP batches --- a fact that not many INN administrators 
+seem to talk about. The primary situations where INN works at higher
+efficiency over the CNews-NNTPd combination are in processing incoming
+NNTP feeds when there are multiple incoming NNTP feeds. For multiple
+readers reading and posting news over NNTP, there is no difference
+between the efficiency of INN and NNTPd. <xref linkend="innefficiency"/>
+discusses the efficiency issues of INN over the earlier CNews
+architecture, based on Rich Salz' paper and our analyses of usage
+patterns.
+</para>
+
+<para>
+INN's architecture has inspired a lot of high-performance Usenet news
+software, including a lot of commercial systems which address the
+``carrier class'' market. That is the market for which the INN
+architecture has clear advantages over C-News.
+</para>
+</section>
+
+<section><title>Leafnode</title>
+<para>
+This is an interesting software system, to set up a ``small'' Usenet
+news server on one computer which only receives newsfeeds but does not
+have the headache of sending out bulk feeds to other sites,
+<emphasis>i.e.</emphasis> it is a ``leaf node'' in the newsfeed flow
+diagram.</para>
+
+<para>This software is a sort of combination of article repository and
+NNTP news server, and receives articles, digests and stores them on the
+local hard disks, expires them periodically, and serves them to an NNTP
+reader. It is claimed that it is simple to manage and is ideal for
+installation on a desktop-class Unix or Linux box, since it does not
+take up much resources.</para>
+
+<para>Leafnode is based on an appealing idea, but we find no problem
+using C-News and NNTPd on a desktop-class box. Its resource consumption is
+somewhat proportional to the volume of articles you want it to process,
+and the number of groups you'll want to retain for a small team of users
+will be easily handled by C-News on a desktop-class computer. An office
+of a hundred users can easily use C-News and NNTPd on a desktop computer
+running Linux, with 64 MBytes of RAM, IDE drives, and sufficient disk
+space. Of course, ease of configuration and management is dependent on
+familiarity, and we are more familiar with C-News than with Leafnode. We
+hope this HOWTO will help you in that direction.</para>
+
+<para>TO BE EXTENDED AND CORRECTED.</para>
+
+</section>
+
+<section><title>Suck</title>
+<para>Suck is a program which lets you pull out an NNTP feed from an NNTP
+server and file it locally. It does not contain any article repository
+management software, expecting you to do it using some other
+software system, <emphasis>e.g.</emphasis> C-News or INN.  It can
+create batchfiles which can be fed to C-News, for instance. (Well,
+to be fair, Suck <emphasis>does</emphasis> have an option to store the
+fetched articles in a spool directory tree very much like what is used
+by C-News or INN in their article area, with one file per article. You
+can later read this raw message spool area using a mail client which
+supports the <literal>msgdir</literal> file layout for mail folders,
+like MH, perhaps. We don't find this option useful if you're running
+Suck on a Usenet server.)  Suck finally boils down to a single
+command-line program which is invoked periodically, typically from
+<literal>cron</literal>. It has a zillion command-line options which
+are confusing at first, but later show how mature and finely tunable
+the software is.</para>
+
+<para>If you need an NNTP pull feed, then we know of no better programs
+than Suck for the job. The <literal>nntpxfer</literal> program which
+forms part of the NNTPd package also implements an NNTP pull feed, for
+instance, but does not have one-tenth of the flexibility and fine-tuning
+of Suck. One of the banes of the NNTP pull feed is connection timeouts;
+Suck allows a lot of special tuning to handle this problem.  If we had
+to set up a Usenet server with an NNTP pull feed, we'd use Suck right
+away.</para>
+
+<para>TO BE EXTENDED AND CORRECTED.</para>
+
+</section>
+
+<section><title>Carrier class software</title>
+
+<para>We have touched upon the characteristics of carrier-class Usenet
+software in the section where we discuss NNTP efficiency issues. As that
+bit shows, the requirements of carrier-class Usenet servers is very
+different from those run within organisations and institutes for
+providing internal service to their members.</para>
+
+<para>Carrier-class servers are expected to handle a complete feed of all
+articles in all newsgroups, including a lot of groups which have what we
+call a ``high noise-to-signal ratio.'' They do not have the luxury of
+choosing a ``useful'' subset like administrators of internal corporate
+Usenet servers do. Secondly, carrier-class servers are expected to turn
+articles around very fast, <emphasis>i.e.</emphasis> they are expected to
+have very low latency from the moment they receive an article to the
+time they retransmit it by NNTP to downstream servers. Third, they are
+supposed to provide very high availability, <emphasis>i.e.</emphasis>
+they are supposed to be like other carrier class services. This usually
+means that they have parallel arrays of computers in load sharing
+configurations. And fourth, they usually do not cater to retail
+connections for reading and posting articles by human users. Usenet news
+carriers usually reserve separate computers to handle retail
+connections.</para>
+
+<para>Thus, carrier-class servers do not need to maintain a repository
+of articles with the usual residence times of days or weeks, and expire
+articles after they age. They only need to focus on super-efficient
+re-transmission. These highly specialised servers have software
+which receive an article over NNTP, parse it, and immediately re-queue
+it for outward transmission to dozens or hundreds of other servers. And
+since they work at these high throughputs, their downstream servers
+are also expected to be live on the Internet round the clock to receive
+incoming NNTP connections from the carrier servers. Therefore, there's
+no batching or long queueing needed, and batching cannot be used. In
+fact, some carrier class servers state that if you wish to receive feeds
+from them, then your servers need to be available round the clock and
+connected with lines fast enough to take the blast of a full feed. If
+you do not fulfil these conditions, your servers will lose articles,
+and the carrier is not responsible for the loss.</para>
+
+<para>Therefore, one can almost say that carrier-class servers have
+neither article repositories nor queues other than the current message(s)
+being re-transmitted. If they fail to connect to five of their fifty
+downstream neighbours, or fail to push an article through due to
+a transmit error, those five neighbours will never receive that
+article later from this server; the article will be dropped from their
+queues. Retries are not part of the game. Therefore, carrier-class
+Usenet servers are more like packet routers than servers with
+repositories.</para>
+
+<para>It can be seen why carrier-class software cannot hope to do its
+job using batch-oriented repository management software like C-News and
+why it needs a totally NNTP-oriented implementation. Therefore, the INN
+antecedents of some of these systems is to be expected. We would
+<emphasis>love</emphasis> to hear from any Linux HOWTO reader whose
+Usenet server needs include carrier-class behaviour.</para>
+
+<para>As far as we know, there is no freely redistributable software
+implementation of carrier-class Usenet news servers. There is no reason
+why such services cannot be offered on Linux, even Intel Linux, provided
+you have fast network links and arrays of servers. Linux as an OS platform
+is not an issue here, but free software has not yet been made available
+for this niche. Presumably it is because the users of such software are
+service providers who earn money using it, and therefore are expected
+to be willing to pay for it.</para>
+
+<para>TO BE EXTENDED AND CORRECTED.</para>
+
+</section>
+
+</chapter>
--- a/LDP/howto/docbook/Usenet-News-HOWTO/what.sgml
+++ b/LDP/howto/docbook/Usenet-News-HOWTO/what.sgml
@ -0,0 +1,158 @@
+<chapter> <title>What is the Usenet?</title>
+
+<section> <title>Discussion groups </title>
+<para>The Usenet is a huge worldwide collection of discussion
+groups. Each discussion group has a name, <emphasis>e.g.</emphasis>
+<literal>comp.os.linux.announce</literal>, and a collection of messages.
+These messages, usually called <emphasis>articles</emphasis>, are posted
+by readers like you and me who have access to Usenet servers, and are
+then stored on the Usenet servers.</para>
+
+<para>This ability to both read and write into a Usenet newsgroup makes
+the Usenet very different from the bulk of what people today call ``the
+Internet.'' The Internet has become a colloquial term to refer to the
+World Wide Web, and the Web is (largely) read-only. There are online
+discussion groups with Web interfaces, and there are mailing lists, but
+Usenet is probably more convenient than either of these for most large
+discussion communities. This is because the articles get replicated to
+your local Usenet server, thus allowing you to read and post articles
+without accessing the global Internet, something which is of great value
+for those with slow Internet links. Usenet articles also conserve
+bandwidth because they do not come and sit in each member's mailbox, unlike 
+email
+based mailing lists. This way, twenty members of a mailing list in one
+office will have twenty copies of each message copied to their
+mailboxes. However, with a Usenet discussion group and a local Usenet
+server, there's just one copy of each article, and it does not fill up
+anyone's mailbox.</para>
+
+<para>Another nice feature of having your own local Usenet server is
+that articles stay on the server even after you've read them. You can't
+accidentally delete a Usenet articles the way you can delete a message
+from your mailbox. This way, a Usenet server is an
+<emphasis>excellent</emphasis> way to archive articles of a group
+discussion on a local server without placing the onus of archiving on
+any group member. This makes local Usenet servers very valuable as
+archives of internal discussion messages within corporate Intranets,
+provided the article expiry configuration of the Usenet server software
+has been set up for sufficiently long expiry periods.</para>
+</section>
+
+<section> <title>How it works, loosely speaking</title>
+<para> Usenet news works by the reader first firing up a Usenet news
+program, which in today's GUI world will highly likely be something like
+Netscape Messenger or Microsoft's Outlook Express. There are a lot of
+proven, well-designed character-based Usenet news readers, but a proper
+review of the user agent software is outside the scope of this HOWTO, so
+we will just assume that you are using whatever software you like. The
+reader then selects a Usenet newsgroup from the hundreds or thousands of
+newsgroups which are hosted by her local server, and accesses all unread
+articles. These articles are displayed to her. She can then decide to
+respond to some of them.</para>
+
+<para>When the reader writes an article, either in response to an
+existing one or as a start of a brand-new thread of discussion, her
+software <emphasis>posts</emphasis> this article to the Usenet server.
+The article contains a list of newsgroups into which it is to be posted.
+Once it is accepted by the server, it becomes available for other users
+to read and respond to. The article is automatically
+<emphasis>expired</emphasis> or deleted by the server from its internal
+archives based on expiry policies set in its software; the author of the
+article usually can do little or nothing to control the expiry of her
+articles.</para>
+
+<para>A Usenet server rarely works on its own. It forms a part of
+a collection of servers, which automatically exchange articles with
+each other. The flow of articles from one server to another is called a
+<emphasis>newsfeed</emphasis>. In a simplistic case, one can imagine a
+worldwide network of servers, all configured to replicate articles with
+each other, busily passing along copies across the network as soon as one
+of them receives a new articles posted by a human reader. This replication
+is done by powerful and fault-tolerant processes, and gives the Usenet
+network its power. Your local Usenet server literally has a copy of all
+current articles in all relevant newsgroups.</para>
+
+</section>
+
+<section> <title>About sizes, volumes, and so on </title>
+<para>Any would-be Usenet server administrator or creator
+<emphasis>must</emphasis> read the <quote>Periodic Posting about the basic steps
+involved in configuring a machine to store Usenet news,</quote> also known as
+the Site Setup FAQ, available from
+<literal>ftp://rtfm.mit.edu/pub/usenet/news.answers/usenet/site-setup</literal>
+or
+<literal>ftp://ftp.uu.net/usenet/news.answers/news/site-setup.Z</literal>.
+It was last updated in 1997, but trends haven't changed much since
+then, though absolute volume figures have.</para>
+
+<para>If you want your Usenet server to be a repository for all articles
+in all newsgroups, you will probably not be reading this HOWTO, or even
+if you do, you will rapidly realise that anyone who needs to read this
+HOWTO may not be ready to set up such a server. This is because the
+volumes of articles on the Usenet have reached a point where very
+specialised networks, very high end servers, and large disk arrays
+are required for handling such Usenet volumes. Those setups are called
+``carrier-class'' Usenet servers, and will be discussed a bit later on in
+this HOWTO. Administering such an array of hardware may not be the job
+of the new Usenet administrator, for which this HOWTO (and most Linux
+HOWTO's) are written.</para>
+
+<para>Nevertheless, it may be interesting to understand what volumes we
+are talking about. Usenet news article volumes have been doubling every
+fourteen months or so, going by what we hear in comments from
+carrier class Usenet administrators. In the beginning of 1997, this
+volume was 1.2 GBytes of articles a day. Thus, the volumes should have
+roughly done five doublings, or grown 32 times, by the time we reach
+mid-2002, at the time of this writing. This gives us a volume of 38.4
+GBytes per day. Assume that this transfer happens using uncompressed
+NNTP (the norm), and add 50% extra for the overheads of NNTP, TCP,
+and IP. This gives you a raw data transfer volume of 57.6 GBytes/day or
+about 460 Gbits/day. If you have to transfer such volumes of data in 24
+hours (86400 seconds), you'll need raw bandwidth of about 5.3 Mbits per
+second just to <emphasis>receive all these articles</emphasis>. You'll
+need more bandwidth to send out feeds to other neighbouring Usenet
+servers, and then you'll need bandwidth to allow your readers to access
+your servers and read and post articles in retail quantities. Clearly,
+these volume figures are outside the network bandwidths of most
+corporate organisations or educational institutions, and therefore only
+those who are in the business of offering Usenet news can afford
+it.</para>
+
+<para>At the other end of the scale, it is perfectly feasible for a
+small office to subscribe to a well-trimmed subset of Usenet newsgroups,
+and exclude most of the high-volume newsgroups.  Starcom Software, where
+the authors of this HOWTO work, has worked with a fairly large subset of
+600 newsgroups, which is still a tiny fraction of the 15,000+ newsgroups
+that the carrier class services offer. Your office or college may not
+even need 600 groups. And our company had excluded specific high-volume
+but low-usefulness newsgroups like the <literal>talk</literal>,
+<literal>comp.binaries</literal>, and <literal>alt</literal>
+hierarchies. With the pruned subset, the total volume of articles per
+day may amount to barely a hundred MBytes a day or so, and can be easily
+handled by most small offices and educational institutions. And in such
+situations, a single Intel Linux server can deliver excellent performance
+as a Usenet server.</para>
+
+<para>Then there's the <emphasis>internal</emphasis> Usenet service. By
+internal here, we mean a private set of Usenet newsgroups, not a private
+computer network. Every company or university which runs a Usenet news
+service creates its own hierarchy of internal newsgroups, whose articles
+never leave the campus or office, and which therefore do not consume
+Internet bandwidth. These newsgroups are often the ones most hotly
+accessed, and will carry more <emphasis>internally generated</emphasis>
+traffic than all the ``public'' newsgroups you may subscribe to, within your
+organisation.  After all, how often does a guy have something to say
+which is relevant to the world at large, unless he's discussing a globally
+relevant topic like ``Unix rules!''? If such internal newsgroups are the
+focus of your Usenet servers, then you may find that fairly modest
+hardware and Internet bandwidth will suffice, depending on the size of
+your organisation.</para>
+
+<para>The new Usenet server administrator has to undertake a sizing
+exercise to ensure that he does not bite off more than he, or his
+network resources, can chew. We hope we have provided sufficient
+information for him to get started with the right questions.</para>
+
+</section>
+
+</chapter>