376 lines
8.7 KiB
HTML
376 lines
8.7 KiB
HTML
<HTML
|
|
><HEAD
|
|
><TITLE
|
|
>How Does Usenet Handle News?</TITLE
|
|
><META
|
|
NAME="GENERATOR"
|
|
CONTENT="Modular DocBook HTML Stylesheet Version 1.57"><LINK
|
|
REL="HOME"
|
|
TITLE="Linux Network Administrators Guide"
|
|
HREF="index.html"><LINK
|
|
REL="UP"
|
|
TITLE="Netnews"
|
|
HREF="x-087-2-news.html"><LINK
|
|
REL="PREVIOUS"
|
|
TITLE="What Is Usenet, Anyway?"
|
|
HREF="x-087-2-news.usenet.html"><LINK
|
|
REL="NEXT"
|
|
TITLE="C News"
|
|
HREF="x-087-2-cnews.html"></HEAD
|
|
><BODY
|
|
CLASS="SECT1"
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#840084"
|
|
ALINK="#0000FF"
|
|
><DIV
|
|
CLASS="NAVHEADER"
|
|
><TABLE
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TH
|
|
COLSPAN="3"
|
|
ALIGN="center"
|
|
>Linux Network Administrators Guide</TH
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="left"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x-087-2-news.usenet.html"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="80%"
|
|
ALIGN="center"
|
|
VALIGN="bottom"
|
|
>Chapter 20. Netnews</TD
|
|
><TD
|
|
WIDTH="10%"
|
|
ALIGN="right"
|
|
VALIGN="bottom"
|
|
><A
|
|
HREF="x-087-2-cnews.html"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"></DIV
|
|
><DIV
|
|
CLASS="SECT1"
|
|
><H1
|
|
CLASS="SECT1"
|
|
><A
|
|
NAME="X-087-2-NEWS.ALGORITHM"
|
|
>20.3. How Does Usenet Handle News?</A
|
|
></H1
|
|
><P
|
|
>
|
|
|
|
|
|
Today, Usenet has grown to enormous proportions. Sites that carry the
|
|
whole of Netnews usually transfer something like a paltry 60 MB a day.<A
|
|
NAME="X-087-2-FNUN02"
|
|
HREF="#FTN.X-087-2-FNUN02"
|
|
>[1]</A
|
|
> Of course, this requires much
|
|
more than pushing files around. So let's take a look at the way most Unix
|
|
systems handle Usenet news.</P
|
|
><P
|
|
>
|
|
News begins when users create and post articles. Each user enters a
|
|
message into a special application called a newsreader, which
|
|
formats it appropriately for transmission to the local
|
|
news server. In Unix environments the newsreader commonly uses the
|
|
<B
|
|
CLASS="COMMAND"
|
|
>inews</B
|
|
> command to transmit articles to the
|
|
newsserver using the TCP/IP protocol. But it's also possible to
|
|
write the article directly into a file in a special directory called
|
|
the news spool. Once the posting is delivered to the local
|
|
news server, it takes responsibility for delivering the article to
|
|
other news users.</P
|
|
><P
|
|
> News is distributed through the net by various transports. The medium used to be UUCP, but today the main traffic is carried by Internet
|
|
sites. The routing algorithm used is called <I
|
|
CLASS="EMPHASIS"
|
|
>flooding</I
|
|
>.
|
|
Each site maintains a number of links (<I
|
|
CLASS="EMPHASIS"
|
|
>news feeds</I
|
|
>) to
|
|
other sites. Any article generated or received by the local news system is
|
|
forwarded to them, unless it has already been at that site, in which case it
|
|
is discarded. A site may find out about all other sites the article has
|
|
already traversed by looking at the <TT
|
|
CLASS="LITERAL"
|
|
>Path:</TT
|
|
> header field. This
|
|
header contains a list of all systems through which the article has been
|
|
forwarded in bang path notation.</P
|
|
><P
|
|
>
|
|
To distinguish articles and recognize duplicates, Usenet articles have
|
|
to carry a message ID (specified in the <TT
|
|
CLASS="LITERAL"
|
|
>Message-Id:</TT
|
|
> header
|
|
field), which combines the posting site's name and a serial number into
|
|
<<TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>serial</I
|
|
></TT
|
|
>@<TT
|
|
CLASS="REPLACEABLE"
|
|
><I
|
|
>site</I
|
|
></TT
|
|
> >.
|
|
For each article processed, the news system logs this ID into a
|
|
<I
|
|
CLASS="EMPHASIS"
|
|
>history</I
|
|
> file, against which all newly arrived articles
|
|
are checked.</P
|
|
><P
|
|
>
|
|
The flow between any two sites may be limited by two criteria. For one,
|
|
an article is assigned a distribution (in the <TT
|
|
CLASS="LITERAL"
|
|
>Distribution:</TT
|
|
>
|
|
header field), which may be used to confine it to a certain group of
|
|
sites. On the other hand, the newsgroups exchanged may be limited by
|
|
both the sending and receiving systems. The set of newsgroups and
|
|
distributions allowed to be transmitted to a site are usually kept in the
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>sys</TT
|
|
> file.</P
|
|
><P
|
|
>
|
|
|
|
The sheer number of articles usually requires that improvements be made
|
|
to the above scheme. On UUCP networks, systems collect articles over a period
|
|
of time and combine them into a single file, which is compressed and sent to
|
|
the remote site. This is called
|
|
<I
|
|
CLASS="EMPHASIS"
|
|
>batching</I
|
|
>.</P
|
|
><P
|
|
>
|
|
An alternative technique is the <I
|
|
CLASS="EMPHASIS"
|
|
>ihave/sendme</I
|
|
> protocol that
|
|
prevents duplicate articles from being transferred,
|
|
thus saving net bandwidth. Instead of putting all articles in batch
|
|
files and sending them along, only the message IDs of articles are
|
|
combined into a giant “ihave” message and sent to the remote
|
|
site. The remote site reads this message, compares it to its history file,
|
|
and returns the list of articles it wants in a “sendme” message.
|
|
Only the requested articles are sent.</P
|
|
><P
|
|
>Of course, ihave/sendme makes sense only if it involves two big sites
|
|
that receive news from several independent feeds each, and that poll each
|
|
other often enough for an efficient flow of news.</P
|
|
><P
|
|
> Sites that are on the Internet generally rely on TCP/IP-based software that
|
|
uses the Network News Transfer Protocol (NNTP). NNTP is described in RFC-977;
|
|
it is responsible for the transfer of news between news servers and provides
|
|
Usenet access to single users on remote hosts.</P
|
|
><P
|
|
>
|
|
NNTP knows three different ways to transfer news. One is a real-time version
|
|
of ihave/sendme, also referred to as <I
|
|
CLASS="EMPHASIS"
|
|
>pushing</I
|
|
> news. The
|
|
second technique is called <I
|
|
CLASS="EMPHASIS"
|
|
>pulling</I
|
|
> news, in which the
|
|
client requests a list of articles in a given newsgroup or hierarchy that have
|
|
arrived at the server's site after a specified date, and chooses those it
|
|
cannot find in its history file. The third technique is for interactive
|
|
newsreading and allows you or your newsreader to retrieve articles from
|
|
specified newgroups, as well as post articles with incomplete header
|
|
information. </P
|
|
><P
|
|
>
|
|
At each site, news is kept in a directory hierarchy below
|
|
<TT
|
|
CLASS="FILENAME"
|
|
>/var/spool/news</TT
|
|
>, each article in a separate file, and
|
|
each newsgroup in a separate directory. The directory name is made up of the
|
|
newsgroup name, with the components being the path components. Thus,
|
|
<SPAN
|
|
CLASS="SYSTEMITEM"
|
|
>comp.os.linux.misc</SPAN
|
|
> articles are kept
|
|
in <TT
|
|
CLASS="FILENAME"
|
|
>/var/spool/news/comp/os/linux/misc</TT
|
|
>. The articles in a
|
|
newsgroup are assigned numbers in the order they arrive. This number serves as
|
|
the file's name. The range of numbers of articles currently online is kept
|
|
in a file called <TT
|
|
CLASS="FILENAME"
|
|
>active</TT
|
|
>, which at the same time serves
|
|
as a list of newsgroups your site knows.</P
|
|
><P
|
|
>
|
|
Since disk space is a finite resource, you have to start throwing away
|
|
articles after some time.<A
|
|
NAME="X-087-2-FNUN03"
|
|
HREF="#FTN.X-087-2-FNUN03"
|
|
>[2]</A
|
|
> This is called
|
|
<I
|
|
CLASS="EMPHASIS"
|
|
>expiring</I
|
|
>. Usually, articles from certain groups and
|
|
hierarchies are expired at a fixed number of days after they arrive. This
|
|
may be overridden by the poster by specifying a date of expiration in the
|
|
<TT
|
|
CLASS="LITERAL"
|
|
>Expires:</TT
|
|
> field of the article header. </P
|
|
><P
|
|
>You now have enough information to choose what to read next. UUCP users
|
|
should read about C-News in <A
|
|
HREF="x-087-2-cnews.html"
|
|
>Chapter 21</A
|
|
>. If you're using
|
|
a TCP/IP network, read about NNTP in <A
|
|
HREF="x-087-2-nntp.html"
|
|
>Chapter 22</A
|
|
>. If you
|
|
need to transfer moderate amounts of news over TCP/IP, the server described
|
|
in that chapter may be enough for you. To install a heavy-duty news server
|
|
that can handle huge volumes of material, go on to read about InterNet News
|
|
in <A
|
|
HREF="x-087-2-inn.html"
|
|
>Chapter 23</A
|
|
>.</P
|
|
></DIV
|
|
><H3
|
|
CLASS="FOOTNOTES"
|
|
>Notes</H3
|
|
><TABLE
|
|
BORDER="0"
|
|
CLASS="FOOTNOTES"
|
|
WIDTH="100%"
|
|
><TR
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="5%"
|
|
><A
|
|
NAME="FTN.X-087-2-FNUN02"
|
|
HREF="x-087-2-news.algorithm.html#X-087-2-FNUN02"
|
|
>[1]</A
|
|
></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="95%"
|
|
><P
|
|
> Wait a minute: 60
|
|
Megs at 9,600 bps, that's 60 million multiplied by 1,024, that
|
|
is… mutter, mutter… Hey! That's 34 hours!</P
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="5%"
|
|
><A
|
|
NAME="FTN.X-087-2-FNUN03"
|
|
HREF="x-087-2-news.algorithm.html#X-087-2-FNUN03"
|
|
>[2]</A
|
|
></TD
|
|
><TD
|
|
ALIGN="LEFT"
|
|
VALIGN="TOP"
|
|
WIDTH="95%"
|
|
><P
|
|
>Some people claim that Usenet is a conspiracy by modem and hard disk vendors.</P
|
|
></TD
|
|
></TR
|
|
></TABLE
|
|
><DIV
|
|
CLASS="NAVFOOTER"
|
|
><HR
|
|
ALIGN="LEFT"
|
|
WIDTH="100%"><TABLE
|
|
WIDTH="100%"
|
|
BORDER="0"
|
|
CELLPADDING="0"
|
|
CELLSPACING="0"
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x-087-2-news.usenet.html"
|
|
>Prev</A
|
|
></TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="index.html"
|
|
>Home</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x-087-2-cnews.html"
|
|
>Next</A
|
|
></TD
|
|
></TR
|
|
><TR
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="left"
|
|
VALIGN="top"
|
|
>What Is Usenet, Anyway?</TD
|
|
><TD
|
|
WIDTH="34%"
|
|
ALIGN="center"
|
|
VALIGN="top"
|
|
><A
|
|
HREF="x-087-2-news.html"
|
|
>Up</A
|
|
></TD
|
|
><TD
|
|
WIDTH="33%"
|
|
ALIGN="right"
|
|
VALIGN="top"
|
|
>C News</TD
|
|
></TR
|
|
></TABLE
|
|
></DIV
|
|
></BODY
|
|
></HTML
|
|
> |