old-www/LDP/LG/issue38/tag/17.html

346 lines
14 KiB
HTML

<!--startcut ======================================================= -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<META NAME="generator" CONTENT="lgazmail v1.1I.e">
<TITLE>The Answer Guy 38: Proxying over PPP</TITLE>
</HEAD><BODY BGCOLOR="#FFFFFF" TEXT="#000000"
LINK="#3366FF" VLINK="#A000A0">
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<H4>"The Linux Gazette...<I>making Linux just a little more fun!</I>"</H4>
<P> <hr> <P>
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<center>
<H1><A NAME="answer">
<img src="../../gx/dennis/qbubble.gif" alt="(?)"
border="0" align="middle">
<font color="#B03060">The Answer Guy</font>
<img src="../../gx/dennis/bbubble.gif" alt="(!)"
border="0" align="middle">
</A></H1>
<BR>
<H4>By James T. Dennis,
<a href="mailto:linux-questions-only@ssc.com">linux-questions-only@ssc.com</a><BR>
Starshine Technical Services,
<A HREF="http://www.starshine.org/">http://www.starshine.org/</A>
</H4>
</center>
<p><hr><p>
<!-- endcut ======================================================= -->
<!-- begin 17 -->
<H3 align="left"><img src="../../gx/dennis/qbubble.gif"
height="50" width="60" alt="(?) " border="0"
>Proxying over PPP</H3>
<p><strong>From prashant on Thu, 11 Feb 1999
</strong></p>
<!-- ::
Proxying over PPP
~~~~~~~~~~~~~~~~~
:: -->
<P><STRONG><IMG SRC="../../gx/dennis/qbub.gif" ALT="(?)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
Hi Answerguy,
</STRONG></P>
<P><STRONG>
I am using <A HREF="http://www.redhat.com/">Red Hat</A> Linux.And I want to install a proxy server.
I have a modem can configure ppp over that.
</STRONG></P>
<P><STRONG>
But i want that proxy to do the following functions:
</STRONG></P>
<P><STRONG><ol>
<li>It should optimize my ppp connection.
(<tt>webproxy-1.3</tt> does provides this)
<li>As this webproxy doesn't handle cache.A cache manager
'Squid' must be installed.
<li>Also it doesn't supports many protocols. So
I want a router linked it
</ol></STRONG></P>
<P><STRONG>
I dont know how i am going to do this please help me.
</STRONG></P>
<P><STRONG>
yours
<br>Prashant Deshpande.
</STRONG></P>
<BLOCKQUOTE><IMG SRC="../../gx/dennis/bbub.gif" ALT="(!)"
HEIGHT="28" WIDTH="50" BORDER="0"
>
Your list mixes needs with conclusions. I don't
recommend that when doing "requirements analysis" as
you'll probably end up with some inappropriate constraints.
</BLOCKQUOTE>
<BLOCKQUOTE>
If I understand it correctly you want to "optimize"
your PPP connection in the sense that you want to
minimize the traffic flowing over it, and the latency
between requests and responses.
</BLOCKQUOTE>
<BLOCKQUOTE>
I'm not familiar with a package named "<tt>webproxy-1.3</tt>" --- but
any caching/proxy will tend to lessen the traffic depending
on your usage patterns and the co-operation of the sites
that you access over these protocols. Squid is probably the
most advanced caching proxying available --- and it's
designed to peer with other ICP (Internet Caching Protocol)
servers, (potentially minimizing traffic over other links,
further out on the Internet, beyond your PPP link while also
minimizing latency).
</BLOCKQUOTE>
<BLOCKQUOTE>
I don't understand item three at all. What doesn't support
many protcocols? Squid supports a number of protocols
(all those that are amenable to caching, that I can think
of). Also the conclusion: "So I want a router linked it"
is completely bogus. A <EM>router</EM> does <EM>routing</EM>, a
proxy does proxying and caching. These functions operate
at different (though sometimes blurred) levels in the
OSI reference model.
</BLOCKQUOTE>
<BLOCKQUOTE>
If you use your Linux system as a "gateway" to the Internet
for any systems other than itself (if it has an ethernet
and a PPP link or any other combination of two or more
non-loopback interfaces) than it probably <EM>is</EM> acting as a
router.
</BLOCKQUOTE>
<BLOCKQUOTE>
So, let's step back from the constraints implied by these
extraneous comments and focus on what you want.
</BLOCKQUOTE>
<BLOCKQUOTE>
You could do some protocol analysis on your PPP link to
determine what protocols are consuming which percentages
of the bandwidth; and to determine the average latency
among various protocols. This would help you focus on
which protocols are likely to benefit the most from
caching. It's also possible you might find other ways
to help improve your utilization.
</BLOCKQUOTE>
<BLOCKQUOTE>
Without going into gory details of using '<tt>tcpdump</tt>' and
performing data analysis on that we can suggest that you
start with the basics.
</BLOCKQUOTE>
<BLOCKQUOTE>
Run a caching nameserver on your PPP/router. This should
immediately improve response time and reduce bandwidth
utilization by obviating the need to forward/route DNS
queries across the link. Make sure to configure the
<TT>/etc/resolv.conf</TT> (or its equivalent on your non-Unix
systems) to actually use your caching nameserver. That
includes the <tt>resolv.conf</tt> on the router/gateway itself!
</BLOCKQUOTE>
<BLOCKQUOTE>
Install Squid and configure your web browsers and any
gopher, WAIS, or other supported clients to use it. That
should help with those web sites that don't egregiously
prevent caching. Note that some sites use HTTP headers
(Pragmas) to eliminate or minimize caching of their pages.
This is often done by "advertising" supported sites as part
of their "imprint" accounting and to support their high
traffic claims (to their customers). That is BAD for the
Internet as a whole (since it forces every link between
those sites and all of their clients to carry redundant
traffic). Oh well! There goes the neighborhood!
</BLOCKQUOTE>
<BLOCKQUOTE>
After you've taken these two steps (and provided your
caching proxy/router with LOTS of disk space and memory) you
should monitor the line performance (informally) to see if
that meets your needs. You've probably gained 80-90% of the
potential efficiency gains already --- so additional work
will have diminishing returns.
</BLOCKQUOTE>
<BLOCKQUOTE>
You can install DeleGate for FTP proxying (I don't know how
to make "normal" FTP clients talk to Squid's FTP proxying
--- but they can be configured to use DeleGate as you'd use
any SOCKS proxy, and you can "manually" traverse a DeleGate
FTP or telnet proxy in a way that's conceptually similar to
the old TIS FWTK (though completely different, and much
cleaner, in syntax).
</BLOCKQUOTE>
<BLOCKQUOTE>
That's probably about as far as you can go with simple
proxying. From there you'll have to change the mixture of
protocols you run, and/or optimize the way you work. For
example if you have e-mail flowing over that PPP link you
might reconfigure that to "Hold" (as "expensive") and queue
it for delivery during off peak hours.
</BLOCKQUOTE>
<BLOCKQUOTE>
You might even reconfigure your e-mail and any netnews
traffic (both outgoing and incoming) to go through UUCP.
UUCP allows you to "grade" your traffic, and to schedule the
delivery and receipt. This can include file transfers as
well as mail and news. Naturally you'd have to arrange for
some ISP to provide your UUCP batching for you. There are
still some ISPs that specialize in this, and there are still
some co-operative arrangements available in some localities.
</BLOCKQUOTE>
<BLOCKQUOTE>
These techniques have a very steep learning curve. No one
has been providing WYSI new front ends to make the
configuration of UUCP links as easy as common PPP scenarios
are today. Also there are very few ISPs with the expertise
and interest to provide these services. In addition the
entire discussion is moot if you aren't carrying netnews,
email, or file-transfer traffic over your link (if you don't
read netnews, you've arranged ISP POP accounts on the other
side of your link and your file transfers can't be scheduled
and automated with UUCP).
</BLOCKQUOTE>
<BLOCKQUOTE>
Another option is to look at your work and access patterns.
If you know that you're going to want to read "Linux Weekly
News" every Thursday morning when you come in, create a <tt>cron</tt>
job to '<tt>wget</tt>' or do a '<tt>lynx -traversal</tt>' of
<A HREF="http://www.lwn.net"
>http://www.lwn.net</A> every Thursday morning at 3:00am (before
you come in, but still in the "dead of the night). The LWN
crew seems to consistently have that up by about midnight
(U.S. Mountain time). You could have similar daily jobs for
your "Dilbert" fix (<A HREF="http://www.unitedmedia.com/dilbert"
>http://www.unitedmedia.com/dilbert</A>)
etc.
</BLOCKQUOTE>
<BLOCKQUOTE>
There are some tricks you can do to minimize the amount of
your bandwidth you devote to downloading advertising and
graphics. One method is to use Lynx (which doesn't download
<EM>any</EM> graphics by default, and therefore filters out most
banner ads). Another is to create your own "localhost"
aliases for some sites like "<tt>click.net</tt>" --- sites which are
used exclusively to serve banner ads that are embedded in
the HTML of the sites you visit. Of course, the
advertisers, web site maintainers (like Yahoo!) and
click.net itself might complain that you are "depriving"
them of revenue by viewing these advertiser supported pages
while filtering out the advertsing.
</BLOCKQUOTE>
<BLOCKQUOTE>
If a statistically significant number of users employ these
strategies then we'll see a resulting "arms race" to force
the advertisments down your throat. They'll increasingly
"mix" the advertising and content as inextricably as
possible --- meaning that text browsers and search engines
will become useless.
</BLOCKQUOTE>
<BLOCKQUOTE>
It's a pity that more of us don't consider the implications
of advertiser supported media on our lives. Your broadcast
news, TV, radio, newspapers and other periodical
publications are all completely funded by advertising and
therefore fundamentally suspect in regards to content and
focus. Its not a "conspiracy" theory --- merely and
economic fact. You get what was paid for. Since you didn't
"pay for" the content that you're receiving through
traditional media (and increasingly for Internet "content")
--- you have little or no say in what's provided over them.
</BLOCKQUOTE>
<BLOCKQUOTE>
You have obscure indirect effects by your selection of
products and services and somewhat more by complaint (to
government and regulatory bodies and to sponsors). It's all
very "negative" (in a philosophical sense). It's a pity we
haven't come up with a better way to do things --- though
the Internet's netnews, mailing lists, and the personally
and "activist" run and maintained web sites continue to be a
"ray of hope."
</BLOCKQUOTE>
<BLOCKQUOTE>
In any event: That's about all there is to caching
and proxying for small sites over PPP and other
low-bandwidth links. Larger internetwork sites might
benefit from more elaborate ICP arrangments (peering
among departmental Squid servers and creating a whole
caching hierarchy).
</BLOCKQUOTE>
<BLOCKQUOTE>
Remember that this is not a magic bullet.
</BLOCKQUOTE>
<BLOCKQUOTE>
It's possible that your usage patterns actually won't
benefit from caching or proxying. If everyone on your
network is always visiting <EM>different</EM> sites, and they only
visit sites that change frequently --- then the cache will
be a waste of your systems memory and disk space.
</BLOCKQUOTE>
<BLOCKQUOTE>
Best of luck!
</BLOCKQUOTE>
<!-- sig -->
<!-- end 17 -->
<!--startcut ======================================================= -->
<P> <hr> <P>
<H5 align="center"><a href="http://www.linuxgazette.com/copying.html"
>Copyright &copy;</a> 1999, James T. Dennis
<BR>Published in <I>The Linux Gazette</I> Issue 38 March 1999</H5>
<P> <hr> <P>
<!-- begin tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::-->
<TABLE WIDTH="96%"><TR VALIGN="center" ALIGN="center">
<TD ROWSPAN="3" COLSPAN="4" WIDTH="24%"><A HREF="../lg_answer38.html"
><IMG SRC="../../gx/dennis/answernew.gif"
ALT="[ Answer Guy Index ]"></A></td>
<TD WIDTH="6%"><A HREF="1.html">1</A></TD>
<TD WIDTH="6%"><A HREF="2.html">2</A></TD>
<TD WIDTH="6%"><A HREF="3.html">3</A></TD>
<TD WIDTH="6%"><A HREF="4.html">4</A></TD>
<TD WIDTH="6%"><A HREF="5.html">5</A></TD>
<TD WIDTH="6%"><A HREF="6.html">6</A></TD>
<TD WIDTH="6%"><A HREF="7.html">7</A></TD>
<TD WIDTH="6%"><A HREF="8.html">8</A></TD>
<TD WIDTH="6%"><A HREF="9.html">9</A></TD>
<TD WIDTH="6%"><A HREF="10.html">10</A></TD>
<TD WIDTH="6%"><A HREF="11.html">11</A></TD>
</TR><TR VALIGN="center" ALIGN="center">
<TD><A HREF="12.html">12</A></TD>
<TD>&nbsp;</TD>
<TD><A HREF="14.html">14</A></TD>
<TD>&nbsp;</TD>
<TD><A HREF="16.html">16</A></TD>
<TD><A HREF="17.html">17</A></TD>
<TD><A HREF="18.html">18</A></TD>
<TD><A HREF="19.html">19</A></TD>
<TD>&nbsp;</TD>
<TD><A HREF="21.html">21</A></TD>
<TD><A HREF="22.html">22</A></TD>
</TR><TR VALIGN="center" ALIGN="center">
<TD><A HREF="23.html">23</A></TD>
<TD><A HREF="24.html">24</A></TD>
<TD>&nbsp;</TD>
<TD><A HREF="26.html">26</A></TD>
<TD>&nbsp;</TD>
<TD><A HREF="28.html">28</A></TD>
<TD><A HREF="29.html">29</A></TD>
<TD><A HREF="30.html">30</A></TD>
<TD><A HREF="31.html">31</A></TD>
<TD><A HREF="32.html">32</A></TD>
<TD>&nbsp;</TD>
</TR></TABLE>
<!-- end tagnav ::::::::::::::::::::::::::::::::::::::::::::::::::::-->
<P> <hr> <P>
<!-- begin lgnav ::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<A HREF="../index.html"
><IMG SRC="../../gx/indexnew.gif" ALT="[ Table Of Contents ]"></A>
<A HREF="../../index.html"
><IMG SRC="../../gx/homenew.gif" ALT="[ Front Page ]"></A>
<A HREF="../lg_bytes38.html"
><IMG SRC="../../gx/back2.gif" ALT="[ Previous Section ]"></A>
<A HREF="../lg_tips38.html"
><IMG SRC="../../gx/fwd.gif" ALT="[ Next Section ]"></A>
<!-- end lgnav ::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
<!-- ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -->
</BODY></HTML>
<!--endcut ========================================================= -->