old-www/HOWTO/TransparentProxy-2.html

106 lines
5.8 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<META NAME="GENERATOR" CONTENT="SGML-Tools 1.0.9">
<TITLE>Transparent Proxy with Linux and Squid mini-HOWTO: Overview of Transparent Proxying</TITLE>
<LINK HREF="TransparentProxy-3.html" REL=next>
<LINK HREF="TransparentProxy-1.html" REL=previous>
<LINK HREF="TransparentProxy.html#toc2" REL=contents>
</HEAD>
<BODY>
<A HREF="TransparentProxy-3.html">Next</A>
<A HREF="TransparentProxy-1.html">Previous</A>
<A HREF="TransparentProxy.html#toc2">Contents</A>
<HR>
<H2><A NAME="s2">2. Overview of Transparent Proxying</A></H2>
<H2><A NAME="ss2.1">2.1 Motivation</A>
</H2>
<P>In ``ordinary'' proxying, the client specifies the hostname and port number
of a proxy in his web browsing software. The browser then makes requests to
the proxy, and the proxy forwards them to the origin servers. This is all fine
and good, but sometimes one of several situations arise. Either
<P>
<UL>
<LI>You want to force clients on your network to use the proxy, whether they
want to or not.</LI>
<LI>You want clients to use a proxy, but don't want them to know they're being
proxied.</LI>
<LI>You want clients to be proxied, but don't want to go to all the work of
updating the settings in hundreds or thousands of web browsers.</LI>
</UL>
<P>This is where transparent proxying comes in. A web request can be intercepted
by the proxy, transparently. That is, as far as the client software knows,
it is talking to the origin server itself, when it is really talking to
the proxy server. (Note that the transparency only applies to the client;
the server knows that a proxy is involved, and will see the IP address of
the proxy, not the IP address of the user. Although, squid may pass
an X-Forwarded-For header, so that the server can determine the original
user's IP address if it groks that header).
<P>Cisco routers support transparent proxying. So do many switches. But,
(surprisingly enough)
Linux can act as a router, and can perform transparent proxying by redirecting
TCP connections to local ports. However, we also need to make our web proxy
aware of the affect of the redirection, so that it can make connections to
the proper origin servers. There are two general ways this works:
<P>The first is when your web proxy is not transparent proxy aware. You can
use a nifty little daemon called transproxy that sits in front of your web
proxy and takes care of all the messy details for you. transproxy was written
by John Saunders, and is available from
<P>
<A HREF="ftp://ftp.nlc.net.au/pub/linux/www/">ftp://ftp.nlc.net.au/pub/linux/www/</A>
or your local metalab mirror. transproxy will not be discussed further in this
document.
<P>A cleaner solution is to get a web proxy that is aware of transparent proxying
itself. The one we are going to focus on here is squid. Squid is an Open Source
caching proxy server for Unix systems. It is available from
<A HREF="http://www.squid-cache.org">www.squid-cache.org</A><P>Alternatively, instead of redirecting the connections to local ports, we could redirect the connections to remote ports. This is discussed in the
<A HREF="TransparentProxy-6.html#twoboxes">Transparent Proxy to a Remote Box</A> section. Readers interested in this approach should skip down to that section. Readers interested on doing everything on one box can safely ignore that section.
<P>
<H2><A NAME="ss2.2">2.2 Scope of this document</A>
</H2>
<P>This document will focus on squid version 2.4 and Linux kernel version
2.4, the most current stable releases as of this writing (August 2002). It
should also work with most of the later 2.3 kernels. If you need information
about earlier releases of squid or Linux, you can find some earlier
documents at
<A HREF="http://users.gurulink.com/transproxy/">http://users.gurulink.com/transproxy/</A>. Note that this site has moved from it's previous location.
<P>If you are using a development kernel or a development version of squid, you are on your own. This document may help you, but YMMV.
<P>Note that this document focuses only on HTTP proxing. I get many emails asking
about transparent FTP proxying. Squid can't do it. Now, allegedly a program
called Frox can. I have not tried this myself, so I cannot say how well it
works. You can find it at
<A HREF="http://www.hollo32.fsnet.co.uk/frox/">http://www.hollo32.fsnet.co.uk/frox/</A>.
<P>I only focus on squid here, but Apache can also function as a caching proxy
server. (If you are not sure which to use, I recommend squid, since it was
built from the ground up to be
a caching proxy server, Apache's caching proxy features are more of
afterthought additions to an already existing system.)
If you want use Apache instead of squid: follow all the instructions in this
document that pertain to the kernel and iptables rules. Ignore the squid
specific sections, and instead look at
<A HREF="http://lupo.campus.uniroma2.it/progetti/mod_tproxy/">http://lupo.campus.uniroma2.it/progetti/mod_tproxy/</A> for source code and
instructions for a transparent proxy module for Apache (thanks to Cristiano Paris (c.paris@libero.it) for contributing this).
<H2><A NAME="ss2.3">2.3 HTTPS</A>
</H2>
<P>Finally, as far as transparently proxing HTTPS (e.g. secure web pages using
SSL, TSL, etc.), you can't do it. Don't even ask. For the explanation, do a
search for 'man-in-the-middle attack'. Note that you probably don't
really need to transparently proxy HTTPS anyway, since squid can not
cache secure pages.
<H2><A NAME="ss2.4">2.4 Proxy Authentication</A>
</H2>
<P>You cannot use Proxy Authentication transparently. See the
<A HREF="http://www.squid-cache.org/Doc/FAQ/FAQ.html">Squid FAQ</A>
for (slightly) more details.
<HR>
<A HREF="TransparentProxy-3.html">Next</A>
<A HREF="TransparentProxy-1.html">Previous</A>
<A HREF="TransparentProxy.html#toc2">Contents</A>
</BODY>
</HTML>