563 lines
23 KiB
Plaintext
563 lines
23 KiB
Plaintext
Transparent Proxy with Linux and Squid mini-HOWTO
|
||
Daniel Kiracofe
|
||
v1.15, August 2002
|
||
|
||
This document provides information on how to setup a transparent
|
||
caching HTTP proxy server using only Linux and squid.
|
||
______________________________________________________________________
|
||
|
||
Table of Contents
|
||
|
||
|
||
1. Introduction
|
||
|
||
1.1 Comments
|
||
1.2 Copyrights and Trademarks
|
||
1.3 #include <disclaimer.h>
|
||
|
||
2. Overview of Transparent Proxying
|
||
|
||
2.1 Motivation
|
||
2.2 Scope of this document
|
||
2.3 HTTPS
|
||
2.4 Proxy Authentication
|
||
|
||
3. Configuring the Kernel
|
||
|
||
4. Setting up squid
|
||
|
||
5. Setting up iptables (Netfilter)
|
||
|
||
6. Transparent Proxy to a Remote Box
|
||
|
||
6.1 First method (simpler, but does not work for some esoteric cases)
|
||
6.2 Second method (more complicated, but more general)
|
||
6.3 Method One: What if iptables-box is on a dynamic IP?
|
||
|
||
7. Transparent Proxy With Bridging
|
||
|
||
8. Put it all together
|
||
|
||
9. Troubleshooting
|
||
|
||
10. Further Resources
|
||
|
||
|
||
|
||
______________________________________________________________________
|
||
|
||
1. Introduction
|
||
|
||
1.1. Comments
|
||
|
||
Comments and general feedback on this mini HOWTO are welcome and can
|
||
be directed to its author, Daniel Kiracofe, at drk@unxsoft.com.
|
||
|
||
1.2. Copyrights and Trademarks
|
||
|
||
Copyright 2000-2002 by Daniel Kiracofe
|
||
|
||
This manual may be reproduced in whole or in part, without fee,
|
||
subject to the following restrictions:
|
||
|
||
|
||
· The copyright notice above and this permission notice must be
|
||
preserved complete on all complete or partial copies
|
||
|
||
· Translation to another language is permitted, provided that the
|
||
author is notified prior to the translation.
|
||
|
||
· Any derived work must be approved by the author in writing before
|
||
distribution.
|
||
|
||
· If you distribute this work in part, instructions for obtaining the
|
||
complete version of this manual must be included, and a means for
|
||
obtaining a complete version provided.
|
||
|
||
· Small portions may be reproduced as illustrations for reviews or
|
||
quotes in other works without this permission notice if proper
|
||
citation is given.
|
||
|
||
Exceptions to these rules may be granted for academic purposes: Write
|
||
to the author and ask. These restrictions are here to protect us as
|
||
authors, not to restrict you as learners and educators. Any source
|
||
code (aside from the SGML this document was written in) in this
|
||
document is placed under the GNU General Public License, available via
|
||
anonymous FTP from the GNU archive.
|
||
|
||
1.3. #include <disclaimer.h>
|
||
|
||
No warranty, expressed or implied, etc, etc, etc...
|
||
|
||
2. Overview of Transparent Proxying
|
||
|
||
2.1. Motivation
|
||
|
||
In ``ordinary'' proxying, the client specifies the hostname and port
|
||
number of a proxy in his web browsing software. The browser then makes
|
||
requests to the proxy, and the proxy forwards them to the origin
|
||
servers. This is all fine and good, but sometimes one of several
|
||
situations arise. Either
|
||
|
||
|
||
· You want to force clients on your network to use the proxy, whether
|
||
they want to or not.
|
||
|
||
· You want clients to use a proxy, but don't want them to know
|
||
they're being proxied.
|
||
|
||
· You want clients to be proxied, but don't want to go to all the
|
||
work of updating the settings in hundreds or thousands of web
|
||
browsers.
|
||
|
||
This is where transparent proxying comes in. A web request can be
|
||
intercepted by the proxy, transparently. That is, as far as the client
|
||
software knows, it is talking to the origin server itself, when it is
|
||
really talking to the proxy server. (Note that the transparency only
|
||
applies to the client; the server knows that a proxy is involved, and
|
||
will see the IP address of the proxy, not the IP address of the user.
|
||
Although, squid may pass an X-Forwarded-For header, so that the server
|
||
can determine the original user's IP address if it groks that header).
|
||
|
||
Cisco routers support transparent proxying. So do many switches. But,
|
||
(surprisingly enough) Linux can act as a router, and can perform
|
||
transparent proxying by redirecting TCP connections to local ports.
|
||
However, we also need to make our web proxy aware of the affect of the
|
||
redirection, so that it can make connections to the proper origin
|
||
servers. There are two general ways this works:
|
||
|
||
The first is when your web proxy is not transparent proxy aware. You
|
||
can use a nifty little daemon called transproxy that sits in front of
|
||
your web proxy and takes care of all the messy details for you.
|
||
transproxy was written by John Saunders, and is available from
|
||
ftp://ftp.nlc.net.au/pub/linux/www/ or your local metalab mirror.
|
||
transproxy will not be discussed further in this document.
|
||
|
||
A cleaner solution is to get a web proxy that is aware of transparent
|
||
proxying itself. The one we are going to focus on here is squid. Squid
|
||
is an Open Source caching proxy server for Unix systems. It is
|
||
available from www.squid-cache.org
|
||
|
||
Alternatively, instead of redirecting the connections to local ports,
|
||
we could redirect the connections to remote ports. This is discussed
|
||
in the ``Transparent Proxy to a Remote Box'' section. Readers
|
||
interested in this approach should skip down to that section. Readers
|
||
interested on doing everything on one box can safely ignore that
|
||
section.
|
||
|
||
|
||
2.2. Scope of this document
|
||
|
||
This document will focus on squid version 2.4 and Linux kernel version
|
||
2.4, the most current stable releases as of this writing (August
|
||
2002). It should also work with most of the later 2.3 kernels. If you
|
||
need information about earlier releases of squid or Linux, you can
|
||
find some earlier documents at http://users.gurulink.com/transproxy/.
|
||
Note that this site has moved from it's previous location.
|
||
|
||
If you are using a development kernel or a development version of
|
||
squid, you are on your own. This document may help you, but YMMV.
|
||
|
||
Note that this document focuses only on HTTP proxing. I get many
|
||
emails asking about transparent FTP proxying. Squid can't do it.
|
||
Now, allegedly a program called Frox can. I have not tried this
|
||
myself, so I cannot say how well it works. You can find it at
|
||
http://www.hollo32.fsnet.co.uk/frox/.
|
||
|
||
I only focus on squid here, but Apache can also function as a caching
|
||
proxy server. (If you are not sure which to use, I recommend squid,
|
||
since it was built from the ground up to be a caching proxy server,
|
||
Apache's caching proxy features are more of afterthought additions to
|
||
an already existing system.) If you want use Apache instead of squid:
|
||
follow all the instructions in this document that pertain to the
|
||
kernel and iptables rules. Ignore the squid specific sections, and
|
||
instead look at http://lupo.campus.uniroma2.it/progetti/mod_tproxy/
|
||
for source code and instructions for a transparent proxy module for
|
||
Apache (thanks to Cristiano Paris (c.paris@libero.it) for contributing
|
||
this).
|
||
|
||
2.3. HTTPS
|
||
|
||
Finally, as far as transparently proxing HTTPS (e.g. secure web pages
|
||
using SSL, TSL, etc.), you can't do it. Don't even ask. For the
|
||
explanation, do a search for 'man-in-the-middle attack'. Note that
|
||
you probably don't really need to transparently proxy HTTPS anyway,
|
||
since squid can not cache secure pages.
|
||
|
||
2.4. Proxy Authentication
|
||
|
||
You cannot use Proxy Authentication transparently. See the Squid FAQ
|
||
for (slightly) more details.
|
||
|
||
3. Configuring the Kernel
|
||
|
||
First, we need to make sure all the proper options are set in your
|
||
kernel. If you are using a stock kernel from your distribution,
|
||
transparent proxying may or may not be enabled. If you are unsure,
|
||
the best way to tell is to simply skip this section, and if the
|
||
commands in the next section give you weird errors, it's probably
|
||
because the kernel wasn't configured properly.
|
||
|
||
If your kernel is not configured for transparent proxying, you will
|
||
need to recompile. Recompiling a kernel is a complex process (at least
|
||
at first), and it is beyond the scope of this document. If you need
|
||
help compiling a kernel, please see The Kernel HOWTO
|
||
|
||
The options you need to set in your configuration are as follows
|
||
(Note: if you prefer modules, some (but not all) of these can be built
|
||
as modules. Luckily, everything that is not modularizable is probably
|
||
got in your kernel anyway.)
|
||
|
||
|
||
· Under General Setup
|
||
|
||
· Networking support
|
||
|
||
· Sysctl support
|
||
|
||
· Under Networking Options
|
||
|
||
· Network packet filtering
|
||
|
||
· TCP/IP networking
|
||
|
||
· Under Networking Options -> IP: Netfilter Configuration
|
||
|
||
· Connection tracking
|
||
|
||
· IP tables support
|
||
|
||
· Full NAT
|
||
|
||
· REDIRECT target support
|
||
|
||
· Under File Systems
|
||
|
||
· /proc filesystem support
|
||
|
||
You must say NO to ``Fast switching'' under Networking Options.
|
||
|
||
Once you have your new kernel up and running, you may need to enable
|
||
IP forwarding. IP forwarding allows your computer to act as a router.
|
||
Since this is not what the average user wants to do, it is off by
|
||
default and must be explicitly enabled at run-time. However, your
|
||
distribution might do this for you already. To check, do ``cat
|
||
/proc/sys/net/ipv4/ip_forward''. If you see ``1'' you're good.
|
||
Otherwise, do ``echo '1' > /proc/sys/net/ipv4/ip_forward''. You will
|
||
then want to add that command to your appropriate bootup scripts
|
||
(depending on your distribution, these may live in /etc/rc.d,
|
||
/etc/init.d, or maybe somewhere else entirely).
|
||
|
||
4. Setting up squid
|
||
|
||
Now, we need to get squid up and running. Download the latest source
|
||
tarball from www.squid-cache.org. Make sure you get a STABLE version,
|
||
not a DEVEL version. The latest as of this writing was
|
||
squid-2.4.STABLE4.tar.gz. Note that AFAIK, you must have squid-2.4
|
||
for linux kernel 2.4. The reason is that the mechanism by which the
|
||
process determines the original destination address has changed from
|
||
linux 2.2, and only squid-2.4 has this new code in it. (For those of
|
||
you who are interested, previously the getsockname() call was hacked
|
||
to provide the original destination address, but now the call is
|
||
getsockopt() with a level of SOL_IP and an option of SO_ORIGINAL_DST).
|
||
|
||
|
||
Now, untar and gunzip the archive (use ``tar -xzf <filename>''). Run
|
||
the autoconfiguration script and tell it to include netfilter code
|
||
(``./configure --enable-linux-netfilter''), compile (``make'') and
|
||
then install (``make install'').
|
||
|
||
Now, we need to edit the default squid.conf file (installed to
|
||
/usr/local/squid/etc/squid.conf, unless you changed the defaults). The
|
||
squid.conf file is heavily commented. In fact, some of the best
|
||
documentation available for squid is in the squid.conf file. After you
|
||
get it all up and running, you should go back and reread the whole
|
||
thing. But for now, let's just get the minimum required. Find the
|
||
following directives, uncomment them, and change them to the
|
||
appropriate values:
|
||
|
||
|
||
· httpd_accel_host virtual
|
||
|
||
· httpd_accel_port 80
|
||
|
||
· httpd_accel_with_proxy on
|
||
|
||
· httpd_accel_uses_host_header on
|
||
|
||
Next, look at the cache_effective_user and cache_effective_group
|
||
directives. Unless the default nobody/nogroup has been created on
|
||
your system (AFAIK, it is not created out of the box on many popular
|
||
distributions, including RH7.1), you'll either need to create those,
|
||
or create another username/group for squid to run under. I strongly
|
||
recommend that you create a username/group of squid/squid and run
|
||
under that, but you could use any existing user/group if you want.
|
||
|
||
Finally, look at the http_access directive. The default is usually
|
||
``http_access deny all''. This will prevent anyone from accessing
|
||
squid. For now, you can change this to ``http_access allow all'', but
|
||
once it is working, you will probably want to read the directions on
|
||
ACLs (Access Control Lists), and setup the cache such that only people
|
||
on your local network (or whatever) can access the cache. This may
|
||
seem silly, but you should put some kind of restrictions on access to
|
||
your cache. People behind filtering firewalls (such as porn filters,
|
||
or filters in nations where speech is not very free) often ``hijack''
|
||
onto wide open proxies and eat up your bandwidth.
|
||
|
||
Initialize the cache directories with ``squid -z'' (if this is a not a
|
||
new installation of squid, you should skip this step).
|
||
|
||
Now, run squid using the RunCache script in the /usr/local/squid/bin/
|
||
directory. If it works, you should be able to set your web browser's
|
||
proxy settings to the IP of the box and port 3128 (unless you changed
|
||
the default port number) and access squid as a normal proxy.
|
||
|
||
For additional help configuring squid, see the squid FAQ at www.squid-
|
||
cache.org
|
||
|
||
5. Setting up iptables (Netfilter)
|
||
|
||
iptables is a new thing for Linux kernel 2.4 that replaces ipchains.
|
||
If your distribution came with a 2.4 kernel, it probably has iptables
|
||
already installed. If not, you'll have to download it (and possibly
|
||
compile it). The homepage is netfilter.samba.org. You make be able
|
||
to find binary RPMs elsewhere, I haven't looked. For the curious,
|
||
there is plenty of documentation on the netfilter site.
|
||
|
||
To set up the rules, you will need to know two things, the interface
|
||
that the to-be-proxied requests are coming in on (I'll use eth0 as an
|
||
example) and the port squid is running on (I'll use the default of
|
||
3128 as an example).
|
||
Now, the magic words for transparent proxying:
|
||
|
||
|
||
· iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT
|
||
--to-port 3128
|
||
|
||
You will want to add the above commands to your appropriate bootup
|
||
script under /etc/rc.d/. Readers upgrading from 2.2 kernels should
|
||
note that this is the only command needed. 2.2 kernels required two
|
||
extra commands in order to prevent forwarding loops. The
|
||
infastructure of netfilter is much nicer, and only this command is
|
||
needed.
|
||
|
||
6. Transparent Proxy to a Remote Box
|
||
|
||
Now, the question naturally arises, if we can do all this nifty stuff
|
||
redirecting HTTP connections to local ports, could we do the same
|
||
thing but to a remote box (e.g., the machine with squid running is
|
||
not the same machine as iptables is running on). The answer is yes,
|
||
but it takes a little different magic words. If you only want to
|
||
redirect to the local box (the normal case), skip this section.
|
||
|
||
For the purposes of example commands, let's assume we have two boxes
|
||
called squid-box and iptables-box, and that they are on the network
|
||
local-network. In the commands below, replace these strings with the
|
||
actual IP addresses or name of your machines and network.
|
||
|
||
I will present two different approaches here.
|
||
|
||
6.1. First method (simpler, but does not work for some esoteric
|
||
cases)
|
||
|
||
|
||
First, we need to machine that squid will be running on, squid-box.
|
||
You do not need iptables or any special kernel options on this
|
||
machine, just squid. You *will*, however, need the 'http_accel'
|
||
options as described above. (Previous version of this HOWTO suggested
|
||
that you did not need those options. That was a mistake. Sorry to
|
||
have confused people...)
|
||
|
||
Now, the machine that iptables will be running on, iptables-box You
|
||
will need to configure the kernel as described in section 3 above,
|
||
except that you don't need the REDIRECT target support). Now, for the
|
||
iptables commands. You need three:
|
||
|
||
|
||
· iptables -t nat -A PREROUTING -i eth0 -s ! squid-box -p tcp --dport
|
||
80 -j DNAT --to squid-box:3128
|
||
|
||
· iptables -t nat -A POSTROUTING -o eth0 -s local-network -d squid-
|
||
box -j SNAT --to iptables-box
|
||
|
||
· iptables -A FORWARD -s local-network -d squid-box -i eth0 -o eth0
|
||
-p tcp --dport 3128 -j ACCEPT
|
||
|
||
The first one sends the packets to squid-box from iptables-box. The
|
||
second makes sure that the reply gets sent back through iptables-box,
|
||
instead of directly to the client (this is very important!). The last
|
||
one makes sure the iptables-box will forward the appropriate packets
|
||
to squid-box. It may not be needed. YMMV. Note that we specified '-i
|
||
eth0' and then '-o eth0', which stands for input interface eth0 and
|
||
output interface eth0. If your packets are entering and leaving on
|
||
different interfaces, you will need to adjust the commands
|
||
accordingly.
|
||
|
||
|
||
Add these commands to your appropriate startup scripts under
|
||
/etc/rc.d/
|
||
|
||
(Thanks to Giles Coochey for help writing this section).
|
||
|
||
6.2. Second method (more complicated, but more general)
|
||
|
||
Our first shot at this works good, but there is a minor drawback in
|
||
that HTTP/1.0 connections without the Host header do not get handled
|
||
properly. Connections that are fully or partially HTTP/1.1 compliant
|
||
work fine. As most modern web browsers send the Host header, this is
|
||
not a problem for most people. However, some small programs or
|
||
embedded devices may send only very simple HTTP/1.0 requests. If you
|
||
want to support these, we'll need to do a little more work. Namely,
|
||
on iptables-box we'll need the following options enabled in the kernel
|
||
in addition to what was specified above:
|
||
|
||
|
||
· IP: advanced router
|
||
|
||
· IP: policy routing
|
||
|
||
· IP: use netfilter MARK value as routing key
|
||
|
||
· IP: Netfilter Configuration -> Packet mangling
|
||
|
||
· IP: Netfilter Configuration -> MARK target support
|
||
|
||
You'll also need the iproute2 tools. Your distribution probably
|
||
already has them installed, but if not, look at
|
||
ftp://ftp.inr.ac.ru/ip-routing/
|
||
|
||
You'll want to use the following set of commands on iptables-box:
|
||
|
||
· iptables -t mangle -A PREROUTING -j ACCEPT -p tcp --dport 80 -s
|
||
squid-box
|
||
|
||
· iptables -t mangle -A PREROUTING -j MARK --set-mark 3 -p tcp
|
||
--dport 80
|
||
|
||
· ip rule add fwmark 3 table 2
|
||
|
||
· ip route add default via squid-box dev eth1 table 2
|
||
|
||
Note that the choice of firewall mark (3) and routing table (2) was
|
||
fairly arbitrary. If you are already using policy routing or
|
||
firewall marking for some other purpose, make sure you choose
|
||
unique numbers here. Otherwise, don't worry about it.
|
||
|
||
|
||
Next, squid-box. Use this command, which should look remarkably
|
||
similar to a command we've seen previously.
|
||
|
||
· iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT
|
||
--to-port 3128
|
||
|
||
As before, add all of these commands to the appropriate startup
|
||
scripts.
|
||
|
||
Here is a brief explanation of how this works: in method one, we used
|
||
Network Address Translation to get the packets to the other box. The
|
||
result of this is that the packet gets altered. This alteration is
|
||
what causes some kinds of clients mentioned above to fail. In method
|
||
two, we use a magic thing called policy routing. The first thing we
|
||
do is to select the packets we want. Thus, all packets on port 80,
|
||
except those coming from squid-box itself, are MARKed. Then, when the
|
||
kernel goes to make a routing decision, the MARKed packets aren't
|
||
routing using the normal routing table that you access with the
|
||
``route'' command but with a special table. This special table has
|
||
only one entry, a default gateway to squid-box. Thus, the packet is
|
||
sent merrily on it's way without every having been altered. So, even
|
||
HTTP/1.0 connections can be handled perfectly. (Thanks to Michal
|
||
Svoboda for suggesting and helping to write this section)
|
||
|
||
|
||
|
||
6.3. Method One: What if iptables-box is on a dynamic IP?
|
||
|
||
If the iptables-box is on a dynamic IP address (e.g. a dialup PPP
|
||
connection, or a DHCP assigned IP address from a cable modem, etc.),
|
||
then you will want to make a slight change to the above commands.
|
||
Replace the second command with this one:
|
||
|
||
|
||
· iptables -t nat -A POSTROUTING -o eth0 -s local-network -d squid-
|
||
box -j MASQUERADE
|
||
|
||
This change avoids having to specify the IP address of iptables-box in
|
||
the command. Since it will change often, you'd have to change your
|
||
commands to reflect it. This will save you a lot of hassle.
|
||
|
||
7. Transparent Proxy With Bridging
|
||
|
||
Warning, this is really esoteric stuff. If you need it, you'll know.
|
||
If not, skip this section. Thanks to Lewis Shobbrook
|
||
(lshobbrook@fasttrack.net.au) for contributing to this section.
|
||
|
||
If you are trying to setup a transparent proxy on a Linux machine that
|
||
has been configured as a bridge, you will need to add one additional
|
||
iptables command to what we had in section 5. Specifically, you need
|
||
to explicitly allow connections to the machine on port 3128 (or any
|
||
other port squid is listening on), otherwise the machine will just
|
||
forward them over to the other interface like a good little bridge.
|
||
Here's the magic words:
|
||
|
||
· iptables -A INPUT -i interface -p tcp -d your_bridge_ip -s local-
|
||
network --dport 3128 -m state --state NEW,ESTABLISHED -j ACCEPT
|
||
|
||
Replacing interface with the interface that corresponds to
|
||
your_bridge_ip (typically eth0 or eth1). First time bridge users
|
||
should also note that you'll probably want to repeat the same
|
||
command with ``3128'' replaced by ``telnet'' if you want to
|
||
administer your bridge remotely.
|
||
|
||
8. Put it all together
|
||
|
||
If everything has gone well so far, go to another machine, change it's
|
||
gateway to the IP of the box with iptables running on it, and surf
|
||
away. To make sure that requests are really being forwarded through
|
||
your proxy instead of straight to the origin server, check the log
|
||
file /usr/local/squid/logs/access.log
|
||
|
||
9. Troubleshooting
|
||
|
||
There is one problem that occurs often enough to mention here. If you
|
||
get the following error:
|
||
|
||
/lib/modules/2.4.2-2/kernel/net/ipv4/netfilter/ip_tables.o init_mod
|
||
ules: Device or resource busy Hints: insmod errors can be caused by
|
||
incorrect module parameters; including invalid IO or IRQ parameters.
|
||
|
||
|
||
perhaps iptables or your kernel needs to be upgraded...
|
||
|
||
|
||
|
||
then you are probably running Red Hat 7.x. The folks at Red Hat, in
|
||
all their wisdom, decided to load the ipchains module by default on
|
||
startup. I guess this was for backwards compatibility for those who
|
||
haven't learned iptables yet. However, the problem is that ipchains
|
||
and iptables are mutually incompatible. Since ipchains has been
|
||
secretly loaded by RH, you cannot use iptables commands. To see if
|
||
this is your problem, do the command ``lsmod'' and look for the module
|
||
named ``ipchains''. If you see it, that is your problem. The quick
|
||
fix is to execute the command ``rmmod ipchains'' before you issue any
|
||
iptables commands. To permanently remove these commands from your
|
||
startup scripts, the following command should work: ``/sbin/chkconfig
|
||
--level 2345 ipchains off''. (Thanks to Rasmus Glud for pointing this
|
||
command out to me).
|
||
|
||
|
||
10. Further Resources
|
||
|
||
Should you still need assistance, you may wish to check the squid FAQ
|
||
or the squid mailing list at www.squid-cache.org. You may also e-mail
|
||
me at drk@unxsoft.com, and I'll try to answer your questions if time
|
||
permits (sometimes it does, but sometimes it doesn't). Please,
|
||
please, please, send the output of ``iptables -t nat -L'' and relavent
|
||
portions of any configuration files in your e-mail, or else I will
|
||
probably not be able to help you out much. And please make sure
|
||
you've read the whole HOWTO before asking a question. Regrettably,
|
||
even though this document has been translated to many different
|
||
languages, I can only answer questions asked in English.
|
||
|
||
|
||
|