SQUID Frequently Asked Questions
© 2004 Team Squid, info@squid-cache.org
Frequently Asked Questions (with answers!) about the Squid Internet
Object Cache software.
You can download the FAQ as
HTML,
PDF,
compressed Postscript,
plain text,
linuxdoc SGML source or as a
compressed tar of HTML.
Squid is a high-performance proxy caching server for web clients,
supporting FTP, gopher, and HTTP data objects. Unlike traditional
caching software, Squid handles all requests in a single,
non-blocking, I/O-driven process.
Squid keeps
meta data and especially hot objects cached in RAM, caches
DNS lookups, supports non-blocking DNS lookups, and implements
negative caching of failed requests.
Squid supports SSL, extensive
access controls, and full request logging. By using the
lightweight Internet Cache Protocol, Squid caches can be arranged
in a hierarchy or mesh for additional bandwidth savings.
Squid consists of a main server program squid, a Domain Name System
lookup program dnsserver, some optional programs for rewriting
requests and performing authentication, and some management and client
tools. When squid starts up, it spawns a configurable number of
dnsserver processes, each of which can perform a single, blocking
Domain Name System (DNS) lookup. This reduces the amount of time the
cache waits for DNS lookups.
Squid is derived from the ARPA-funded
Harvest project.
Internet object caching is a way to store requested Internet objects
(i.e., data available via the HTTP, FTP, and gopher protocols) on a
system closer to the requesting site than to the source. Web browsers
can then use the local Squid cache as a proxy HTTP server, reducing
access time as well as bandwidth consumption.
Harris' Lament says, ``All the good ones are taken."
We needed to distinguish this new version from the Harvest
cache software. Squid was the code name for initial
development, and it stuck.
Squid is updated often; please see
the Squid home page
for the most recent versions.
Squid is the result of efforts by numerous individuals from
the Internet community.
Duane Wessels
of the National Laboratory for Applied Network Research (funded by
the National Science Foundation) leads code development.
Please see
the CONTRIBUTORS file
for a list of our excellent contributors.
You can download Squid via FTP from
the primary FTP site
or one of the many worldwide
mirror sites.
Many sushi bars also have Squid.
The software is designed to operate on any modern Unix system, and
is known to work on at least the following platforms:
- Linux
- FreeBSD
- NetBSD
- OpenBSD
- BSDI
- Mac OS/X
- OSF/Digital Unix/Tru64
- IRIX
- SunOS/Solaris
- NeXTStep
- SCO Unix
- AIX
- HP-UX
-
OS/2
For more specific information, please see
platforms.php.
If you encounter any platform-specific problems, please
let us know by registering a entry in our
bug database.
Recent versions of Squid will compile and run on Windows/NT
with the
Cygwin /
Mingw packages.
Guido Serassio
maintains the native NT port of Squid and is actively working on having the needed changes integrated into the standard Squid distribution. Partially based on earlier NT port by
Romeo Anghelache.
LogiSense
has ported Squid to Windows NT and sells a supported
version. You can also download the source from
their FTP site.
Thanks to LogiSense for making the code available as required by the GPL terms.
- squid-users@squid-cache.org: general discussions about the
Squid cache software. Subscribe via
squid-users-subscribe@squid-cache.org.
Previous messages are available for browsing at
the Squid Users Archive,
and also at
theaimsgroup.com.
- squid-users-digest: digested (daily) version of
above. Subscribe via
squid-users-digest-subscribe@squid-cache.org.
- squid-announce@squid-cache.org: A receive-only list for
announcements of new versions.
Subscribe via
squid-announce-subscribe@squid-cache.org.
- squid-bugs@squid-cache.org:
A closed list for sending us bug reports.
Bug reports received here are given priority over
those mentioned on squid-users.
- squid@squid-cache.org:
A closed list for sending us feed-back and ideas.
- squid-faq@squid-cache.org:
A closed list for sending us feed-back, updates, and additions to
the Squid FAQ.
We also have a few other mailing lists which are not strictly
Squid-related.
- cache-snmp@ircache.net:
A public list for discussion of Web Caching and SNMP issues and developments.
Eventually we hope to put forth a standard Web Caching MIB.
- icp-wg@ircache.net:
Mostly-idle mailing list for the nonexistent ICP Working Group within
the IETF. It may be resurrected some day, you never know!
All of our mailing lists have ``-subscribe'' and ``-unsubscribe''
addresses that you must
use for subscribe and unsubscribe requests. To unsubscribe from
the squid-users list, you send a message to squid-users-unsubscribe@squid-cache.org.
As of version 2.5, Squid can terminate SSL connections. This is perhaps
only useful in a surrogate (http accelerator) configuration. You must
run configure with --enable-ssl. See https_port in
squid.conf for more information.
Squid also supports these encrypted protocols by ``tunelling''
traffic between clients and servers. In this case, Squid can relay
the encrypted bits between a client and a server.
Normally, when your browser comes across an https URL, it
does one of two things:
- The browser opens an SSL connection directly to the origin
server.
- The browser tunnels the request through Squid with the
CONNECT request method.
The CONNECT method is a way to tunnel any kind of
connection through an HTTP proxy. The proxy doesn't
understand or interpret the contents. It just passes
bytes back and forth between the client and server.
For the gory details on tunnelling and the CONNECT
method, please see
RFC 2817
and
Tunneling TCP based protocols through Web proxy servers (expired).
Squid is
copyrighted
by the University of California San Diego.
Squid uses some
code developed by others.
Squid is
Free Software.
Squid is licensed under the terms of the
GNU General Public License.
We think so. Squid uses the Unix time format for all internal time
representations. Potential problem areas are in printing and
parsing other time representations. We have made the following
fixes in to address the year 2000:
- cache.log timestamps use 4-digit years instead of just 2 digits.
- parse_rfc1123() assumes years less than "70" are after 2000.
- parse_iso3307_time() checks all four year digits.
Year-2000 fixes were applied to the following Squid versions:
-
squid-2.1:
Year parsing bug fixed for dates in the "Wed Jun 9 01:29:59 1993 GMT"
format (Richard Kettlewell).
- squid-1.1.22:
Fixed likely year-2000 bug in ftpget's timestamp parsing (Henrik Nordstrom).
- squid-1.1.20:
Misc fixes (Arjan de Vet).
Patches:
Squid-2.2 and earlier versions have a
New Year bug. This is not strictly a Year-2000 bug; it would happen on the first day of any year.
Yep. Please see the
commercial support page.
The following people have made contributions to this document:
Please send corrections, updates, and comments to:
squid-faq@squid-cache.org.
This document is copyrighted (2000) by Duane Wessels.
This document was written in SGML and converted with the
SGML-Tools package.
Most current version of this document can always be found at
http://www.squid-cache.org/Doc/FAQ/ in HTML, Plain Text, Postscript and SGML formats.
Want to contribute? Please write in SGML...
It is easier for us if you send us text which is close to "correct" SGML.
The SQUID FAQ currently uses the LINUXDOC DTD. Its probably easiest
to follow examples in the this file.
Here are the basics:
Use the <url> tag for links, instead of HTML <A HREF ...>
<url url="http://www.squid-cache.org" name="Squid Home Page">
Use <em> for emphasis, config options, and pathnames:
<em>usr/local/squid/etc/squid.conf</em>
<em/cache_peer/
Here is how you do lists:
<itemize>
<item>foo
<item>bar
</itemize>
Use <verb>, just like HTML's <PRE> to show
unformatted text.
You must download a source archive file of the form
squid-x.y.z-src.tar.gz (eg, squid-1.1.6-src.tar.gz) from
the Squid home page, or.
the Squid FTP site.
Context diffs are available for upgrading to new versions.
These can be applied with the patch program (available from
the GNU FTP site).
For Squid-1.0 and Squid-1.1 versions, you can just
type make from the top-level directory after unpacking
the source files. For example:
% tar xzf squid-1.1.21-src.tar.gz
% cd squid-1.1.21
% make
For Squid-2 you must run the configure script yourself
before running make:
% tar xzf squid-2.0.RELEASE-src.tar.gz
% cd squid-2.0.RELEASE
% ./configure
% make
To compile Squid, you will need an ANSI C compiler. Almost all
modern Unix systems come with pre-installed compilers which work
just fine. The old SunOS compilers do not have support for ANSI
C, and the Sun compiler for Solaris is a product which
must be purchased separately.
If you are uncertain about your system's C compiler, The GNU C compiler is
available at
the GNU FTP site.
In addition to gcc, you may also want or need to install the binutils
package.
You will need
Perl installed
on your system.
The developers do not have the resources to make pre-compiled
binaries available. Instead, we invest effort into making
the source code very portable. Some people have made
binary packages available. Please see our
Platforms Page.
The
SGI Freeware site
has pre-compiled packages for SGI IRIX.
Squid binaries for
FreeBSD on Alpha and Intel.
Squid binaries for
NetBSD on everything
Gurkan Sengun has some
Sparc/Solaris packages
available.
You need the patch program. You should probably duplicate the
entire directory structure before applying the patch. For example, if
you are upgrading from squid-1.1.10 to 1.1.11, you would run
these commands:
cd squid-2.5.STABLE3
mkdir ../squid-2.5.STABLE4
find . -depth -print | cpio -pdv ../squid-1.1.11
cd ../squid-1.1.11
patch -p1 < /tmp/squid-2.5.STABLE3-STABLE4.diff
or alternatively
cp -rl squid-2.5.STABLE3 squid-2.5.STABLE4
cd squid-2.5.STABLE4
zcat /tmp/squid-2.5.STABLE3-STABLE4.diff.gz | patch -p1
After the patch has been applied, you must rebuild Squid from the
very beginning, i.e.:
make distclean
./configure ...
make
make install
If your patch program seems to complain or refuses to work,
you should get a more recent version, from the
GNU FTP site, for example.
The configure script can take numerous options. The most
useful is --prefix to install it in a different directory.
The default installation directory is /usr/local/squid/. To
change the default, you could do:
% cd squid-x.y.z
% ./configure --prefix=/some/other/directory/squid
Type
% ./configure --help
to see all available options. You will need to specify some
of these options to enable or disable certain features.
Some options which are used often include:
--prefix=PREFIX install architecture-independent files in PREFIX
[/usr/local/squid]
--enable-dlmalloc[=LIB] Compile & use the malloc package by Doug Lea
--enable-gnuregex Compile GNUregex
--enable-splaytree Use SPLAY trees to store ACL lists
--enable-xmalloc-debug Do some simple malloc debugging
--enable-xmalloc-debug-trace
Detailed trace of memory allocations
--enable-xmalloc-statistics
Show malloc statistics in status page
--enable-carp Enable CARP support
--enable-async-io Do ASYNC disk I/O using threads
--enable-icmp Enable ICMP pinging
--enable-delay-pools Enable delay pools to limit bandwith usage
--enable-mem-gen-trace Do trace of memory stuff
--enable-useragent-log Enable logging of User-Agent header
--enable-kill-parent-hack
Kill parent on shutdown
--enable-snmp Enable SNMP monitoring
--enable-cachemgr-hostname[=hostname]
Make cachemgr.cgi default to this host
--enable-arp-acl Enable use of ARP ACL lists (ether address)
--enable-htpc Enable HTCP protocol
--enable-forw-via-db Enable Forw/Via database
--enable-cache-digests Use Cache Digests
see http://www.squid-cache.org/Doc/FAQ/FAQ-16.php
--enable-err-language=lang
Select language for Error pages (see errors dir)
by
Kevin Sartorelli
and
Andreas Doering.
Probably you've recently installed bind 8.x. There is a mismatch between
the header files and DNS library that Squid has found. There are a couple
of things you can try.
First, try adding -lbind to XTRA_LIBS in src/Makefile.
If -lresolv is already there, remove it.
If that doesn't seem to work, edit your arpa/inet.h file and comment out the following:
#define inet_addr __inet_addr
#define inet_aton __inet_aton
#define inet_lnaof __inet_lnaof
#define inet_makeaddr __inet_makeaddr
#define inet_neta __inet_neta
#define inet_netof __inet_netof
#define inet_network __inet_network
#define inet_net_ntop __inet_net_ntop
#define inet_net_pton __inet_net_pton
#define inet_ntoa __inet_ntoa
#define inet_pton __inet_pton
#define inet_ntop __inet_ntop
#define inet_nsap_addr __inet_nsap_addr
#define inet_nsap_ntoa __inet_nsap_ntoa
If you have source for BIND, you can modify it as indicated in the diff
below. It causes the global variable _dns_ttl_ to be set with the TTL
of the most recent lookup. Then, when you compile Squid, the configure
script will look for the _dns_ttl_ symbol in libresolv.a. If found,
dnsserver will return the TTL value for every lookup.
This hack was contributed by
Endre Balint Nagy.
diff -ru bind-4.9.4-orig/res/gethnamaddr.c bind-4.9.4/res/gethnamaddr.c
--- bind-4.9.4-orig/res/gethnamaddr.c Mon Aug 5 02:31:35 1996
+++ bind-4.9.4/res/gethnamaddr.c Tue Aug 27 15:33:11 1996
@@ -133,6 +133,7 @@
} align;
extern int h_errno;
+int _dns_ttl_;
#ifdef DEBUG
static void
@@ -223,6 +224,7 @@
host.h_addr_list = h_addr_ptrs;
haveanswer = 0;
had_error = 0;
+ _dns_ttl_ = -1;
while (ancount-- > 0 && cp < eom && !had_error) {
n = dn_expand(answer->buf, eom, cp, bp, buflen);
if ((n < 0) || !(*name_ok)(bp)) {
@@ -232,8 +234,11 @@
cp += n; /* name */
type = _getshort(cp);
cp += INT16SZ; /* type */
- class = _getshort(cp);
- cp += INT16SZ + INT32SZ; /* class, TTL */
+ class = _getshort(cp);
+ cp += INT16SZ; /* class */
+ if (qtype == T_A && type == T_A)
+ _dns_ttl_ = _getlong(cp);
+ cp += INT32SZ; /* TTL */
n = _getshort(cp);
cp += INT16SZ; /* len */
if (class != C_IN) {
And here is a patch for BIND-8:
*** src/lib/irs/dns_ho.c.orig Tue May 26 21:55:51 1998
--- src/lib/irs/dns_ho.c Tue May 26 21:59:57 1998
***************
*** 87,92 ****
--- 87,93 ----
#endif
extern int h_errno;
+ int _dns_ttl_;
/* Definitions. */
***************
*** 395,400 ****
--- 396,402 ----
pvt->host.h_addr_list = pvt->h_addr_ptrs;
haveanswer = 0;
had_error = 0;
+ _dns_ttl_ = -1;
while (ancount-- > 0 && cp < eom && !had_error) {
n = dn_expand(ansbuf, eom, cp, bp, buflen);
if ((n < 0) || !(*name_ok)(bp)) {
***************
*** 404,411 ****
cp += n; /* name */
type = ns_get16(cp);
cp += INT16SZ; /* type */
! class = ns_get16(cp);
! cp += INT16SZ + INT32SZ; /* class, TTL */
n = ns_get16(cp);
cp += INT16SZ; /* len */
if (class != C_IN) {
--- 406,416 ----
cp += n; /* name */
type = ns_get16(cp);
cp += INT16SZ; /* type */
! class = _getshort(cp);
! cp += INT16SZ; /* class */
! if (qtype == T_A && type == T_A)
! _dns_ttl_ = _getlong(cp);
! cp += INT32SZ; /* TTL */
n = ns_get16(cp);
cp += INT16SZ; /* len */
if (class != C_IN) {
cache_cf.c: In function `parseConfigFile':
cache_cf.c:1353: yacc stack overflow before `token'
...
You may need to upgrade your gcc installation to a more recent version.
Check your gcc version with
gcc -v
If it is earlier than 2.7.2, you might consider upgrading.
The following error occurs on Solaris systems using gcc when the Solaris C
compiler is not installed:
/usr/bin/rm -f libmiscutil.a
/usr/bin/false r libmiscutil.a rfc1123.o rfc1738.o util.o ...
make[1]: *** [libmiscutil.a] Error 255
make[1]: Leaving directory `/tmp/squid-1.1.11/lib'
make: *** [all] Error 1
Note on the second line the /usr/bin/false. This is supposed
to be a path to the ar program. If configure cannot find ar
on your system, then it substitues false.
To fix this you either need to:
- Add /usr/ccs/bin to your PATH. This is where the ar
command should be. You need to install SUNWbtool if ar
is not there. Otherwise,
- Install the binutils package from
the GNU FTP site.
This package includes programs such as ar, as, and ld.
Please check the
page of platforms
on which Squid is known to compile. Your problem might be listed
there together with a solution. If it isn't listed there, mail
us what you are trying, your Squid version, and the problems
you encounter.
Warnings are usually not a big concern, and can be common with software
designed to operate on multiple platforms. If you feel like fixing
compile-time warnings, please do so and send us the patches.
by
Doug Nazar
In order in compile squid, you need to have a reasonable facsimile of a
Unix system installed. This includes bash, make, sed,
emx, various file utilities and a few more. I've setup a TVFS
drive that matches a Unix file system but this probably isn't strictly
necessary.
I made a few modifications to the pristine EMX 0.9d install.
- added defines for strcasecmp() & strncasecmp() to string.h
- changed all occurrences of time_t to signed long instead
of unsigned long
- hacked ld.exe
- to search for both xxxx.a and libxxxx.a
- to produce the correct filename when using the
-Zexe option
You will need to run scripts/convert.configure.to.os2 (in the
Squid source distribution) to modify
the configure script so that it can search for the various programs.
Next, you need to set a few environment variables (see EMX docs
for meaning):
export EMXOPT="-h256 -c"
export LDFLAGS="-Zexe -Zbin -s"
Now you are ready to configure squid:
./configure
Compile everything:
make
and finally, install:
make install
This will by default, install into /usr/local/squid. If you wish
to install somewhere else, see the --prefix option for configure.
Now, don't forget to set EMXOPT before running squid each time. I
recommend using the -Y and -N options.
There are no hard-and-fast rules. The most important resource
for Squid is physical memory. Your processor does not need
to be ultra-fast. Your disk system will be the major bottleneck,
so fast disks are important for high-volume caches. Do not use
IDE disks if you can help it.
In late 1998, if you are buying a new machine for
a cache, I would recommend the following configuration:
- 300 MHz Pentium II CPU
- 512 MB RAM
- Five 9 GB UW-SCSI disks
Your system disk, and logfile disk can probably be IDE without losing
any cache performance.
Also, see
Squid Sizing for Intel Platforms by Martin Hamilton This is a
very nice page summarizing system configurations people are using for
large Squid caches.
After
compiling Squid, you can install it
with this simple command:
% make install
If you have enabled the
ICMP features
then you will also want to type
% su
# make install-pinger
After installing, you will want to edit and customize
the squid.conf file. By default, this file is
located at /usr/local/squid/etc/squid.conf.
Also, a QUICKSTART guide has been included with the source
distribution. Please see the directory where you
unpacked the source archive.
The squid.conf file defines the configuration for
squid. the configuration includes (but not limited to)
HTTP port number, the ICP request port number, incoming and outgoing
requests, information about firewall access, and various timeout
information.
Yes, after you make install, a sample squid.conf file will
exist in the ``etc" directory under the Squid installation directory.
The sample squid.conf file contains comments explaining each
option.
First you need to make your Squid configuration. The Squid configuration
can be found in /usr/local/squid/etc/squid.conf and by default includes
documentation on all directives.
In the Suqid distribution there is a small QUICKSTART guide indicating
which directives you need to look closer at and why. At a absolute minimum
you need to change the http_access configuration to allow access from
your clients.
To verify your configuration file you can use the -k parse option
% /usr/local/squid/sbin/squid -k parse
If this outputs any errors then these are syntax errors or other fatal
misconfigurations and needs to be corrected before you continue. If it is
silent and immediately gives back the command promt then your squid.conf
is syntactically correct and could be understood by Squid.
After you've finished editing the configuration file, you can
start Squid for the first time. The procedure depends a little
bit on which version you are using.
First, you must create the swap directories. Do this by
running Squid with the -z option:
% /usr/local/squid/sbin/squid -z
NOTE: If you run Squid as root then you may need to first create
/usr/local/squid/var/logs and your cache_dir directories and assign ownership
of these to the cache_effective_user configured in your squid.conf.
Once the creation of the cache directories completes, you can start Squid
and try it out. Probably the best thing to do is run it from your terminal
and watch the debugging output. Use this command:
% /usr/local/squid/sbin/squid -NCd1
If everything is working okay, you will see the line:
Ready to serve requests.
If you want to run squid in the background, as a daemon process,
just leave off all options:
% /usr/local/squid/sbin/squid
NOTE: depending on which http_port you select you may need to start
squid as root (http_port <1024).
NOTE: In Squid-2.4 and earlier Squid was installed in bin by default, not sbin.
Squid-2 has a restart feature built in. This greatly simplifies
starting Squid and means that you don't need to use RunCache
or inittab. At the minimum, you only need to enter the
pathname to the Squid executable. For example:
/usr/local/squid/sbin/squid
Squid will automatically background itself and then spawn
a child process. In your syslog messages file, you
should see something like this:
Sep 23 23:55:58 kitty squid[14616]: Squid Parent: child process 14617 started
That means that process ID 14563 is the parent process which monitors the child
process (pid 14617). The child process is the one that does all of the
work. The parent process just waits for the child process to exit. If the
child process exits unexpectedly, the parent will automatically start another
child process. In that case, syslog shows:
Sep 23 23:56:02 kitty squid[14616]: Squid Parent: child process 14617 exited with status 1
Sep 23 23:56:05 kitty squid[14616]: Squid Parent: child process 14619 started
If there is some problem, and Squid can not start, the parent process will give up
after a while. Your syslog will show:
Sep 23 23:56:12 kitty squid[14616]: Exiting due to repeated, frequent failures
When this happens you should check your syslog messages and
cache.log file for error messages.
When you look at a process (ps command) listing, you'll see two squid processes:
24353 ?? Ss 0:00.00 /usr/local/squid/bin/squid
24354 ?? R 0:03.39 (squid) (squid)
The first is the parent process, and the child process is the one called ``(squid)''.
Note that if you accidentally kill the parent process, the child process will not
notice.
If you want to run Squid from your termainal and prevent it from
backgrounding and spawning a child process, use the -N command
line option.
/usr/local/squid/bin/squid -N
From inittab
On systems which have an /etc/inittab file (Digital Unix,
Solaris, IRIX, HP-UX, Linux), you can add a line like this:
sq:3:respawn:/usr/local/squid/sbin/squid.sh < /dev/null >> /tmp/squid.log 2>&1
We recommend using a squid.sh shell script, but you could instead call
Squid directly with the -N option and other options you may require. A sameple squid.sh script is shown below:
#!/bin/sh
C=/usr/local/squid
PATH=/usr/bin:$C/bin
TZ=PST8PDT
export PATH TZ
# User to notify on restarts
notify="root"
# Squid command line options
opts=""
cd $C
umask 022
sleep 10
while [ -f /var/run/nosquid ]; do
sleep 1
done
/usr/bin/tail -20 $C/logs/cache.log \
| Mail -s "Squid restart on `hostname` at `date`" $notify
exec bin/squid -N $opts
From rc.local
On BSD-ish systems, you will need to start Squid from the ``rc'' files,
usually /etc/rc.local. For example:
if [ -f /usr/local/squid/sbin/squid ]; then
echo -n ' Squid'
/usr/local/squid/sbin/squid
fi
From init.d
Squid ships with a init.d type startup script in contrib/squid.rc which
works on most init.d type systems. Or you can write your own using any
normal init.d script found in your system as template and add the
start/stop fragments shown below.
Start:
/usr/local/squid/sbin/squid
Stop:
/usr/local/squid/sbin/squid -k shutdown
n=120
while /usr/local/squid/sbin/squid -k check && [ $n -gt 120 ]; do
sleep 1
echo -n .
n=`expr $n - 1`
done
You can use the squidclient program:
% squidclient http://www.netscape.com/ > test
There are other command-line HTTP client programs available
as well. Two that you may find useful are
wget
and
echoping.
Another way is to use Squid itself to see if it can signal a running
Squid process:
% squid -k check
And then check the shell's exit status variable.
Also, check the log files, most importantly the access.log and
cache.log files.
These are the command line options for Squid-2:
- -a
Specify an alternate port number for incoming HTTP requests.
Useful for testing a configuration file on a non-standard port.
- -d
Debugging level for ``stderr'' messages. If you use this
option, then debugging messages up to the specified level will
also be written to stderr.
- -f
Specify an alternate squid.conf file instead of the
pathname compiled into the executable.
- -h
Prints the usage and help message.
- -k reconfigure
Sends a HUP signal, which causes Squid to re-read
its configuration files.
- -k rotate
Sends an USR1 signal, which causes Squid to
rotate its log files. Note, if logfile_rotate
is set to zero, Squid still closes and re-opens
all log files.
- -k shutdown
Sends a TERM signal, which causes Squid to
wait briefly for current connections to finish and then
exit. The amount of time to wait is specified with
shutdown_lifetime.
- -k interrupt
Sends an INT signal, which causes Squid to
shutdown immediately, without waiting for
current connections.
- -k kill
Sends a KILL signal, which causes the Squid
process to exit immediately, without closing
any connections or log files. Use this only
as a last resort.
- -k debug
Sends an USR2 signal, which causes Squid
to generate full debugging messages until the
next USR2 signal is recieved. Obviously
very useful for debugging problems.
- -k check
Sends a ``ZERO'' signal to the Squid process.
This simply checks whether or not the process
is actually running.
- -s
Send debugging (level 0 only) message to syslog.
- -u
Specify an alternate port number for ICP messages.
Useful for testing a configuration file on a non-standard port.
- -v
Prints the Squid version.
- -z
Creates disk swap directories. You must use this option when
installing Squid for the first time, or when you add or
modify the cache_dir configuration.
- -D
Do not make initial DNS tests. Normally, Squid looks up
some well-known DNS hostnames to ensure that your DNS
name resolution service is working properly.
- -F
If the swap.state logs are clean, then the cache is
rebuilt in the ``foreground'' before any requests are
served. This will decrease the time required to rebuild
the cache, but HTTP requests will not be satisified during
this time.
- -N
Do not automatically become a background daemon process.
- -R
Do not set the SO_REUSEADDR option on sockets.
- -V
Enable virtual host support for the httpd-accelerator mode.
This is identical to writing httpd_accel_host virtual
in the config file.
- -X
Enable full debugging while parsing the config file.
- -Y
Return ICP_OP_MISS_NOFETCH instead of ICP_OP_MISS while
the swap.state file is being read. If your cache has
mostly child caches which use ICP, this will allow your
cache to rebuild faster.
- Check the cache.log file in your logs directory. It logs
interesting (and boring) things as a part of its normal operation.
- Install and use the
Cache Manager.
Squid is a single process application and can not make use of SMP.
If you want to make Squid benefit from a SMP system you will need to run
multiple instances of Squid and find a way to distribute your users on the
different Squid instances just as if you had multiple Squid boxes.
Having two CPUs is indeed nice for running other CPU intensive
tasks on the same server as the proxy, such as if you have a lot of logs
and need to run various statistics collections during peak hours.
The authentication and group helpers barely use any CPU and does
not benefit from dual-CPU configuration.
RAID1 is fine, and so are separate drives.
RAID0 (striping) with Squid only gives you the drawback that if
you lose one of the drives the whole stripe set is lost. There is no
benefit in performance as Squid already distributes the load on the drives
quite nicely.
Squid is the worst case application for RAID5, whether hardware or
software, and will absolutely kill the performance of a RAID5. Once the
cache has been filled Squid uses a lot of small random writes which the
worst case workload for RAID5, effectively reducing write speed to only
little more than that of one single drive.
Generally seek time is what you want to optimize for Squid, or
more precisely the total amount of seeks/s your system can sustain.
Choosing the right RAID solution generally decreases the amount of seeks/s
your system can sustain significantly.
To place your cache in a hierarchy, use the cache_peer
directive in squid.conf to specify the parent and sibling
nodes.
For example, the following squid.conf file on
childcache.example.com configures its cache to retrieve
data from one parent cache and two sibling caches:
# squid.conf - On the host: childcache.example.com
#
# Format is: hostname type http_port udp_port
#
cache_peer parentcache.example.com parent 3128 3130
cache_peer childcache2.example.com sibling 3128 3130
cache_peer childcache3.example.com sibling 3128 3130
The cache_peer_domain directive allows you to specify that
certain caches siblings or parents for certain domains:
# squid.conf - On the host: sv.cache.nlanr.net
#
# Format is: hostname type http_port udp_port
#
cache_peer electraglide.geog.unsw.edu.au parent 3128 3130
cache_peer cache1.nzgate.net.nz parent 3128 3130
cache_peer pb.cache.nlanr.net parent 3128 3130
cache_peer it.cache.nlanr.net parent 3128 3130
cache_peer sd.cache.nlanr.net parent 3128 3130
cache_peer uc.cache.nlanr.net sibling 3128 3130
cache_peer bo.cache.nlanr.net sibling 3128 3130
cache_peer_domain electraglide.geog.unsw.edu.au .au
cache_peer_domain cache1.nzgate.net.nz .au .aq .fj .nz
cache_peer_domain pb.cache.nlanr.net .uk .de .fr .no .se .it
cache_peer_domain it.cache.nlanr.net .uk .de .fr .no .se .it
cache_peer_domain sd.cache.nlanr.net .mx .za .mu .zm
The configuration above indicates that the cache will use
pb.cache.nlanr.net and it.cache.nlanr.net
for domains uk, de, fr, no, se and it, sd.cache.nlanr.net
for domains mx, za, mu and zm, and cache1.nzgate.net.nz
for domains au, aq, fj, and nz.
We have a simple set of
guidelines for joining
the NLANR cache hierarchy.
The NLANR hierarchy can provide you with an initial source for parent or
sibling caches. Joining the NLANR global cache system will frequently
improve the performance of your caching service.
Just enable these options in your squid.conf and you'll be
registered:
cache_announce 24
announce_to sd.cache.nlanr.net:3131
NOTE: announcing your cache is not the same thing as
joining the NLANR cache hierarchy.
You can join the NLANR cache hierarchy without registering, and
you can register without joining the NLANR cache hierarchy.
Visit the NLANR cache
registration database
to discover other caches near you. Keep in mind that just because
a cache is registered in the database does not mean they
are willing to be your parent/sibling/child. But it can't hurt to ask...
- Your site will not be listed if your cache IP address does not have
a DNS PTR record. If we can't map the IP address back to a domain
name, it will be listed as ``Unknown.''
- The registration messages are sent with UDP. We may not be receiving
your announcement message due to firewalls which block UDP, or
dropped packets due to congestion.
This entry has been moved to
a different section.
Note: The information here is current for version 2.2.
If you are behind a firewall then you can't make direct connections
to the outside world, so you must use a
parent cache. Squid doesn't use ICP queries for a request if it's
behind a firewall or if there is only one parent.
You can use the never_direct access list in
squid.conf to specify which requests must be forwarded to
your parent cache outside the firewall, and the always_direct access list
to specify which requests must not be forwarded. For example, if Squid
must connect directly to all servers that end with mydomain.com, but
must use the parent for all others, you would write:
acl INSIDE dstdomain .mydomain.com
always_direct allow INSIDE
never_direct allow all
You could also specify internal servers by IP address
acl INSIDE_IP dst 1.2.3.0/24
always_direct allow INSIDE_IP
never_direct allow all
Note, however that when you use IP addresses, Squid must
perform a DNS lookup to convert URL hostnames to an
address. Your internal DNS servers may not be able to
lookup external domains.
If you use never_direct and you have multiple parent caches,
then you probably will want to mark one of them as a default
choice in case Squid can't decide which one to use. That is
done with the default keyword on a cache_peer
line. For example:
cache_peer xyz.mydomain.com parent 3128 0 default
Note: The information here is current for version 2.2.
First, you need to give Squid a parent cache. Second, you need
to tell Squid it can not connect directly to origin servers. This is done
with three configuration file lines:
cache_peer parentcache.foo.com parent 3128 0 no-query default
acl all src 0.0.0.0/0.0.0.0
never_direct allow all
Note, with this configuration, if the parent cache fails or becomes
unreachable, then every request will result in an error message.
In case you want to be able to use direct connections when all the
parents go down you should use a different approach:
cache_peer parentcache.foo.com parent 3128 0 no-query
prefer_direct off
The default behaviour of Squid in the absence of positive ICP, HTCP, etc
replies is to connect to the origin server instead of using parents.
The prefer_direct off directive tells Squid to try parents first.
The dnsserver processes are used by squid because the gethostbyname(3) library routines used to
convert web sites names to their internet addresses
blocks until the function returns (i.e., the process that calls
it has to wait for a reply). Since there is only one squid
process, everyone who uses the cache would have to wait each
time the routine was called. This is why the dnsserver is
a separate process, so that these processes can block,
without causing blocking in squid.
It's very important that there are enough dnsserver
processes to cope with every access you will need, otherwise
squid will stop occasionally. A good rule of thumb is to
make sure you have at least the maximum number of dnsservers
squid has ever needed on your system,
and probably add two to be on the safe side. In other words, if
you have only ever seen at most three dnsserver processes
in use, make at least five. Remember that a dnsserver is
small and, if unused, will be swapped out.
First, find out if you have enough dnsserver processes running by
looking at the Cachemanager dns output. Ideally, you should see
that the first dnsserver handles a lot of requests, the second one
less than the first, etc. The last dnsserver should have serviced
relatively few requests. If there is not an obvious decreasing trend, then
you need to increase the number of dns_children in the configuration
file. If the last dnsserver has zero requests, then you definately
have enough.
Another factor which affects the dnsserver service time is the
proximity of your DNS resolver. Normally we do not recommend running
Squid and named on the same host. Instead you should try use a
DNS resolver (named) on a different host, but on the same LAN.
If your DNS traffic must pass through one or more routers, this could
be causing unnecessary delays.
Before you run the configure script, simply set the CACHE_HTTP_PORT
environment variable.
setenv CACHE_HTTP_PORT 8080
./configure
make
make install
With Squid-1.1 it is NOT possible. Each cache_dir is assumed
to be the same size. The cache_swap setting defines the size of
all cache_dir's taken together. If you have N cache_dir's
then each one will hold cache_swap ÷ N Megabytes.
Most people have a disk partition dedicated to the Squid cache.
You don't want to use the entire partition size. You have to leave
some extra room. Currently, Squid is not very tolerant of running
out of disk space.
Lets say you have a 9GB disk.
Remember that disk manufacturers lie about the space available.
A so-called 9GB disk usually results in about 8.5GB of raw, usable space.
First, put a filesystem on it, and mount
it. Then check the ``available space'' with your df program.
Note that you lose some disk space to filesystem overheads, like superblocks,
inodes, and directory entries. Also note that Unix normally keeps
10% free for itself. So with a 9GB disk, you're probably down to
about 8GB after formatting.
Next, I suggest taking off another 10%
or so for Squid overheads, and a "safe buffer." Squid normally puts
its swap.state files in each cache directory. These grow in size
until you rotate the logs, or restart squid.
Also note that Squid performs better when there is
more free space. So if performance is important to you, then take off
even more space. Typically, for a 9GB disk, I recommend a cache_dir
setting of 6000 to 7500 Megabytes:
cache_dir ... 7000 16 256
Its better to start out conservative. After the cache becomes full,
look at the disk usage. If you think there is plenty of unused space,
then increase the cache_dir setting a little.
If you're getting ``disk full'' write errors, then you definately need
to decrease your cache size.
With Squid-1.1, yes, you will lose your cache. This is because
version 1.1 uses a simplistic algorithm to distribute files
between cache directories.
With Squid-2, you will not lose your existing cache.
You can add and delete cache_dir's without affecting
any of the others.
Several people on both the fwtk-users and the
squid-users mailing asked
about using Squid in combination with http-gw from the
TIS toolkit.
The most elegant way in my opinion is to run an internal Squid caching
proxyserver which handles client requests and let this server forward
it's requests to the http-gw running on the firewall. Cache hits won't
need to be handled by the firewall.
In this example Squid runs on the same server as the http-gw, Squid uses
8000 and http-gw uses 8080 (web). The local domain is home.nl.
Firewall configuration:
Either run http-gw as a daemon from the /etc/rc.d/rc.local (Linux
Slackware):
exec /usr/local/fwtk/http-gw -daemon 8080
or run it from inetd like this:
web stream tcp nowait.100 root /usr/local/fwtk/http-gw http-gw
I increased the watermark to 100 because a lot of people run into
problems with the default value.
Make sure you have at least the following line in
/usr/local/etc/netperm-table:
http-gw: hosts 127.0.0.1
You could add the IP-address of your own workstation to this rule and
make sure the http-gw by itself works, like:
http-gw: hosts 127.0.0.1 10.0.0.1
Squid configuration:
The following settings are important:
http_port 8000
icp_port 0
cache_peer localhost.home.nl parent 8080 0 default
acl HOME dstdomain .home.nl
alwayws_direct allow HOME
never_direct allow all
This tells Squid to use the parent for all domains other than home.nl.
Below, access.log entries show what happens if you do a reload on the
Squid-homepage:
872739961.631 1566 10.0.0.21 ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/ - DEFAULT_PARENT/localhost.home.nl -
872739962.976 1266 10.0.0.21 TCP_CLIENT_REFRESH/304 88 GET http://www.nlanr.net/Images/cache_now.gif - DEFAULT_PARENT/localhost.home.nl -
872739963.007 1299 10.0.0.21 ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/Icons/squidnow.gif - DEFAULT_PARENT/localhost.home.nl -
872739963.061 1354 10.0.0.21 TCP_CLIENT_REFRESH/304 83 GET http://www.squid-cache.org/Icons/Squidlogo2.gif - DEFAULT_PARENT/localhost.home.nl
http-gw entries in syslog:
Aug 28 02:46:00 memo http-gw[2052]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta)
Aug 28 02:46:00 memo http-gw[2052]: log host=localhost/127.0.0.1 protocol=HTTP cmd=dir dest=www.squid-cache.org path=/
Aug 28 02:46:01 memo http-gw[2052]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=1
Aug 28 02:46:01 memo http-gw[2053]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta)
Aug 28 02:46:01 memo http-gw[2053]: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-cache.org path=/Icons/Squidlogo2.gif
Aug 28 02:46:01 memo http-gw[2054]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta)
Aug 28 02:46:01 memo http-gw[2054]: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-cache.org path=/Icons/squidnow.gif
Aug 28 02:46:01 memo http-gw[2055]: permit host=localhost/127.0.0.1 use of gateway (V2.0beta)
Aug 28 02:46:01 memo http-gw[2055]: log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.nlanr.net path=/Images/cache_now.gif
Aug 28 02:46:02 memo http-gw[2055]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=1
Aug 28 02:46:03 memo http-gw[2053]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=2
Aug 28 02:46:04 memo http-gw[2054]: exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration=3
To summarize:
Advantages:
- http-gw allows you to selectively block ActiveX and Java, and it's
primary design goal is security.
- The firewall doesn't need to run large applications like Squid.
- The internal Squid-server still gives you the benefit of caching.
Disadvantages:
- The internal Squid proxyserver can't (and shouldn't) work with other
parent or neighbor caches.
- Initial requests are slower because these go through http-gw, http-gw
also does reverse lookups. Run a nameserver on the firewall or use an
internal nameserver.
--
Rodney van den Oever
When a proxy-cache is used, a server does not see the connection
coming from the originating client. Many people like to implement
access controls based on the client address.
To accommodate these people, Squid adds its own request header
called "X-Forwarded-For" which looks like this:
X-Forwarded-For: 128.138.243.150, unknown, 192.52.106.30
Entries are always IP addresses, or the word unknown if the address
could not be determined or if it has been disabled with the
forwarded_for configuration option.
We must note that access controls based on this header are extremely
weak and simple to fake. Anyone may hand-enter a request with any IP
address whatsoever. This is perhaps the reason why client IP addresses
have been omitted from the HTTP/1.1 specification.
Because of the weakness of this header, support for access controls based
on X-Forwarder-For is not yet available in any officially released version of
squid. However, unofficial patches are available from the
follow_xff
Squid development project and may be integrated into later versions of Squid
once a suitable trust model have been developed.
Yes it can, however the way of doing it has changed from earlier versions
of squid. As of squid-2.2 a more customisable method has been introduced.
Please follow the instructions for the version of squid that you are using.
As a default, no anonymizing is done.
If you choose to use the anonymizer you might wish to investigate the forwarded_for
option to prevent the client address being disclosed. Failure to turn off the
forwarded_for option will reduce the effectiveness of the anonymizer. Finally
if you filter the User-Agent header using the fake_user_agent option can
prevent some user problems as some sites require the User-Agent header.
Squid 2.2
With the introduction of squid 2.2 the anonoymizer has become more customisable.
It now allows specification of exactly which headers will be allowed to pass.
This is further extended in Squid-2.5 to allow headers to be anonymized conditionally.
For details see the documentation of the http_header_access and header_replace
directives in squid.conf.default.
References:
Anonymous WWW
Sure, just use the always_direct access list.
For example, if you want Squid to connect directly to hotmail.com servers,
you can use these lines in your config file:
acl hotmail dstdomain .hotmail.com
always_direct allow hotmail
Sure, there are few things you can do.
You can use the no_cache access list to make Squid never cache any response:
acl all src 0/0
no_cache deny all
With Squid-2.4 and later you can use the ``null'' storage module to avoid
having a cache directory:
cache_dir null /tmp
Note: a null cache_dir does not disable caching, but it does save you from
creating a cache structure if you have disabled caching with no_cache.
Note: the directory (e.g., /tmp) must exist so that squid
can chdir to it, unless you also use the coredump_dir option.
To configure Squid for the ``null'' storage module, specify it
on the configure command line:
./configure --enable-storeio=ufs,null ...
You can set the global reply_body_max_size parameter. This option
controls the largest HTTP message body that will be sent to a cache
client for one request.
If the HTTP response coming from the server has a Content-length
header, then Squid compares the content-length value to the
reply_body_max_size value. If the content-length is larger,
the server connection is closed and the user receives an error
message from Squid.
Some responses don't have Content-length
headers. In this case, Squid counts how many bytes are written
to the client. Once the limit is reached, the client's connection
is simply closed.
Note that ``creative'' user-agents will still be able to download
really large files through the cache using HTTP/1.1 range requests.
Most web browsers available today support proxying and are easily configured
to use a Squid server as a proxy. Some browsers support advanced features
such as lists of domains or URL patterns that shouldn't be fetched through
the proxy, or JavaScript automatic proxy configuration.
Select Network Preferences from the
Options menu. On the Proxies
page, click the radio button next to Manual Proxy
Configuration and then click on the View
button. For each protocol that your Squid server supports (by default,
HTTP, FTP, and gopher) enter the Squid server's hostname or IP address
and put the HTTP port number for the Squid server (by default, 3128) in
the Port column. For any protocols that your Squid
does not support, leave the fields blank.
Here is a
screen shot of the Netscape Navigator manual proxy
configuration screen.
Netscape Navigator's proxy configuration can be automated with
JavaScript (for Navigator versions 2.0 or higher). Select
Network Preferences from the Options
menu. On the Proxies page, click the radio button
next to Automatic Proxy Configuration and then
fill in the URL for your JavaScript proxy configuration file in the
text box. The box is too small, but the text will scroll to the
right as you go.
Here is a
screen shot
of the Netscape Navigator automatic proxy configuration screen.
You may also wish to consult Netscape's documentation for the Navigator
JavaScript proxy configuration
Here is a sample auto configuration JavaScript from Oskar Pearson:
//We (www.is.co.za) run a central cache for our customers that they
//access through a firewall - thus if they want to connect to their intranet
//system (or anything in their domain at all) they have to connect
//directly - hence all the "fiddling" to see if they are trying to connect
//to their local domain.
//Replace each occurrence of company.com with your domain name
//and if you have some kind of intranet system, make sure
//that you put it's name in place of "internal" below.
//We also assume that your cache is called "cache.company.com", and
//that it runs on port 8080. Change it down at the bottom.
//(C) Oskar Pearson and the Internet Solution (http://www.is.co.za)
function FindProxyForURL(url, host)
{
//If they have only specified a hostname, go directly.
if (isPlainHostName(host))
return "DIRECT";
//These connect directly if the machine they are trying to
//connect to starts with "intranet" - ie http://intranet
//Connect directly if it is intranet.*
//If you have another machine that you want them to
//access directly, replace "internal*" with that
//machine's name
if (shExpMatch( host, "intranet*")||
shExpMatch(host, "internal*"))
return "DIRECT";
//Connect directly to our domains (NB for Important News)
if (dnsDomainIs( host,"company.com")||
//If you have another domain that you wish to connect to
//directly, put it in here
dnsDomainIs(host,"sistercompany.com"))
return "DIRECT";
//So the error message "no such host" will appear through the
//normal Netscape box - less support queries :)
if (!isResolvable(host))
return "DIRECT";
//We only cache http, ftp and gopher
if (url.substring(0, 5) == "http:" ||
url.substring(0, 4) == "ftp:"||
url.substring(0, 7) == "gopher:")
//Change the ":8080" to the port that your cache
//runs on, and "cache.company.com" to the machine that
//you run the cache on
return "PROXY cache.company.com:8080; DIRECT";
//We don't cache WAIS
if (url.substring(0, 5) == "wais:")
return "DIRECT";
else
return "DIRECT";
}
For Mosaic and Lynx, you can set environment variables
before starting the application. For example (assuming csh or tcsh):
% setenv http_proxy http://mycache.example.com:3128/
% setenv gopher_proxy http://mycache.example.com:3128/
% setenv ftp_proxy http://mycache.example.com:3128/
For Lynx you can also edit the lynx.cfg file to configure
proxy usage. This has the added benefit of causing all Lynx users on
a system to access the proxy without making environment variable changes
for each user. For example:
http_proxy:http://mycache.example.com:3128/
ftp_proxy:http://mycache.example.com:3128/
gopher_proxy:http://mycache.example.com:3128/
There's one nasty side-effect to using auto-proxy scripts: if you start
the web browser it will try and load the auto-proxy-script.
If your script isn't available either because the web server hosting the
script is down or your workstation can't reach the web server (e.g.
because you're working off-line with your notebook and just want to
read a previously saved HTML-file) you'll get different errors depending
on the browser you use.
The Netscape browser will just return an error after a timeout (after
that it tries to find the site 'www.proxy.com' if the script you use is
called 'proxy.pac').
The Microsoft Internet Explorer on the other hand won't even start, no
window displays, only after about 1 minute it'll display a window asking
you to go on with/without proxy configuration.
The point is that your workstations always need to locate the
proxy-script. I created some extra redundancy by hosting the script on
two web servers (actually Apache web servers on the proxy servers
themselves) and adding the following records to my primary nameserver:
proxy CNAME proxy1
CNAME proxy2
The clients just refer to 'http://proxy/proxy.pac'. This script looks like this:
function FindProxyForURL(url,host)
{
// Hostname without domainname or host within our own domain?
// Try them directly:
// http://www.domain.com actually lives before the firewall, so
// make an exception:
if ((isPlainHostName(host)||dnsDomainIs( host,".domain.com")) &&
!localHostOrDomainIs(host, "www.domain.com"))
return "DIRECT";
// First try proxy1 then proxy2. One server mostly caches '.com'
// to make sure both servers are not
// caching the same data in the normal situation. The other
// server caches the other domains normally.
// If one of 'm is down the client will try the other server.
else if (shExpMatch(host, "*.com"))
return "PROXY proxy1.domain.com:8080; PROXY proxy2.domain.com:8081; DIRECT";
return "PROXY proxy2.domain.com:8081; PROXY proxy1.domain.com:8080; DIRECT";
}
I made sure every client domain has the appropriate 'proxy' entry.
The clients are automatically configured with two nameservers using
DHCP.
--
Rodney van den Oever
The
Sharp Super Proxy Script page
contains a lot of good information about hash-based proxy auto-configuration
scripts. With these you can distribute the load between a number
of caching proxies.
Select Options from the View
menu. Click on the Connection tab. Tick the
Connect through Proxy Server option and hit the
Proxy Settings button. For each protocol that
your Squid server supports (by default, HTTP, FTP, and gopher)
enter the Squid server's hostname or IP address and put the HTTP
port number for the Squid server (by default, 3128) in the
Port column. For any protocols that your Squid
does not support, leave the fields blank.
Here is a
screen shot of the Internet Explorer proxy
configuration screen.
Microsoft is also starting to support Netscape-style JavaScript
automated proxy configuration. As of now, only MSIE version 3.0a
for Windows 3.1 and Windows NT 3.51 supports this feature (i.e.,
as of version 3.01 build 1225 for Windows 95 and NT 4.0, the feature
was not included).
If you have a version of MSIE that does have this feature, elect
Options from the View menu.
Click on the Advanced tab. In the lower left-hand
corner, click on the Automatic Configuration
button. Fill in the URL for your JavaScript file in the dialog
box it presents you. Then exit MSIE and restart it for the changes
to take effect. MSIE will reload the JavaScript file every time
it starts.
Netmanage WebSurfer supports manual proxy configuration and exclusion
lists for hosts or domains that should not be fetched via proxy
(this information is current as of WebSurfer 5.0). Select
Preferences from the Settings
menu. Click on the Proxies tab. Select the
Use Proxy options for HTTP, FTP, and gopher. For
each protocol that enter the Squid server's hostname or IP address
and put the HTTP port number for the Squid server (by default,
3128) in the Port boxes. For any protocols that
your Squid does not support, leave the fields blank.
Take a look at this
screen shot
if the instructions confused you.
On the same configuration window, you'll find a button to bring up
the exclusion list dialog box, which will let you enter some hosts
or domains that you don't want fetched via proxy. It should be
self-explanatory, but you might look at this
screen shot
just for fun anyway.
Select Proxy Servers... from the Preferences menu. Check each
protocol that your Squid server supports (by default, HTTP, FTP, and
Gopher) and enter the Squid server's address as hostname:port (e.g.
mycache.example.com:3128 or 123.45.67.89:3128). Click on Okay to accept the
setup.
Notes:
- Opera 2.12 doesn't support gopher on its own, but requires a proxy; therefore
Squid's gopher proxying can extend the utility of your Opera immensely.
- Unfortunately, Opera 2.12 chokes on some HTTP requests, for example
abuse.net.
At the moment I think it has something to do with cookies. If you have
trouble with a site, try disabling the HTTP proxying by unchecking
that protocol in the Preferences|Proxy Servers... dialogue. Opera will
remember the address, so reenabling is easy.
--
Hume Smith
Insert your username in the host part of the URL, for example:
ftp://joecool@ftp.foo.org/
Squid should then prompt you for your account password. Alternatively,
you can specify both your username and password in the URL itself:
ftp://joecool:secret@ftp.foo.org/
However, we certainly do not recommend this, as it could be very
easy for someone to see or grab your password.
by
Mark Reynolds
You may like to start by reading the
Expired Internet-Draft
that describes WPAD.
After reading the 8 steps below, if you don't understand any of the
terms or methods mentioned, you probably shouldn't be doing this.
Implementing wpad requires you to fully understand:
- web server installations and modifications.
- squid proxy server (or others) installation etc.
- Domain Name System maintenance etc.
Please don't bombard the squid list with web server or dns questions. See
your system administrator, or do some more research on those topics.
This is not a recommendation for any product or version. As far as I
know IE5 is the only browser out now implementing wpad. I think wpad
is an excellent feature that will return several hours of life per month.
Hopefully, all browser clients will implement it as well. But it will take
years for all the older browsers to fade away though.
I have only focused on the domain name method, to the exclusion of the
DHCP method. I think the dns method might be easier for most people.
I don't currently, and may never, fully understand wpad and IE5, but this
method worked for me. It may work for you.
But if you'd rather just have a go ...
- Create a standard
netscape auto proxy config file. The sample provided there is more than
adequate to get you going. No doubt all the other load balancing
and backup scripts will be fine also.
- Store the resultant file in the document root directory of a
handy web server as wpad.dat (Not proxy.pac as you
may have previously done.)
Andrei Ivanov
notes that you should be able to use an HTTP redirect if you
want to store the wpad.dat file somewhere else. You can probably
even redirect wpad.dat to proxy.pac:
Redirect /wpad.dat http://racoon.riga.lv/proxy.pac
- If you do nothing more, a url like
http://www.your.domain.name/wpad.dat should bring up
the script text in your browser window.
- Insert the following entry into your web server mime.types file.
Maybe in addition to your pac file type, if you've done this before.
application/x-ns-proxy-autoconfig dat
And then restart your web server, for new mime type to work.
- Assuming Internet Explorer 5, under Tools, Internet
Options, Connections, Settings or Lan
Settings, set ONLY Use Automatic Configuration Script
to be the URL for where your new wpad.dat file can be found.
i.e.
http://www.your.domain.name/wpad.dat Test that
that all works as per your script and network. There's no point
continuing until this works ...
- Create/install/implement a DNS record so that
wpad.your.domain.name resolves to the host above where
you have a functioning auto config script running. You should
now be able to use http://wpad.your.domain.name/wpad.dat
as the Auto Config Script location in step 5 above.
- And finally, go back to the setup screen detailed in 5 above,
and choose nothing but the Automatically Detect Settings
option, turning everything else off. Best to restart IE5, as
you normally do with any Microsoft product... And it should all
work. Did for me anyway.
- One final question might be 'Which domain name does the client
(IE5) use for the wpad... lookup?' It uses the hostname from
the control panel setting. It starts the search by adding the
hostname "WPAD" to current fully-qualified domain name. For
instance, a client in a.b.Microsoft.com would search for a WPAD
server at wpad.a.b.microsoft.com. If it could not locate one,
it would remove the bottom-most domain and try again; for
instance, it would try wpad.b.microsoft.com next. IE 5 would
stop searching when it found a WPAD server or reached the
third-level domain, wpad.microsoft.com.
Anybody using these steps to install and test, please feel free to make
notes, corrections or additions for improvements, and post back to the
squid list...
There are probably many more tricks and tips which hopefully will be
detailed here in the future. Things like wpad.dat files being served
from the proxy server themselves, maybe with a round robin dns setup
for the WPAD host.
You can also use DHCP to configure browsers for WPAD.
This technique allows you to set any URL as the PAC
URL. For ISC DHCPD, enter a line like this in your
dhcpd.conf file:
option wpad code 252 = text;
option wpad "http://www.example.com/proxy.pac";
Replace the hostname with the name or address of your
own server.
Ilja Pavkovic notes that the DHCP mode does not work reliably with
every version of Internet Explorer. The DNS name method to find
wpad.dat is more reliable.
Another user adds that IE 6.01 seems to strip the last character
from the URL. By adding a trailing newline, he is able to make
it work with both IE 5.0 and 6.0:<
option wpad "http://www.example.com/proxy.pac\n";
by
Reuben Farrelly
There was a bug in the 5.0x releases of Internet Explorer in which IE
cropped any trailing slash off an FTP URL. The URL showed up correctly in
the browser's ``Address:'' field, however squid logs show that the trailing
slash was being taken off.
An example of where this impacted squid if you had a setup where squid
would go direct for FTP directory listings but forward a request to a
parent for FTP file transfers. This was useful if your upstream proxy was
an older version of Squid or another vendors software which displayed
directory listings with broken icons and you wanted your own local version
of squid to generate proper FTP directory listings instead.
The workaround for this is to add a double slash to any directory listing
in which the slash was important, or else upgrade to IE 5.5. (Or use Netscape)
When using authentication with Internet Explorer 6 SP1, you may
encounter issues when you first launch Internet Explorer.
The problem will show itself when you first authenticate, you will
receive a "Page Cannot Be Displayed" error. However, if you click
refresh, the page will be correctly displayed.
This only happens immediately after you authenticate.
This is not a Squid error or bug. Microsoft broke the Basic
Authentication when they put out IE6 SP1.
There is a knowledgebase article
(
KB 331906)
regarding this issue, which contains a link to a downloadable
"hot fix." They do warn that this code is not "regression tested"
but so far there have not been any reports of this breaking anything
else. The problematic file is wininet.dll. Please note that this
hotfix is included in the latest security update.
Lloyd Parkes notes that the article references another article,
KB 312176.
He says that you must not have the registry entry that KB
312176 encourages users to add to their registry.
According to Joao Coutinho, this simple solution also corrects the problem:
- Go to Tools/Internet
- Go to Options/Advanced
- UNSELECT "Show friendly HTTP error messages" under Browsing.
Another possible workaround to these problems is to make the
ERR_CACHE_ACCESS_DENIED larger than 1460 bytes. This should trigger
IE to handle the authentication in a slightly different manner.
The logs are a valuable source of information about Squid workloads and
performance. The logs record not only access information, but also system
configuration errors and resource consumption (eg, memory, disk
space). There are several log file maintained by Squid. Some have to be
explicitely activated during compile time, others can safely be deactivated
during run-time.
There are a few basic points common to all log files. The time stamps
logged into the log files are usually UTC seconds unless stated otherwise.
The initial time stamp usually contains a millisecond extension.
If you run your Squid from the RunCache script, a file
squid.out contains the Squid startup times, and also all fatal
errors, e.g. as produced by an assert() failure. If you are not
using RunCache, you will not see such a file.
The cache.log file contains the debug and error messages that Squid
generates. If you start your Squid using the default RunCache script,
or start it with the -s command line option, a copy of certain
messages will go into your syslog facilities. It is a matter of personal
preferences to use a separate file for the squid log data.
From the area of automatic log file analysis, the cache.log file does
not have much to offer. You will usually look into this file for automated
error reports, when programming Squid, testing new features, or searching
for reasons of a perceived misbehaviour, etc.
The user agent log file is only maintained, if
- you configured the compile time --enable-useragent-log
option, and
- you pointed the useragent_log configuration option to a
file.
From the user agent log file you are able to find out about distributation
of browsers of your clients. Using this option in conjunction with a loaded
production squid might not be the best of all ideas.
The store.log file covers the objects currently kept on disk or
removed ones. As a kind of transaction log it is ususally used for
debugging purposes. A definitive statement, whether an object resides on
your disks is only possible after analysing the complete log file.
The release (deletion) of an object may be logged at a later time than the
swap out (save to disk).
The store.log file may be of interest to log file analysis which
looks into the objects on your disks and the time they spend there, or how
many times a hot object was accessed. The latter may be covered by another
log file, too. With knowledge of the cache_dir configuration option,
this log file allows for a URL to filename mapping without recursing your
cache disks. However, the Squid developers recommend to treat
store.log primarily as a debug file, and so should you, unless you
know what you are doing.
The print format for a store log entry (one line) consists of eleven
space-separated columns, compare with the storeLog() function in file
src/store_log.c:
"%9d.%03d %-7s %02d %08X %4d %9d %9d %9d %s %d/%d %s %s\n"
- time
The timestamp when the line was logged in UTC with a millisecond fraction.
- action
The action the object was sumitted to, compare with src/store_log.c:
- CREATE Seems to be unused.
- RELEASE The object was removed from the cache (see also
file number).
- SWAPOUT The object was saved to disk.
- SWAPIN The object existed on disk and was read into memory.
- dir numer
The cache_dir number this object was stored into, starting at 0 for your first
cache_dir line.
- file number
The file number for the object storage file. Please note that the path to
this file is calculated according to your cache_dir configuration.
A file number of FFFFFFFF denominates "memory only" objects. Any
action code for such a file number refers to an object which existed only
in memory, not on disk. For instance, if a RELEASE code was logged
with file number FFFFFFFF, the object existed only in memory, and was
released from memory.
- status
The HTTP reply status code.
- datehdr
The value of the HTTP "Date: " reply header.
- lastmod
The value of the HTTP "Last-Modified: " reply header.
- expires
The value of the HTTP "Expires: " reply header.
- type
The HTTP "Content-Type" major value, or "unknown" if it cannot be
determined.
- sizes
This column consists of two slash separated fields:
- The advertised content length from the HTTP "Content-Length: " reply
header.
- The size actually read.
If the advertised (or expected) length is missing, it will be set to
zero. If the advertised length is not zero, but not equal to the real
length, the object will be realeased from the cache.
- method
The request method for the object, e.g. GET.
- key
The key to the object, usually the URL.
The timestamp format for the columns
Date to
Expires are all expressed in UTC seconds. The
actual values are parsed from the HTTP reply headers. An unparsable header
is represented by a value of -1, and a missing header is represented by a
value of -2.
The column
key usually contains just the URL of
the object. Some objects though will never become public. Thus the key is
said to include a unique integer number and the request method in addition
to the URL.
This logfile exists for Squid-1.0 only. The format is
[date] URL peerstatus peerhost
Most log file analysis program are based on the entries in
access.log. Currently, there are two file formats possible for the log
file, depending on your configuration for the emulate_httpd_log
option. By default, Squid will log in its native log file format. If the
above option is enabled, Squid will log in the common log file format as
defined by the CERN web daemon.
The common log file format contains other information than the native log
file, and less. The native format contains more information for the admin
interested in cache evaluation.
The common log file format
The
Common Logfile Format
is used by numerous HTTP servers. This format consists of the following
seven fields:
remotehost rfc931 authuser [date] "method URL" status bytes
It is parsable by a variety of tools. The common format contains different
information than the native log file format. The HTTP version is logged,
which is not logged in native log file format.
The native log file format
The native format is different for different major versions of Squid. For
Squid-1.0 it is:
time elapsed remotehost code/status/peerstatus bytes method URL
For Squid-1.1, the information from the hierarchy.log was moved
into access.log. The format is:
time elapsed remotehost code/status bytes method URL rfc931 peerstatus/peerhost type
For Squid-2 the columns stay the same, though the content within may change
a little.
The native log file f |