+Privoxy User Manual
-Junkbuster User Manual
-
- By: Junkbuster Developers
-
- $Id: user-manual.sgml,v 1.20 2001/10/24 23:58:25 hal9 Exp $
-
- The user manual gives the users information on how to install and
- configure Internet Junkbuster. Internet Junkbuster is an application
- that provides privacy and security to users of the World Wide Web.
-
- You can find the latest version of the user manual at
- [1]http://ijbswa.sourceforge.net/user-manual/.
-
- Feel free to send a note to the developers at
- <[2]ijbswa-developers@lists.sourceforge.net>.
- _________________________________________________________________
-
- Table of Contents
- 1. [3]Introduction
-
- 1.1. [4]New Features
-
- 2. [5]Installation
-
- 2.1. [6]Source
- 2.2. [7]Red Hat
- 2.3. [8]SuSE
- 2.4. [9]OS/2
- 2.5. [10]Windows
- 2.6. [11]Other
-
- 3. [12]Junkbuster Configuration
-
- 3.1. [13]The Main Configuration File
- 3.2. [14]The Actions File
- 3.3. [15]The Filter File
-
- 4. [16]Quickstart to Using Junkbuster
- 5. [17]Contact the Developers
- 6. [18]Copyright and History
-
- 6.1. [19]License
- 6.2. [20]History
-
- 7. [21]See also
- 8. [22]Appendix
-
- 8.1. [23]Regular Expressions
-
-1. Introduction
+By: Privoxy Developers
+
+$Id: user-manual.sgml,v 1.99 2002/04/28 16:59:05 swa Exp $
+
+The user manual gives users information on how to install, configure and use
+Privoxy.
+
+Privoxy is a web proxy with advanced filtering capabilities for protecting
+privacy, filtering web page content, managing cookies, controlling access, and
+removing ads, banners, pop-ups and other obnoxious Internet junk. Privoxy has a
+very flexible configuration and can be customized to suit individual needs and
+tastes. Privoxy has application for both stand-alone systems and multi-user
+networks.
- Internet Junkbuster is a web proxy with advanced filtering
- capabilities for protecting privacy, filtering web page content,
- managing cookies, controlling access, and removing ads, banners,
- pop-ups and other obnoxious Internet Junk. Junkbuster has a very
- flexible configuration and can be customized to suit individual needs
- and tastes. Internet Junkbuster has application for both stand-alone
- systems and multi-user networks.
-
- This documentation is included with the current development version of
- Internet Junkbuster and is incomplete at this point. The most up to
- date reference for the time being is still the comments in the source
- files and in the individual configuration files. Development of
- version 3.0 is currently underway, and includes many significant
- changes and enhancements over earlier verions. The target release date
- for stable v3.0 is December 2001.
-
- Since this is a development version, some features are in the process
- of being implemented. This documentation may be slightly out of sync
- as a result. And there are bugs, though hopefully not many!
- _________________________________________________________________
-
-1.1. New Features
-
- In addition to Junkbuster's traditional features of ad and banner
- blocking and cookie management, this is a list of new features
- currently under development:
-
- * A browser based configuration utility (WIP at [24]http://i.j.b).
- * Modularized configuration that will allow for system wide
- settings, and individual user settings. (not implemented yet)
- * Blocking of annoying pop-up browser windows (previously available
- as a patch).
- * Support for HTTP/1.1 (partially implemented at this point).
- * Support for Perl Compatible Regular Expressions in the
- configuration files, and generally a more sophisticated
- configuration syntax over previous versions.
- * Web page content filtering.
- * Multi-threaded.
+Privoxy is based on Internet Junkbuster (tm).
+
+You can find the latest version of the user manual at http://www.privoxy.org/
+user-manual/. Please see the Contact section on how to contact the developers.
+
+-------------------------------------------------------------------------------
+
+Table of Contents
+
+1. Introduction
+
+ 1.1. Features
+
+3. Installation
+
+ 3.1. Red Hat and SuSE RPMs
+ 3.2. Debian
+ 3.3. Windows
+ 3.4. Solaris, NetBSD, FreeBSD, HP-UX
+ 3.5. OS/2
+ 3.6. Max OSX
+ 3.7. AmigaOS
+
+4. Note to Upgraders
+5. Quickstart to Using Privoxy
+6. Starting Privoxy
+
+ 6.1. RedHat and Debian
+ 6.2. SuSE
+ 6.3. Windows
+ 6.4. Solaris, NetBSD, FreeBSD, HP-UX and others
+ 6.5. OS/2
+ 6.6. MAX OSX
+ 6.7. AmigaOS
+ 6.8. Command Line Options
+
+7. Privoxy Configuration
+
+ 7.1. Controlling Privoxy with Your Web Browser
+ 7.2. Configuration Files Overview
+
+8. The Main Configuration File
+
+ 8.1. Configuration and Log File Locations
+
+ 8.1.1. confdir
+ 8.1.2. logdir
+ 8.1.3. actionsfile
+ 8.1.4. filterfile
+ 8.1.5. logfile
+ 8.1.6. jarfile
+ 8.1.7. trustfile
+ 8.1.8. user-manual
+
+ 8.2. Local Set-up Documentation
+
+ 8.2.1. trust-info-url
+ 8.2.2. admin-address
+ 8.2.3. proxy-info-url
+
+ 8.3. Debugging
+
+ 8.3.1. debug
+ 8.3.2. single-threaded
+
+ 8.4. Access Control and Security
+
+ 8.4.1. listen-address
+ 8.4.2. toggle
+ 8.4.3. enable-remote-toggle
+ 8.4.4. enable-edit-actions
+ 8.4.5. ACLs: permit-access and deny-access
+ 8.4.6. buffer-limit
- In addition, the configuration is more versatile overall.
- _________________________________________________________________
+ 8.5. Forwarding
+
+ 8.5.1. forward
+ 8.5.2. forward-socks4 and forward-socks4a
+ 8.5.3. Advanced Forwarding Examples
+
+ 8.6. Windows GUI Options
-2. Installation
-
- Junkbuster is available as raw source code, or pre-compiled binaries.
- See the [25]Junkbuster Home Page for current release info. Junkbuster
- is also available via [26]CVS. This is the recommended approach at
- this time. But please be aware that CVS is constantly changing, and it
- may break in mysterious ways.
- _________________________________________________________________
+9. Actions Files
-2.1. Source
-
- For gzipped tar archives, unpack the source:
+ 9.1. Finding the Right Mix
+ 9.2. How to Edit
+ 9.3. How Actions are Applied to URLs
+ 9.4. Patterns
+
+ 9.4.1. The Domain Pattern
+ 9.4.2. The Path Pattern
+
+ 9.5. Actions
+
+ 9.5.1. +add-header
+ 9.5.2. +block
+ 9.5.3. +deanimate-gifs
+ 9.5.4. +downgrade-http-version
+ 9.5.5. +fast-redirects
+ 9.5.6. +filter
+ 9.5.7. +hide-forwarded-for-headers
+ 9.5.8. +hide-from-header
+ 9.5.9. +hide-referer
+ 9.5.10. +hide-user-agent
+ 9.5.11. +handle-as-image
+ 9.5.12. +set-image-blocker
+ 9.5.13. +limit-connect
+ 9.5.14. +prevent-compression
+ 9.5.15. +session-cookies-only
+ 9.5.16. +prevent-reading-cookies
+ 9.5.17. +prevent-setting-cookies
+ 9.5.18. +kill-popups
+ 9.5.19. +send-vanilla-wafer
+ 9.5.20. +send-wafer
+ 9.5.21. Summary
+ 9.5.22. Sample Actions Files
+
+ 9.6. Aliases
- tar zxvf ijb_source_2.9*
- cd ijb_source_2.9*
-
- For retrieving the current CVS sources, you'll need the CVS package
- installed first. To download CVS source:
+10. The Filter File
- cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login
- cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co cu
-rrent
- cd current
-
- This will create a directory named current/, which will contain the
- source tree.
+ 10.1. The +filter Action
- Then, in either case, to build from source:
+11. Templates
+12. Contacting the Developers, Bug Reporting and Feature Requests
- autoconf #recommended for CVS source
- ./configure
- make
- su
- make install
-
- For Redhat and SuSE Linux RPM packages, see below.
- _________________________________________________________________
+ 12.1. Get Support
+ 12.2. Report bugs
+ 12.3. Request new features
+ 12.4. Report ads or other filter problems
+ 12.5. Other
-2.2. Red Hat
-
- To build Redhat RPM packages, install source as above. Then:
+13. Copyright and History
+
+ 13.1. Copyright
+ 13.2. History
+
+14. See Also
+15. Appendix
+
+ 15.1. Regular Expressions
+ 15.2. Privoxy's Internal Pages
+
+ 15.2.1. Bookmarklets
+
+ 15.3. Chain of Events
+ 15.4. Anatomy of an Action
- autoconf #recommended for CVS source
- ./configure
- make redhat-dist
+-------------------------------------------------------------------------------
+
+1. Introduction
+
+This documentation is included with the current beta version of Privoxy,
+v.2.9.14, and is mostly complete at this point. The most up to date reference
+for the time being is still the comments in the source files and in the
+individual configuration files. Development of version 3.0 is currently nearing
+completion, and includes many significant changes and enhancements over earlier
+versions. The target release date for stable v3.0 is "soon" ;-).
+
+Since this is a beta version, not all new features are well tested. This
+documentation may be slightly out of sync as a result (especially with CVS
+sources). And there may be bugs, though hopefully not many!
+
+-------------------------------------------------------------------------------
+
+1.1. Features
+
+In addition to Internet Junkbuster's traditional features of ad and banner
+blocking and cookie management, Privoxy provides new features, some of them
+currently under development:
- This will create both binary and src RPMs in the usual places.
- Example:
+ * FIXME: complete the list of features. change the order: most important
+ features to the top of the list. prefix new features with "NEW".
- /usr/src/redhat/RPMS/i686/junkbuster-2.9.8-1.i686.rpm
+ * Integrated browser based configuration and control utility at http://
+ config.privoxy.org/ (shortcut: http://p.p/). Browser-based tracing of rule
+ and filter effects. Remote toggling.
- /usr/src/redhat/SRPMS/junkbuster-2.9.9-1.src.rpm
+ * Blocking of annoying pop-up browser windows.
- To install, of course:
+ * HTTP/1.1 compliant (but not all optional 1.1 features are supported).
- rpm -Uvv /usr/src/redhat/RPMS/i686/junkbuster-2.9.9-1.i686.rpm
-
- This will place the Junkbuster configuration files in
- /etc/junkbuster/, and log files in /var/log/junkbuster/.
- _________________________________________________________________
+ * Support for Perl Compatible Regular Expressions in the configuration files,
+ and generally a more sophisticated and flexible configuration syntax over
+ previous versions.
-2.3. SuSE
-
- To build SuSE RPM packages, install source as above. Then:
+ * GIF de-animation.
+
+ * Web page content filtering (removes banners based on size, invisible
+ "web-bugs", JavaScript and HTML annoyances, pop-ups, etc.)
+
+ * Bypass many click-tracking scripts (avoids script redirection).
+
+ * Multi-threaded (POSIX and native threads).
+
+ * Auto-detection and re-reading of config file changes.
+
+ * User-customizable HTML templates (e.g. 404 error page).
+
+ * Improved cookie management features (e.g. session based cookies).
- autoconf #recommended for CVS source
- ./configure
- make suse-dist
+ * Improved signal handling, and a true daemon mode (Unix).
+
+ * Every feature now controllable on a per-site or per-location basis,
+ configuration more powerful and versatile over-all.
+
+ * Many smaller new features added, limitations and bugs removed, and security
+ holes fixed.
+
+-------------------------------------------------------------------------------
+
+3. Installation
+
+Privoxy is available both in convenient pre-compiled packages for a wide range
+of operating systems, and as raw source code. For most users, we recommend
+using the packages, which can be downloaded from our Privoxy Project Page. For
+installing and compiling the source code, please look into our Developer
+Manual.
+
+If you like to live on the bleeding edge and are not afraid of using possibly
+unstable development versions, you can check out the up-to-the-minute version
+directly from the CVS repository or simply download the nightly CVS tarball.
+Again, we refer you to the Developer Manual.
+
+At present, Privoxy is known to run on Windows(95, 98, ME, 2000, XP), Linux
+(RedHat, Suse, Debian), Mac OSX, OS/2, AmigaOS, FreeBSD, NetBSD, BeOS, and many
+more flavors of Unix.
+
+Note: If you have a previous Junkbuster or Privoxy installation on your system,
+you will need to remove it. Some platforms do this for you as part of their
+installation procedure. (See below for your platform).
- This will create both binary and src RPMs in the usual places.
- Example:
+In any case be sure to backup your old configuration if it is valuable to you.
+See the note to upgraders section below.
+
+-------------------------------------------------------------------------------
+
+3.1. Red Hat and SuSE RPMs
+
+RPMs can be installed with rpm -Uvh privoxy-2.9.14-1.rpm, and will use /etc/
+privoxy for the location of configuration files.
+
+Note that on Red Hat, Privoxy will not be automatically started on system boot.
+You will need to enable that using chkconfig, ntsysv, or similar methods. Note
+that SuSE will automatically start Privoxy in the boot process.
+
+If you have problems with failed dependencies, try rebuilding the SRC RPM: rpm
+--rebuild privoxy-2.9.14-1.src.rpm;. This will use your locally installed
+libraries and RPM version.
+
+Also note that if you have a Junkbuster RPM installed on your system, you need
+to remove it first, because the packages conflict. Otherwise, RPM will try to
+remove Junkbuster automatically, before installing Privoxy.
+
+-------------------------------------------------------------------------------
+
+3.2. Debian
+
+FIXME.
+
+-------------------------------------------------------------------------------
+
+3.3. Windows
+
+Just double-click the installer, which will guide you through the installation
+process. You will find the configuration files in the same directory as you
+installed Privoxy in. We do not use the registry of Windows.
+
+-------------------------------------------------------------------------------
+
+3.4. Solaris, NetBSD, FreeBSD, HP-UX
+
+Create a new directory, cd to it, then unzip and untar the archive. For the
+most part, you'll have to figure out where things go. FIXME.
+
+-------------------------------------------------------------------------------
+
+3.5. OS/2
+
+First, make sure that no previous installations of Junkbuster and / or Privoxy
+are left on your system. You can do this by
+
+Then, just double-click the WarpIN self-installing archive, which will guide
+you through the installation process. A shadow of the Privoxy executable will
+be placed in your startup folder so it will start automatically whenever OS/2
+starts.
+
+The directory you choose to install Privoxy into will contain all of the
+configuration files.
+
+-------------------------------------------------------------------------------
+
+3.6. Max OSX
+
+Unzip the downloaded package (you can either double-click on the file in the
+finder, or on the desktop if you downloaded it there). Then, double-click on
+the package installer icon and follow the installation process. Privoxy will be
+installed in the subdirectory /Applications/Privoxy.app. Privoxy will set
+itself up to start automatically on system bring-up via /System/Library/
+StartupItems/Privoxy.
+
+-------------------------------------------------------------------------------
+
+3.7. AmigaOS
+
+Copy and then unpack the lha archive to a suitable location. All necessary
+files will be installed into Privoxy directory, including all configuration and
+log files. To uninstall, just remove this directory.
+
+Start Privoxy (with RUN <>NIL:) in your startnet script (AmiTCP), in s:
+user-startup (RoadShow), as startup program in your startup script (Genesis),
+or as startup action (Miami and MiamiDx). Privoxy will automatically quit when
+you quit your TCP/IP stack (just ignore the harmless warning your TCP/IP stack
+may display that Privoxy is still running).
+
+-------------------------------------------------------------------------------
+
+4. Note to Upgraders
+
+There are very significant changes from older versions of Junkbuster to the
+current Privoxy. Configuration is substantially changed. Junkbuster 2.0.x and
+earlier configuration files will not migrate. The functionality of the old
+blockfile, cookiefile and imagelist, are now combined into the "actions files".
+default.action, is the main actions file. Local exceptions should best be put
+into user.action.
+
+A "filter file" (typically default.filter) is new as of Privoxy 2.9.x, and
+provides some of the new sophistication (explained below). config is much the
+same as before.
+
+If upgrading from a 2.0.x version, you will have to use the new config files,
+and possibly adapt any personal rules from your older files. When porting
+personal rules over from the old blockfile to the new actions files, please
+note that even the pattern syntax has changed. If upgrading from 2.9.x
+development versions, it is still recommended to use the new configuration
+files.
+
+A quick list of things to be aware of before upgrading:
+
+ * The default listening port is now 8118 due to a conflict with another
+ service (NAS).
- /usr/src/suse/RPMS/i686/junkbuster-2.9.9-1.i686.rpm
+ * Some installers may remove earlier versions completely. Save any important
+ configuration files!
- /usr/src/suse/SRPMS/junkbuster-2.9.9-1.src.rpm
+ * Privoxy is controllable with a web browser at the special URL: http://
+ config.privoxy.org/ (Shortcut: http://p.p/). Many aspects of configuration
+ can be done here, including temporarily disabling Privoxy.
- To install, of course:
+ * The primary configuration file for cookie management, ad and banner
+ blocking, and many other aspects of Privoxy configuration is in the actions
+ files. It is strongly recommended to become familiar with the new actions
+ concept below, before modifying these files. Locally defined rules should
+ go into user.action.
- rpm -Uvv /usr/src/suse/RPMS/i686/junkbuster-2.9.9-1.i686.rpm
+ * Some installers may not automatically start Privoxy after installation.
+
+-------------------------------------------------------------------------------
+
+5. Quickstart to Using Privoxy
- This will place the Junkbuster configuration files in
- /etc/junkbuster/, and log files in /var/log/junkbuster/.
- _________________________________________________________________
+ * Install Privoxy. See the section Installing.
-2.4. OS/2
+ * Start Privoxy. See the section Starting Privoxy.
+
+ * Change your browser's configuration to use the proxy localhost on port
+ 8118. See the section Starting Privoxy.
+
+ * Enjoy surfing with enhanced comfort and privacy. Please see the section
+ Contacting the Developers on how to report bugs or problems with websites
+ or to get help. You may want to change the file user.action to further
+ tweak your new browsing experience.
+
+-------------------------------------------------------------------------------
+
+6. Starting Privoxy
+
+Before launching Privoxy for the first time, you will want to configure your
+browser(s) to use Privoxy as a HTTP and HTTPS proxy. The default is localhost
+for the proxy address, and port 8118 (earlier versions used port 8000). This is
+the one configuration step that must be done!
+
+With Netscape (and Mozilla), this can be set under Edit -> Preferences ->
+Advanced -> Proxies -> HTTP Proxy. For Internet Explorer: Tools -> Internet
+Properties -> Connections -> LAN Setting. Then, check "Use Proxy" and fill in
+the appropriate info (Address: localhost, Port: 8118). Include if HTTPS proxy
+support too.
+
+After doing this, flush your browser's disk and memory caches to force a
+re-reading of all pages and to get rid of any ads that may be cached. You are
+now ready to start enjoying the benefits of using Privoxy!
+
+Privoxy is typically started by specifying the main configuration file to be
+used on the command line. If no configuration file is specified on the command
+line, Privoxy will look for a file named config in the current directory.
+Except on Win32 where it will try config.txt.
+
+-------------------------------------------------------------------------------
+
+6.1. RedHat and Debian
+
+We use a script. Note that RedHat does not start Privoxy upon booting per
+default. It will use the file /etc/privoxy/config as its main configuration
+file. FIXME: Debian??
+
+ # /etc/rc.d/init.d/privoxy start
+
+-------------------------------------------------------------------------------
+
+6.2. SuSE
+
+We use a script. It will use the file /etc/privoxy/config as its main
+configuration file. Note that SuSE starts Privoxy upon booting your PC.
+
+ # rcprivoxy start
+
+-------------------------------------------------------------------------------
+
+6.3. Windows
+
+Click on the Privoxy Icon to start Privoxy. If no configuration file is
+specified on the command line, Privoxy will look for a file named config.txt.
+Note that Windows will automatically start Privoxy upon booting you PC.
+
+-------------------------------------------------------------------------------
+
+6.4. Solaris, NetBSD, FreeBSD, HP-UX and others
+
+Example Unix startup command:
+
+ # /usr/sbin/privoxy /etc/privoxy/config
+
+-------------------------------------------------------------------------------
+
+6.5. OS/2
+
+FIXME.
+
+-------------------------------------------------------------------------------
+
+6.6. MAX OSX
- The OS/2 version of Junkbuster requires the EMX runtime library to be
- installed. The EMX runtime library is available on the hobbes OS/2
- archive, among many other locations:
- [27]http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=emx
- rt.zip&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fdev%2Femx%2Fv0.9d
+FIXME.
+
+-------------------------------------------------------------------------------
+
+6.7. AmigaOS
+
+FIXME.
+
+-------------------------------------------------------------------------------
+
+6.8. Command Line Options
+
+Privoxy may be invoked with the following command-line options:
+
+ * --version
+
+ Print version info and exit. Unix only.
- Junkbuster is packaged in a WarpIN self- installing archive. The
- self-installing program will be named depending on the release
- version, something like: ijbos123.exe. In order to install it, simply
- run this executable or double-click on its icon and follow the WarpIN
- installation panels. A shadow of the Junkbuster executable will be
- placed in your startup folder so it will start automatically whenever
- OS/2 starts.
+ * --help
- The directory you choose to install Junkbuster into will contain all
- of the configuration files.
+ Print short usage info and exit. Unix only.
- If you would like to build binary images on OS/2 yourself, you will
- need a working EMX/GCC environment, plus several Unix-like tools. The
- Hobbes OS/2 archive is a good place to start when building such an
- environment. A set of Unix-like tools named gnupack is located here:
- [28]http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&key=gnupack&stype=all
- &sort=type&dir=%2Fpub%2Fos2%2Fapps
+ * --no-daemon
- Once you have the source code unpacked as above, you can build the
- binaries from the current/ directory:
+ Don't become a daemon, i.e. don't fork and become process group leader, and
+ don't detach from controlling tty. Unix only.
- autoconf
- sh configure
- make
- _________________________________________________________________
+ * --pidfile FILE
-2.5. Windows
+ On startup, write the process ID to FILE. Delete the FILE on exit. Failure
+ to create or delete the FILE is non-fatal. If no FILE option is given, no
+ PID file will be used. Unix only.
+
+ * --user USER[.GROUP]
+
+ After (optionally) writing the PID file, assume the user ID of USER, and if
+ included the GID of GROUP. Exit if the privileges are not sufficient to do
+ so. Unix only.
+
+ * configfile
+
+ If no configfile is included on the command line, Privoxy will look for a
+ file named "config" in the current directory (except on Win32 where it will
+ look for "config.txt" instead). Specify full path to avoid confusion. If no
+ config file is found, Privoxy will fail to start.
+
+-------------------------------------------------------------------------------
+
+7. Privoxy Configuration
+
+All Privoxy configuration is stored in text files. These files can be edited
+with a text editor. Many important aspects of Privoxy can also be controlled
+easily with a web browser.
+
+-------------------------------------------------------------------------------
+
+7.1. Controlling Privoxy with Your Web Browser
+
+Privoxy's user interface can be reached through the special URL http://
+config.privoxy.org/ (shortcut: http://p.p/), which is a built-in page and works
+without Internet access. You will see the following section:
+
+ Privoxy Menu
+ ? View & change the current configuration
+ ? View the source code version numbers
+ ? View the request headers.
+ ? Look up which actions apply to a URL and why
+ ? Toggle Privoxy on or off
+
+
+This should be self-explanatory. Note the first item leads to an editor for the
+"actions list", which is where the ad, banner, cookie, and URL blocking magic
+is configured as well as other advanced features of Privoxy. This is an easy
+way to adjust various aspects of Privoxy configuration. The actions file, and
+other configuration files, are explained in detail below.
+
+"Toggle Privoxy On or Off" is handy for sites that might have problems with
+your current actions and filters. You can in fact use it as a test to see
+whether it is Privoxy causing the problem or not. Privoxy continues to run as a
+proxy in this case, but all filtering is disabled. There is even a toggle
+Bookmarklet offered, so that you can toggle Privoxy with one click from your
+browser.
+
+-------------------------------------------------------------------------------
+
+7.2. Configuration Files Overview
+
+For Unix, *BSD and Linux, all configuration files are located in /etc/privoxy/
+by default. For MS Windows, OS/2, and AmigaOS these are all in the same
+directory as the Privoxy executable. The name and number of configuration files
+has changed from previous versions, and is subject to change as development
+progresses.
+
+The installed defaults provide a reasonable starting point, though some
+settings may be aggressive by some standards. For the time being, the principle
+configuration files are:
+
+ * The main configuration file is named config on Linux, Unix, BSD, OS/2, and
+ AmigaOS and config.txt on Windows. This is a required file.
+
+ * default.action (the main actions file) is used to define the default
+ settings for various "actions" relating to images, banners, pop-ups, access
+ restrictions, banners and cookies.
+
+ Multiple actions files may be defined in config. These are processed in the
+ order they are defined. Local customizations and locally preferred
+ exceptions to the default policies as defined in default.action are
+ probably best applied in user.action, which should be preserved across
+ upgrades. standard.action is also included. This is mostly for Privoxy's
+ internal use.
+
+ There is also a web based editor that can be accessed from http://
+ config.privoxy.org/show-status/ (Shortcut: http://p.p/show-status/) for the
+ various actions files.
+
+ * default.filter (the filter file) can be used to re-write the raw page
+ content, including viewable text as well as embedded HTML and JavaScript,
+ and whatever else lurks on any given web page. The filtering jobs are only
+ pre-defined here; whether to apply them or not is up to the actions files.
+
+All files use the "#" character to denote a comment (the rest of the line will
+be ignored) angd understand line continuation through placing a backslash ("\")
+as the very last character in a line. If the # is preceded by a backslash, it
+looses its special function. Placing a # in front of an otherwise valid
+configuration line to prevent it from being interpreted is called "commenting
+out" that line.
+
+The actions files and default.filter can use Perl style regular expressions for
+maximum flexibility.
+
+After making any changes, there is no need to restart Privoxy in order for the
+changes to take effect. Privoxy detects such changes automatically. Note,
+however, that it may take one or two additional requests for the change to take
+effect. When changing the listening address of Privoxy, these "wake up"
+requests must obviously be sent to the old listening address.
+
+While under development, the configuration content is subject to change. The
+below documentation may not be accurate by the time you read this. Also, what
+constitutes a "default" setting, may change, so please check all your
+configuration files on important issues.
+
+-------------------------------------------------------------------------------
+
+8. The Main Configuration File
+
+Again, the main configuration file is named config on Linux/Unix/BSD and OS/2,
+and config.txt on Windows. Configuration lines consist of an initial keyword
+followed by a list of values, all separated by whitespace (any number of spaces
+or tabs). For example:
- Click-click. (I need help on this. Not a clue here. Also for
- configuration section below. HB.)
- _________________________________________________________________
+ confdir /etc/privoxy
-2.6. Other
- Some quick notes on other Operating Systems.
+Assigns the value /etc/privoxy to the option confdir and thus indicates that
+the configuration directory is named "/etc/privoxy/".
+
+All options in the config file except for confdir and logdir are optional.
+Watch out in the below description for what happens if you leave them unset.
+
+The main config file controls all aspects of Privoxy's operation that are not
+location dependent (i.e. they apply universally, no matter where you may be
+surfing).
+
+-------------------------------------------------------------------------------
+
+8.1. Configuration and Log File Locations
+
+Privoxy can (and normally does) use a number of other files for additional
+configuration, help and logging. This section of the configuration file tells
+Privoxy where to find those other files.
+
+-------------------------------------------------------------------------------
+
+8.1.1. confdir
+
+Specifies:
+
+ The directory where the other configuration files are located
+
+Type of value:
+
+ Path name
+
+Default value:
+
+ /etc/privoxy (Unix) or Privoxy installation dir (Windows)
+
+Effect if unset:
+
+ Mandatory
+
+Notes:
- For FreeBSD (and other *BSDs?), the build will need gmake instead of
- the included make. gmake is available from [29]http://www.gnu.org. The
- rest should be the same as above for Linux/Unix.
- _________________________________________________________________
+ No trailing "/", please
-3. Junkbuster Configuration
+ When development goes modular and multi-user, the blocker, filter, and
+ per-user config will be stored in subdirectories of "confdir". For now, the
+ configuration directory structure is flat, except for confdir/templates,
+ where the HTML templates for CGI output reside (e.g. Privoxy's 404 error
+ page).
+
+-------------------------------------------------------------------------------
+
+8.1.2. logdir
- For Unix, *BSD and Linux, all configuraton files are located in
- /etc/junkbuster/ by default. For MS Windows and OS/2, these are all in
- the same directory as the Junkbuster executable. The name and number
- of configuration files has changed from previous versions, and is
- subject to change as development progresses.
+Specifies:
- The installed defaults provide a reasonable starting point. For the
- time being, there are only three default configuration files (this
- will change in time):
+ The directory where all logging takes place (i.e. where logfile and jarfile
+ are located)
- * The main configuration file is named config on Linux, Unix, BSD,
- and OS/2, and junkbustr.txt on Windows. On Amiga, it is
- AmiTCP:db/junkbuster/config.
- * The actionsfile file is used to define various "actions" relating
- to images, banners, pop-ups, access restrictions, banners and
- cookies. There is a CGI based editor for this file that can be
- accessed via [30]http://i.j.b./. This is the easiest method of
- configuring actions. (Still under active development.)
- * The re_filterfile file can be used to rewrite the raw page
- content, including text as well as embedded HTML and JavaScript.
-
- actionsfile and re_filterfile can use Perl style regular expressions
- for maximum flexibility. All files use the "#" character to denote a
- comment. Such lines are not processed by Junkbuster. After making any
- changes, restart Junkbuster in order for the changes to take effect.
+Type of value:
+
+ Path name
+
+Default value:
+
+ /var/log/privoxy (Unix) or Privoxy installation dir (Windows)
+
+Effect if unset:
+
+ Mandatory
- While under development, the configuration content is subject to
- change. The below documentation may not be accurate by the time you
- read this. Also, what constitutes a "default" setting, may change, so
- please check all your configuration files on important issues.
- _________________________________________________________________
+Notes:
-3.1. The Main Configuration File
+ No trailing "/", please
+
+-------------------------------------------------------------------------------
+
+8.1.3. actionsfile
- Again, the main configuration file is named config on Linux/Unix/BSD
- and OS/2, and junkbustr.txt on Windows. Configuration lines consist of
- an initial keyword followed by a list of values, all separated by
- whitespace (any number of spaces or tabs). For example:
+Specifies:
+
+ The actions file(s) to use
+
+Type of value:
- blockfile blocklist.ini
+ File name, relative to confdir
- Indicates that the blockfile is named "blocklist.ini".
+Default value:
- A "#" indicates a comment. Any part of a line following a "#" is
- ignored, except if the "#" is preceded by a "\".
+ standard # Internal purposes, recommended not editing
+
+ default # Main actions file
+
+ user # User customizations
- Thus, by placing a "#" at the start of an existing configuration line,
- you can make it a comment and it will be treated as if it weren't
- there. This is called "commenting out" an option and can be useful to
- turn off features: If you comment out the "logfile" line, junkbuster
- will not log to a file at all. Watch for the "default:" section in
- each explanation to see what happens if the option is left unset (or
- commented out).
+Effect if unset:
- Long lines can be continued on the next line by using a "\" as the
- very last character.
+ No actions are taken at all. Simple neutral proxying.
- There are various aspects of Junkbuster behavior that can be tuned.
- _________________________________________________________________
+Notes:
-3.1.1. Defining Other Configuration Files
+ Multiple actionsfile lines are permitted, and are in fact recommended!
+
+ The default values include standard.action, which is used for internal
+ purposes and should be loaded, default.action, which is the "main" actions
+ file maintained by the developers, and user.action, where you can make your
+ personal additions.
+
+ Actions files are where all the per site and per URL configuration is done
+ for ad blocking, cookie management, privacy considerations, etc. There is
+ no point in using Privoxy without at least one actions file.
+
+-------------------------------------------------------------------------------
+
+8.1.4. filterfile
- Junkbuster can use a number of other files to tell it what ads to
- block, what cookies to accept, etc. This section of the configuration
- file tells Junkbuster where to find all those other files.
+Specifies:
- On Windows, Junkbuster looks for these files in the same directory as
- the executable. On Unix and OS/2, Junkbuster looks for these files in
- the current working directory. In either case, an absolute path name
- can be used to avoid problems.
+ The filter file to use
- When development goes modular and multiuser, the blocker, filter, and
- per-user config will be stored in subdirectories of "confdir". For
- now, only confdir/templates is used for storing HTML templates for CGI
- results.
+Type of value:
- The location of the configuration files:
+ File name, relative to confdir
- confdir /etc/junkbuster # No trailing /, please.
+Default value:
- The directory where all logging (i.e. logfile and jarfile) takes
- place. No trailing "/", please:
+ default.filter (Unix) or default.filter.txt (Windows)
- logdir /var/log/junkbuster
+Effect if unset:
- Note that all file specifications below are relative to the above two
- directories!
+ No textual content filtering takes place, i.e. all +filter{name} actions in
+ the actions files are turned off
- The "actionsfile" contains patterns to specify the actions to apply to
- requests for each site. Default: Cookies to and from all destinations
- are filtered. Popups are disabled for all sites. All sites are
- filtered if re_filterfile specified. No sites are blocked. An empty
- image is displayed for filtered ads and other images (formerly
- "tinygif"). The syntax of this file is explained in detail [31]below.
+Notes:
- actionsfile actionsfile
+ The "default.filter" file contains content modification rules that use
+ "regular expressions". These rules permit powerful changes on the content
+ of Web pages, e.g., you could disable your favorite JavaScript annoyances,
+ re-write the actual displayed text, or just have some fun replacing
+ "Microsoft" with "MicroSuck" wherever it appears on a Web page.
- The "re_filterfile" file contains content modification rules. These
- rules permit powerful changes on the content of Web pages, e.g., you
- could disable your favourite JavaScript annoyances, rewrite the actual
- content, or just have some fun replacing "Microsoft" with "MicroSuck"
- wherever it appears on a Web page. Default: No content modification,
- or whatever the developers are playing with :-/
+-------------------------------------------------------------------------------
+
+8.1.5. logfile
+
+Specifies:
+
+ The log file to use
+
+Type of value:
- re_filterfile re_filterfile
+ File name, relative to logdir
- The logfile is where all logging and error messages are written. The
- logfile can be useful for tracking down a problem with Junkbuster
- (e.g., it's not blocking an ad you think it should block) but in most
- cases you probably will never look at it.
+Default value:
- Your logfile will grow indefinitely, and you will probably want to
- periodically remove it. On Unix systems, you can do this with a cron
- job (see "man cron"). For Redhat, a logrotate script has been
- included.
+ logfile (Unix) or privoxy.log (Windows)
- On SuSE Linux systems, you can place a line like
- "/var/log/junkbuster.* +1024k 644 nobody.nogroup" in /etc/logfiles,
- with the effect that cron.daily will automatically archive, gzip, and
- empty the log, when it exceeds 1M size.
-
- Default: Log to the a file named logfile. Comment out to disable
- logging.
-
- logfile logfile
+Effect if unset:
- The "jarfile" defines where Junkbuster stores the cookies it
- intercepts. Note that if you use a "jarfile", it may grow quite large.
- Default: Don't store intercepted cookies.
+ No log file is used, all log messages go to the console (stderr).
- #jarfile jarfile
+Notes:
- If you specify a "trustfile", Junkbuster will only allow access to
- sites that are named in the trustfile. You can also mark sites as
- trusted referrers, with the effect that access to untrusted sites will
- be granted, if a link from a trusted referrer was used. The link
- target will then be added to the "trustfile". This is a very
- restrictive feature that typical users most propably want to leave
- disabled. Default: Disabled, don't use the trust mechanism.
+ The windows version will additionally log to the console.
- #trustfile trust
+ The logfile is where all logging and error messages are written. The level
+ of detail and number of messages are set with the debug option (see below).
+ The logfile can be useful for tracking down a problem with Privoxy (e.g.,
+ it's not blocking an ad you think it should block) but in most cases you
+ probably will never look at it.
- If you use the trust mechanism, it is a good idea to write up some
- online documentation about your blocking policy and to specify the
- URL(s) here. They will appear on the page that your users receive when
- they try to access untrusted content. Use multiple times for multiple
- URLs. Default: Don't display links on the "untrusted" info page.
+ Your logfile will grow indefinitely, and you will probably want to
+ periodically remove it. On Unix systems, you can do this with a cron job
+ (see "man cron"). For Red Hat, a logrotate script has been included.
- trust-info-url http://www.your-site.com/why_we_block.html
- trust-info-url http://www.your-site.com/what_we_allow.html
- _________________________________________________________________
+ On SuSE Linux systems, you can place a line like "/var/log/privoxy.* +1024k
+ 644 nobody.nogroup" in /etc/logfiles, with the effect that cron.daily will
+ automatically archive, gzip, and empty the log, when it exceeds 1M size.
-3.1.2. Other Configuration Options
+-------------------------------------------------------------------------------
- This part of the configuration file contains options that control how
- Junkbuster operates.
+8.1.6. jarfile
+
+Specifies:
+
+ The file to store intercepted cookies in
+
+Type of value:
+
+ File name, relative to logdir
- "Admin-address" should be set to the email address of the proxy
- administrator. It is used in many of the proxy-generated pages.
- Default: fill@me.in.please.
+Default value:
+
+ jarfile (Unix) or privoxy.jar (Windows)
+
+Effect if unset:
+
+ Intercepted cookies are not stored at all.
+
+Notes:
+
+ The jarfile may grow to ridiculous sizes over time.
+
+-------------------------------------------------------------------------------
+
+8.1.7. trustfile
+
+Specifies:
- #admin-address fill@me.in.please
-
- "Proxy-info-url" can be set to a URL that contains more info about
- this Junkbuster installation, it's configuration and policies. It is
- used in many of the proxy-generated pages and its use is highly
- recommended in multi-user installations, since your users will want to
- know why certain content is blocked or modified. Default: Don't show a
- link to online documentation.
+ The trust file to use
- proxy-info-url http://www.your-site.com/proxy.html
-
- "Listen-address" specifies the address and port where Junkbuster will
- listen for connections from your Web browser. The default is to listen
- on the localhost port 8000, and this is suitable for most users. (In
- your web browser, under proxy configuration, list the proxy server as
- "localhost" and the port as "8000").
-
- If you already have another service running on port 8000, or if you
- want to serve requests from other machines (e.g. on your local
- network) as well, you will need to override the default. The syntax is
- "listen-address [<ip-address>]:<port>". If you leave out the IP
- address, junkbuster will bind to all interfaces (addresses) on your
- machine and may become reachable from the Internet. In that case,
- consider using access control lists (acl's) (see "aclfile" above), or
- a firewall.
-
- For example, suppose you are running Junkbuster on a machine which has
- the address 192.168.0.1 on your local private network (192.168.0.0)
- and has another outside connection with a different address. You want
- it to serve requests from inside only:
-
- listen-address 192.168.0.1:8000
-
- If you want it to listen on all addresses (including the outside
- connection):
-
- listen-address :8000
-
- If you do this, consider using ACLs (see "aclfile" above). Note: you
- will need to point your browser(s) to the address and port that you
- have configured here. Default: localhost:8000 (127.0.0.1:8000).
-
- The debug option sets the level of debugging information to log in the
- logfile (and to the console in the Windows version). A debug level of
- 1 is informative because it will show you each request as it happens.
- Higher levels of debug are probably only of interest to developers.
-
- debug 1 # GPC = show each GET/POST/CONNECT request
- debug 2 # CONN = show each connection status
- debug 4 # IO = show I/O status
- debug 8 # HDR = show header parsing
- debug 16 # LOG = log all data into the logfile
- debug 32 # FRC = debug force feature
- debug 64 # REF = debug regular expression filter
- debug 128 # = debug fast redirects
- debug 256 # = debug GIF deanimation
- debug 512 # CLF = Common Log Format
- debug 1024 # = debug kill popups
- debug 4096 # INFO = Startup banner and warnings.
- debug 8192 # ERROR = Non-fatal errors
-
- It is highly recommended that you enable ERROR reporting (debug 8192),
- at least until the next stable release.
+Type of value:
- The reporting of FATAL errors (i.e. ones which crash JunkBuster) is
- always on and cannot be disabled.
+ File name, relative to confdir
- If you want to use CLF (Common Log Format), you should set "debug 512"
- ONLY, do not enable anything else.
+Default value:
- Multiple "debug" directives, are OK - they're logical-OR'd together.
+ Unset (commented out). When activated: trust (Unix) or trust.txt (Windows)
- debug 15 # same as setting the first 4 listed above
+Effect if unset:
- Default:
+ The whole trust mechanism is turned off.
- debug 1 # URLs
- debug 4096 # Info
- debug 8192 # Errors - *we highly recommended enabling this*
+Notes:
- Junkbuster normally uses "multi-threading", a software technique that
- permits it to handle many different requests simultaneously. In some
- cases you may wish to disable this -- particularly if you're trying to
- debug a problem. The "single-threaded" option forces Junkbuster to
- handle requests sequentially. Default: Multi-threaded mode.
+ The trust mechanism is an experimental feature for building white-lists and
+ should be used with care. It is NOT recommended for the casual user.
- #single-threaded
+ If you specify a trust file, Privoxy will only allow access to sites that
+ are named in the trustfile. You can also mark sites as trusted referrers
+ (with +), with the effect that access to untrusted sites will be granted,
+ if a link from a trusted referrer was used. The link target will then be
+ added to the "trustfile". Possible applications include limiting Internet
+ access for children.
- "toggle" allows you to temporarily disable all Junkbuster's filtering.
- Just set "toggle 0".
+ If you use + operator in the trust file, it may grow considerably over
+ time.
- The Windows version of Junkbuster puts an icon in the system tray,
- which also allows you to change this option. If you right-click on
- that icon (or select the "Options" menu), one choice is "Enable".
- Clicking on enable toggles Junkbuster on and off. This is useful if
- you want to temporarily disable Junkbuster, e.g., to access a site
- that requires cookies which you normally have blocked. This can also
- be toggled via a web browser at the Junkbuster internal address of
- [32]http://i.j.b./ on any platform.
+-------------------------------------------------------------------------------
+
+8.1.8. user-manual
+
+Specifies:
- "toggle 1" means Junkbuster runs normally, "toggle 0" means that
- Junkbuster becomes a non-anonymizing non-blocking proxy. Default: 1
- (on).
+ Location of the Privoxy User Manual.
- toggle 1
+Type of value:
- For content filtering, i.e. the "+filter" and "+deanimate-gif"
- actions, it is neccessary that Junkbuster buffers the entire document
- body. This can be potentially dangerous, since a server could just
- keep sending data indefinitely and wait for your RAM to exhaust. With
- nasty consequences.
+ A fully qualified URI
- The buffer-limit option lets you set the maximum size in Kbytes that
- each buffer may use. When the documents buffer exceeds this size, it
- is flushed to the client unfiltered and no further attempt to filter
- the rest of it is made. Remember that there may multiple threads
- running, which might require increasing the "buffer-limit" Kbytes
- each, unless you have enabled "single-threaded" above.
+Default value:
- buffer-limit 4069
+ http://www.privoxy.org/user-manual/
- To enable the web-based actionsfile editor set enable-edit-actions to
- 1, or 0 to disable. Note that you must have compiled JunkBuster with
- support for this feature, otherwise this option has no effect. This
- internal page can be reached at [33]http://i.j.b./.
+Effect if unset:
- Security note: If this is enabled, anyone who can use the proxy can
- edit the actions file, and their changes will affect all users. For
- shared proxies, you probably want to disable this. Default: enabled.
+ The default will be used.
- enable-edit-actions 1
+Notes:
- Allow JunkBuster to be toggled on and off remotely, using your web
- browser. Set "enable-remote-toggle"to 1 to enable, and 0 to disable.
- Note that you must have compiled JunkBuster with support for this
- feature, otherwise this option has no effect.
+ The User Manual is used for help hints from some of the internal CGI pages.
+ It is normally packaged with the binary distributions, and would make more
+ sense to have this pointed at a locally installed copy.
- Security note: If this is enabled, anyone who can use the proxy can
- toggle it on or off (see [34]http://i.j.b./), and their changes will
- affect all users. For shared proxies, you probably want to disable
- this. Default: enabled.
+ A more useful example (Unix):
- enable-remote-toggle 1
- _________________________________________________________________
+ user-manual file:///usr/share/doc/privoxy-2.9.14/user-manual/
-3.1.3. Access Control List (ACL)
+-------------------------------------------------------------------------------
+
+8.2. Local Set-up Documentation
+
+If you intend to operate Privoxy for more users that just yourself, it might be
+a good idea to let them know how to reach you, what you block and why you do
+that, your policies etc.
- Access controls are included at the request of some ISPs and systems
- administrators, and are not usually needed by individual users. Please
- note the warnings in the FAQ that this proxy is not intended to be a
- substitute for a firewall or to encourage anyone to defer addressing
- basic security weaknesses.
+-------------------------------------------------------------------------------
+
+8.2.1. trust-info-url
+
+Specifies:
+
+ A URL to be displayed in the error page that users will see if access to an
+ untrusted page is denied.
+
+Type of value:
- If no access settings are specified, the proxy talks to anyone that
- connects. If any access settings file are specified, then the proxy
- talks only to IP addresses permitted somewhere in this file and not
- denied later in this file.
+ URL
- Summary -- if using an ACL:
+Default value:
- Client must have permission to receive service.
+ Two example URL are provided
- LAST match in ACL wins.
+Effect if unset:
- Default behavior is to deny service.
+ No links are displayed on the "untrusted" error page.
- The syntax for an entry in the Access Control List is:
+Notes:
- ACTION SRC_ADDR[/SRC_MASKLEN] [ DST_ADDR[/DST_MASKLEN] ]
+ The value of this option only matters if the experimental trust mechanism
+ has been activated. (See trustfile above.)
- Where the individual fields are:
+ If you use the trust mechanism, it is a good idea to write up some on-line
+ documentation about your trust policy and to specify the URL(s) here. Use
+ multiple times for multiple URLs.
- ACTION = "permit-access" or "deny-access"
- SRC_ADDR = client hostname or dotted IP address
- SRC_MASKLEN = number of bits in the subnet mask for the source
- DST_ADDR = server or forwarder hostname or dotted IP address
- DST_MASKLEN = number of bits in the subnet mask for the target
+ The URL(s) should be added to the trustfile as well, so users don't end up
+ locked out from the information on why they were locked out in the first
+ place!
- The field separator (FS) is whitespace (space or tab).
+-------------------------------------------------------------------------------
+
+8.2.2. admin-address
+
+Specifies:
- IMPORTANT NOTE: If the junkbuster is using a forwarder (see below) or
- a gateway for a particular destination URL, the DST_ADDR that is
- examined is the address of the forwarder or the gateway and NOT the
- address of the ultimate target. This is necessary because it may be
- impossible for the local Junkbuster to determine the address of the
- ultimate target (that's often what gateways are used for).
+ An email address to reach the proxy administrator.
- Here are a few examples to show how the ACL features work:
+Type of value:
- "localhost" is OK -- no DST_ADDR implies that ALL destination
- addresses are OK:
+ Email address
- permit-access localhost
+Default value:
- A silly example to illustrate permitting any host on the class-C
- subnet with Junkbuster to go anywhere:
+ Unset
- permit-access www.junkbusters.com/24
+Effect if unset:
- Except deny one particular IP address from using it at all:
+ No email address is displayed on error pages and the CGI user interface.
- deny-access ident.junkbusters.com
+Notes:
- You can also specify an explicit network address and subnet mask.
- Explicit addresses do not have to be resolved to be used.
+ If both admin-address and proxy-info-url are unset, the whole "Local
+ Privoxy Support" box on all generated pages will not be shown.
+
+-------------------------------------------------------------------------------
+
+8.2.3. proxy-info-url
+
+Specifies:
- permit-access 207.153.200.0/24
+ A URL to documentation about the local Privoxy setup, configuration or
+ policies.
- A subnet mask of 0 matches anything, so the next line permits
- everyone.
+Type of value:
- permit-access 0.0.0.0/0
+ URL
- Note, you cannot say:
+Default value:
- permit-access .org
+ Unset
- to allow all *.org domains. Every IP address listed must resolve
- fully.
+Effect if unset:
- An ISP may want to provide a Junkbuster that is accessible by "the
- world" and yet restrict use of some of their private content to hosts
- on its internal network (i.e. its own subscribers). Say, for instance
- the ISP owns the Class-B IP address block 123.124.0.0 (a 16 bit
- netmask). This is how they could do it:
+ No link to local documentation is displayed on error pages and the CGI user
+ interface.
- permit-access 0.0.0.0/0 0.0.0.0/0 # other clients can go anywhere
- # with the following exceptions
- :
+Notes:
- deny-access 0.0.0.0/0 123.124.0.0/16 # block all external request
- s for
- # sites on the ISP's network
- permit 0.0.0.0/0 www.my_isp.com # except for the ISP's main
- # web site
- permit 123.124.0.0/16 0.0.0.0/0 # the ISP's clients can go
- # anywhere
+ If both admin-address and proxy-info-url are unset, the whole "Local
+ Privoxy Support" box on all generated pages will not be shown.
- Note that if some hostnames are listed with multiple IP addresses, the
- primary value returned by DNS (via gethostbyname()) is used. Default:
- Anyone can access the proxy.
- _________________________________________________________________
+ This URL shouldn't be blocked ;-)
-3.1.4. Forwarding
+-------------------------------------------------------------------------------
+
+8.3. Debugging
+
+These options are mainly useful when tracing a problem. Note that you might
+also want to invoke Privoxy with the --no-daemon command line option when
+debugging.
+
+-------------------------------------------------------------------------------
- This feature allows chaining of HTTP requests via multiple proxies. It
- can be used to better protect privacy and confidentiality when
- accessing specific domains by routing requests to those domains to a
- special purpose filtering proxy such as lpwa.com. Or to use a caching
- proxy to speed up browsing.
+8.3.1. debug
+
+Specifies:
+
+ Key values that determine what information gets logged.
- It can also be used in an environment with multiple networks to route
- requests via multiple gateways allowing transparent access to multiple
- networks without having to modify browser configurations.
+Type of value:
- Also specified here are SOCKS proxies. Junkbuster SOCKS 4 and SOCKS
- 4A. The difference is that SOCKS 4A will resolve the target hostname
- using DNS on the SOCKS server, not our local DNS client.
+ Integer values
- The syntax of each line is:
+Default value:
- forward target_domain[:port] http_proxy_host[:port]
- forward-socks4 target_domain[:port] socks_proxy_host[:port]
- http_proxy_host[:port]
- forward-socks4a target_domain[:port] socks_proxy_host[:port]
- http_proxy_host[:port]
+ 12289 (i.e.: URLs plus informational and warning messages)
- If http_proxy_host is ".", then requests are not forwarded to a HTTP
- proxy but are made directly to the web servers.
+Effect if unset:
- Lines are checked in sequence, and the last match wins.
+ Nothing gets logged.
- There is an implicit line equivalent to the following, which specifies
- that anything not finding a match on the list is to go out without
- forwarding or gateway protocol, like so:
+Notes:
- forward .* . # implicit
+ The available debug levels are:
- In the following common configuration, everything goes to Lucent's
- LPWA, except SSL on port 443 (which it doesn't handle):
+ debug 1 # show each GET/POST/CONNECT request
+ debug 2 # show each connection status
+ debug 4 # show I/O status
+ debug 8 # show header parsing
+ debug 16 # log all data into the logfile
+ debug 32 # debug force feature
+ debug 64 # debug regular expression filter
+ debug 128 # debug fast redirects
+ debug 256 # debug GIF de-animation
+ debug 512 # Common Log Format
+ debug 1024 # debug kill pop-ups
+ debug 4096 # Startup banner and warnings.
+ debug 8192 # Non-fatal errors
- forward .* lpwa.com:8000
- forward :443 .
+ To select multiple debug levels, you can either add them or use multiple
+ debug lines.
- See the FAQ for instructions on how to automate the login procedure
- for LPWA. Some users have reported difficulties related to LPWA's use
- of "." as the last element of the domain, and have said that this can
- be fixed with this:
+ A debug level of 1 is informative because it will show you each request as
+ it happens. 1, 4096 and 8192 are highly recommended so that you will notice
+ when things go wrong. The other levels are probably only of interest if you
+ are hunting down a specific problem. They can produce a hell of an output
+ (especially 16).
- forward lpwa. lpwa.com:8000
+ The reporting of fatal errors (i.e. ones which crash Privoxy) is always on
+ and cannot be disabled.
- (NOTE: the syntax for specifiying target_domain has changed since the
- previous paragraph was written -- it will not work now. More
- information is welcome.)
+ If you want to use CLF (Common Log Format), you should set "debug 512" ONLY
+ and not enable anything else.
+
+-------------------------------------------------------------------------------
+
+8.3.2. single-threaded
+
+Specifies:
- In this fictitious example, everything goes via an ISP's caching
- proxy, except requests to that ISP:
+ Whether to run only one server thread
- forward .* caching.myisp.net:8000
- forward myisp.net .
+Type of value:
- For the @home network, we're told the forwarding configuration is
- this:
+ None
- forward .* proxy:8080
+Default value:
- Also, we're told they insist on getting cookies and JavaScript, so you
- need to add home.com to the cookie file. We consider JavaScript a
- security risk. Java need not be enabled.
+ Unset
- In this example direct connections are made to all "internal" domains,
- but everything else goes through Lucent's LPWA by way of the company's
- SOCKS gateway to the Internet.
+Effect if unset:
- forward_socks4 .* lpwa.com:8000 firewall.my_company.com:1080
- forward my_company.com .
+ Multi-threaded (or, where unavailable: forked) operation, i.e. the ability
+ to serve multiple requests simultaneously.
- This is how you could set up a site that always uses SOCKS but no
- forwarders:
+Notes:
- forward_socks4a .* . firewall.my_company.com:1080
+ This option is only there for debug purposes and you should never need to
+ use it. It will drastically reduce performance.
- An advanced example for network administrators:
+-------------------------------------------------------------------------------
+
+8.4. Access Control and Security
+
+This section of the config file controls the security-relevant aspects of
+Privoxy's configuration.
+
+-------------------------------------------------------------------------------
+
+8.4.1. listen-address
+
+Specifies:
- If you have links to multiple ISPs that provide various special
- content to their subscribers, you can configure forwarding to pass
- requests to the specific host that's connected to that ISP so that
- everybody can see all of the content on all of the ISPs.
+ The IP address and TCP port on which Privoxy will listen for client
+ requests.
- This is a bit tricky, but here's an example:
+Type of value:
- host-a has a PPP connection to isp-a.com. And host-b has a PPP
- connection to isp-b.com. host-a can run a Junkbuster proxy with
- forwarding like this:
+ [IP-Address]:Port
- forward .* .
- forward isp-b.com host-b:8000
+Default value:
- host-b can run a Junkbuster proxy with forwarding like this:
+ localhost:8118
- forward .* .
- forward isp-a.com host-a:8000
+Effect if unset:
- Now, anyone on the Internet (including users on host-a and host-b) can
- set their browser's proxy to either host-a or host-b and be able to
- browse the content on isp-a or isp-b.
+ Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended
+ for home users who run Privoxy on the same machine as their browser.
- Here's another practical example, for University of Kent at Canterbury
- students with a network connection in their room, who need to use the
- University's Squid web cache.
+Notes:
- forward *. ssbcache.ukc.ac.uk:3128 # Use the proxy, except for:
- forward .ukc.ac.uk . # Anything on the same domain as us
- forward * . # Host with no domain specified
- forward 129.12.*.* . # A dotted IP on our /16 network.
- forward 127.*.*.* . # Loopback address
- forward localhost.localdomain . # Loopback address
- forward www.ukc.mirror.ac.uk . # Specific host
+ You will need to configure your browser(s) to this proxy address and port.
- If you intend to chain Junkbuster and squid locally, then chain as
- browser -> squid -> junkbuster is the recommended way.
+ If you already have another service running on port 8118, or if you want to
+ serve requests from other machines (e.g. on your local network) as well,
+ you will need to override the default.
- Your squid configuration could then look like this:
+ If you leave out the IP address, Privoxy will bind to all interfaces
+ (addresses) on your machine and may become reachable from the Internet. In
+ that case, consider using access control lists (ACL's) (see "ACLs" below),
+ or a firewall.
- # Define junkbuster as parent cache
+Example:
- cache_peer 127.0.0.1 parent 8000 0 no-query
+ Suppose you are running Privoxy on a machine which has the address
+ 192.168.0.1 on your local private network (192.168.0.0) and has another
+ outside connection with a different address. You want it to serve requests
+ from inside only:
- # Define ACL for protocol FTP
- acl FTP proto FTP
- # Do not forward ACL FTP to junkbuster
- always_direct allow FTP
- # Do not forward ACL CONNECT (https) to junkbuster
- always_direct allow CONNECT
- # Forward the rest to junkbuster
- never_direct allow all
- _________________________________________________________________
+ listen-address 192.168.0.1:8118
-3.1.5. Windows GUI Options
+-------------------------------------------------------------------------------
- Junkbuster has a number of options specific to the Windows GUI
- interface:
+8.4.2. toggle
+
+Specifies:
+
+ Initial state of "toggle" status
- If "activity-animation" is set to 1, the Junkbuster icon will animate
- when "Junkbuster" is active. To turn off, set to 0.
+Type of value:
- activity-animation 1
+ 1 or 0
- If "log-messages" is set to 1, Junkbuster will log messages to the
- console window:
+Default value:
- log-messages 1
+ 1
- If "log-buffer-size" is set to 1, the size of the log buffer, i.e. the
- amount of memory used for the log messages displayed in the console
- window, will be limited to "log-max-lines" (see below).
+Effect if unset:
- Warning: Setting this to 0 will result in the buffer to grow
- infinitely and eat up all your memory!
+ Act as if toggled on
- log-buffer-size 1
+Notes:
- log-max-lines is the maximum number of lines held in the log buffer.
- See above.
+ If set to 0, Privoxy will start in "toggled off" mode, i.e. behave like a
+ normal, content-neutral proxy. See enable-remote-toggle below. This is not
+ really useful anymore, since toggling is much easier via the web interface
+ than via editing the conf file.
- log-max-lines 200
+ The windows version will only display the toggle icon in the system tray if
+ this option is present.
- If "log-highlight-messages" is set to 1, Junkbuster will highlight
- portions of the log messages with a bold-faced font:
+-------------------------------------------------------------------------------
+
+8.4.3. enable-remote-toggle
+
+Specifies:
- log-highlight-messages 1
+ Whether or not the web-based toggle feature may be used
- The font used in the console window:
+Type of value:
- log-font-name Comic Sans MS
+ 0 or 1
- Font size used in the console window:
+Default value:
- log-font-size 8
+ 1
- "show-on-task-bar" controls whether or not Junkbuster will appear as a
- button on the Task bar when minimized:
+Effect if unset:
- show-on-task-bar 0
+ The web-based toggle feature is disabled.
- If "close-button-minimizes" is set to 1, the Windows close button will
- minimize Junkbuster instead of closing the program (close with the
- exit option on the File menu).
+Notes:
- close-button-minimizes 1
+ When toggled off, Privoxy acts like a normal, content-neutral proxy, i.e.
+ it acts as if none of the actions applied to any URL.
- The "hide-console" option is specific to the MS-Win console version of
- JunkBuster. If this option is used, Junkbuster will disconnect from
- and hide the command console.
+ For the time being, access to the toggle feature can not be controlled
+ separately by "ACLs" or HTTP authentication, so that everybody who can
+ access Privoxy (see "ACLs" and listen-address above) can toggle it for all
+ users. So this option is not recommended for multi-user environments with
+ untrusted users.
- #hide-console
- _________________________________________________________________
+ Note that you must have compiled Privoxy with support for this feature,
+ otherwise this option has no effect.
-3.2. The Actions File
+-------------------------------------------------------------------------------
- The "actionsfile" is used to define what actions Junkbuster takes, and
- thus determines how images, cookies and various other aspects of HTTP
- content and transactions are handled. Images can be anything you want,
- including ads, banners, or just some obnoxious image that you would
- rather not see. Cookies can be accepted or rejected. The default file
- is in fact named actionsfile.
+8.4.4. enable-edit-actions
+
+Specifies:
+
+ Whether or not the web-based actions file editor may be used
- To determine which actions apply to a request, the URL of the request
- is compared to all patterns in this file. Every time it matches, the
- list of applicable actions for the URL is incrementally updated. You
- can trace this process by visiting [35]http://i.j.b/show-url-info.
+Type of value:
- The actions file can be edited with a browser by loading
- [36]http://i.j.b, and then select "Edit Actions".
+ 0 or 1
- There are four types of lines in this file: comments (begin with a "#"
- character), actions, aliases and patterns, all of which are explained
- below, as well as the configuration file syntax that Junkbuster
- understands.
- _________________________________________________________________
+Default value:
-3.2.1. URL Domain and Path Syntax
+ 1
+
+Effect if unset:
+
+ The web-based actions file editor is disabled.
+
+Notes:
+
+ For the time being, access to the editor can not be controlled separately
+ by "ACLs" or HTTP authentication, so that everybody who can access Privoxy
+ (see "ACLs" and listen-address above) can modify its configuration for all
+ users. So this option is not recommended for multi-user environments with
+ untrusted users.
+
+ Note that you must have compiled Privoxy with support for this feature,
+ otherwise this option has no effect.
+
+-------------------------------------------------------------------------------
- Generally, a pattern has the form <domain>/<path>, where both the
- <domain> and <path> part are optional. If you only specify a domain
- part, the "/" can be left out:
+8.4.5. ACLs: permit-access and deny-access
+
+Specifies:
+
+ Who can access what.
- www.example.com - is a domain only pattern and will match any request
- to "www.example.com".
+Type of value:
- www.example.com/ - means exactly the same.
+ src_addr[/src_masklen] [dst_addr[/dst_masklen]]
- www.example.com/index.html - matches only the single document
- "/index.html" on "www.example.com".
+ Where src_addr and dst_addr are IP addresses in dotted decimal notation or
+ valid DNS names, and src_masklen and dst_masklen are subnet masks in CIDR
+ notation, i.e. integer values from 2 to 30 representing the length (in
+ bits) of the network address. The masks and the whole destination part are
+ optional.
- /index.html - matches the document "/index.html", regardless of the
- domain.
+Default value:
- index.html - matches nothing, since it would be interpreted as a
- domain name and there is no top-level domain called ".html".
+ Unset
- The matching of the domain part offers some flexible options: if the
- domain starts or ends with a dot, it becomes unanchored at that end.
- For example:
+Effect if unset:
- .example.com - matches any domain that ENDS in ".example.com".
+ Don't restrict access further than implied by listen-address
- www. - matches any domain that STARTS with "www".
+Notes:
- Additionally, there are wildcards that you can use in the domain names
- themselves. They work pretty similar to shell wildcards: "*" stands
- for zero or more arbitrary characters, "?" stands for any single
- character. And you can define charachter classes in square brackets
- and they can be freely mixed:
+ Access controls are included at the request of ISPs and systems
+ administrators, and are not usually needed by individual users. For a
+ typical home user, it will normally suffice to ensure that Privoxy only
+ listens on the localhost or internal (home) network address by means of the
+ listen-address option.
- ad*.example.com - matches "adserver.example.com", "ads.example.com",
- etc but not "sfads.example.com".
+ Please see the warnings in the FAQ that this proxy is not intended to be a
+ substitute for a firewall or to encourage anyone to defer addressing basic
+ security weaknesses.
- *ad*.example.com - matches all of the above, and then some.
+ Multiple ACL lines are OK. If any ACLs are specified, then the Privoxy
+ talks only to IP addresses that match at least one permit-access line and
+ don't match any subsequent deny-access line. In other words, the last match
+ wins, with the default being deny-access.
- .?pix.com - matches "www.ipix.com", "pictures.epix.com",
- "a.b.c.d.e.upix.com", etc.
+ If Privoxy is using a forwarder (see forward below) for a particular
+ destination URL, the dst_addr that is examined is the address of the
+ forwarder and NOT the address of the ultimate target. This is necessary
+ because it may be impossible for the local Privoxy to determine the IP
+ address of the ultimate target (that's often what gateways are used for).
- www[1-9a-ez].example.com - matches "www1.example.com",
- "www4.example.com", "wwwd.example.com", "wwwz.example.com", etc., but
- not "wwww.example.com".
+ You should prefer using IP addresses over DNS names, because the address
+ lookups take time. All DNS names must resolve! You can not use domain
+ patterns like "*.org" or partial domain names. If a DNS name resolves to
+ multiple IP addresses, only the first one is used.
- If Junkbuster was compiled with "pcre" support (default), Perl
- compatible regular expressions can be used. See the pcre/docs/
- direcory or "man perlre" (also available on
- [37]http://www.perldoc.com/perl5.6/pod/perlre.html) for details. A
- brief discussion of regular expressions is in the [38]Appendix. For
- instance:
+ Denying access to particular sites by ACL may have undesired side effects
+ if the site in question is hosted on a machine which also hosts other
+ sites.
- /.*/advert[0-9]+\.jpe?g - would match a URL from any domain, with any
- path that includes "advert" followed immediately by one or more
- digits, then a "." and ending in either "jpeg" or "jpg". So we match
- "example.com/ads/advert2.jpg", and
- "www.example.com/ads/banners/advert39.jpeg", but not
- "www.example.com/ads/banners/advert39.gif" (no gifs in the example
- pattern).
+Examples:
- Please note that matching in the path is case INSENSITIVE by default,
- but you can switch to case sensitive at any point in the pattern by
- using the "(?-i)" switch:
+ Explicitly define the default behavior if no ACL and listen-address are
+ set: "localhost" is OK. The absence of a dst_addr implies that all
+ destination addresses are OK:
- www.example.com/(?-i)PaTtErN.* - will match only documents whose path
- starts with "PaTtErN" in exactly this capitalization.
- _________________________________________________________________
+ permit-access localhost
-3.2.2. Actions
+ Allow any host on the same class C subnet as www.privoxy.org access to
+ nothing but www.example.com:
+
+ permit-access www.privoxy.org/24 www.example.com/32
+
+ Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere,
+ with the exception that 192.168.45.73 may not access
+ www.dirty-stuff.example.com:
+
+ permit-access 192.168.45.64/26
+ deny-access 192.168.45.73 www.dirty-stuff.example.com
+
+-------------------------------------------------------------------------------
- Actions are enabled if preceded with a "+", and disabled if preceded
- with a "-". Actions are invoked by enclosing the action name in curly
- braces (e.g. {+some_action}), followed by a list of URLs to which the
- action applies. There are three classes of actions:
+8.4.6. buffer-limit
+
+Specifies:
- * Boolean (e.g. "+/-block"):
- {+name} # enable this action
- {-name} # disable this action
-
- * Parameterized (e.g. "+/-hide-user-agent"):
- {+name{param}} # enable action and set parameter to "param"
- {-name} # disable action
-
- * Multi-value (e.g. "{+/-add-header{Name: value}}",
- "{+/-wafer{name=value}}"):
- {+name{param}} # enable action and add parameter "param"
- {-name{param}} # remove the parameter "param"
- {-name} # disable this action totally
-
- If nothing is specified in this file, no "actions" are taken. So in
- this case JunkBuster would just be a normal, non-blocking,
- non-anonymizing proxy. You must specifically enable the privacy and
- blocking features you need (although the provided default actionsfile
- file will give a good starting point).
+ Maximum size of the buffer for content filtering.
- Later defined actions always over-ride earlier ones. For multi-valued
- actions, the actions are applied in the order they are specified.
+Type of value:
- The list of valid Junkbuster "actions" are:
+ Size in Kbytes
- * Add the specified HTTP header, which is not checked for validity.
- You may specify this many times to specify many different headers:
- +add-header{Name: value}
-
- * Block this URL totally.
- +block
-
- * De-animate all animated GIF images, i.e. reduce them to their last
- frame. This will also shrink the images considerably (in bytes,
- not pixels!). If the option "first" is given, the first frame of
- the animation is used as the replacement. If "last" is given, the
- last frame of the animation is used instead, which propably makes
- more sense for most banner animations, but also has the risk of
- not showing the entire last frame (if it is only a delta to an
- earlier frame).
- +deanimate-gifs{last}
- +deanimate-gifs{first}
-
- * "+downgrade" will downgrade HTTP/1.1 client requests to HTTP/1.0
- and downgrade the responses as well. Use this action for servers
- that use HTTP/1.1 protocol features that Junkbuster doesn't handle
- well yet. HTTP/1.1 is only partially implemented. Default is not
- to downgrade requests.
- +downgrade
-
- * Many sites, like yahoo.com, don't just link to other sites.
- Instead, they will link to some script on their own server, giving
- the destination as a parameter, which will then redirect you to
- the final target. URLs resulting from this scheme typically look
- like: http://some.place/some_script?http://some.where-else.
- Sometimes, there are even multiple consecutive redirects encoded
- in the URL. These redirections via scripts make your web browing
- more traceable, since the server from which you follow such a link
- can see where you go to. Apart from that, valuable bandwidth and
- time is wasted, while your browser ask the server for one redirect
- after the other. Plus, it feeds the advertisers.
- The "+fast-redirects" option enables interception of these
- requests by Junkbuster, who will cut off all but the last valid
- URL in the request and send a local redirect back to your browser
- without contacting the remote site.
- +fast-redirects
-
- * Filter the website through the re_filterfile:
- +filter{filename}
-
- * Block any existing X-Forwarded-for header, and do not add a new
- one:
- +hide-forwarded
-
- * If the browser sends a "From:" header containing your e-mail
- address, this either completely removes the header ("block"), or
- changes it to the specified e-mail address.
- +hide-from{block}
- +hide-from{spam@sittingduck.xqq}
-
- * Don't send the "Referer:" (sic) header to the web site. You can
- block it, forge a URL to the same server as the request (which is
- preferred because some sites will not send images otherwise) or
- set it to a constant string of your choice.
- +hide-referer{block}
- +hide-referer{forge}
- +hide-referer{http://nowhere.com}
-
- * Alternative spelling of "+hide-referer". It has the same
- parameters, and can be freely mixed with, "+hide-referer".
- ("referrer" is the correct English spelling, however the HTTP
- specification has a bug - it requires it to be spelled "referer".)
- +hide-referrer{...}
-
- * Change the "User-Agent:" header so web servers can't tell your
- browser type. Warning! This breaks many web sites. Specify the
- user-agent value you want. Example, pretend to be using Netscape
- on Linux:
- +hide-user-agent{Mozilla (X11; I; Linux 2.0.32 i586)}
-
- * Treat this URL as an image. This only matters if it's also
- "+block"ed, in which case a "blocked" image can be sent rather
- than a HTML page. See "+image-blocker{}" below for the control
- over what is actually sent.
- +image
-
- * Decides what to do with URLs that end up tagged with "{+block
- +image}". There are 4 options. "-image-blocker" will send a HTML
- "blocked" page, usually resulting in a "broken image" icon.
- "+image-blocker{logo}" will send a "JunkBuster" image.
- "+image-blocker{blank}" will send a 1x1 transparent GIF image. And
- finally, "+image-blocker{http://xyz.com}" will send a HTTP
- temporary redirect to the specified image. This has the advantage
- of the icon being being cached by the browser, which will speed up
- the display.
- +image-blocker{logo}
- +image-blocker{blank}
- +image-blocker{http://i.j.b/send-banner}
+Default value:
+
+ 4096
+
+Effect if unset:
+
+ Use a 4MB (4096 KB) limit.
+
+Notes:
+
+ For content filtering, i.e. the +filter and +deanimate-gif actions, it is
+ necessary that Privoxy buffers the entire document body. This can be
+ potentially dangerous, since a server could just keep sending data
+ indefinitely and wait for your RAM to exhaust -- with nasty consequences.
+ Hence this option.
+
+ When a document buffer size reaches the buffer-limit, it is flushed to the
+ client unfiltered and no further attempt to filter the rest of the document
+ is made. Remember that there may be multiple threads running, which might
+ require up to buffer-limit Kbytes each, unless you have enabled
+ "single-threaded" above.
+
+-------------------------------------------------------------------------------
+
+8.5. Forwarding
+
+This feature allows routing of HTTP requests through a chain of multiple
+proxies. It can be used to better protect privacy and confidentiality when
+accessing specific domains by routing requests to those domains through an
+anonymous public proxy (see e.g. http://www.multiproxy.org/anon_list.htm) Or to
+use a caching proxy to speed up browsing. Or chaining to a parent proxy may be
+necessary because the machine that Privoxy runs on has no direct Internet
+access.
+
+Also specified here are SOCKS proxies. Privoxy supports the SOCKS 4 and SOCKS
+4A protocols.
+
+-------------------------------------------------------------------------------
+
+8.5.1. forward
+
+Specifies:
+
+ To which parent HTTP proxy specific requests should be routed.
+
+Type of value:
+
+ target_domain[:port] http_parent[/port]
+
+ Where target_domain is a domain name pattern (see the chapter on domain
+ matching in the default.action file), http_parent is the address of the
+ parent HTTP proxy as an IP addresses in dotted decimal notation or as a
+ valid DNS name (or "." to denote "no forwarding", and the optional port
+ parameters are TCP ports, i.e. integer values from 1 to 64535
+
+Default value:
+
+ Unset
+
+Effect if unset:
+
+ Don't use parent HTTP proxies.
+
+Notes:
+
+ If http_parent is ".", then requests are not forwarded to another HTTP
+ proxy but are made directly to the web servers.
+
+ Multiple lines are OK, they are checked in sequence, and the last match
+ wins.
+
+Examples:
+
+ Everything goes to an example anonymizing proxy, except SSL on port 443
+ (which it doesn't handle):
+
+ forward .* anon-proxy.example.org:8080
+ forward :443 .
+
+ Everything goes to our example ISP's caching proxy, except for requests to
+ that ISP's sites:
+
+ forward .*. caching-proxy.example-isp.net:8000
+ forward .example-isp.net .
+
+-------------------------------------------------------------------------------
+
+8.5.2. forward-socks4 and forward-socks4a
+
+Specifies:
+
+ Through which SOCKS proxy (and to which parent HTTP proxy) specific
+ requests should be routed.
+
+Type of value:
+
+ target_domain[:port] socks_proxy[/port] http_parent[/port]
+
+ Where target_domain is a domain name pattern (see the chapter on domain
+ matching in the default.action file), http_parent and socks_proxy are IP
+ addresses in dotted decimal notation or valid DNS names (http_parent may be
+ "." to denote "no HTTP forwarding"), and the optional port parameters are
+ TCP ports, i.e. integer values from 1 to 64535
+
+Default value:
+
+ Unset
+
+Effect if unset:
+
+ Don't use SOCKS proxies.
+
+Notes:
+
+ Multiple lines are OK, they are checked in sequence, and the last match
+ wins.
+
+ The difference between forward-socks4 and forward-socks4a is that in the
+ SOCKS 4A protocol, the DNS resolution of the target hostname happens on the
+ SOCKS server, while in SOCKS 4 it happens locally.
+
+ If http_parent is ".", then requests are not forwarded to another HTTP
+ proxy but are made (HTTP-wise) directly to the web servers, albeit through
+ a SOCKS proxy.
+
+Examples:
+
+ From the company example.com, direct connections are made to all "internal"
+ domains, but everything outbound goes through their ISP's proxy by way of
+ example.com's corporate SOCKS 4A gateway to the Internet.
+
+ forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080
+ forward .example.com .
+
+ A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent
+ looks like this:
+
+ forward-socks4 .*. socks-gw.example.com:1080 .
+
+-------------------------------------------------------------------------------
+
+8.5.3. Advanced Forwarding Examples
+
+If you have links to multiple ISPs that provide various special content only to
+their subscribers, you can configure multiple Privoxies which have connections
+to the respective ISPs to act as forwarders to each other, so that your users
+can see the internal content of all ISPs.
+
+Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP
+connection to isp-b.net. Both run Privoxy. Their forwarding configuration can
+look like this:
+
+host-a:
+
+ forward .*. .
+ forward .isp-b.net host-b:8118
+
+host-b:
+
+ forward .*. .
+ forward .isp-a.net host-a:8118
+
+Now, your users can set their browser's proxy to use either host-a or host-b
+and be able to browse the internal content of both isp-a and isp-b.
+
+If you intend to chain Privoxy and squid locally, then chain as browser ->
+squid -> privoxy is the recommended way.
+
+Assuming that Privoxy and squid run on the same box, your squid configuration
+could then look like this:
+
+ # Define Privoxy as parent proxy (without ICP)
+ cache_peer 127.0.0.1 parent 8118 7 no-query
+
+ # Define ACL for protocol FTP
+ acl ftp proto FTP
+
+ # Do not forward FTP requests to Privoxy
+ always_direct allow ftp
+
+ # Forward all the rest to Privoxy
+ never_direct allow all
+
+You would then need to change your browser's proxy settings to squid's address
+and port. Squid normally uses port 3128. If unsure consult http_port in
+squid.conf.
+
+-------------------------------------------------------------------------------
+
+8.6. Windows GUI Options
+
+Privoxy has a number of options specific to the Windows GUI interface:
+
+If "activity-animation" is set to 1, the Privoxy icon will animate when
+"Privoxy" is active. To turn off, set to 0.
+
+ activity-animation 1
+
+
+If "log-messages" is set to 1, Privoxy will log messages to the console window:
+
+ log-messages 1
+
+
+If "log-buffer-size" is set to 1, the size of the log buffer, i.e. the amount
+of memory used for the log messages displayed in the console window, will be
+limited to "log-max-lines" (see below).
+
+Warning: Setting this to 0 will result in the buffer to grow infinitely and eat
+up all your memory!
+
+ log-buffer-size 1
+
+
+log-max-lines is the maximum number of lines held in the log buffer. See above.
+
+ log-max-lines 200
+
+
+If "log-highlight-messages" is set to 1, Privoxy will highlight portions of the
+log messages with a bold-faced font:
+
+ log-highlight-messages 1
+
+
+The font used in the console window:
+
+ log-font-name Comic Sans MS
+
+
+Font size used in the console window:
+
+ log-font-size 8
+
+
+"show-on-task-bar" controls whether or not Privoxy will appear as a button on
+the Task bar when minimized:
+
+ show-on-task-bar 0
+
+
+If "close-button-minimizes" is set to 1, the Windows close button will minimize
+Privoxy instead of closing the program (close with the exit option on the File
+menu).
+
+ close-button-minimizes 1
+
+
+The "hide-console" option is specific to the MS-Win console version of Privoxy.
+If this option is used, Privoxy will disconnect from and hide the command
+console.
+
+ #hide-console
+
+
+-------------------------------------------------------------------------------
+
+9. Actions Files
+
+The actions files are used to define what actions Privoxy takes for which URLs,
+and thus determines how ad images, cookies and various other aspects of HTTP
+content and transactions are handled, and on which sites (or even parts
+thereof). There are three such files included with Privoxy (as of version
+2.9.15), with differing purposes:
+
+ * standard.action - is used by the web based editor, to set various
+ pre-defined sets of rules for the default actions section in
+ default.action. These have increasing levels of aggressiveness. It is not
+ recommend to edit this file.
+
+ * default.action - is the primary action file that sets the initial values
+ for all actions. It is intended to provide a base level of functionality
+ for Privoxy's array of features. So it is a set of broad rules that should
+ work reasonably well for users everywhere. This is the file that the
+ developers are keeping updated, and making available to users.
+
+ * user.action - is intended to be for local site preferences and exceptions.
+ As an example, if your ISP or your bank has specific requirements, and need
+ special handling, this kind of thing should go here. This file will not be
+ upgraded.
+
+The list of actions files to be used are defined in the main configuration
+file, and are processed in the order they are defined. The content of these can
+all be viewed and edited from http://config.privoxy.org/show-status.
+
+An actions file typically has sections. Near the top, "aliases" are optionally
+defined (discussed below), then the default set of rules which will apply
+universally to all sites and pages. And then below that, exceptions to the
+defined universal policies.
+
+Actions can be used to block anything you want, including ads, banners, or just
+some obnoxious URL that you would rather not see. Cookies can be accepted or
+rejected, or accepted only during the current browser session (i.e. not written
+to disk), content can be modified, JavaScripts tamed, user-tracking fooled, and
+much more. See below for a complete list of actions.
+
+-------------------------------------------------------------------------------
+
+9.1. Finding the Right Mix
+
+Note that some actions, like cookie suppression or script disabling, may render
+some sites unusable that rely on these techniques to work properly. Finding the
+right mix of actions is not always easy and certainly a matter of personal
+taste. In general, it can be said that the more "aggressive" your default
+settings (in the top section of the actions file) are, the more exceptions for
+"trusted" sites you will have to make later. If, for example, you want to kill
+popup windows per default, you'll have to make exceptions from that rule for
+sites that you regularly use and that require popups for actually useful
+content, like maybe your bank, favorite shop, or newspaper.
+
+We have tried to provide you with reasonable rules to start from in the
+distribution actions files. But there is no general rule of thumb on these
+things. There just are too many variables, and sites are constantly changing.
+Sooner or later you will want to change the rules (and read this chapter again
+:).
+
+-------------------------------------------------------------------------------
+
+9.2. How to Edit
+
+The easiest way to edit the "actions" files is with a browser by using our
+browser-based editor, which can be reached from http://config.privoxy.org/
+show-status.
+
+If you prefer plain text editing to GUIs, you can of course also directly edit
+the the actions files.
+
+-------------------------------------------------------------------------------
+
+9.3. How Actions are Applied to URLs
+
+Actions files are divided into sections. There are special sections, like the "
+alias" sections which will be discussed later. For now let's concentrate on
+regular sections: They have a heading line (often split up to multiple lines
+for readability) which consist of a list of actions, separated by whitespace
+and enclosed in curly braces. Below that, there is a list of URL patterns, each
+on a separate line.
+
+To determine which actions apply to a request, the URL of the request is
+compared to all patterns in this file. Every time it matches, the list of
+applicable actions for the URL is incrementally updated, using the heading of
+the section in which the pattern is located. If multiple matches for the same
+URL set the same action differently, the last match wins. If not, the effects
+are aggregated (e.g. a URL might match both the "+handle-as-image" and "+block"
+actions).
+
+You can trace this process by visiting http://config.privoxy.org/show-url-info.
+
+More detail on this is provided in the Appendix, Anatomy of an Action.
+
+-------------------------------------------------------------------------------
+
+9.4. Patterns
+
+Generally, a pattern has the form <domain>/<path>, where both the <domain> and
+<path> are optional. (This is why the pattern / matches all URLs).
+
+www.example.com/
+
+ is a domain-only pattern and will match any request to www.example.com,
+ regardless of which document on that server is requested.
+
+www.example.com
+
+ means exactly the same. For domain-only patterns, the trailing / may be
+ omitted.
+
+www.example.com/index.html
+
+ matches only the single document /index.html on www.example.com.
+
+/index.html
+
+ matches the document /index.html, regardless of the domain, i.e. on any web
+ server.
+
+index.html
+
+ matches nothing, since it would be interpreted as a domain name and there
+ is no top-level domain called .html.
+
+-------------------------------------------------------------------------------
+
+9.4.1. The Domain Pattern
+
+The matching of the domain part offers some flexible options: if the domain
+starts or ends with a dot, it becomes unanchored at that end. For example:
+
+.example.com
+
+ matches any domain that ENDS in .example.com
+
+www.
+
+ matches any domain that STARTS with www.
+
+.example.
+
+ matches any domain that CONTAINS .example. (Correctly speaking: It matches
+ any FQDN that contains example as a domain.)
+
+Additionally, there are wild-cards that you can use in the domain names
+themselves. They work pretty similar to shell wild-cards: "*" stands for zero
+or more arbitrary characters, "?" stands for any single character, you can
+define character classes in square brackets and all of that can be freely
+mixed:
+
+ad*.example.com
+
+ matches "adserver.example.com", "ads.example.com", etc but not
+ "sfads.example.com"
+
+*ad*.example.com
+
+ matches all of the above, and then some.
+
+.?pix.com
+
+ matches www.ipix.com, pictures.epix.com, a.b.c.d.e.upix.com etc.
+
+www[1-9a-ez].example.c*
+
+ matches www1.example.com, www4.example.cc, wwwd.example.cy,
+ wwwz.example.com etc., but not wwww.example.com.
+
+-------------------------------------------------------------------------------
+
+9.4.2. The Path Pattern
+
+Privoxy uses Perl compatible regular expressions (through the PCRE library) for
+matching the path.
+
+There is an Appendix with a brief quick-start into regular expressions, and
+full (very technical) documentation on PCRE regex syntax is available on-line
+at http://www.pcre.org/man.txt. You might also find the Perl man page on
+regular expressions (man perlre) useful, which is available on-line at http://
+www.perldoc.com/perl5.6/pod/perlre.html.
+
+Note that the path pattern is automatically left-anchored at the "/", i.e. it
+matches as if it would start with a "^" (regular expression speak for the
+beginning of a line).
+
+Please also note that matching in the path is case INSENSITIVE by default, but
+you can switch to case sensitive at any point in the pattern by using the "(?
+-i)" switch: www.example.com/(?-i)PaTtErN.* will match only documents whose
+path starts with PaTtErN in exactly this capitalization.
+
+-------------------------------------------------------------------------------
+
+9.5. Actions
+
+All actions are disabled by default, until they are explicitly enabled
+somewhere in an actions file. Actions are turned on if preceded with a "+", and
+turned off if preceded with a "-". So a "+action" means "do that action", e.g.
+"+block" means please "block the following URL patterns".
+
+Actions are invoked by enclosing the action name in curly braces (e.g.
+{+some_action}), followed by a list of URLs (or patterns that match URLs) to
+which the action applies. There are three classes of actions:
+
+ * Boolean, i.e the action can only be "on" or "off". Examples:
+
+ {+name} # enable this action
+ {-name} # disable this action
+
+
+ * Parameterized, e.g. "+/-hide-user-agent{ Mozilla 1.0 }", where some value
+ is required in order to enable this type of action. Examples:
+
+ {+name{param}} # enable action and set parameter to "param"
+ {-name} # disable action ("parameter") can be omitted
+
+
+ * Multi-value, e.g. "{+/-add-header{Name: value}}" or "{+/-send-wafer{name=
+ value}}"), where some value needs to be defined in addition to simply
+ enabling the action. Examples:
+
+ {+name{param=value}} # enable action and set "param" to "value"
+ {-name{param=value}} # remove the parameter "param" completely
+ {-name} # disable this action totally and remove param too
+
+
+If nothing is specified in any actions file, no "actions" are taken. So in this
+case Privoxy would just be a normal, non-blocking, non-anonymizing proxy. You
+must specifically enable the privacy and blocking features you need (although
+the provided default actions files will give a good starting point).
+
+Later defined actions always over-ride earlier ones. So exceptions to any rules
+you make, should come in the latter part of the file (or in a file that is
+processed later when using multiple actions files). For multi-valued actions,
+the actions are applied in the order they are specified. Actions files are
+processed in the order they are defined in config (the default installation has
+three actions files). It also quite possible for any given URL pattern to match
+more than one action!
+
+The list of valid Privoxy "actions" are:
+
+-------------------------------------------------------------------------------
+
+9.5.1. +add-header
+
+Type:
+
+ Multi-value.
+
+Typical uses:
+
+ Send a user defined HTTP header to the web server.
+
+Possible values:
+
+ Any value is possible. Validity of the defined HTTP headers is not checked.
+
+Example usage:
+
+ {+add-header{X-User-Tracking: sucks}}
+ .example.com
+
+
+Notes:
+
+ This action may be specified multiple times, in order to define multiple
+ headers. This is rarely needed for the typical user. If you don't know what
+ "HTTP headers" are, you definitely don't need to worry about this one.
+
+-------------------------------------------------------------------------------
+
+9.5.2. +block
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Used to block a URL from reaching your browser. The URL may be anything,
+ but is typically used to block ads or other obnoxious content.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+block}
+ .banners.example.com
+ .ads.r.us
+
+
+Notes:
+
+ If a URL matches one of the blocked patterns, Privoxy will intercept the
+ URL and display its special "BLOCKED" page instead. If there is sufficient
+ space, a large red banner will appear with a friendly message about why the
+ page was blocked, and a way to go there anyway. If there is insufficient
+ space a smaller "BLOCKED" page will appear without the red banner. Click
+ here to view the default blocked HTML page (Privoxy must be running for
+ this to work as intended!).
+
+ A very important exception is if the URL matches both "+block" and
+ "+handle-as-image", then it will be handled by "+set-image-blocker" (see
+ below). It is important to understand this process, in order to understand
+ how Privoxy is able to deal with ads and other objectionable content.
+
+ The "+filter" action can also perform some of the same functionality as
+ "+block", but by virtue of very different programming techniques, and is
+ most often used for different reasons.
+
+-------------------------------------------------------------------------------
+
+9.5.3. +deanimate-gifs
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ To stop those annoying, distracting animated GIF images.
+
+Possible values:
+
+ "last" or "first"
+
+Example usage:
+
+ {+deanimate-gifs{last}}
+ .example.com
+
+
+Notes:
+
+ De-animate all animated GIF images, i.e. reduce them to their last frame.
+ This will also shrink the images considerably (in bytes, not pixels!). If
+ the option "first" is given, the first frame of the animation is used as
+ the replacement. If "last" is given, the last frame of the animation is
+ used instead, which probably makes more sense for most banner animations,
+ but also has the risk of not showing the entire last frame (if it is only a
+ delta to an earlier frame).
+
+-------------------------------------------------------------------------------
+
+9.5.4. +downgrade-http-version
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ "+downgrade-http-version" will downgrade HTTP/1.1 client requests to HTTP/
+ 1.0 and downgrade the responses as well.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+downgrade-http-version}
+ .example.com
+
+
+Notes:
+
+ Use this action for servers that use HTTP/1.1 protocol features that
+ Privoxy doesn't handle well yet. HTTP/1.1 is only partially implemented.
+ Default is not to downgrade requests. This is an infrequently needed
+ action, and is used to help with rare problem sites only.
+
+-------------------------------------------------------------------------------
+
+9.5.5. +fast-redirects
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ The "+fast-redirects" action enables interception of "redirect" requests
+ from one server to another, which are used to track users.Privoxy can cut
+ off all but the last valid URL in a redirect request and send a local
+ redirect back to your browser without contacting the intermediate site(s).
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+fast-redirects}
+ .example.com
+
+
+Notes:
+
+ Many sites, like yahoo.com, don't just link to other sites. Instead, they
+ will link to some script on their own server, giving the destination as a
+ parameter, which will then redirect you to the final target. URLs resulting
+ from this scheme typically look like: http://some.place/some_script?http://
+ some.where-else.
+
+ Sometimes, there are even multiple consecutive redirects encoded in the
+ URL. These redirections via scripts make your web browsing more traceable,
+ since the server from which you follow such a link can see where you go to.
+ Apart from that, valuable bandwidth and time is wasted, while your browser
+ ask the server for one redirect after the other. Plus, it feeds the
+ advertisers.
+
+ This is a normally "on" feature, and often requires exceptions for sites
+ that are sensitive to defeating this mechanism.
+
+-------------------------------------------------------------------------------
+
+9.5.6. +filter
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ Apply page filtering as defined by named sections of the default.filter
+ file to the specified site(s). "Filtering" can be any modification of the
+ raw page content, including re-writing or deletion of content.
+
+Possible values:
+
+ "+filter" must include the name of one of the section identifiers from
+ default.filter (or whatever filterfile is specified in config).
+
+Example usage (from the current default.filter):
+
+ +filter{html-annoyances}: Get rid of particularly annoying HTML abuse.
+
+ +filter{js-annoyances}: Get rid of particularly annoying JavaScript abuse
+
+ +filter{content-cookies}: Kill cookies that come in the HTML or JS content
+
+ +filter{popups}: Kill all popups in JS and HTML
+
+ +filter{frameset-borders}: Give frames a border and make them resizable
+
+ +filter{webbugs}: Squish WebBugs (1x1 invisible GIFs used for user
+ tracking)
+
+ +filter{refresh-tags}: Kill automatic refresh tags (for dial-on-demand
+ setups)
+
+ +filter{fun}: Text replacements for subversive browsing fun!
+
+ +filter{nimda}: Remove Nimda (virus) code.
+
+ +filter{banners-by-size}: Kill banners by size (very efficient!)
+
+ +filter{shockwave-flash}: Kill embedded Shockwave Flash objects
+
+ +filter{crude-parental}: Kill all web pages that contain the words "sex" or
+ "warez"
+
+Notes:
+
+ This is potentially a very powerful feature! And requires a knowledge of
+ regular expressions if you want to "roll your own". Filtering operates on a
+ line by line basis throughout the entire page.
+
+ Filtering requires buffering the page content, which may appear to slow
+ down page rendering since nothing is displayed until all content has passed
+ the filters. (It does not really take longer, but seems that way since the
+ page is not incrementally displayed.) This effect will be more noticeable
+ on slower connections.
+
+ Filtering can achieve some of the effects as the "+block" action, i.e. it
+ can be used to block ads and banners. In the overall scheme of things,
+ filtering is one of the first things "Privoxy" does with a web page. So
+ other most other actions are applied to the already "filtered" page.
+
+-------------------------------------------------------------------------------
+
+9.5.7. +hide-forwarded-for-headers
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Block any existing X-Forwarded-for HTTP header, and do not add a new one.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+hide-forwarded-for-headers}
+ .example.com
+
+
+Notes:
+
+ It is fairly safe to leave this on. It does not seem to break many sites.
+
+-------------------------------------------------------------------------------
+
+9.5.8. +hide-from-header
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ To block the browser from sending your email address in a "From:" header.
+
+Possible values:
+
+ Keyword: "block", or any user defined value.
+
+Example usage:
+
+ {+hide-from-header{block}}
+ .example.com
+
+
+Notes:
+
+ The keyword "block" will completely remove the header (not to be confused
+ with the "+block" action). Alternately, you can specify any value you
+ prefer to send to the web server.
+
+-------------------------------------------------------------------------------
+
+9.5.9. +hide-referer
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ Don't send the "Referer:" (sic) HTTP header to the web site. Or,
+ alternately send a forged header instead.
+
+Possible values:
+
+ Prevent the header from being sent with the keyword, "block". Or, "forge" a
+ URL to one from the same server as the request. Or, set to user defined
+ value of your choice.
+
+Example usage:
+
+ {+hide-referer{forge}}
+ .example.com
+
+
+Notes:
+
+ "forge" is the preferred option here, since some servers will not send
+ images back otherwise.
+
+ "+hide-referrer" is an alternate spelling of "+hide-referer". It has the
+ exact same parameters, and can be freely mixed with, "+hide-referer".
+ ("referrer" is the correct English spelling, however the HTTP specification
+ has a bug - it requires it to be spelled as "referer".)
+
+-------------------------------------------------------------------------------
+
+9.5.10. +hide-user-agent
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ To change the "User-Agent:" header so web servers can't tell your browser
+ type. Who's business is it anyway?
+
+Possible values:
+
+ Any user defined string.
+
+Example usage:
+
+ {+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}}
+ .msn.com
+
+
+Notes:
+
+ Warning! This breaks many web sites that depend on this in order to
+ determine how the target browser will respond to various requests. Use with
+ caution.
+
+-------------------------------------------------------------------------------
+
+9.5.11. +handle-as-image
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ To define what Privoxy should treat automatically as an image, and is an
+ important ingredient of how ads are handled.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+handle-as-image}
+ /.*\.(gif|jpg|jpeg|png|bmp|ico)
+
+
+Notes:
+
+ This only has meaning if the URL (or pattern) also is "+block"ed, in which
+ case a user definable image can be sent rather than a HTML page. This is
+ integral to the whole concept of ad blocking: the URL must match both a
+ "+block" rule, and "+handle-as-image". (See "+set-image-blocker" below for
+ control over what will actually be displayed by the browser.)
+
+ There is little reason to change the default definition for this action.
+
+-------------------------------------------------------------------------------
+
+9.5.12. +set-image-blocker
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ Decide what to do with URLs that end up tagged with both "+block" and
+ "+handle-as-image", e.g an advertisement.
+
+Possible values:
+
+ There are four available options: "-set-image-blocker" will send a HTML
+ "blocked" page, usually resulting in a "broken image" icon.
+ "+set-image-blocker{blank}" will send a 1x1 transparent GIF image.
+ "+set-image-blocker{pattern}" will send a checkerboard type pattern (the
+ default). And finally, "+set-image-blocker{http://xyz.com}" will send a
+ HTTP temporary redirect to the specified image. This has the advantage of
+ the icon being being cached by the browser, which will speed up the
+ display.
+
+Example usage:
+
+ {+set-image-blocker{blank}}
+ .example.com
+
+
+Notes:
+
+ If you want invisible ads, they need to meet criteria as matching both
+ images and blocked actions. And then, "image-blocker" should be set to
+ "blank" for invisibility. Note you cannot treat HTML pages as images in
+ most cases. For instance, frames require an HTML page to display. So a
+ frame that is an ad, typically cannot be treated as an image. Forcing an
+ "image" in this situation just will not work reliably.
+
+-------------------------------------------------------------------------------
+
+9.5.13. +limit-connect
+
+Type:
+
+ Parameterized.
+
+Typical uses:
+
+ By default, Privoxy only allows HTTP CONNECT requests to port 443 (the
+ standard, secure HTTPS port). Use "+limit-connect" to disable this
+ altogether, or to allow more ports.
+
+Possible values:
+
+ Any valid port number, or port number range.
+
+Example usages:
+
+ +limit-connect{443} #
+ This is the default and need not be specified.
+ +limit-connect{80,443} # Ports 80 and 443 are OK.
+ +limit-connect{-3, 7, 20-100, 500-} #
+ Port less than 3, 7, 20 to 100 and above 500 are OK.
+
+
+Notes:
+
+ The CONNECT methods exists in HTTP to allow access to secure websites
+ (https:// URLs) through proxies. It works very simply: the proxy connects
+ to the server on the specified port, and then short-circuits its
+ connections to the client and to the remote proxy. This can be a big
+ security hole, since CONNECT-enabled proxies can be abused as TCP relays
+ very easily.
+
+ If you want to allow CONNECT for more ports than this, or want to forbid
+ CONNECT altogether, you can specify a comma separated list of ports and
+ port ranges (the latter using dashes, with the minimum defaulting to 0 and
+ max to 65K).
+
+ If you don't know what any of this means, there probably is no reason to
+ change this one.
+
+-------------------------------------------------------------------------------
+
+9.5.14. +prevent-compression
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Prevent the specified websites from compressing HTTP data.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+prevent-compression}
+ .example.com
+
+
+Notes:
+
+ Some websites do this, which can be a problem for Privoxy, since "+filter",
+ "+kill-popups" and "+gif-deanimate" will not work on compressed data. This
+ will slow down connections to those websites, though. Default typically is
+ to turn "prevent-compression" on.
+
+-------------------------------------------------------------------------------
+
+9.5.15. +session-cookies-only
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Allow cookies for the current browser session only.
+
+Possible values:
+
+ N/A
+
+Example usage (disabling):
+
+ {-session-cookies-only}
+ .example.com
+
+
+Notes:
+
+ If websites set cookies, "+session-cookies-only" will make sure they are
+ erased when you exit and restart your web browser. This makes profiling
+ cookies useless, but won't break sites which require cookies so that you
+ can log in for transactions. This is generally turned on for all sites, and
+ is the recommended setting.
+
+ "+prevent-*-cookies" actions should be turned off as well (see below), for
+ "+session-cookies-only" to work. Or, else no cookies will get through at
+ all. For, "persistent" cookies that survive across browser sessions, see
+ below as well.
+
+-------------------------------------------------------------------------------
+
+9.5.16. +prevent-reading-cookies
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Explicitly prevent the web server from reading any cookies on your system.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+prevent-reading-cookies}
+ .example.com
+
+
+Notes:
+
+ Often used in conjunction with "+prevent-setting-cookies" to disable
+ cookies completely. Note that "+session-cookies-only" requires these to
+ both be disabled (or else it never gets any cookies to cache).
+
+ For "persistent" cookies to work (i.e. they survive across browser sessions
+ and reboots), all three cookie settings should be "off" for the specified
+ sites.
+
+-------------------------------------------------------------------------------
+
+9.5.17. +prevent-setting-cookies
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Explicitly block the web server from storing cookies on your system.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+prevent-setting-cookies}
+ .example.com
+
+
+Notes:
+
+ Often used in conjunction with "+prevent-reading-cookies" to disable
+ cookies completely (see above).
+
+-------------------------------------------------------------------------------
+
+9.5.18. +kill-popups
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Stop those annoying JavaScript pop-up windows!
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+kill-popups}
+ .example.com
+
+
+Notes:
+
+ "+kill-popups" uses a built in filter to disable pop-ups that use the
+ window.open() function, etc. This is one of the first actions processed by
+ Privoxy as it contacts the remote web server. This action is not always
+ 100% reliable, and is supplemented by "+filter{popups}".
+
+-------------------------------------------------------------------------------
+
+9.5.19. +send-vanilla-wafer
+
+Type:
+
+ Boolean.
+
+Typical uses:
+
+ Sends a cookie for every site stating that you do not accept any copyright
+ on cookies sent to you, and asking them not to track you.
+
+Possible values:
+
+ N/A
+
+Example usage:
+
+ {+send-vanilla-wafer}
+ .example.com
+
+
+Notes:
+
+ This action only applies if you are using a jarfile for saving cookies. Of
+ course, this is a (relatively) unique header and could conceivably be used
+ to track you.
+
+-------------------------------------------------------------------------------
+
+9.5.20. +send-wafer
+
+Type:
+
+ Multi-value.
+
+Typical uses:
+
+ This allows you to send an arbitrary, user definable cookie.
+
+Possible values:
+
+ User specified cookie name and corresponding value.
+
+Example usage:
+
+ {+send-wafer{name=value}}
+ .example.com
+
+
+Notes:
+
+ This can be specified multiple times in order to add as many cookies as you
+ like.
+
+-------------------------------------------------------------------------------
+
+9.5.21. Summary
+
+Note that many of these actions have the potential to cause a page to
+misbehave, possibly even not to display at all. There are many ways a site
+designer may choose to design his site, and what HTTP header content, and other
+criteria, he may depend on. There is no way to have hard and fast rules for all
+sites. See the Appendix for a brief example on troubleshooting actions.
+
+-------------------------------------------------------------------------------
+
+9.5.22. Sample Actions Files
+
+Remember that the meaning of any of the above references is reversed by
+preceding the action with a "-", in place of the "+". Also, that some actions
+are turned on in the default section of the actions file, and require little to
+no additional configuration. These are just "on".
+
+But, other actions that are turned on in the default section do typically
+require exceptions to be listed in the latter sections of one of our actions
+file. For instance, by default no URLs are "blocked" (i.e. in the default
+definitions of default.action). We need exceptions to this in order to enable
+ad blocking in the lower sections. But we need to be very selective about what
+we do block. Thus, the default is "off" for blocking.
+
+Below is a liberally commented sample default.action file to demonstrate how
+all the pieces come together. And to show how exceptions to the default
+policies can be handled. This is followed by a brief user.action with similar
+examples.
+
+# Sample default.action file <developers@privoxy.org>
+
+# Settings -- Don't change! For internal Privoxy use ONLY.
+{{settings}}
+for-privoxy-version=3.0
+
+
+##########################################################################
+# Aliases must be defined *before* they are used. These are
+# easier to remember, and can combine several actions into one. Once
+# defined they can be used just like any built-in action -- but within
+# this file only! Aliases do not require a + or - sign.
+##########################################################################
+
+# Some useful aliases.
+# Alias to turn off cookie handling, ie allow all cookies unmolested.
+ -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies \
+ -session-cookies-only
+
+# Alias to both block and treat as if an image for ad blocking
+# purposes.
+ +imageblock = +block +handle-as-image
+
+# Fragile sites should have the minimum changes:
+ fragile = -block -deanimate-gifs -fast-redirects -filter -hide-referer \
+ -prevent-cookies -kill-popups
+
+# Shops should be allowed to set persistent cookies
+ shop = -filter -prevent-cookies -session-cookies-only
+
+
+##########################################################################
+# Begin default action settings. Anything in this section will match
+# all URLs -- UNLESS we have exceptions that also match, defined below this
+# section. We will show all potential actions here whether they are on
+# or off. We could omit any disabled action if we wanted, since all
+# actions are 'off' by default anyway. Shown for completeness only.
+# Actions are enabled if preceded by a '+', otherwise they are disabled
+# (unless an alias has been defined without this).
+##########################################################################
+ { \
+ -add-header \
+ -block \
+ -deanimate-gifs \
+ -downgrade-http-version \
+ +fast-redirects \
+ +filter{html-annoyances} \
+ +filter{js-annoyances} \
+ -filter{content-cookies} \
+ -filter{popups} \
+ +filter{webbugs} \
+ -filter{refresh-tags} \
+ -filter{fun} \
+ +filter{nimda} \
+ +filter{banners-by-size} \
+ -filter{shockwave-flash} \
+ -filter{crude-prental} \
+ +hide-forwarded-for-headers \
+ +hide-from-header{block} \
+ -hide-referrer \
+ -hide-user-agent \
+ -handle-as-image \
+ +set-image-blocker{pattern} \
+ -limit-connect \
+ +prevent-compression \
+ -session-cookies-only \
+ -prevent-reading-cookies \
+ -prevent-setting-cookies \
+ -kill-popups \
+ -send-vanilla-wafer \
+ -send-wafer \
+ }
+ / # forward slash will match *all* potential URL patterns.
+
+##########################################################################
+# Default behavior is now set. Now we will define some exceptions to our
+# default action policies.
+##########################################################################
+
+# These sites are very complex and require very minimal interference.
+# We'll disable most actions with our 'fragile' alias:
+ { fragile }
+ .office.microsoft.com # surprise, surprise!
+ .windowsupdate.microsoft.com
+
+
+# Shopping sites - not as fragile but require some special
+# handling. We still want to block ads, and we will allow
+# persistant cookies via the 'shop' alias:
+ { shop }
+ .quietpc.com
+ .worldpay.com # for quietpc.com
+ .jungle.com
+ .scan.co.uk
+
+
+# These sites require pop-ups too :( We'll combine our 'shop'
+# alias with two other actions into one rule to allow all popups.
+ { shop -kill-popups -filter{popups} }
+ .dabs.com
+ .overclockers.co.uk
+
+
+# The 'Fast-redirects' action breaks some sites. Disable this action
+# for these known sensitive sites:
+ { -fast-redirects }
+ login.yahoo.com
+ edit.europe.yahoo.com
+ .google.com
+ .altavista.com/.*(like|url|link):http
+ .altavista.com/trans.*urltext=http
+ .nytimes.com
+
+
+# Define which file types will be treated as images. Important
+# for ad blocking.
+ { +handle-as-image }
+ /.*\.(gif|jpe?g|png|bmp|ico)
+
+
+# Now lets list some domains that are known ad generators. And
+# our alias that we use here will block these as well as force
+# them to be treated as images. This combination of actions is
+# important for ad blocking. What the browser will show instead is
+# determined by the setting of "+set-image-blocker"
+ { +imageblock }
+ ar.atwola.com
+ .ad.doubleclick.net
+ .a.yimg.com/(?:(?!/i/).)*$
+ .a[0-9].yimg.com/(?:(?!/i/).)*$
+ bs*.gsanet.com
+ bs*.einets.com
+ .qkimg.net
+ ad.*.doubleclick.net
+
+
+# These will just simply be blocked. They will generate the BLOCKED
+# banner page, if matched. Heavy use of wildcards and regular
+# expressions in this example. Enable block action:
+ { +block }
+ ad*.
+ .*ads.
+ banner?.
+ count*.
+ /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
+ /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
+ .hitbox.com
+
+
+# The above block section will probably inadvertantly catch some
+# sites we DO NOT want blocked via the wildcards and regular expressions.
+# Now let's set exceptions to the exceptions so the good guys get better
+# treatment. Disable block action:
+ { -block }
+ advogato.org
+ adsl.
+ ad[ud]*.
+ advice.
+# Let's just trust all .edu top level domains.
+ .edu
+ www.ugu.com/sui/ugu/adv
+# We'll need to access to path names containing 'download'
+ .*downloads.
+ /downloads/
+# 'adv' is for globalintersec and means advanced, not advertisement
+ www.globalintersec.com/adv
+
+
+# Don't filter *anything* from our friends at sourceforge.
+# Notice we don't have to name the individual filter
+# identifiers -- we just turn them all off in one fell swoop.
+# Disable all filters for this one site:
+ { -filter }
+ .sourceforge.net
+
+
+So far we are painting with a broad brush by setting general policies. The
+above would be a reasonable starting point for many situations. Now, we want to
+be more specific and have customized rules that are more suitable to our
+personal habits and preferences. These would be for narrowly defined situations
+like your ISP or your bank, and should be placed in user.action, which is
+parsed after all other actions files and should not be clobbered by upgrades.
+So any settings here, will have the last word and over-ride any previously
+defined actions.
+
+Now a few examples of some things that one might do with a user.action file.
+
+# Sample user.action file.
+
+# Any aliases you want to use need to be re-defined here.
+# Alias to turn off cookie handling, ie allow all cookies unmolested.
+ -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies \
+ -session-cookies-only
+
+# Fragile sites should have the minimum changes:
+ fragile = -block -deanimate-gifs -fast-redirects -filter -hide-referer \
+ -prevent-cookies -kill-popups
+
+# Allow persistent cookies for a few regular sites that we
+# trust via our above alias. These will be saved from one browser session
+# to the next. We are explicity turning off any and all cookie handling,
+# even though the prevent-*-cookie settings were disabled in our above
+# default.action anyway. So cookies from these domains will come through
+# unmolested.
+ { -prevent-cookies }
+ .sun.com
+ .yahoo.com
+ .msdn.microsoft.com
+ .redhat.com
+
+
+# My ISP uses obnoxious self promoting images on many pages.
+# Nuke them :) Note that "+handle-as-image" need not be specified,
+# since all URLs ending in .gif will be tagged as images by the
+# general rules in default.action anyway.
+ { +block }
+ www.my-isp-example.com/logo[0-9].gif
+
+
+# Say the site where you do your homebanking needs to open
+# popup windows, but you have chosen to kill popups by
+# default. This will allow it for your-example-bank.com:
+#
+ { -filter{popups} -kill-popups }
+ .my-example-bank.com
+
+
+# This site is delicate, and requires kid-glove
+# treatment.
+ { fragile }
+ .forbes.com
+
+
+-------------------------------------------------------------------------------
+
+9.6. Aliases
+
+Custom "actions", known to Privoxy as "aliases", can be defined by combining
+other "actions". These can in turn be invoked just like the built-in "actions".
+Currently, an alias can contain any character except space, tab, "=", "{" or "}
+". But please use only "a"- "z", "0"-"9", "+", and "-". Alias names are not
+case sensitive, and must be defined before other actions in the actions file!
+And there can only be one set of "aliases" defined per file. Each actions file
+may have its own aliases, but they are only visible within that file. Aliases
+do not requir a "+" or "-" sign in front, since they are merely expanded.
+
+Now let's define a few aliases:
+
+ # Useful custom aliases we can use later. These must come first!
+ {{alias}}
+ +prevent-cookies = +prevent-setting-cookies +prevent-reading-cookies
+ -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies
+ fragile =
+ -block -prevent-cookies -filter -fast-redirects -hide-referer -kill-popups
+ shop = -prevent-cookies -filter -fast-redirects
+ +imageblock = +block +handle-as-image
+
+ # Aliases defined from other aliases, for people who don't like to type
+ # too much: ;-)
+ c0 = +prevent-cookies
+ c1 = -prevent-cookies
+ #... etc. Customize to your heart's content.
+
+
+Some examples using our "shop" and "fragile" aliases from above. These would
+appear in the lower sections of an actions file as exceptions to the default
+actions (as defined in the upper section):
+
+ # These sites are very complex and require
+ # minimal interference.
+ {fragile}
+ .office.microsoft.com
+ .windowsupdate.microsoft.com
+ .nytimes.com
+
+ # Shopping sites - but we still want to block ads.
+ {shop}
+ .quietpc.com
+ .worldpay.com # for quietpc.com
+ .scan.co.uk
+
+ # These shops require pop-ups also
+ {shop -kill-popups}
+ .dabs.com
+ .overclockers.co.uk
+
+
+The "shop" and "fragile" aliases are often used for "problem" sites that
+require most actions to be disabled in order to function properly.
+
+-------------------------------------------------------------------------------
+
+10. The Filter File
+
+Any web page can be dynamically modified with the filter file. This
+modification can be removal, or re-writing, of any web page content, including
+tags and non-visible content. The default filter file is oddly enough
+default.filter, located in the config directory.
+
+This is potentially a very powerful feature, and requires knowledge of both
+"regular expression" and HTML in order create custom filters. But, there are a
+number of useful filters included with Privoxy for many common situations.
+
+The included example file is divided into sections. Each section begins with
+the FILTER keyword, followed by the identifier for that section, e.g. "FILTER:
+webbugs". Each section performs a similar type of filtering, such as
+"html-annoyances".
+
+This file uses regular expressions to alter or remove any string in the target
+page. The expressions can only operate on one line at a time. Some examples
+from the included default default.filter:
+
+Stop web pages from displaying annoying messages in the status bar by deleting
+such references:
+
+ FILTER: html-annoyances
+
+ # New browser windows should be resizeable and have a location and status
+ # bar. Make it so.
+ #
+ s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig
+ s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig
+ s/scrolling="?(no|0|Auto)"?/scrolling=1/ig
+ s/menubar="?(no|0)"?/menubar=1/ig
+
+ # The <BLINK> tag was a crime!
+ #
+ s*<blink>|</blink>**ig
+
+ # Is this evil?
+ #
+ #s/framespacing="?(no|0)"?//ig
+ #s/margin(height|width)=[0-9]*//gi
+
+
+Just for kicks, replace any occurrence of "Microsoft" with "MicroSuck", and
+have a little fun with topical buzzwords:
+
+ FILTER: fun
+
+ s/microsoft(?!.com)/MicroSuck/ig
+
+ # Buzzword Bingo:
+ #
+ s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></
+font>/ig
+
+
+Kill those pesky little web-bugs:
+
+ # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking)
+ FILTER: webbugs
+
+ s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1
+(\D[^>]*?)?>/<!-- Squished WebBug -->/sig
+
+
+-------------------------------------------------------------------------------
+
+10.1. The +filter Action
+
+Filters are enabled with the "+filter" action from within one of the actions
+files. "+filter" requires one parameter, which should match one of the section
+identifiers in the filter file itself. Example:
+
+ +filter{html-annoyances}
+
+This would activate that particular filter. Similarly, "+filter" can be turned
+off for selected sites as: "-filter{html-annoyances}". Remember too, all
+actions are off by default, unless they are explicity enabled in one of the
+actions files.
+
+-------------------------------------------------------------------------------
+
+11. Templates
+
+When Privoxy displays one of its internal pages, such as a 404 Not Found error
+page (Privoxy must be running for link to work as intended), it uses the
+appropriate template. On Linux, BSD, and Unix, these are located in /etc/
+privoxy/templates by default. These may be customized, if desired.
+cgi-style.css is used to control the HTML attributes (fonts, etc).
+
+The default Blocked (Privoxy needs to be running for page to display) banner
+page with the bright red top banner, is called just "blocked". This may be
+customized or replaced with something else if desired.
+
+-------------------------------------------------------------------------------
+
+12. Contacting the Developers, Bug Reporting and Feature Requests
+
+We value your feedback. However, to provide you with the best support, please
+note the following sections.
+
+-------------------------------------------------------------------------------
+
+12.1. Get Support
+
+To get support, use the Sourceforge Support Forum:
+
+ http://sourceforge.net/tracker/?group_id=11118&atid=211118
+
+-------------------------------------------------------------------------------
+
+12.2. Report bugs
+
+To submit bugs, use the Sourceforge Bug Forum:
+
+ http://sourceforge.net/tracker/?group_id=11118&atid=111118.
+
+Make sure that the bug has not already been submitted. Please try to verify
+that it is a Privoxy bug, and not a browser or site bug first. If you are using
+your own custom configuration, please try the stock configs to see if the
+problem is a configuration related bug. And if not using the latest development
+snapshot, please try the latest one. Or even better, CVS sources. Please be
+sure to include the Privoxy version, platform, browser, any pertinent log data,
+any other relevant details (please be specific) and, if possible, some way to
+reproduce the bug.
+
+-------------------------------------------------------------------------------
+
+12.3. Request new features
+
+To submit ideas on new features, use the Sourceforge feature request forum:
+
+ http://sourceforge.net/tracker/?atid=361118&group_id=11118&func=browse.
+
+-------------------------------------------------------------------------------
+
+12.4. Report ads or other filter problems
+
+You can also send feedback on websites that Privoxy has problems with. Please
+bookmark the following link: "Privoxy - Submit Filter Feedback". Once you surf
+to a page with problems, use the bookmark to send us feedback. We will look
+into the issue as soon as possible.
+
+New, improved default.action files will occasionally be made available based on
+your feedback. These will be announced on the ijbswa-announce list.
+
+-------------------------------------------------------------------------------
+
+12.5. Other
+
+For any other issues, feel free to use the mailing lists:
+
+ http://sourceforge.net/mail/?group_id=11118.
+
+Anyone interested in actively participating in development and related
+discussions can also join the appropriate mailing list. Archives are available,
+too. See the page on Sourceforge.
+
+-------------------------------------------------------------------------------
+
+13. Copyright and History
+
+13.1. Copyright
+
+Privoxy is free software; you can redistribute it and/or modify it under the
+terms of the GNU General Public License as published by the Free Software
+Foundation; either version 2 of the License, or (at your option) any later
+version.
+
+This program is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
+PARTICULAR PURPOSE. See the GNU General Public License for more details, which
+is available from the Free Software Foundation, Inc, 59 Temple Place - Suite
+330, Boston, MA 02111-1307, USA.
+
+You should have received a copy of the GNU General Public License along with
+this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+Place, Suite 330, Boston, MA 02111-1307 USA.
+
+-------------------------------------------------------------------------------
+
+13.2. History
+
+Privoxy is evolved, and derived from, the Internet Junkbuster, with many
+improvments and enhancements over the original.
+
+Junkbuster was originally written by Anonymous Coders and Junkbusters
+Corporation, and was released as free open-source software under the GNU GPL.
+Stefan Waldherr made many improvements, and started the SourceForge project
+Privoxy to rekindle development. There are now several active developers
+contributing. The last stable release of Junkbuster was v2.0.2, which has now
+grown whiskers ;-).
+
+-------------------------------------------------------------------------------
+
+14. See Also
+
+Other references and sites of interest to Privoxy users:
+
+http://www.privoxy.org/, The Privoxy Home page.
+
+http://sourceforge.net/projects/ijbswa, the Project Page for Privoxy on
+Sourceforge.
+
+http://p.p/, access Privoxy from your browser. Alternately, http://
+config.privoxy.org may work in some situations where the first does not.
+
+http://p.p/, and select "Privoxy - Submit Filter Feedback" to submit "misses"
+to the developers.
+
+http://www.junkbusters.com/ht/en/cookies.html
+
+http://www.waldherr.org/junkbuster/
+
+http://privacy.net/analyze/
+
+http://www.squid-cache.org/
+
+
+
+-------------------------------------------------------------------------------
+
+15. Appendix
+
+15.1. Regular Expressions
+
+Privoxy can use "regular expressions" in various config files. Assuming support
+for "pcre" (Perl Compatible Regular Expressions) is compiled in, which is the
+default. Such configuration directives do not require regular expressions, but
+they can be used to increase flexibility by matching a pattern with wild-cards
+against URLs.
+
+If you are reading this, you probably don't understand what "regular
+expressions" are, or what they can do. So this will be a very brief
+introduction only. A full explanation would require a book ;-)
+
+"Regular expressions" is a way of matching one character expression against
+another to see if it matches or not. One of the "expressions" is a literal
+string of readable characters (letter, numbers, etc), and the other is a
+complex string of literal characters combined with wild-cards, and other
+special characters, called meta-characters. The "meta-characters" have special
+meanings and are used to build the complex pattern to be matched against. Perl
+Compatible Regular Expressions is an enhanced form of the regular expression
+language with backward compatibility.
+
+To make a simple analogy, we do something similar when we use wild-card
+characters when listing files with the dir command in DOS. *.* matches all
+filenames. The "special" character here is the asterisk which matches any and
+all characters. We can be more specific and use ? to match just individual
+characters. So "dir file?.text" would match "file1.txt", "file2.txt", etc. We
+are pattern matching, using a similar technique to "regular expressions"!
+
+Regular expressions do essentially the same thing, but are much, much more
+powerful. There are many more "special characters" and ways of building complex
+patterns however. Let's look at a few of the common ones, and then some
+examples:
+
+. - Matches any single character, e.g. "a", "A", "4", ":", or "@".
+
+? - The preceding character or expression is matched ZERO or ONE times. Either/
+or.
+
++ - The preceding character or expression is matched ONE or MORE times.
+
+* - The preceding character or expression is matched ZERO or MORE times.
+
+\ - The "escape" character denotes that the following character should be taken
+literally. This is used where one of the special characters (e.g. ".") needs to
+be taken literally and not as a special meta-character. Example: "example
+\.com", makes sure the period is recognized only as a period (and not expanded
+to its meta-character meaning of any single character).
+
+[] - Characters enclosed in brackets will be matched if any of the enclosed
+characters are encountered. For instance, "[0-9]" matches any numeric digit
+(zero through nine). As an example, we can combine this with "+" to match any
+digit one of more times: "[0-9]+".
+
+() - parentheses are used to group a sub-expression, or multiple
+sub-expressions.
+
+| - The "bar" character works like an "or" conditional statement. A match is
+successful if the sub-expression on either side of "|" matches. As an example:
+"/(this|that) example/" uses grouping and the bar character and would match
+either "this example" or "that example", and nothing else.
+
+s/string1/string2/g - This is used to rewrite strings of text. "string1" is
+replaced by "string2" in this example. There must of course be a match on
+"string1" first.
+
+These are just some of the ones you are likely to use when matching URLs with
+Privoxy, and is a long way from a definitive list. This is enough to get us
+started with a few simple examples which may be more illuminating:
+
+/.*/banners/.* - A simple example that uses the common combination of "." and "
+*" to denote any character, zero or more times. In other words, any string at
+all. So we start with a literal forward slash, then our regular expression
+pattern (".*") another literal forward slash, the string "banners", another
+forward slash, and lastly another ".*". We are building a directory path here.
+This will match any file with the path that has a directory named "banners" in
+it. The ".*" matches any characters, and this could conceivably be more forward
+slashes, so it might expand into a much longer looking path. For example, this
+could match: "/eye/hate/spammers/banners/annoy_me_please.gif", or just "/
+banners/annoying.html", or almost an infinite number of other possible
+combinations, just so it has "banners" in the path somewhere.
+
+A now something a little more complex:
+
+/.*/adv((er)?ts?|ertis(ing|ements?))?/ - We have several literal forward
+slashes again ("/"), so we are building another expression that is a file path
+statement. We have another ".*", so we are matching against any conceivable
+sub-path, just so it matches our expression. The only true literal that must
+match our pattern is adv, together with the forward slashes. What comes after
+the "adv" string is the interesting part.
+
+Remember the "?" means the preceding expression (either a literal character or
+anything grouped with "(...)" in this case) can exist or not, since this means
+either zero or one match. So "((er)?ts?|ertis(ing|ements?))" is optional, as
+are the individual sub-expressions: "(er)", "(ing|ements?)", and the "s". The "
+|" means "or". We have two of those. For instance, "(ing|ements?)", can expand
+to match either "ing" OR "ements?". What is being done here, is an attempt at
+matching as many variations of "advertisement", and similar, as possible. So
+this would expand to match just "adv", or "advert", or "adverts", or
+"advertising", or "advertisement", or "advertisements". You get the idea. But
+it would not match "advertizements" (with a "z"). We could fix that by changing
+our regular expression to: "/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/", which
+would then match either spelling.
+
+/.*/advert[0-9]+\.(gif|jpe?g) - Again another path statement with forward
+slashes. Anything in the square brackets "[]" can be matched. This is using
+"0-9" as a shorthand expression to mean any digit one through nine. It is the
+same as saying "0123456789". So any digit matches. The "+" means one or more of
+the preceding expression must be included. The preceding expression here is
+what is in the square brackets -- in this case, any digit one through nine.
+Then, at the end, we have a grouping: "(gif|jpe?g)". This includes a "|", so
+this needs to match the expression on either side of that bar character also. A
+simple "gif" on one side, and the other side will in turn match either "jpeg"
+or "jpg", since the "?" means the letter "e" is optional and can be matched
+once or not at all. So we are building an expression here to match image GIF or
+JPEG type image file. It must include the literal string "advert", then one or
+more digits, and a "." (which is now a literal, and not a special character,
+since it is escaped with "\"), and lastly either "gif", or "jpeg", or "jpg".
+Some possible matches would include: "//advert1.jpg", "/nasty/ads/
+advert1234.gif", "/banners/from/hell/advert99.jpg". It would not match
+"advert1.gif" (no leading slash), or "/adverts232.jpg" (the expression does not
+include an "s"), or "/advert1.jsp" ("jsp" is not in the expression anywhere).
+
+s/microsoft(?!.com)/MicroSuck/i - This is a substitution. "MicroSuck" will
+replace any occurrence of "microsoft". The "i" at the end of the expression
+means ignore case. The "(?!.com)" means the match should fail if "microsoft" is
+followed by ".com". In other words, this acts like a "NOT" modifier. In case
+this is a hyperlink, we don't want to break it ;-).
+
+We are barely scratching the surface of regular expressions here so that you
+can understand the default Privoxy configuration files, and maybe use this
+knowledge to customize your own installation. There is much, much more that can
+be done with regular expressions. Now that you know enough to get started, you
+can learn more on your own :/
+
+More reading on Perl Compatible Regular expressions: http://www.perldoc.com/
+perl5.6/pod/perlre.html
+
+-------------------------------------------------------------------------------
+
+15.2. Privoxy's Internal Pages
+
+Since Privoxy proxies each requested web page, it is easy for Privoxy to trap
+certain special URLs. In this way, we can talk directly to Privoxy, and see how
+it is configured, see how our rules are being applied, change these rules and
+other configuration options, and even turn Privoxy's filtering off, all with a
+web browser.
+
+The URLs listed below are the special ones that allow direct access to Privoxy.
+Of course, Privoxy must be running to access these. If not, you will get a
+friendly error message. Internet access is not necessary either.
+
+ * Privoxy main page:
+
+ http://config.privoxy.org/
- * By default (i.e. in the absence of a "+limit-connect" action),
- Junkbuster will only allow CONNECT requests to port 443, which is
- the standard port for https as a precaution.
- The CONNECT methods exists in HTTP to allow access to secure
- websites (https:// URLs) through proxies. It works very simply:
- the proxy connects to the server on the specified port, and then
- short-circuits its connections to the client and to the remote
- proxy. This can be a big security hole, since CONNECT-enabled
- proxies can be abused as TCP relays very easily.
- If you want to allow CONNECT for more ports than this, or want to
- forbid CONNECT altogether, you can specify a comma separated list
- of ports and port ranges (the latter using dashes, with the
- minimum defaulting to 0 and max to 65K):
- +limit-connect{443} # This is the default and need no be
- specified.
- +limit-connect{80,443} # Ports 80 and 443 are OK.
- +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to
- 100
- #and above 500 are OK.
+ Alternately, this may be reached at http://p.p/, but this variation may not
+ work as reliably as the above in some configurations.
+
+ * Show information about the current configuration, including viewing and
+ editing of actions files:
+
+ http://config.privoxy.org/show-status
- * "+no-compression" prevents the website from compressing the data.
- Some websites do this, which can be a problem for Junkbuster,
- since "+filter", "+no-popup" and "+gif-deanimate" will not work on
- compressed data. This will slow down connections to those
- websites, though. Default is "nocompression" is turned on.
- +nocompression
+ * Show the source code version numbers:
+
+ http://config.privoxy.org/show-version
- * Prevent the website from reading cookies:
- +no-cookies-read
+ * Show the browser's request headers:
+
+ http://config.privoxy.org/show-request
- * Prevent the website from setting cookies:
- +no-cookies-set
+ * Show which actions apply to a URL and why:
+
+ http://config.privoxy.org/show-url-info
- * Filter the website through a built-in filter to disable those
- obnoxious JavaScript pop-up windows via window.open(), etc. The
- two alternative spellings are equivalent.
- +no-popup
- +no-popups
+ * Toggle Privoxy on or off. In this case, "Privoxy" continues to run, but
+ only as a pass-through proxy, with no actions taking place:
+
+ http://config.privoxy.org/toggle
- * This action only applies if you are using a jarfile for saving
- cookies. It sends a cookie to every site stating that you do not
- accept any copyright on cookies sent to you, and asking them not
- to track you. Of course, this is a (relatively) unique header they
- could use to track you.
- +vanilla-wafer
+ Short cuts. Turn off, then on:
+
+ http://config.privoxy.org/toggle?set=disable
- * This allows you to add an arbitrary cookie. It can be specified
- multiple times in order to add as many cookies as you like.
- +wafer{name=value}
+ http://config.privoxy.org/toggle?set=enable
- The meaning of any of the above is reversed by preceding the action
- with a "-", in place of the "+".
-
- Some examples:
-
- Turn off cookies by default, then allow a few through for specified
- sites:
-
- # Turn off all cookies
- { +no-cookies-read }
- { +no-cookies-set }
- # Execeptions to the above, sites that need cookies
- { -no-cookies-read }
- { -no-cookies-set }
- .javasoft.com
- .sun.com
- .yahoo.com
- .msdn.microsoft.com
- .redhat.com
- # Alternative way of saying the same thing
- {-no-cookies-set -no-cookies-read}
- .sourceforge.net
- .sf.net
-
- Now turn off "fast redirects", and then we allow two exceptions:
-
- # Turn them off!
- {+fast-redirects}
-
- # Reverse it for these two sites, which don't work right without it.
- {-fast-redirects}
- www.ukc.ac.uk/cgi-bin/wac\.cgi\?
- login.yahoo.com
-
- Turn on page filtering, with one exception for sourceforge:
-
- # Run everything through the default filter file (re_filterfile):
- {+filter}
-
- # But please don't re_filter code from sourceforge!
- {-filter}
- .cvs.sourceforge.net
-
- Now some URLs that we want "blocked", ie we won't see them. Many of
- these use regular expressions that will expand to match multiple URLs:
-
- # Blocklist:
- {+block}
- /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g))
- /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/])
- /.*/(ng)?adclient\.cgi
- /.*/(plain|live|rotate)[-_.]?ads?/
- /.*/(sponsor)s?[0-9]?/
- /.*/_?(plain|live)?ads?(-banners)?/
- /.*/abanners/
- /.*/ad(sdna_image|gifs?)/
- /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe)
- /.*/adbanners/
- /.*/adserver
- /.*/adstream\.cgi
- /.*/adv((er)?ts?|ertis(ing|ements?))?/
- /.*/banner_?ads/
- /.*/banners?/
- /.*/banners?\.cgi/
- /.*/cgi-bin/centralad/getimage
- /.*/images/addver\.gif
- /.*/images/marketing/.*\.(gif|jpe?g)
- /.*/popupads/
- /.*/siteads/
- /.*/sponsor.*\.gif
- /.*/sponsors?[0-9]?/
- /.*/advert[0-9]+\.jpg
- /Media/Images/Adds/
- /ad_images/
- /adimages/
- /.*/ads/
- /bannerfarm/
- /grafikk/annonse/
- /graphics/defaultAd/
- /image\.ng/AdType
- /image\.ng/transactionID
- /images/.*/.*_anim\.gif # alvin brattli
- /ip_img/.*\.(gif|jpe?g)
- /rotateads/
- /rotations/
- /worldnet/ad\.cgi
- /cgi-bin/nph-adclick.exe/
- /.*/Image/BannerAdvertising/
- /.*/ad-bin/
- /.*/adlib/server\.cgi
- /autoads/
- _________________________________________________________________
-
-3.2.3. Aliases
-
- Custom "actions", known to Junkbuster as "aliases", can be defined by
- combining other "actions". These can in turn be invoked just like the
- built-in "actions". Currently, an alias can contain any character
- except space, tab, "=", "{" or "}". But please use only "a"- "z",
- "0"-"9", "+", and "-". Alias names are not case sensitive, and must be
- defined before anything else in actionsfile! And there can only be one
- set of "aliases" defined.
-
- Now let's define a few aliases:
-
- # Useful customer aliases we can use later. These must come first!
- {{alias}}
- +no-cookies = +no-cookies-set +no-cookies-read
- -no-cookies = -no-cookies-set -no-cookies-read
- fragile = -block -no-cookies -filter -fast-redirects -hide-refere
- r -no-popups
- shop = -no-cookies -filter -fast-redirects
- +imageblock = +block +image
- #For people who don't like to type too much: ;-)
- c0 = +no-cookies
- c1 = -no-cookies
- c2 = -no-cookies-set +no-cookies-read
- c3 = +no-cookies-set -no-cookies-read
- #... etc. Customize to your heart's content.
-
- Some examples using our "shop" and "fragile" aliases from above:
-
- # These sites are very complex and require
- # minimal interference.
- {fragile}
- .office.microsoft.com
- .windowsupdate.microsoft.com
- .nytimes.com
- # Shopping sites - still want to block ads.
- {shop}
- .quietpc.com
- .worldpay.com # for quietpc.com
- .jungle.com
- .scan.co.uk
- # These shops require pop-ups
- {shop -no-popups}
- .dabs.com
- .overclockers.co.uk
- _________________________________________________________________
-
-3.3. The Filter File
-
- The filter file defines what filtering of web pages Junkbuster does.
- The default filter file is re_filterfile, located in the config
- directory. In this file, any document content, whether viewable text
- or embedded non-visible content, can be changed.
-
- This file uses regular expressions to alter or remove any string in
- the target page. Some examples from the included default
- re_filterfile:
-
- Stop web pages from displaying annoying messages in the status bar by
- deleting such references:
-
- # The status bar is for displaying link targets, not pointless buzzwo
- rds.
- # Again, check it out on http://www.airport-cgn.de/.
- s/status='.*?';*//ig
-
- Just for kicks, replace any occurrence of "Microsoft" with
- "MicroSuck":
-
- s/microsoft(?!.com)/MicroSuck/ig
-
- Kill those auto-refresh tags:
-
- # Kill refresh tags. I like to refresh myself. Manually.
- # check it out on http://www.airport-cgn.de/ and go to the arrivals p
- age.
- #
- s/<meta[^>]*http-equiv[^>]*refresh.*URL=([^>]*?)"?>/<link rev="x-refr
- esh" href=$1>/i
- s/<meta[^>]*http-equiv="?page-enter"?[^>]*content=[^>]*>/<!--no page
- enter for me-->/i
- _________________________________________________________________
-
-4. Quickstart to Using Junkbuster
-
- Install package, then run and enjoy! Junbuster accepts only one
- command line option -- the configuration file to be used. Example Unix
- startup command:
-
-
- # /usr/sbin/junkbuster /etc/junkbuster/config &
-
-
- If no configuration file is specified on the command line, Junkbuster
- will look for a file named config in the current directory. Except on
- Amiga where it will look for AmiTCP:db/junkbuster/config and Win32
- where it will try junkbstr.txt. If no file is specified on the command
- line and no default configuration file can be found, Junkbuster will
- fail to start.
-
- Be sure your browser is set to use the proxy which is by default at
- localhost, port 8000. With Netscape (and Mozilla), this can be set
- under Edit -> Preferences -> Advanced -> Proxies -> HTTP Proxy. For
- Internet Explorer: Tools > Internet Properties -> Connections -> LAN
- Setting. Then, check "Use Proxy" and fill in the appropriate info
- (Address: localhost, Port: 8000). Include if HTTPS proxy support too.
-
- The included default configuration files should give a reasonable
- starting point, though may be somewhat aggressive in blocking junk.
- You will probably want to keep an eye out for sites that require
- cookies, and add these to actionsfile as needed. By default, most of
- these will be blocked until you add them to the configuration. If you
- want the browser to handle this instead, you will need to edit
- actionsfile and disable this feature. If you use more than one
- browser, it would make more sense to let Junkbuster handle this. In
- which case, the browser(s) should be set to accept all cookies.
-
- If a particular site shows problems loading properly, try adding it to
- the {fragile} section of actionsfile. This will turn off most actions
- for this site.
-
- HTTP/1.1 support is not fully implemented. If browsers that support
- HTTP/1.1 (like Mozilla or recent versions of I.E.) experience
- problems, you might try to force HTTP/1.0 compatiblity. For Mozilla,
- look under Edit -> Preferences -> Debug -> Networking. Or set the
- "+downgrade" config option in actionsfile.
-
- After running Junkbuster for a while, you can start to fine tune the
- configuration to suit your personal, or site, preferences and
- requirements. There are many, many aspects that can be customized.
- "Actions" (from actionsfile) can be adjusted by pointing your browser
- to [39]http://i.j.b./, and then follow the link to "edit the actions
- list". (This is an internal page and does not require Internet
- access.)
-
- In fact, various aspects of Junkbuster configuration can be viewed
- from this page, including current configuration parameters, source
- code version numbers, the browser's request headers, and "actions"
- that apply to a given URL. In addition to the actionsfile editor
- mentioned above, Junkbuster can also be turned "on" and "off" from
- this page.
-
- If you encounter problems, please verify it is a Junkbuster bug, by
- disabling Junkbuster, and then trying the same page. Also, try another
- browser if possible to eliminate browser or site problems. Before
- reporting it as a bug, see if there is not a configuration option that
- is enabled that is causing the page not to load. You can then add an
- exception for that page or site. If a bug, please report it to the
- developers (see below).
- _________________________________________________________________
-
-5. Contact the Developers
-
- Feature requests and other questions should be posted to the
- [40]Feature request page at SourceForge. There is also an archive
- there.
-
- Anyone interested in actively participating in development and related
- discussions can join the appropriate mailing list [41]here. Archives
- are available here too.
-
- Please report bugs, using the form at [42]Sourceforge. Please try to
- verify that it is a Junkbuster bug, and not a browser or site bug
- first. Also, check to make sure this is not already a known bug.
- _________________________________________________________________
-
-6. Copyright and History
-
-6.1. License
-
- Internet Junkbuster is free software; you can redistribute it and/or
- modify it under the terms of the GNU General Public License as
- published by the Free Software Foundation; either version 2 of the
- License, or (at your option) any later version.
-
- This program is distributed in the hope that it will be useful, but
- WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- General Public License for more details, which is available from
- [43]the Free Software Foundation, Inc, 59 Temple Place - Suite 330,
- Boston, MA 02111-1307, USA.
- _________________________________________________________________
-
-6.2. History
-
- Junkbuster was originally written by Anonymous Coders and
- [44]JunkBusters Corporation, and was released as free open-source
- software under the GNU GPL. [45]Stefan Waldherr made many
- improvements, and started the [46]SourceForge project to rekindle
- development. The last stable release was v2.0.2, which has now grown
- whiskers ;-).
- _________________________________________________________________
-
-7. See also
-
- [47]http://sourceforge.net/projects/ijbswa
-
- [48]http://ijbswa.sourceforge.net/
-
- [49]http://i.j.b./
-
- [50]http://www.junkbusters.com/ht/en/cookies.html
-
- [51]http://www.waldherr.org/junkbuster/
-
- [52]http://privacy.net/analyze/
-
- [53]http://www.squid-cache.org/
- _________________________________________________________________
-
-8. Appendix
-
-8.1. Regular Expressions
-
- Junkbuster can use "regular expressions" in various config files.
- Assuming support for "pcre" (Perl Compatible Regular Expressions) is
- compiled in, which is the default. Such configuration directives do
- not require regular expressions, but they can be used to increase
- flexibility by matching a pattern with wildcards against URLs.
-
- If you are reading this, you probably don't understand what "regular
- expressions" are, or what they can do. So this will be a very brief
- introduction only. A full explanation would require a book ;-)
-
- "Regular expressions" is a way of matching one character expression
- against another to see if it matches or not. One of the "expressions"
- is a literal string of readable characters (letter, numbers, etc), and
- the other is a complex string of literal characters combined with
- wildcards, and other special characters, called metacharacters. The
- "metacharacters" have special meanings and are used to build the
- complex pattern to be matched against. Perl Compatible Regular
- Expressions is an enhanced form of the regular expression language
- with backward compatibility.
-
- To make a simple analogy, we do something similar when we use wildcard
- characters when listing files with the dir command in DOS. *.* matches
- all filenames. The "special" character here is the asterik which
- matches any and all characters. We can be more specific and use ? to
- match just individual characters. So "dir file?.text" would match
- "file1.txt", "file2.txt", etc. We are pattern matching, using a
- similar technique to "regular expressions"!
-
- Regular expressions do essentially the same thing, but are much, much
- more powerful. There are many more "special characters" and ways of
- building complex patterns however. Let's look at a few of the common
- ones, and then some examples:
-
- . - Matches any single character, e.g. "a", "A", "4", ":", or "@".
-
- ? - The preceding character or expression is matched ZERO or ONE
- times. Either/or.
-
- + - The preceding character or expression is matched ONE or MORE
- times.
-
- * - The preceding character or expression is matched ZERO or MORE
- times.
-
- \ - The "escape" character denotes that the following character should
- be taken literally. This is used where one of the special characters
- (e.g. ".") needs to be taken literally and not as a special
- metacharacter.
-
- [] - Characters enclosed in brackets will be matched if any of the
- enclosed characters are encountered.
-
- () - Pararentheses are used to group a sub-expression, or multiple
- sub-expressions.
-
- | - The "bar" character works like an "or" conditional statement. A
- match is successful if the sub-expression on either side of "|"
- matches.
-
- s/string1/string2/g - This is used to rewrite strings of text.
- "string1" is replaced by "string2" in this example.
-
- These are just some of the ones you are likely to use when matching
- URLs with Junkbuster, and is a long way from a definitive list. This
- is enough to get us started with a few simple examples which may be
- more illuminating:
-
- /.*/banners/.* - A simple example that uses the common combination of
- "." and "*" to denote any character, zero or more times. In other
- words, any string at all. So we start with a literal forward slash,
- then our regular expression pattern (".*") another literal forward
- slash, the string "banners", another forward slash, and lastly another
- ".*". We are building a directory path here. This will match any file
- with the path that has a directory named "banners" in it. The ".*"
- matches any characters, and this could conceivably be more forward
- slashes, so it might expand into a much longer looking path. For
- example, this could match:
- "/eye/hate/spammers/banners/annoy_me_please.gif", or just
- "/banners/annoying.html", or almost an infinite number of other
- possible combinations, just so it has "banners" in the path somewhere.
-
- A now something a little more complex:
-
- /.*/adv((er)?ts?|ertis(ing|ements?))?/ - We have several literal
- forward slashes again ("/"), so we are building another expression
- that is a file path statement. We have another ".*", so we are
- matching against any conceivable sub-path, just so it matches our
- expression. The only true literal that must match our pattern is adv,
- together with the forward slashes. What comes after the "adv" string
- is the interesting part.
-
- Remember the "?" means the preceding expression (either a literal
- character or anything grouped with "(...)" in this case) can exist or
- not, since this means either zero or one match. So
- "((er)?ts?|ertis(ing|ements?))" is optional, as are the individual
- sub-expressions: "(er)", "(ing|ements?)", and the "s". The "|" means
- "or". We have two of those. For instance, "(ing|ements?)", can expand
- to match either "ing" OR "ements?". What is being done here, is an
- attempt at matching as many variations of "advertisement", and
- similar, as possible. So this would expand to match just "adv", or
- "advert", or "adverts", or "advertising", or "advertisement", or
- "advertisements". You get the idea. But it would not match
- "advertizements" (with a "z"). We could fix that by changing our
- regular expression to: "/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/",
- which would then match either spelling.
-
- /.*/advert[0-9]+\.(gif|jpe?g) - Again another path statement with
- forward slashes. Anything in the square brackets "[]" can be matched.
- This is using "0-9" as a shorthand expression to mean any digit one
- through nine. It is the same as saying "0123456789". So any digit
- matches. The "+" means one or more of the preceding expression must be
- included. The preceding expression here is what is in the square
- brackets -- in this case, any digit one through nine. Then, at the
- end, we have a grouping: "(gif|jpe?g)". This includes a "|", so this
- needs to match the expression on either side of that bar character
- also. A simple "gif" on one side, and the other side will in turn
- match either "jpeg" or "jpg", since the "?" means the letter "e" is
- optional and can be matched once or not at all. So we are building an
- expression here to match image GIF or JPEG type image file. It must
- include the literal string "advert", then one or more digits, and a
- "." (which is now a literal, and not a special character, since it is
- escaped with "\"), and lastly either "gif", or "jpeg", or "jpg". Some
- possible matches would include: "//advert1.jpg",
- "/nasty/ads/advert1234.gif", "/banners/from/hell/advert99.jpg". It
- would not match "advert1.gif" (no leading slash), or "/adverts232.jpg"
- (the expression does not include an "s"), or "/advert1.jsp" ("jsp" is
- not in the expression anywhere).
-
- s/microsoft(?!.com)/MicroSuck/i - This is a substitution. "MicroSuck"
- will replace any occurence of "microsoft". The "i" at the end of the
- expression means ignore case. The "(?!.com)" means the match should
- fail if "microsoft" is followed by ".com". In other words, this acts
- like a "NOT" modifier. In case this is a hyperlink, we don't want to
- break it ;-).
-
- We are barely scratching the surface of regular expressions here so
- that you can understand the default Junkbuster configuration files,
- and maybe use this knowledge to customize your own installation. There
- is much, much more that can be done with regular expressions. Now that
- you know enough to get started, you can learn more on your own :/
-
- More reading on Perl Compatible Regular expressions:
- [54]http://www.perldoc.com/perl5.6/pod/perlre.html
-
-References
-
- 1. http://ijbswa.sourceforge.net/user-manual/
- 2. mailto:ijbswa-developers@lists.sourceforge.net
- 3. file://localhost/home/swa/sf/current/doc/source/tmp.html#INTRODUCTION
- 4. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN27
- 5. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION
- 6. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-SOURCE
- 7. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-RH
- 8. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-SUSE
- 9. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-OS2
- 10. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-WIN
- 11. file://localhost/home/swa/sf/current/doc/source/tmp.html#INSTALLATION-OTHER
- 12. file://localhost/home/swa/sf/current/doc/source/tmp.html#CONFIGURATION
- 13. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN158
- 14. file://localhost/home/swa/sf/current/doc/source/tmp.html#ACTIONSFILE
- 15. file://localhost/home/swa/sf/current/doc/source/tmp.html#FILTERFILE
- 16. file://localhost/home/swa/sf/current/doc/source/tmp.html#QUICKSTART
- 17. file://localhost/home/swa/sf/current/doc/source/tmp.html#CONTACT
- 18. file://localhost/home/swa/sf/current/doc/source/tmp.html#COPYRIGHT
- 19. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN1161
- 20. file://localhost/home/swa/sf/current/doc/source/tmp.html#AEN1167
- 21. file://localhost/home/swa/sf/current/doc/source/tmp.html#SEEALSO
- 22. file://localhost/home/swa/sf/current/doc/source/tmp.html#APPENDIX
- 23. file://localhost/home/swa/sf/current/doc/source/tmp.html#REGEX
- 24. http://i.j.b/
- 25. http://sourceforge.net/projects/ijbswa/
- 26. http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/ijbswa/current/
- 27. http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=emxrt.zip&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fdev%2Femx%2Fv0.9d
- 28. http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&key=gnupack&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fapps
- 29. http://www.gnu.org/
- 30. http://i.j.b/
- 31. file://localhost/home/swa/sf/current/doc/source/tmp.html#ACTIONSFILE
- 32. http://i.j.b/
- 33. http://i.j.b/
- 34. http://i.j.b/
- 35. http://i.j.b/show-url-info
- 36. http://i.j.b/
- 37. http://www.perldoc.com/perl5.6/pod/perlre.html
- 38. file://localhost/home/swa/sf/current/doc/source/tmp.html#REGEX
- 39. http://i.j.b/
- 40. http://sourceforge.net/tracker/?atid=361118&group_id=11118&func=browse
- 41. http://sourceforge.net/mail/?group_id=11118
- 42. http://sourceforge.net/tracker/?group_id=11118&atid=111118
- 43. http://www.gnu.org/copyleft/gpl.html
- 44. http://www.junkbusters.com/ht/en/ijbfaq.html
- 45. http://www.waldherr.org/junkbuster/
- 46. http://sourceforge.net/projects/ijbswa/
- 47. http://sourceforge.net/projects/ijbswa
- 48. http://ijbswa.sourceforge.net/
- 49. http://i.j.b/
- 50. http://www.junkbusters.com/ht/en/cookies.html
- 51. http://www.waldherr.org/junkbuster/
- 52. http://privacy.net/analyze/
- 53. http://www.squid-cache.org/
- 54. http://www.perldoc.com/perl5.6/pod/perlre.html
+These may be bookmarked for quick reference. See next.
+
+-------------------------------------------------------------------------------
+
+15.2.1. Bookmarklets
+
+Below are some "bookmarklets" to allow you to easily access a "mini" version of
+some of Privoxy's special pages. They are designed for MS Internet Explorer,
+but should work equally well in Netscape, Mozilla, and other browsers which
+support JavaScript. They are designed to run directly from your bookmarks - not
+by clicking the links below (although that should work for testing).
+
+To save them, right-click the link and choose "Add to Favorites" (IE) or "Add
+Bookmark" (Netscape). You will get a warning that the bookmark "may not be
+safe" - just click OK. Then you can run the Bookmarklet directly from your
+favorites/bookmarks. For even faster access, you can put them on the "Links"
+bar (IE) or the "Personal Toolbar" (Netscape), and run them with a single
+click.
+
+ * Privoxy - Enable
+
+ * Privoxy - Disable
+
+ * Privoxy - Toggle Privoxy (Toggles between enabled and disabled)
+
+ * Privoxy- View Status
+
+ * Privoxy - Submit Filter Feedback
+
+Credit: The site which gave me the general idea for these bookmarklets is
+www.bookmarklets.com. They have more information about bookmarklets.
+
+-------------------------------------------------------------------------------
+
+15.3. Chain of Events
+
+Let's take a quick look at the basic sequence of events when a web page is
+requested by your browser and Privoxy is on duty:
+
+ * First, your web browser requests a web page. The browser knows to send the
+ request to Privoxy, which will in turn, relay the request to the remote web
+ server after passing the following tests:
+
+ * Privoxy traps any request for its own internal CGI pages (e.g http://p.p/)
+ and sends the CGI page back to the browser.
+
+ * Next, Privoxy checks to see if the URL matches any "+block" patterns. If
+ so, the URL is then blocked, and the remote web server will not be
+ contacted. "+handle-as-image" is then checked and if it does not match, an
+ HTML "BLOCKED" page is sent back. Otherwise, if it does match, an image is
+ returned. The type of image depends on the setting of "+set-image-blocker"
+ (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
+
+ * Untrusted URLs are blocked. If URLs are being added to the trust file, then
+ that is done.
+
+ * If the URL pattern matches the "+fast-redirects" action, it is then
+ processed. Unwanted parts of the requested URL are stripped.
+
+ * Now the rest of the client browser's request headers are processed. If any
+ of these match any of the relevant actions (e.g. "+hide-user-agent", etc.),
+ headers are suppressed or forged as determined by these actions and their
+ parameters.
+
+ * Now the web server starts sending its response back (i.e. typically a web
+ page and related data).
+
+ * First, the server headers are read and processed to determine, among other
+ things, the MIME type (document type) and encoding. The headers are then
+ filtered as deterimed by the "+prevent-setting-cookies",
+ "+session-cookies-only", and "+downgrade-http-version" actions.
+
+ * If the "+kill-popups" action applies, and it is an HTML or JavaScript
+ document, the popup-code in the response is filtered on-the-fly as it is
+ received.
+
+ * If a "+filter" or "+deanimate-gifs" action applies (and the document type
+ fits the action), the rest of the page is read into memory (up to a
+ configurable limit). Then the filter rules (from default.filter) are
+ processed against the buffered content. Filters are applied in the order
+ they are specified in the default.filter file. Animated GIFs, if present,
+ are reduced to either the first or last frame, depending on the action
+ setting.The entire page, which is now filtered, is then sent by Privoxy
+ back to your browser.
+
+ If neither "+filter" or "+deanimate-gifs" matches, then Privoxy passes the
+ raw data through to the client browser as it becomes available.
+
+ * As the browser receives the now (probably filtered) page content, it reads
+ and then requests any URLs that may be embedded within the page source,
+ e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
+ frames), sounds, etc. For each of these objects, the browser issues a new
+ request. And each such request is in turn processed as above. Note that a
+ complex web page may have many such embedded URLs.
+
+-------------------------------------------------------------------------------
+
+15.4. Anatomy of an Action
+
+The way Privoxy applies "actions" and "filters" to any given URL can be
+complex, and not always so easy to understand what is happening. And sometimes
+we need to be able to see just what Privoxy is doing. Especially, if something
+Privoxy is doing is causing us a problem inadvertently. It can be a little
+daunting to look at the actions and filters files themselves, since they tend
+to be filled with "regular expressions" whose consequences are not always so
+obvious.
+
+One quick test to see if Privoxy is causing a problem or not, is to disable it
+temporarily. This should be the first troubleshooting step. See the
+Bookmarklets section on a quick and easy way to do this (be sure to flush
+caches afterward!).
+
+Privoxy also provides the http://config.privoxy.org/show-url-info page that can
+show us very specifically how actions are being applied to any given URL. This
+is a big help for troubleshooting.
+
+First, enter one URL (or partial URL) at the prompt, and then Privoxy will tell
+us how the current configuration will handle it. This will not help with
+filtering effects (i.e. the "+filter" action) from the default.filter file
+since this is handled very differently and not so easy to trap! It also will
+not tell you about any other URLs that may be embedded within the URL you are
+testing. For instance, images such as ads are expressed as URLs within the raw
+page source of HTML pages. So you will only get info for the actual URL that is
+pasted into the prompt area -- not any sub-URLs. If you want to know about
+embedded URLs like ads, you will have to dig those out of the HTML source. Use
+your browser's "View Page Source" option for this. Or right click on the ad,
+and grab the URL.
+
+Let's try an example, google.com, and look at it one section at a time:
+
+ Matches for http://google.com:
+
+--- File standard ---
+(no matches in this file)
+
+--- File default ---
+
+{ -add-header -block +deanimate-gifs{last} -downgrade-http-version +fast-redirects
+ -filter{popups} -filter{fun} -filter{shockwave-flash} -filter{crude-parental}
+ +filter{html-annoyances} +filter{js-annoyances} +filter{content-cookies}
+ +filter{webbugs} +filter{refresh-tags} +filter{nimda} +filter{banners-by-size}
+ +hide-forwarded-for-headers +hide-from-header{block} +hide-referer{forge}
+ -hide-user-agent -handle-as-image +set-image-blocker{pattern} -limit-connect
+ +prevent-compression +session-cookies-only -prevent-reading-cookies
+ -prevent-setting-cookies -kill-popups -send-vanilla-wafer -send-wafer }
+/
+
+ { -session-cookies-only }
+ .google.com
+
+ { -fast-redirects }
+ .google.com
+
+--- File user ---
+(no matches in this file)
+
+This tells us how we have defined our "actions", and which ones match for our
+example, "google.com". The first listing is any matches for the standard.action
+file. No hits at all here on "standard". Then next is "default", or our
+default.action file. The large, multi-line listing, is how the actions are set
+to match for all URLs, i.e. our default settings. If you look at your "actions"
+file, this would be the section just below the "aliases" section near the top.
+This will apply to all URLs as signified by the single forward slash at the end
+of the listing -- "/".
+
+But we can define additional actions that would be exceptions to these general
+rules, and then list specific URLs (or patterns) that these exceptions would
+apply to. Last match wins. Just below this then are two explicit matches for
+".google.com". The first is negating our previous cookie setting, which was for
+"+session-cookies-only" (i.e. not persistent). So we will allow persistent
+cookies for google. The second turns off any "+fast-redirects" action, allowing
+this to take place unmolested. Note that there is a leading dot here --
+".google.com". This will match any hosts and sub-domains, in the google.com
+domain also, such as "www.google.com". So, apparently, we have these two
+actions defined somewhere in the lower part of our default.action file, and
+"google.com" is referenced somewhere in these latter sections.
+
+Then, for our user.action file, we again have no hits.
+
+And finally we pull it all together in the bottom section and summarize how
+Privoxy is applying all its "actions" to "google.com":
+
+ Final results:
+ -add-header -block +deanimate-gifs{last} -downgrade-http-version -fast-redirects
+ -filter{popups} -filter{fun} -filter{shockwave-flash} -filter{crude-parental}
+ +filter{html-annoyances} +filter{js-annoyances} +filter{content-cookies}
+ +filter{webbugs} +filter{refresh-tags} +filter{nimda} +filter{banners-by-size}
+ +hide-forwarded-for-headers +hide-from-header{block} +hide-referer{forge}
+ -hide-user-agent -handle-as-image +set-image-blocker{pattern} -limit-connect
+ +prevent-compression -session-cookies-only -prevent-reading-cookies
+ -prevent-setting-cookies -kill-popups -send-vanilla-wafer -send-wafer
+
+Notice the only difference here to the previous listing, is to "fast-redirects"
+and "session-cookies-only".
+
+Now another example, "ad.doubleclick.net":
+
+ { +block +handle-as-image }
+ .ad.doubleclick.net
+
+ { +block +handle-as-image }
+ ad*.
+
+ { +block +handle-as-image }
+ .doubleclick.net
+
+We'll just show the interesting part here, the explicit matches. It is matched
+three different times. Each as an "+block +handle-as-image", which is the
+expanded form of one of our aliases that had been defined as: "+imageblock". (
+"Aliases" are defined in the first section of the actions file and typically
+used to combine more than one action.)
+
+Any one of these would have done the trick and blocked this as an unwanted
+image. This is unnecessarily redundant since the last case effectively would
+also cover the first. No point in taking chances with these guys though ;-)
+Note that if you want an ad or obnoxious URL to be invisible, it should be
+defined as "ad.doubleclick.net" is done here -- as both a "+block" and an
+"+handle-as-image". The custom alias "+imageblock" just simplifies the process
+and make it more readable.
+
+One last example. Let's try "http://www.rhapsodyk.net/adsl/HOWTO/". This one is
+giving us problems. We are getting a blank page. Hmmm...
+
+ Matches for http://www.rhapsodyk.net/adsl/HOWTO/:
+
+ { -add-header -block +deanimate-gifs -downgrade-http-version +fast-redirects
+ +filter{html-annoyances} +filter{js-annoyances} +filter{kill-popups}
+ +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal}
+ +filter{fun} +hide-forwarded-for-headers +hide-from-header{block}
+ +hide-referer{forge} -hide-user-agent -handle-as-image +set-image-blocker{blank}
+ +prevent-compression +session-cookies-only -prevent-setting-cookies
+ -prevent-reading-cookies +kill-popups -send-vanilla-wafer -send-wafer }
+ /
+
+ { +block +handle-as-image }
+ /ads
+
+Ooops, the "/adsl/" is matching "/ads"! But we did not want this at all! Now we
+see why we get the blank page. We could now add a new action below this that
+explicitly does not block ("{-block}") paths with "adsl". There are various
+ways to handle such exceptions. Example:
+
+ { -block }
+ /adsl
+
+Now the page displays ;-) Be sure to flush your browser's caches when making
+such changes. Or, try using Shift+Reload.
+
+But now what about a situation where we get no explicit matches like we did
+with:
+
+ { +block +handle-as-image }
+ /ads
+
+That actually was very telling and pointed us quickly to where the problem was.
+If you don't get this kind of match, then it means one of the default rules in
+the first section is causing the problem. This would require some guesswork,
+and maybe a little trial and error to isolate the offending rule. One likely
+cause would be one of the "{+filter}" actions. Try adding the URL for the site
+to one of aliases that turn off "+filter":
+
+ {shop}
+ .quietpc.com
+ .worldpay.com # for quietpc.com
+ .jungle.com
+ .scan.co.uk
+ .forbes.com
+
+"{shop}" is an "alias" that expands to "{ -filter -session-cookies-only }". Or
+you could do your own exception to negate filtering:
+
+ {-filter}
+ .forbes.com
+
+This would probably be most appropriately put in user.action, for local site
+exceptions.
+
+"{fragile}" is an alias that disables most actions. This can be used as a last
+resort for problem sites. Remember to flush caches! If this still does not
+work, you will have to go through the remaining actions one by one to find
+which one(s) is causing the problem.
+