X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Ftext%2Fuser-manual.txt;h=e119faeee9428a379a5be988dfabc044c7cec380;hp=06b8faf0334ce145260d3ae55cbcc094d557e09b;hb=8a31ca80c7179bcf38ad3de25f2b106844880fee;hpb=64d565448d0ac8f945cdf6a2b0adaa981405ba7a diff --git a/doc/text/user-manual.txt b/doc/text/user-manual.txt index 06b8faf0..e119faee 100644 --- a/doc/text/user-manual.txt +++ b/doc/text/user-manual.txt @@ -1,1494 +1,7069 @@ -Junkbuster User Manual - - By: Junkbuster Developers - - Abstract - - The user manual gives the users information on how to install and - configure Internet Junkbuster. Internet Junkbuster is an application - that provides privacy and security to users of the World Wide Web. - - You can find the latest version of the user manual at - [1]http://ijbswa.sourceforge.net/doc/user-manual/. - - Feel free to send a note to the developers at - - ijbswa-developers@lists.sourceforge.net - - . - ____________________________________________________ - - Table of Contents - [2]Introduction - [3]Installation - [4]Junkbuster Configuration - [5]Quickstart to Using Junkbuster - [6]Contact the Developers - [7]Copyright and History - [8]See also - [9]Appendix - -Introduction - - Internet Junkbuster is a web proxy with advanced filtering - capabilities for protecting privacy, filtering web page content, - managing cookies, controlling access, and removing ads, banners, - pop-ups and other obnoxious Internet Junk. Junkbuster has a very - flexible configuration and can be customized to suit individual needs - and tastes. Internet Junkbuster has application for both stand-alone - systems and multi-user networks. - - This documentation is included with the current development version of - Internet Junkbuster and is incomplete at this point. The most up to - date reference for the time being is still the comments in the source - files and in the individual configuration files. Development of - version 3.0 is currently underway, and includes many significant - changes and enhancements over earlier verions. The target release date - for stable v3.0 is December 2001. - - Since this is a development version, some features are in the process - of being implemented. This documentation may be slightly out of sync - as a result. And there are bugs, though hopefully not many! - _________________________________________________________________ - -New Features - - In addition to Junkbuster's traditional features of ad and banner - blocking and cookie management, this is a list of new features - currently under development: - - * Modularized configuration that will allow for system wide - settings, and individual user settings. - * A browser based GUI configuration utility (not finished). - * Blocking of annoying pop-up browser windows (previously available - as a patch). - * Partial support for HTTP/1.1. - * Support for Perl Compatible Regular Expressions in the - configuration files, and generally a more sophisticated - configuration syntax over previous versions. - * Web page content filtering. - * Multi-threaded. - _________________________________________________________________ - -Installation - - Junkbuster is available as raw source code, or pre-compiled binaries. - See the [10]Junkbuster Home Page for current release info. Junkbuster - is also available via [11]CVS. This is the recommended approach at - this time. But please be aware that CVS is constantly changing, and it - may break in mysterious ways. - _________________________________________________________________ - -Source - - For gzipped tar archives, unpack the source: - - tar zxvf ijb_source_2.9* - cd ijb_source_2.9* - - For retrieving the current CVS sources, you'll need the CVS package - installed first. To download CVS source: - - cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login - cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co cu -rrent - cd current +Privoxy 3.0.8 User Manual + +[ Copyright 2001 - 2007 by Privoxy Developers ] + +$Id: user-manual.sgml,v 2.53 2008/01/19 15:03:05 hal9 Exp $ + +The Privoxy User Manual gives users information on how to install, configure +and use Privoxy. + +Privoxy is a non-caching web proxy with advanced filtering capabilities for +enhancing privacy, modifying web page data, managing HTTP cookies, controlling +access, and removing ads, banners, pop-ups and other obnoxious Internet junk. +Privoxy has a flexible configuration and can be customized to suit individual +needs and tastes. Privoxy has application for both stand-alone systems and +multi-user networks. + +Privoxy is based on Internet Junkbuster (tm). + +You can find the latest version of the Privoxy User Manual at http:// +www.privoxy.org/user-manual/. Please see the Contact section on how to contact +the developers. + +------------------------------------------------------------------------------- + +Table of Contents +1. Introduction + + 1.1. Features + +2. Installation + + 2.1. Binary Packages + + 2.1.1. Red Hat and Fedora RPMs + 2.1.2. Debian and Ubuntu + 2.1.3. Windows + 2.1.4. Solaris + 2.1.5. OS/2 + 2.1.6. Mac OSX + 2.1.7. AmigaOS + 2.1.8. FreeBSD + 2.1.9. Gentoo + + 2.2. Building from Source + 2.3. Keeping your Installation Up-to-Date + +3. What's New in this Release + + 3.1. Note to Upgraders + +4. Quickstart to Using Privoxy + + 4.1. Quickstart to Ad Blocking + +5. Starting Privoxy + + 5.1. Red Hat and Fedora + 5.2. Debian + 5.3. Windows + 5.4. Solaris, NetBSD, FreeBSD, HP-UX and others + 5.5. OS/2 + 5.6. Mac OSX + 5.7. AmigaOS + 5.8. Gentoo + 5.9. Command Line Options + +6. Privoxy Configuration + + 6.1. Controlling Privoxy with Your Web Browser + 6.2. Configuration Files Overview + +7. The Main Configuration File + + 7.1. Local Set-up Documentation + + 7.1.1. user-manual + 7.1.2. trust-info-url + 7.1.3. admin-address + 7.1.4. proxy-info-url + + 7.2. Configuration and Log File Locations + + 7.2.1. confdir + 7.2.2. templdir + 7.2.3. logdir + 7.2.4. actionsfile + 7.2.5. filterfile + 7.2.6. logfile + 7.2.7. jarfile + 7.2.8. trustfile + + 7.3. Debugging + + 7.3.1. debug + 7.3.2. single-threaded + + 7.4. Access Control and Security + + 7.4.1. listen-address + 7.4.2. toggle + 7.4.3. enable-remote-toggle + 7.4.4. enable-remote-http-toggle + 7.4.5. enable-edit-actions + 7.4.6. enforce-blocks + 7.4.7. ACLs: permit-access and deny-access + 7.4.8. buffer-limit + + 7.5. Forwarding + + 7.5.1. forward + 7.5.2. forward-socks4 and forward-socks4a + 7.5.3. Advanced Forwarding Examples + 7.5.4. forwarded-connect-retries + 7.5.5. accept-intercepted-requests + 7.5.6. allow-cgi-request-crunching + 7.5.7. split-large-forms + + 7.6. Windows GUI Options + +8. Actions Files + + 8.1. Finding the Right Mix + 8.2. How to Edit + 8.3. How Actions are Applied to Requests + 8.4. Patterns + + 8.4.1. The Domain Pattern + 8.4.2. The Path Pattern + 8.4.3. The Tag Pattern + + 8.5. Actions + + 8.5.1. add-header + 8.5.2. block + 8.5.3. client-header-filter + 8.5.4. client-header-tagger + 8.5.5. content-type-overwrite + 8.5.6. crunch-client-header + 8.5.7. crunch-if-none-match + 8.5.8. crunch-incoming-cookies + 8.5.9. crunch-server-header + 8.5.10. crunch-outgoing-cookies + 8.5.11. deanimate-gifs + 8.5.12. downgrade-http-version + 8.5.13. fast-redirects + 8.5.14. filter + 8.5.15. force-text-mode + 8.5.16. forward-override + 8.5.17. handle-as-empty-document + 8.5.18. handle-as-image + 8.5.19. hide-accept-language + 8.5.20. hide-content-disposition + 8.5.21. hide-if-modified-since + 8.5.22. hide-forwarded-for-headers + 8.5.23. hide-from-header + 8.5.24. hide-referrer + 8.5.25. hide-user-agent + 8.5.26. inspect-jpegs + 8.5.27. kill-popups + 8.5.28. limit-connect + 8.5.29. prevent-compression + 8.5.30. overwrite-last-modified + 8.5.31. redirect + 8.5.32. send-vanilla-wafer + 8.5.33. send-wafer + 8.5.34. server-header-filter + 8.5.35. server-header-tagger + 8.5.36. session-cookies-only + 8.5.37. set-image-blocker + 8.5.38. treat-forbidden-connects-like-blocks + 8.5.39. Summary + + 8.6. Aliases + 8.7. Actions Files Tutorial + + 8.7.1. default.action + 8.7.2. user.action + +9. Filter Files + + 9.1. Filter File Tutorial + 9.2. The Pre-defined Filters + +10. Privoxy's Template Files +11. Contacting the Developers, Bug Reporting and Feature Requests + + 11.1. Get Support + 11.2. Reporting Problems + + 11.2.1. Reporting Ads or Other Configuration Problems + 11.2.2. Reporting Bugs + + 11.3. Request New Features + 11.4. Other + +12. Privoxy Copyright, License and History + + 12.1. License + 12.2. History + 12.3. Authors + +13. See Also +14. Appendix + + 14.1. Regular Expressions + 14.2. Privoxy's Internal Pages + + 14.2.1. Bookmarklets + + 14.3. Chain of Events + 14.4. Troubleshooting: Anatomy of an Action + +1. Introduction + +This documentation is included with the current stable version of Privoxy, +v.3.0.8. + +------------------------------------------------------------------------------- + +1.1. Features + +In addition to the core features of ad blocking and cookie management, Privoxy +provides many supplemental features, that give the end-user more control, more +privacy and more freedom: + + * Can be run as an "intercepting" proxy, which obviates the need to configure + browsers individually. + + * Sophisticated actions and filters for manipulating both server and client + headers. + + * Can be chained with other proxies. + + * Integrated browser based configuration and control utility at http:// + config.privoxy.org/ (shortcut: http://p.p/). Browser-based tracing of rule + and filter effects. Remote toggling. + + * Web page filtering (text replacements, removes banners based on size, + invisible "web-bugs", JavaScript and HTML annoyances, pop-up windows, etc.) + + * Modularized configuration that allows for standard settings and user + settings to reside in separate files, so that installing updated actions + files won't overwrite individual user settings. + + * Support for Perl Compatible Regular Expressions in the configuration files, + and a more sophisticated and flexible configuration syntax. + + * Improved cookie management features (e.g. session based cookies). + + * GIF de-animation. + + * Bypass many click-tracking scripts (avoids script redirection). + + * Multi-threaded (POSIX and native threads). + + * User-customizable HTML templates for all proxy-generated pages (e.g. + "blocked" page). + + * Auto-detection and re-reading of config file changes. + + * Improved signal handling, and a true daemon mode (Unix). + + * Every feature now controllable on a per-site or per-location basis, + configuration more powerful and versatile over-all. + + * Many smaller new features added, limitations and bugs removed, and security + holes fixed. + +------------------------------------------------------------------------------- + +2. Installation + +Privoxy is available both in convenient pre-compiled packages for a wide range +of operating systems, and as raw source code. For most users, we recommend +using the packages, which can be downloaded from our Privoxy Project Page. + +Note: On some platforms, the installer may remove previously installed +versions, if found. (See below for your platform). In any case be sure to +backup your old configuration if it is valuable to you. See the note to +upgraders section below. + +------------------------------------------------------------------------------- + +2.1. Binary Packages - This will create a directory named current/, which will contain the - source tree. +How to install the binary packages depends on your operating system: - Then, in either case, to build from source: +------------------------------------------------------------------------------- - ./configure - make - su - make install +2.1.1. Red Hat and Fedora RPMs - For Redhat and SuSE Linux RPM packages, see below. - _________________________________________________________________ +RPMs can be installed with rpm -Uvh privoxy-3.0.8-1.rpm, and will use /etc/ +privoxy for the location of configuration files. -Red Hat +Note that on Red Hat, Privoxy will not be automatically started on system boot. +You will need to enable that using chkconfig, ntsysv, or similar methods. - To build Redhat RPM packages, install source as above. Then: +If you have problems with failed dependencies, try rebuilding the SRC RPM: rpm +--rebuild privoxy-3.0.8-1.src.rpm. This will use your locally installed +libraries and RPM version. - ./configure - make redhat-dist +Also note that if you have a Junkbuster RPM installed on your system, you need +to remove it first, because the packages conflict. Otherwise, RPM will try to +remove Junkbuster automatically if found, before installing Privoxy. - This will create both binary and src RPMs in the usual places. - Example: +------------------------------------------------------------------------------- - /usr/src/redhat/RPMS/i686/junkbuster-2.9.8-1.i686.rpm +2.1.2. Debian and Ubuntu - /usr/src/redhat/SRPMS/junkbuster-2.9.9-1.src.rpm +DEBs can be installed with apt-get install privoxy, and will use /etc/privoxy +for the location of configuration files. - To install, of course: +------------------------------------------------------------------------------- - rpm -Uvv /usr/src/redhat/RPMS/i686/junkbuster-2.9.9-1.i686.rpm +2.1.3. Windows - This will place the Junkbuster configuration files in - /etc/junkbuster/, and log files in /var/log/junkbuster/. - _________________________________________________________________ +Just double-click the installer, which will guide you through the installation +process. You will find the configuration files in the same directory as you +installed Privoxy in. -SuSE +Version 3.0.5 beta introduced full Windows service functionality. On Windows +only, the Privoxy program has two new command line arguments to install and +uninstall Privoxy as a service. - To build SuSE RPM packages, install source as above. Then: +Arguments: - ./configure - make suse-dist + --install[:service_name] - This will create both binary and src RPMs in the usual places. - Example: + --uninstall[:service_name] - /usr/src/suse/RPMS/i686/junkbuster-2.9.9-1.i686.rpm +After invoking Privoxy with --install, you will need to bring up the Windows +service console to assign the user you want Privoxy to run under, and whether +or not you want it to run whenever the system starts. You can start the Windows +services console with the following command: services.msc. If you do not take +the manual step of modifying Privoxy's service settings, it will not start. +Note too that you will need to give Privoxy a user account that actually +exists, or it will not be permitted to write to its log and configuration +files. - /usr/src/suse/SRPMS/junkbuster-2.9.9-1.src.rpm +------------------------------------------------------------------------------- - To install, of course: +2.1.4. Solaris - rpm -Uvv /usr/src/suse/RPMS/i686/junkbuster-2.9.9-1.i686.rpm +Create a new directory, cd to it, then unzip and untar the archive. For the +most part, you'll have to figure out where things go. - This will place the Junkbuster configuration files in - /etc/junkbuster/, and log files in /var/log/junkbuster/. - _________________________________________________________________ +------------------------------------------------------------------------------- -OS/2 +2.1.5. OS/2 - The OS/2 version of Junkbuster requires the EMX runtime library to be - installed. The EMX runtime library is available on the hobbes OS/2 - archive, among many other locations: - [12]http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=emx - rt.zip&stype=all&sort=type&dir=%2Fpub%2Fos2%2Fdev%2Femx%2Fv0.9d +First, make sure that no previous installations of Junkbuster and / or Privoxy +are left on your system. Check that no Junkbuster or Privoxy objects are in +your startup folder. - Junkbuster is packaged in a WarpIN self- installing archive. The - self-installing program will be named depending on the release - version, something like: ijbos123.exe. In order to install it, simply - run this executable or double-click on its icon and follow the WarpIN - installation panels. A shadow of the Junkbuster executable will be - placed in your startup folder so it will start automatically whenever - OS/2 starts. +Then, just double-click the WarpIN self-installing archive, which will guide +you through the installation process. A shadow of the Privoxy executable will +be placed in your startup folder so it will start automatically whenever OS/2 +starts. - The directory you choose to install Junkbuster into will contain all - of the configuration files. +The directory you choose to install Privoxy into will contain all of the +configuration files. - If you would like to build binary images on OS/2 yourself, you will - need a working EMX/GCC environment, plus several Unix-like tools. The - Hobbes OS/2 archive is a good place to start when building such an - environment. A set of Unix-like tools named gnupack is located here: - [13]http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&key=gnupack&stype=all - &sort=type&dir=%2Fpub%2Fos2%2Fapps +------------------------------------------------------------------------------- - Once you have the source code unpacked as above, you can build the - binaries from the current/ directory: +2.1.6. Mac OSX +Unzip the downloaded file (you can either double-click on the file from the +finder, or from the desktop if you downloaded it there). Then, double-click on +the package installer icon named Privoxy.pkg and follow the installation +process. Privoxy will be installed in the folder /Library/Privoxy. It will +start automatically whenever you start up. To prevent it from starting +automatically, remove or rename the folder /Library/StartupItems/Privoxy. + +To start Privoxy by hand, double-click on StartPrivoxy.command in the /Library/ +Privoxy folder. Or, type this command in the Terminal: + + /Library/Privoxy/StartPrivoxy.command + + + +You will be prompted for the administrator password. + +------------------------------------------------------------------------------- + +2.1.7. AmigaOS + +Copy and then unpack the lha archive to a suitable location. All necessary +files will be installed into Privoxy directory, including all configuration and +log files. To uninstall, just remove this directory. + +------------------------------------------------------------------------------- + +2.1.8. FreeBSD + +Privoxy is part of FreeBSD's Ports Collection, you can build and install it +with cd /usr/ports/www/privoxy; make install clean. + +If you don't use the ports, you can fetch and install the package with pkg_add +-r privoxy. + +The port skeleton and the package can also be downloaded from the File Release +Page, but there's no reason to use them unless you're interested in the beta +releases which are only available there. + +------------------------------------------------------------------------------- + +2.1.9. Gentoo + +Gentoo source packages (Ebuilds) for Privoxy are contained in the Gentoo +Portage Tree (they are not on the download page, but there is a Gentoo section, +where you can see when a new Privoxy Version is added to the Portage Tree). + +Before installing Privoxy under Gentoo just do first emerge rsync to get the +latest changes from the Portage tree. With emerge privoxy you install the +latest version. + +Configuration files are in /etc/privoxy, the documentation is in /usr/share/doc +/privoxy-3.0.8 and the Log directory is in /var/log/privoxy. + +------------------------------------------------------------------------------- + +2.2. Building from Source + +The most convenient way to obtain the Privoxy sources is to download the source +tarball from our project download page. + +If you like to live on the bleeding edge and are not afraid of using possibly +unstable development versions, you can check out the up-to-the-minute version +directly from the CVS repository. + +To build Privoxy from source, autoconf, GNU make (gmake), and, of course, a C +compiler like gcc are required. + +When building from a source tarball, first unpack the source: + + tar xzvf privoxy-3.0.8-src* [.tgz or .tar.gz] + cd privoxy-3.0.8 + + +For retrieving the current CVS sources, you'll need a CVS client installed. +Note that sources from CVS are typically development quality, and may not be +stable, or well tested. To download CVS source, check the Sourceforge +documentation, which might give commands like: + + cvs -d:pserver:anonymous@ijbswa.cvs.sourceforge.net:/cvsroot/ijbswa login + cvs -z3 -d:pserver:anonymous@ijbswa.cvs.sourceforge.net:/cvsroot/ijbswa co current + cd current + + +This will create a directory named current/, which will contain the source +tree. + +You can also check out any Privoxy "branch", just exchange the current name +with the wanted branch name (Example: v_3_0_branch for the 3.0 cvs tree). + +It is also strongly recommended to not run Privoxy as root. You should +configure/install/run Privoxy as an unprivileged user, preferably by creating a +"privoxy" user and group just for this purpose. See your local documentation +for the correct command line to do add new users and groups (something like +adduser, but the command syntax may vary from platform to platform). + +/etc/passwd might then look like: + + privoxy:*:7777:7777:privoxy proxy:/no/home:/no/shell + + +And then /etc/group, like: + + privoxy:*:7777: + + +Some binary packages may do this for you. + +Then, to build from either unpacked tarball or CVS source: + + autoheader autoconf - sh configure - make - _________________________________________________________________ + ./configure # (--help to see options) + make # (the make from GNU, sometimes called gmake) + su # Possibly required + make -n install # (to see where all the files will go) + make -s install # (to really install, -s to silence output) -Windows - Click-click. (I need help on this. Not a clue here. Also for - configuration section below. HB.) - _________________________________________________________________ +Using GNU make, you can have the first four steps automatically done for you by +just typing: -Other + make - Some quick notes on other Operating Systems. - For FreeBSD (and other *BSDs?), the build will need gmake instead of - the included make. gmake is available from [14]http://www.gnu.org. The - rest should be the same as above for Linux/Unix. - _________________________________________________________________ +in the freshly downloaded or unpacked source directory. -Junkbuster Configuration +To build an executable with security enhanced features so that users cannot +easily bypass the proxy (e.g. "Go There Anyway"), or alter their own +configurations, configure like this: - For Unix, *BSD and Linux, all configuraton files are located in - /etc/junkbuster/ by default. For MS Windows and OS/2, these are all in - the same directory as the Junkbuster executable. The name and number - of configuration files has changed from previous versions, and is - subject to change as development progresses. + ./configure --disable-toggle --disable-editor --disable-force - The installed defaults provide a reasonable starting point. For the - time being, there are only three default configuration files (this - will change in time): - * The main configuration file is named config on Linux, Unix, BSD, - and OS/2, and junkbustr.txt on Windows. On Amiga, it is - AmiTCP:db/junkbuster/config. - * The actionsfile file is used to define various actions relating to - images, banners, pop-ups, banners and cookies. - * The re_filterfile file can be used to rewrite the raw page - content, including text as well as embedded HTML and JavaScript. +Then build as above. In Privoxy 3.0.7 and later, all of these options can also +be disabled through the configuration file. - actionsfile and re_filterfile can use Perl style regular expressions - for maximum flexibility. All files use the "#" character to denote a - comment. Such lines are not processed by Junkbuster. After making any - changes, restart Junkbuster in order for the changes to take effect. - _________________________________________________________________ +WARNING: If installing as root, the install will fail unless a non-root user or +group is specified, or a privoxy user and group already exist on the system. If +a non-root user is specified, and no group, then the installation will try to +also use a group of the same name as "user". If a group is specified (and no +user), then the support files will be installed as writable by that group, and +owned by the user running the installation. -The Main Configuration File +configure accepts --with-user and --with-group options for setting user and +group ownership of the configuration files (which need to be writable by the +daemon). The specified user must already exist. When starting Privoxy, it must +be run as this same user to insure write access to configuration and log files! - Again, the main configuration file is named config on Linux/Unix/BSD - and OS/2, and junkbustr.txt on Windows. Configuration lines consist of - an initial keyword followed by a list of values, all separated by - whitespace (any number of spaces or tabs). For example: +Alternately, you can specify user and group on the make command line, but be +sure both already exist: - blockfile blocklist.ini + make -s install USER=privoxy GROUP=privoxy - Indicates that the blockfile is named "blocklist.ini". - The "#" indicates a comment. Any part of a line following a "#" is - ignored, except if the "#" is preceded by a "\". +The default installation path for make install is /usr/local. This may of +course be customized with the various ./configure path options. If you are +doing an install to anywhere besides /usr/local, be sure to set the appropriate +paths with the correct configure options (./configure --help). Non-privileged +users must of course have write access permissions to wherever the target +installation is going. - Thus, by placing a "#" at the start of an existing configuration line, - you can make it a comment and it will be treated as if it weren't - there. This is called "commenting out" an option and can be useful to - turn off features: If you comment out the "logfile" line, junkbuster - will not log to a file at all. Watch for the "default:" section in - each explanation to see what happens if the option is left unset (or - commented out). +If you do install to /usr/local, the install will use sysconfdir=$prefix/etc/ +privoxy by default. All other destinations, and the direct usage of +--sysconfdir flag behave like normal, i.e. will not add the extra privoxy +directory. This is for a safer install, as there may already exist another +program that uses a file with the "config" name, and thus makes /usr/local/etc +cleaner. - Long lines can be continued on the next line by using a "\" as the - very last character. +If installing to /usr/local, the documentation will go by default to $prefix/ +share/doc. But if this directory doesn't exist, it will then try $prefix/doc +and install there before creating a new $prefix/share/doc just for Privoxy. - There are various aspects of Junkbuster behavior that can be adjusted. - _________________________________________________________________ +Again, if the installs goes to /usr/local, the localstatedir (ie: var/) will +default to /var instead of $prefix/var so the logs will go to /var/log/privoxy +/, and the pid file will be created in /var/run/privoxy.pid. -Defining Other Configuration Files +make install will attempt to set the correct values in config (main +configuration file). You should check this to make sure all values are correct. +If appropriate, an init script will be installed, but it is up to the user to +determine how and where to start Privoxy. The init script should be checked for +correct paths and values, if anything other than a default install is done. - Junkbuster can use a number of other files to tell it what ads to - block, what cookies to accept, etc. This section of the configuration - file tells Junkbuster where to find all those other files. +If install finds previous versions of local configuration files, most of these +will not be overwritten, and the new ones will be installed with a "new" +extension. default.action, default.filter, and standard.action will be +overwritten. You will then need to manually update the other installed +configuration files as needed. The default template files will be overwritten. +If you have customized, local templates, these should be stored safely in a +separate directory and defined in config by the "templdir" directive. It is of +course wise to always back-up any important configuration files "just in case". +If a previous version of Privoxy is already running, you will have to restart +it manually. - On Windows, Junkbuster looks for these files in the same directory as - the executable. On Unix and OS/2, Junkbuster looks for these files in - the current working directory. In either case, an absolute path name - can be used to avoid problems. +For more detailed instructions on how to build Redhat RPMs, Windows +self-extracting installers, building on platforms with special requirements +etc, please consult the developer manual. - When development goes modular and multiuser, the blocker, filter, and - per-user config will be stored in subdirectories of "confdir". For - now, only confdir/templates is used for storing HTML templates for CGI - results. +------------------------------------------------------------------------------- - The location of the configuration files: +2.3. Keeping your Installation Up-to-Date - confdir /etc/junkbuster # No trailing /, please. +As user feedback comes in and development continues, we will make updated +versions of both the main actions file (as a separate package) and the software +itself (including the actions file) available for download. - The directory where all logging (i.e. logfile and jarfile) takes - place. No trailing "/", please: +If you wish to receive an email notification whenever we release updates of +Privoxy or the actions file, subscribe to our announce mailing list, +ijbswa-announce@lists.sourceforge.net. - logdir /var/log/junkbuster +In order not to lose your personal changes and adjustments when updating to the +latest default.action file we strongly recommend that you use user.action and +user.filter for your local customizations of Privoxy. See the Chapter on +actions files for details. - Note that all file specifications below are relative to the above two - directories! +------------------------------------------------------------------------------- - The "actionsfile" contains patterns to specify the actions to apply to - requests for each site. Default: Cookies to and from all destinations - are filtered. Popups are disabled for all sites. All sites are - filtered if re_filterfile specified. No sites are blocked. An empty - image is displayed for filtered ads and other images (formerly - "tinygif"). The syntax of this file is explained in detail [15]below. +3. What's New in this Release - actionsfile actionsfile +There are many improvements and new features since Privoxy 3.0.6, the last +stable release: - The "re_filterfile" file contains content modification rules. These - rules permit powerful changes on the content of Web pages, e.g., you - could disable your favourite JavaScript annoyances, rewrite the actual - content, or just have some fun replacing "Microsoft" with "MicroSuck" - wherever it appears on a Web page. Default: No content modification, - or whatever the developers are playing with :-/ + * Two new actions server-header-tagger and client-header-tagger that can be + used to create arbitrary "tags" based on client and server headers. These + "tags" can then subsequently be used to control the other actions used for + the current request, greatly increasing Privoxy's flexibility and + selectivity. See tag patterns for more information on tags. - re_filterfile re_filterfile + * Header filtering is done with dedicated header filters now. As a result the + actions "filter-client-headers" and "filter-server-headers" that were + introduced with Privoxy 3.0.5 to apply content filters to the headers have + been removed. See the new actions server-header-filter and + client-header-filter for details. - The logfile is where all logging and error messages are written. The - logfile can be useful for tracking down a problem with Junkbuster - (e.g., it's not blocking an ad you think it should block) but in most - cases you probably will never look at it. + * There are four new options for the main config file: - Your logfile will grow indefinitely, and you will probably want to - periodically remove it. On Unix systems, you can do this with a cron - job (see "man cron"). For Redhat, a logrotate script has been - included. - - On SuSE Linux systems, you can place a line like - "/var/log/junkbuster.* +1024k 644 nobody.nogroup" in /etc/logfiles, - with the effect that cron.daily will automatically archive, gzip, and - empty the log, when it exceeds 1M size. - - Default: Log to the a file named logfile. Comment out to disable - logging. + + allow-cgi-request-crunching which allows requests for Privoxy's + internal CGI pages to be blocked, redirected or (un)trusted like + ordinary requests. - logfile logfile + + split-large-forms that will work around a browser bug that caused IE6 + and IE7 to ignore the Submit button on the Privoxy's + edit-actions-for-url CGI page. - The "jarfile" defines where Junkbuster stores the cookies it - intercepts. Note that if you use a "jarfile", it may grow quite large. - Default: Don't store intercepted cookies. + + accept-intercepted-requests which allows to combine Privoxy with any + packet filter to create an intercepting proxy for HTTP/1.1 requests + (and for HTTP/1.0 requests with Host header set). This means clients + can be forced to use Privoxy even if their proxy settings are + configured differently. - #jarfile jarfile + + templdir to designate an alternate location for Privoxy's locally + customized CGI templates so that these are not overwritten during + upgrades. - If you specify a "trustfile", Junkbuster will only allow access to - sites that are named in the trustfile. You can also mark sites as - trusted referrers, with the effect that access to untrusted sites will - be granted, if a link from a trusted referrer was used. The link - target will then be added to the "trustfile". This is a very - restrictive feature that typical users most propably want to leave - disabled. Default: Disabled, don't use the trust mechanism. + * A new command line option --pre-chroot-nslookup hostname to initialize the + resolver library before chroot'ing. On some systems this reduces the number + of files that must be copied into the chroot tree. (Patch provided by + Stephen Gildea) - #trustfile trust + * The forward-override action allows changing of the forwarding settings + through the actions files. Combined with tags, this allows to choose the + forwarder based on client headers like the User-Agent, or the request + origin. - If you use the trust mechanism, it is a good idea to write up some - online documentation about your blocking policy and to specify the - URL(s) here. They will appear on the page that your users receive when - they try to access untrusted content. Use multiple times for multiple - URLs. Default: Don't display links on the "untrusted" info page. + * The redirect action can now use regular expression substitutions against + the original URL. - trust-info-url http://www.your-site.com/why_we_block.html - trust-info-url http://www.your-site.com/what_we_allow.html - _________________________________________________________________ + * zlib support is now available as a compile time option to filter compressed + content. Patch provided by Wil Mahan. -Other Configuration Options + * Improve various filters, and add new ones. - This part of the configuration file contains options that control how - Junkbuster operates. + * Include support for RFC 3253 so that Subversion works with Privoxy. Patch + provided by Petr Kadlec. - "Admin-address" should be set to the email address of the proxy - administrator. It is used in many of the proxy-generated pages. - Default: fill@me.in.please. + * Logging can be completely turned off by not specifying a logfile directive. - #admin-address fill@me.in.please + * A number of improvements to Privoxy's internal CGI pages, including the use + of favicons for error and control pages. - "Proxy-info-url" can be set to a URL that contains more info about - this Junkbuster installation, it's configuration and policies. It is - used in many of the proxy-generated pages and its use is highly - recommended in multi-user installations, since your users will want to - know why certain content is blocked or modified. Default: Don't show a - link to online documentation. + * Many bugfixes, memory leaks addressed, code improvements, and logging + improvements. - proxy-info-url http://www.your-site.com/proxy.html +For a more detailed list of changes please have a look at the ChangeLog. - "Listen-address" specifies the address and port where Junkbuster will - listen for connections from your Web browser. The default is to listen - on the localhost port 8000, and this is suitable for most users. (In - your web browser, under proxy configuration, list the proxy server as - "localhost" and the port as "8000"). +------------------------------------------------------------------------------- - If you already have another service running on port 8000, or if you - want to serve requests from other machines (e.g. on your local - network) as well, you will need to override the default. The syntax is - "listen-address []:". If you leave out the IP - adress, junkbuster will bind to all interfaces (addresses) on your - machine and may become reachable from the internet. In that case, - consider using access control lists (acl's) (see "aclfile" above). +3.1. Note to Upgraders - For example, suppose you are running Junkbuster on a machine which has - the address 192.168.0.1 on your local private network (192.168.0.0) - and has another outside connection with a different address. You want - it to serve requests from inside only: +A quick list of things to be aware of before upgrading from earlier versions of +Privoxy: - listen-address 192.168.0.1:8000 + * The recommended way to upgrade Privoxy is to backup your old configuration + files, install the new ones, verify that Privoxy is working correctly and + finally merge back your changes using diff and maybe patch. - If you want it to listen on all addresses (including the outside - connection): + There are a number of new features in each Privoxy release and most of them + have to be explicitly enabled in the configuration files. Old configuration + files obviously don't do that and due to syntax changes using old + configuration files with a new Privoxy isn't always possible anyway. - listen-address :8000 + * Note that some installers remove earlier versions completely, including + configuration files, therefore you should really save any important + configuration files! - If you do this, consider using ACLs (see "aclfile" above). Note: you - will need to point your browser(s) to the address and port that you - have configured here. Default: localhost:8000 (127.0.0.1:8000). + * On the other hand, other installers don't overwrite existing configuration + files, thinking you will want to do that yourself. - The debug option sets the level of debugging information to log in the - logfile (and to the console in the Windows version). A debug level of - 1 is informative because it will show you each request as it happens. - Higher levels of debug are probably only of interest to developers. + * standard.action now only includes the enabled actions. Not all actions as + before. - debug 1 # GPC = show each GET/POST/CONNECT request - debug 2 # CONN = show each connection status - debug 4 # IO = show I/O status - debug 8 # HDR = show header parsing - debug 16 # LOG = log all data into the logfile - debug 32 # FRC = debug force feature - debug 64 # REF = debug regular expression filter - debug 128 # = debug fast redirects - debug 256 # = debug GIF deanimation - debug 512 # CLF = Common Log Format - debug 1024 # = debug kill popups - debug 4096 # INFO = Startup banner and warnings. - debug 8192 # ERROR = Non-fatal errors + * In the default configuration only fatal errors are logged now. You can + change that in the debug section of the configuration file. You may also + want to enable more verbose logging until you verified that the new Privoxy + version is working as expected. - It is highly recommended that you enable ERROR reporting (debug 8192), - at least until the next stable release. + * Three other config file settings are now off by default: + enable-remote-toggle, enable-remote-http-toggle, and enable-edit-actions. + If you use or want these, you will need to explicitly enable them, and be + aware of the security issues involved. - The reporting of FATAL errors (i.e. ones which crash JunkBuster) is - always on and cannot be disabled. + * The "filter-client-headers" and "filter-server-headers" actions that were + introduced with Privoxy 3.0.5 to apply content filters to the headers have + been removed and replaced with new actions. See the What's New section + above. + +------------------------------------------------------------------------------- - If you want to use CLF (Common Log Format), you should set "debug 512" - ONLY, do not enable anything else. +4. Quickstart to Using Privoxy + + * Install Privoxy. See the Installation Section below for platform specific + information. + + * Advanced users and those who want to offer Privoxy service to more than + just their local machine should check the main config file, especially the + security-relevant options. These are off by default. + + * Start Privoxy, if the installation program has not done this already (may + vary according to platform). See the section Starting Privoxy. + + * Set your browser to use Privoxy as HTTP and HTTPS (SSL) proxy by setting + the proxy configuration for address of 127.0.0.1 and port 8118. DO NOT + activate proxying for FTP or any protocols besides HTTP and HTTPS (SSL) + unless you intend to prevent your browser from using these protocols. + + * Flush your browser's disk and memory caches, to remove any cached ad + images. If using Privoxy to manage cookies, you should remove any currently + stored cookies too. + + * A default installation should provide a reasonable starting point for most. + There will undoubtedly be occasions where you will want to adjust the + configuration, but that can be dealt with as the need arises. Little to no + initial configuration is required in most cases, you may want to enable the + web-based action editor though. Be sure to read the warnings first. + + See the Configuration section for more configuration options, and how to + customize your installation. You might also want to look at the next + section for a quick introduction to how Privoxy blocks ads and banners. - Multiple "debug" directives, are OK - they're logical-OR'd together. + * If you experience ads that slip through, innocent images that are blocked, + or otherwise feel the need to fine-tune Privoxy's behavior, take a look at + the actions files. As a quick start, you might find the richly commented + examples helpful. You can also view and edit the actions files through the + web-based user interface. The Appendix "Troubleshooting: Anatomy of an + Action" has hints on how to understand and debug actions that "misbehave". + + * Please see the section Contacting the Developers on how to report bugs, + problems with websites or to get help. + + * Now enjoy surfing with enhanced control, comfort and privacy! + +------------------------------------------------------------------------------- + +4.1. Quickstart to Ad Blocking + +Ad blocking is but one of Privoxy's array of features. Many of these features +are for the technically minded advanced user. But, ad and banner blocking is +surely common ground for everybody. + +This section will provide a quick summary of ad blocking so you can get up to +speed quickly without having to read the more extensive information provided +below, though this is highly recommended. + +First a bit of a warning ... blocking ads is much like blocking SPAM: the more +aggressive you are about it, the more likely you are to block things that were +not intended. And the more likely that some things may not work as intended. So +there is a trade off here. If you want extreme ad free browsing, be prepared to +deal with more "problem" sites, and to spend more time adjusting the +configuration to solve these unintended consequences. In short, there is not an +easy way to eliminate all ads. Either take the easy way and settle for most ads +blocked with the default configuration, or jump in and tweak it for your +personal surfing habits and preferences. + +Secondly, a brief explanation of Privoxy's "actions". "Actions" in this +context, are the directives we use to tell Privoxy to perform some task +relating to HTTP transactions (i.e. web browsing). We tell Privoxy to take some +"action". Each action has a unique name and function. While there are many +potential actions in Privoxy's arsenal, only a few are used for ad blocking. +Actions, and action configuration files, are explained in depth below. + +Actions are specified in Privoxy's configuration, followed by one or more URLs +to which the action should apply. URLs can actually be URL type patterns that +use wildcards so they can apply potentially to a range of similar URLs. The +actions, together with the URL patterns are called a section. + +When you connect to a website, the full URL will either match one or more of +the sections as defined in Privoxy's configuration, or not. If so, then Privoxy +will perform the respective actions. If not, then nothing special happens. +Furthermore, web pages may contain embedded, secondary URLs that your web +browser will use to load additional components of the page, as it parses the +original page's HTML content. An ad image for instance, is just an URL embedded +in the page somewhere. The image itself may be on the same server, or a server +somewhere else on the Internet. Complex web pages will have many such embedded +URLs. Privoxy can deal with each URL individually, so, for instance, the main +page text is not touched, but images from such-and-such server are blocked. + +The most important actions for basic ad blocking are: block, handle-as-image, +handle-as-empty-document,and set-image-blocker: + + * block - this is perhaps the single most used action, and is particularly + important for ad blocking. This action stops any contact between your + browser and any URL patterns that match this action's configuration. It can + be used for blocking ads, but also anything that is determined to be + unwanted. By itself, it simply stops any communication with the remote + server and sends Privoxy's own built-in BLOCKED page instead to let you now + what has happened (with some exceptions, see below). + + * handle-as-image - tells Privoxy to treat this URL as an image. Privoxy's + default configuration already does this for all common image types (e.g. + GIF), but there are many situations where this is not so easy to determine. + So we'll force it in these cases. This is particularly important for ad + blocking, since only if we know that it's an image of some kind, can we + replace it with an image of our choosing, instead of the Privoxy BLOCKED + page (which would only result in a "broken image" icon). There are some + limitations to this though. For instance, you can't just brute-force an + image substitution for an entire HTML page in most situations. + + * handle-as-empty-document - sends an empty document instead of Privoxy's + normal BLOCKED HTML page. This is useful for file types that are neither + HTML nor images, such as blocking JavaScript files. + + * set-image-blocker - tells Privoxy what to display in place of an ad image + that has hit a block rule. For this to come into play, the URL must match a + block action somewhere in the configuration, and, it must also match an + handle-as-image action. + + The configuration options on what to display instead of the ad are: + + pattern - a checkerboard pattern, so that an ad replacement is obvious. + This is the default. + + blank - A very small empty GIF image is displayed. This is the so-called + "invisible" configuration option. - debug 15 # same as setting the first 4 listed above + http:// - A redirect to any image anywhere of the user's choosing + (advanced usage). - Default: +Advanced users will eventually want to explore Privoxy filters as well. Filters +are very different from blocks. A "block" blocks a site, page, or unwanted +contented. Filters are a way of filtering or modifying what is actually on the +page. An example filter usage: a text replacement of "no-no" for "nasty-word". +That is a very simple example. This process can be used for ad blocking, but it +is more in the realm of advanced usage and has some pitfalls to be wary off. - debug 1 # URLs - debug 4096 # Info - debug 8192 # Errors - *we highly recommended enabling this* +The quickest way to adjust any of these settings is with your browser through +the special Privoxy editor at http://config.privoxy.org/show-status (shortcut: +http://p.p/show-status). This is an internal page, and does not require +Internet access. - Junkbuster normally uses "multi-threading", a software technique that - permits it to handle many different requests simultaneously. In some - cases you may wish to disable this -- particularly if you're trying to - debug a problem. The "single-threaded" option forces Junkbuster to - handle requests sequentially. Default: Multi-threaded mode. +Note that as of Privoxy 3.0.7 beta the action editor is disabled by default. +Check the enable-edit-actions section in the configuration file to learn why +and in which cases it's safe to enable again. - #single-threaded +If you decided to enable the action editor, select the appropriate "actions" +file, and click "Edit". It is best to put personal or local preferences in +user.action since this is not meant to be overwritten during upgrades, and will +over-ride the settings in other files. Here you can insert new "actions", and +URLs for ad blocking or other purposes, and make other adjustments to the +configuration. Privoxy will detect these changes automatically. - "toggle" allows you to temporarily disable all Junkbuster's filtering. - Just set "toggle 0". +A quick and simple step by step example: - The Windows version of Junkbuster puts an icon in the system tray, - which allows you to change this option without having to edit this - file. If you right-click on that icon (or select the "Options" menu), - one choice is "Enable". Clicking on enable toggles Junkbuster on and - off. This is useful if you want to temporarily disable Junkbuster, - e.g., to access a site that requires cookies which you normally have - blocked. + * Right click on the ad image to be blocked, then select "Copy Link Location" + from the pop-up menu. - "toggle 1" means Junkbuster runs normally, "toggle 0" means that - Junkbuster becomes a non-anonymizing non-blocking proxy. Default: 1. + * Set your browser to http://config.privoxy.org/show-status - toggle 1 - _________________________________________________________________ + * Find user.action in the top section, and click on "Edit": -Access Control List (ACL) + Figure 1. Actions Files in Use - Access controls are included at the request of some ISPs and systems - administrators, and are not usually needed by individual users. Please - note the warnings in the FAQ that this proxy is not intended to be a - substitute for a firewall or to encourage anyone to defer addressing - basic security weaknesses. + [files-in-u] - If no access settings are specified, the proxy talks to anyone that - connects. If any access settings file are specified, then the proxy - talks only to IP addresses permitted somewhere in this file and not - denied later in this file. + * You should have a section with only block listed under "Actions:". If not, + click a "Insert new section below" button, and in the new section that just + appeared, click the Edit button right under the word "Actions:". This will + bring up a list of all actions. Find block near the top, and click in the + "Enabled" column, then "Submit" just below the list. - Summary -- if using an ACL: + * Now, in the block actions section, click the "Add" button, and paste the + URL the browser got from "Copy Link Location". Remove the http:// at the + beginning of the URL. Then, click "Submit" (or "OK" if in a pop-up window). - Client must have permission to receive service. + * Now go back to the original page, and press SHIFT-Reload (or flush all + browser caches). The image should be gone now. - LAST match in ACL wins. +This is a very crude and simple example. There might be good reasons to use a +wildcard pattern match to include potentially similar images from the same +site. For a more extensive explanation of "patterns", and the entire actions +concept, see the Actions section. - Default behavior is to deny service. +For advanced users who want to hand edit their config files, you might want to +now go to the Actions Files Tutorial. The ideas explained therein also apply to +the web-based editor. - The syntax for an entry in the Access Control List is: +There are also various filters that can be used for ad blocking (filters are a +special subset of actions). These fall into the "advanced" usage category, and +are explained in depth in later sections. - ACTION SRC_ADDR[/SRC_MASKLEN] [ DST_ADDR[/DST_MASKLEN] ] +------------------------------------------------------------------------------- - Where the individual fields are: +5. Starting Privoxy - ACTION = "permit-access" or "deny-access" - SRC_ADDR = client hostname or dotted IP address - SRC_MASKLEN = number of bits in the subnet mask for the source - DST_ADDR = server or forwarder hostname or dotted IP address - DST_MASKLEN = number of bits in the subnet mask for the target +Before launching Privoxy for the first time, you will want to configure your +browser(s) to use Privoxy as a HTTP and HTTPS (SSL) proxy. The default is +127.0.0.1 (or localhost) for the proxy address, and port 8118 (earlier versions +used port 8000). This is the one configuration step that must be done! - The field separator (FS) is whitespace (space or tab). +Please note that Privoxy can only proxy HTTP and HTTPS traffic. It will not +work with FTP or other protocols. - IMPORTANT NOTE: If the junkbuster is using a forwarder (see below) or - a gateway for a particular destination URL, the DST_ADDR that is - examined is the address of the forwarder or the gateway and NOT the - address of the ultimate target. This is necessary because it may be - impossible for the local Junkbuster to determine the address of the - ultimate target (that's often what gateways are used for). +Figure 2. Proxy Configuration Showing Mozilla/Netscape HTTP and HTTPS (SSL) +Settings - Here are a few examples to show how the ACL features work: +[proxy_setu] - "localhost" is OK -- no DST_ADDR implies that ALL destination - addresses are OK: +With Firefox, this is typically set under: - permit-access localhost + Tools -> Options -> General -> Connection Settings -> Manual Proxy +Configuration + - A silly example to illustrate permitting any host on the class-C - subnet with Junkbuster to go anywhere: +Or optionally on some platforms: - permit-access www.junkbusters.com/24 + Edit -> Preferences -> General -> Connection Settings -> Manual Proxy +Configuration + - Except deny one particular IP address from using it at all: +With Netscape (and Mozilla), this can be set under: - deny-access ident.junkbusters.com + Edit -> Preferences -> Advanced -> Proxies -> HTTP Proxy + - You can also specify an explicit network address and subnet mask. - Explicit addresses do not have to be resolved to be used. +For Internet Explorer v.5-6: - permit-access 207.153.200.0/24 + Tools -> Internet Options -> Connections -> LAN Settings - A subnet mask of 0 matches anything, so the next line permits - everyone. +Then, check "Use Proxy" and fill in the appropriate info (Address: 127.0.0.1, +Port: 8118). Include HTTPS (SSL), if you want HTTPS proxy support too +(sometimes labeled "Secure"). Make sure any checkboxes like "Use the same proxy +server for all protocols" is UNCHECKED. You want only HTTP and HTTPS (SSL)! - permit-access 0.0.0.0/0 +Figure 3. Proxy Configuration Showing Internet Explorer HTTP and HTTPS (Secure) +Settings - Note, you cannot say: +[proxy2] - permit-access .org +After doing this, flush your browser's disk and memory caches to force a +re-reading of all pages and to get rid of any ads that may be cached. Remove +any cookies, if you want Privoxy to manage that. You are now ready to start +enjoying the benefits of using Privoxy! - to allow all *.org domains. Every IP address listed must resolve - fully. +Privoxy itself is typically started by specifying the main configuration file +to be used on the command line. If no configuration file is specified on the +command line, Privoxy will look for a file named config in the current +directory. Except on Win32 where it will try config.txt. - An ISP may want to provide a Junkbuster that is accessible by "the - world" and yet restrict use of some of their private content to hosts - on its internal network (i.e. its own subscribers). Say, for instance - the ISP owns the Class-B IP address block 123.124.0.0 (a 16 bit - netmask). This is how they could do it: +------------------------------------------------------------------------------- - permit-access 0.0.0.0/0 0.0.0.0/0 # other clients can go anywhere - # with the following exceptions - : +5.1. Red Hat and Fedora - deny-access 0.0.0.0/0 123.124.0.0/16 # block all external request - s for - # sites on the ISP's network - permit 0.0.0.0/0 www.my_isp.com # except for the ISP's main - # web site - permit 123.124.0.0/16 0.0.0.0/0 # the ISP's clients can go - # anywhere +A default Red Hat installation may not start Privoxy upon boot. It will use the +file /etc/privoxy/config as its main configuration file. - Note that if some hostnames are listed with multiple IP addresses, the - primary value returned by DNS (via gethostbyname()) is used. Default: - Anyone can access the proxy. - _________________________________________________________________ + # /etc/rc.d/init.d/privoxy start -Forwarding - This feature allows chaining of HTTP requests via multiple proxies. It - can be used to better protect privacy and confidentiality when - accessing specific domains by routing requests to those domains to a - special purpose filtering proxy such as lpwa.com. +Or ... - It can also be used in an environment with multiple networks to route - requests via multiple gateways allowing transparent access to multiple - networks without having to modify browser configurations. + # service privoxy start - Also specified here are SOCKS proxies. Junkbuster SOCKS 4 and SOCKS - 4A. The difference is that SOCKS 4A will resolve the target hostname - using DNS on the SOCKS server, not our local DNS client. - The syntax of each line is: +------------------------------------------------------------------------------- - forward target_domain[:port] http_proxy_host[:port] - forward-socks4 target_domain[:port] socks_proxy_host[:port] - http_proxy_host[:port] - forward-socks4a target_domain[:port] socks_proxy_host[:port] - http_proxy_host[:port] +5.2. Debian - If http_proxy_host is ".", then requests are not forwarded to a HTTP - proxy but are made directly to the web servers. +We use a script. Note that Debian typically starts Privoxy upon booting per +default. It will use the file /etc/privoxy/config as its main configuration +file. - Lines are checked in sequence, and the last match wins. + # /etc/init.d/privoxy start - There is an implicit line equivalent to the following, which specifies - that anything not finding a match on the list is to go out without - forwarding or gateway protocol, like so: - forward .* . # implicit +------------------------------------------------------------------------------- - In the following common configuration, everything goes to Lucent's - LPWA, except SSL on port 443 (which it doesn't handle): +5.3. Windows - forward .* lpwa.com:8000 - forward :443 . +Click on the Privoxy Icon to start Privoxy. If no configuration file is +specified on the command line, Privoxy will look for a file named config.txt. +Note that Windows will automatically start Privoxy when the system starts if +you chose that option when installing. - See the FAQ for instructions on how to automate the login procedure - for LPWA. Some users have reported difficulties related to LPWA's use - of "." as the last element of the domain, and have said that this can - be fixed with this: +Privoxy can run with full Windows service functionality. On Windows only, the +Privoxy program has two new command line arguments to install and uninstall +Privoxy as a service. See the Windows Installation instructions for details. - forward lpwa. lpwa.com:8000 +------------------------------------------------------------------------------- - (NOTE: the syntax for specifiying target_domain has changed since the - previous paragraph was written -- it will not work now. More - information is welcome.) +5.4. Solaris, NetBSD, FreeBSD, HP-UX and others - In this fictitious example, everything goes via an ISP's caching - proxy, except requests to that ISP: +Example Unix startup command: - forward .* caching.myisp.net:8000 - forward myisp.net . + # /usr/sbin/privoxy /etc/privoxy/config - For the @home network, we're told the forwarding configuration is - this: - forward .* proxy:8080 +------------------------------------------------------------------------------- - Also, we're told they insist on getting cookies and JavaScript, so you - need to add home.com to the cookie file. We consider JavaScript a - security risk. Java need not be enabled. +5.5. OS/2 - In this example direct connections are made to all "internal" domains, - but everything else goes through Lucent's LPWA by way of the company's - SOCKS gateway to the Internet. +During installation, Privoxy is configured to start automatically when the +system restarts. You can start it manually by double-clicking on the Privoxy +icon in the Privoxy folder. - forward_socks4 .* lpwa.com:8000 firewall.my_company.com:1080 - forward my_company.com . +------------------------------------------------------------------------------- - This is how you could set up a site that always uses SOCKS but no - forwarders: +5.6. Mac OSX - forward_socks4a .* . firewall.my_company.com:1080 +During installation, Privoxy is configured to start automatically when the +system restarts. To start Privoxy manually, double-click on the +StartPrivoxy.command icon in the /Library/Privoxy folder. Or, type this command +in the Terminal: - An advanced example for network administrators: + /Library/Privoxy/StartPrivoxy.command - If you have links to multiple ISPs that provide various special - content to their subscribers, you can configure forwarding to pass - requests to the specific host that's connected to that ISP so that - everybody can see all of the content on all of the ISPs. - This is a bit tricky, but here's an example: - host-a has a PPP connection to isp-a.com. And host-b has a PPP - connection to isp-b.com. host-a can run a Junkbuster proxy with - forwarding like this: +You will be prompted for the administrator password. - forward .* . - forward isp-b.com host-b:8000 +------------------------------------------------------------------------------- - host-b can run a Junkbuster proxy with forwarding like this: +5.7. AmigaOS - forward .* . - forward isp-a.com host-a:8000 +Start Privoxy (with RUN <>NIL:) in your startnet script (AmiTCP), in +s:user-startup (RoadShow), as startup program in your startup script (Genesis), +or as startup action (Miami and MiamiDx). Privoxy will automatically quit when +you quit your TCP/IP stack (just ignore the harmless warning your TCP/IP stack +may display that Privoxy is still running). - Now, anyone on the Internet (including users on host-a and host-b) can - set their browser's proxy to either host-a or host-b and be able to - browse the content on isp-a or isp-b. +------------------------------------------------------------------------------- - Here's another practical example, for University of Kent at Canterbury - students with a network connection in their room, who need to use the - University's Squid web cache. +5.8. Gentoo - forward *. ssbcache.ukc.ac.uk:3128 # Use the proxy, except for: - forward .ukc.ac.uk . # Anything on the same domain as us - forward * . # Host with no domain specified - forward 129.12.*.* . # A dotted IP on our /16 network. - forward 127.*.*.* . # Loopback address - forward localhost.localdomain . # Loopback address - forward www.ukc.mirror.ac.uk . # Specific host +A script is again used. It will use the file /etc/privoxy/config as its main +configuration file. - If you intend to chain Junkbuster and squid locally, then chain as - browser -> squid -> junkbuster is the recommended way. + /etc/init.d/privoxy start - Your squid configuration could then look like this: - # Define junkbuster as parent cache - cache_peer 127.0.0.1 parent 8000 0 no-query +Note that Privoxy is not automatically started at boot time by default. You can +change this with the rc-update command. - # Define ACL for protocol FTP - acl FTP proto FTP - # Do not forward ACL FTP to junkbuster - always_direct allow FTP - # Do not forward ACL CONNECT (https) to junkbuster - always_direct allow CONNECT - # Forward the rest to junkbuster - never_direct allow all - _________________________________________________________________ + rc-update add privoxy default -Windows GUI Options - Junkbuster has a number of options specific to the Windows GUI - interface: - If "activity-animation" is set to 1, the Junkbuster icon will animate - when "Junkbuster" is active. To turn off, set to 0. +------------------------------------------------------------------------------- - activity-animation 1 +5.9. Command Line Options - If "log-messages" is set to 1, Junkbuster will log messages to the - console window: +Privoxy may be invoked with the following command-line options: - log-messages 1 + * --version - If "log-buffer-size" is set to 1, the size of the log buffer, i.e. the - amount of memory used for the log messages displayed in the console - window, will be limited to "log-max-lines" (see below). + Print version info and exit. Unix only. - Warning: Setting this to 0 will result in the buffer to grow - infinitely and eat up all your memory! + * --help - log-buffer-size 1 + Print short usage info and exit. Unix only. - log-max-lines is the maximum number of lines held in the log buffer. - See above. + * --no-daemon - log-max-lines 200 + Don't become a daemon, i.e. don't fork and become process group leader, and + don't detach from controlling tty. Unix only. - If "log-highlight-messages" is set to 1, Junkbuster will highlight - portions of the log messages with a bold-faced font: + * --pidfile FILE - log-highlight-messages 1 + On startup, write the process ID to FILE. Delete the FILE on exit. Failure + to create or delete the FILE is non-fatal. If no FILE option is given, no + PID file will be used. Unix only. - The font used in the console window: + * --user USER[.GROUP] - log-font-name Comic Sans MS + After (optionally) writing the PID file, assume the user ID of USER, and if + included the GID of GROUP. Exit if the privileges are not sufficient to do + so. Unix only. - Font size used in the console window: + * --chroot - log-font-size 8 + Before changing to the user ID given in the --user option, chroot to that + user's home directory, i.e. make the kernel pretend to the Privoxy process + that the directory tree starts there. If set up carefully, this can limit + the impact of possible vulnerabilities in Privoxy to the files contained in + that hierarchy. Unix only. - "show-on-task-bar" controls whether or not Junkbuster will appear as a - button on the Task bar when minimized: + * --pre-chroot-nslookup hostname - show-on-task-bar 0 + Specifies a hostname to look up before doing a chroot. On some systems, + initializing the resolver library involves reading config files from /etc + and/or loading additional shared libraries from /lib. On these systems, + doing a hostname lookup before the chroot reduces the number of files that + must be copied into the chroot tree. - If "close-button-minimizes" is set to 1, the Windows close button will - minimize Junkbuster instead of closing the program (close with the - exit option on the File menu). + For fastest startup speed, a good value is a hostname that is not in /etc/ + hosts but that your local name server (listed in /etc/resolv.conf) can + resolve without recursion (that is, without having to ask any other name + servers). The hostname need not exist, but if it doesn't, an error message + (which can be ignored) will be output. - close-button-minimizes 1 + * configfile - The "hide-console" option is specific to the MS-Win console version of - JunkBuster. If this option is used, Junkbuster will disconnect from - and hide the command console. + If no configfile is included on the command line, Privoxy will look for a + file named "config" in the current directory (except on Win32 where it will + look for "config.txt" instead). Specify full path to avoid confusion. If no + config file is found, Privoxy will fail to start. - #hide-console - _________________________________________________________________ +On MS Windows only there are two additional command-line options to allow +Privoxy to install and run as a service. See the Window Installation section +for details. -The Actions File +------------------------------------------------------------------------------- - The "actionsfile" is used to define what actions Junkbuster takes, and - thus determines how images, cookies and various other aspects of HTTP - content and transactions are handled. Images can be anything you want, - including ads, banners, or just some obnoxious image that you would - rather not see. Cookies can be accepted or rejected. The default file - is in fact named actionsfile. +6. Privoxy Configuration - To determine which actions apply to a request, the URL of the request - is compared to all patterns in this file. Every time it matches, the - list of applicable actions for the URL is incrementally updated. You - can trace this process by visiting [16]http://i.j.b/show-url-info. +All Privoxy configuration is stored in text files. These files can be edited +with a text editor. Many important aspects of Privoxy can also be controlled +easily with a web browser. - There are four types of lines in this file: comments (begin with a "#" - character), actions, aliases and patterns, all of which are explained - below. - _________________________________________________________________ +------------------------------------------------------------------------------- -URL Domain and Path Syntax +6.1. Controlling Privoxy with Your Web Browser - Generally, a pattern has the form /, where both the - and part are optional. If you only specify a domain - part, the "/" can be left out: +Privoxy's user interface can be reached through the special URL http:// +config.privoxy.org/ (shortcut: http://p.p/), which is a built-in page and works +without Internet access. You will see the following section: - www.example.com - is a domain only pattern and will match any request - to "www.example.com". + Privoxy Menu + ? View & change the current configuration + ? View the source code version numbers + ? View the request headers. + ? Look up which actions apply to a URL and why + ? Toggle Privoxy on or off + ? Documentation - www.example.com/ - means exactly the same. - www.example.com/index.html - matches only the single document - "/index.html" on "www.example.com". +This should be self-explanatory. Note the first item leads to an editor for the +actions files, which is where the ad, banner, cookie, and URL blocking magic is +configured as well as other advanced features of Privoxy. This is an easy way +to adjust various aspects of Privoxy configuration. The actions file, and other +configuration files, are explained in detail below. - /index.html - matches the document "/index.html", regardless of the - domain. +"Toggle Privoxy On or Off" is handy for sites that might have problems with +your current actions and filters. You can in fact use it as a test to see +whether it is Privoxy causing the problem or not. Privoxy continues to run as a +proxy in this case, but all manipulation is disabled, i.e. Privoxy acts like a +normal forwarding proxy. There is even a toggle Bookmarklet offered, so that +you can toggle Privoxy with one click from your browser. - index.html - matches nothing, since it would be interpreted as a - domain name and there is no top-level domain called ".html". +Note that several of the features described above are disabled by default in +Privoxy 3.0.7 beta and later. Check the configuration file to learn why and in +which cases it's safe to enable them again. - The matching of the domain part offers some flexible options: if the - domain starts or ends with a dot, it becomes unanchored at that end. - For example: +------------------------------------------------------------------------------- - .example.com - matches any domain that ENDS in ".example.com". +6.2. Configuration Files Overview - www. - matches any domain that STARTS with "www". +For Unix, *BSD and Linux, all configuration files are located in /etc/privoxy/ +by default. For MS Windows, OS/2, and AmigaOS these are all in the same +directory as the Privoxy executable. - Additionally, there are wildcards that you can use in the domain names - themselves. They work pretty similar to shell wildcards: "*" stands - for zero or more arbitrary characters, "?" stands for any single - character. And you can define charachter classes in square brackets - and they can be freely mixed: +The installed defaults provide a reasonable starting point, though some +settings may be aggressive by some standards. For the time being, the principle +configuration files are: - ad*.example.com - matches "adserver.example.com", "ads.example.com", - etc but not "sfads.example.com". + * The main configuration file is named config on Linux, Unix, BSD, OS/2, and + AmigaOS and config.txt on Windows. This is a required file. - *ad*.example.com - matches all of the above, and then some. + * default.action (the main actions file) is used to define which "actions" + relating to banner-blocking, images, pop-ups, content modification, cookie + handling etc should be applied by default. It also defines many exceptions + (both positive and negative) from this default set of actions that enable + Privoxy to selectively eliminate the junk, and only the junk, on as many + websites as possible. - .?pix.com - matches "www.ipix.com", "pictures.epix.com", - "a.b.c.d.e.upix.com", etc. + Multiple actions files may be defined in config. These are processed in the + order they are defined. Local customizations and locally preferred + exceptions to the default policies as defined in default.action (which you + will most probably want to define sooner or later) are probably best + applied in user.action, where you can preserve them across upgrades. + standard.action is only for Privoxy's internal use. - www[1-9a-ez].example.com - matches "www1.example.com", - "www4.example.com", "wwwd.example.com", "wwwz.example.com", etc., but - not "wwww.example.com". + There is also a web based editor that can be accessed from http:// + config.privoxy.org/show-status (Shortcut: http://p.p/show-status) for the + various actions files. - If Junkbuster was compiled with "pcre" support (default), Perl - compatible regular expressions can be used. See the pcre/docs/ - direcory or "man perlre" (also available on - [17]http://www.perldoc.com/perl5.6/pod/perlre.html) for details. A - brief discussion of regular expressions is in the [18]Appendix. For - instance: + * "Filter files" (the filter file) can be used to re-write the raw page + content, including viewable text as well as embedded HTML and JavaScript, + and whatever else lurks on any given web page. The filtering jobs are only + pre-defined here; whether to apply them or not is up to the actions files. + default.filter includes various filters made available for use by the + developers. Some are much more intrusive than others, and all should be + used with caution. You may define additional filter files in config as you + can with actions files. We suggest user.filter for any locally defined + filters or customizations. - /.*/advert[0-9]+\.jpe?g - would match a URL from any domain, with any - path that includes "advert" followed immediately by one or more - digits, then a "." and ending in either "jpeg" or "jpg". So we match - "example.com/ads/advert2.jpg", and - "www.example.com/ads/banners/advert39.jpeg", but not - "www.example.com/ads/banners/advert39.gif" (no gifs in the example - pattern). - - Please note that matching in the path is case INSENSITIVE by default, - but you can switch to case sensitive at any point in the pattern by - using the "(?-i)" switch: - - www.example.com/(?-i)PaTtErN.* - will match only documents whose path - starts with "PaTtErN" in exactly this capitalization. - _________________________________________________________________ - -Actions - - Actions are enabled if preceded with a "+", and disabled if preceded - with a "-". Actions are invoked by enclosing the action name in curly - braces (e.g. {+some_action}), followed by a list of URLs to which the - action applies. There are three classes of actions: - - * Boolean (e.g. "+/-block"): - {+name} # enable this action - {-name} # disable this action - - * Parameterized (e.g. "+/-hide-user-agent"): - {+name{param}} # enable action and set parameter to "param" - {-name} # disable action - - * Multi-value (e.g. "{+/-add-header{Name: value}}", - "{+/-wafer{name=value}}"): - {+name{param}} # enable action and add parameter "param" - {-name{param}} # remove the parameter "param" - {-name} # disable this action totally - - If nothing is specified in this file, no "actions" are taken. So in - this case JunkBuster would just be a normal, non-blocking, - non-anonymizing proxy. You must specifically enable the privacy and - blocking features you need (although the provided default actionsfile - file will give a good starting point). - - Later defined actions always over-ride earlier ones. For multi-valued - actions, the actions are applied in the order they are specified. - - The list of valid Junkbuster "actions" are: - - * Add the specified HTTP header, which is not checked for validity. - You may specify this many times to specify many different headers: - +add-header{Name: value} - - * Block this URL totally. - +block - - * De-animate all animated GIF images, i.e. reduce them to their last - frame. This will also shrink the images considerably (in bytes, - not pixels!). If the option "first" is given, the first frame of - the animation is used as the replacement. If "last" is given, the - last frame of the animation is used instead, which propably makes - more sense for most banner animations, but also has the risk of - not showing the entire last frame (if it is only a delta to an - earlier frame). - +deanimate-gifs{last} - +deanimate-gifs{first} - - * "+downgrade" will downgrade HTTP/1.1 client requests to HTTP/1.0 - and downgrade the responses as well. Use this action for servers - that use HTTP/1.1 protocol features that Junkbuster doesn't handle - well yet. HTTP/1.1 is only partially implemented. Default is not - to downgrade requests. - +downgrade - - * Many sites, like yahoo.com, don't just link to other sites. - Instead, they will link to some script on their own server, giving - the destination as a parameter, which will then redirect you to - the final target. URLs resulting from this scheme typically look - like: http://some.place/some_script?http://some.where-else. - Sometimes, there are even multiple consecutive redirects encoded - in the URL. These redirections via scripts make your web browing - more traceable, since the server from which you follow such a link - can see where you go to. Apart from that, valuable bandwidth and - time is wasted, while your browser ask the server for one redirect - after the other. Plus, it feeds the advertisers. - The "+fast-redirects" option enables interception of these - requests by Junkbuster, who will cut off all but the last valid - URL in the request and send a local redirect back to your browser - without contacting the remote site. - +fast-redirects - - * Filter the website through the re_filterfile: - +filter{filename} - - * Block any existing X-Forwarded-for header, and do not add a new - one: - +hide-forwarded - - * If the browser sends a "From:" header containing your e-mail - address, this either completely removes the header ("block"), or - changes it to the specified e-mail address. - +hide-from{block} - +hide-from{spam@sittingduck.xqq} - - * Don't send the "Referer:" (sic) header to the web site. You can - block it, forge a URL to the same server as the request (which is - preferred because some sites will not send images otherwise) or - set it to a constant string of your choice. - +hide-referer{block} - +hide-referer{forge} - +hide-referer{http://nowhere.com} - - * Alternative spelling of "+hide-referer". It has the same - parameters, and can be freely mixed with, "+hide-referer". - ("referrer" is the correct English spelling, however the HTTP - specification has a bug - it requires it to be spelled "referer".) - +hide-referrer{...} - - * Change the "User-Agent:" header so web servers can't tell your - browser type. Warning! This breaks many web sites. Specify the - user-agent value you want. Example, pretend to be using Netscape - on Linux: - +hide-user-agent{Mozilla (X11; I; Linux 2.0.32 i586)} - - * Treat this URL as an image. This only matters if it's also - "+block"ed, in which case a "blocked" image can be sent rather - than a HTML page. See "+image-blocker{}" below for the control - over what is actually sent. - +image - - * Decides what to do with URLs that end up tagged with "{+block - +image}". There are 4 options. "-image-blocker" will send a HTML - "blocked" page, usually resulting in a "broken image" icon. - "+image-blocker{logo}" will send a "JunkBuster" image. - "+image-blocker{blank}" will send a 1x1 transparent GIF image. And - finally, "+image-blocker{http://xyz.com}" will send a HTTP - temporary redirect to the specified image. This has the advantage - of the icon being being cached by the browser, which will speed up - the display. - +image-blocker{logo} - +image-blocker{blank} - +image-blocker{http://i.j.b/send-banner} - - * By default (i.e. in the absence of a "+limit-connect" action), - Junkbuster will only allow CONNECT requests to port 443, which is - the standard port for https as a precaution. - The CONNECT methods exists in HTTP to allow access to secure - websites (https:// URLs) through proxies. It works very simply: - the proxy connects to the server on the specified port, and then - short-circuits its connections to the client and to the remote - proxy. This can be a big security hole, since CONNECT-enabled - proxies can be abused as TCP relays very easily. - If you want to allow CONNECT for more ports than this, or want to - forbid CONNECT altogether, you can specify a comma separated list - of ports and port ranges (the latter using dashes, with the - minimum defaulting to 0 and max to 65K): - +limit-connect{443} # This is the default and need no be - specified. - +limit-connect{80,443} # Ports 80 and 443 are OK. - +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to - 100 - #and above 500 are OK. - - * "+no-compression" prevents the website from compressing the data. - Some websites do this, which can be a problem for Junkbuster, - since "+filter", "+no-popup" and "+gif-deanimate" will not work on - compressed data. This will slow down connections to those - websites, though. Default is "nocompression" is turned on. - +nocompression - - * Prevent the website from reading cookies: - +no-cookies-read - - * Prevent the website from setting cookies: - +no-cookies-set - - * Filter the website through a built-in filter to disable those - obnoxious JavaScript pop-up windows via window.open(), etc. The - two alternative spellings are equivalent. - +no-popup - +no-popups - - * This action only applies if you are using a jarfile for saving - cookies. It sends a cookie to every site stating that you do not - accept any copyright on cookies sent to you, and asking them not - to track you. Of course, this is a (relatively) unique header they - could use to track you. - +vanilla-wafer - - * This allows you to add an arbitrary cookie. It can be specified - multiple times in order to add as many cookies as you like. - +wafer{name=value} - - The meaning of any of the above is reversed by preceding the action - with a "-", in place of the "+". - - Some examples: - - Turn off cookies by default, then allow a few through for specified - sites: - - # Turn off all cookies - { +no-cookies-read } - { +no-cookies-set } - # Execeptions to the above, sites that need cookies - { -no-cookies-read } - { -no-cookies-set } - .javasoft.com - .sun.com - .yahoo.com - .msdn.microsoft.com - .redhat.com - # Alternative way of saying the same thing - {-no-cookies-set -no-cookies-read} - .sourceforge.net - .sf.net - - Now turn off "fast redirects", and then we allow two exceptions: - - # Turn them off! - {+fast-redirects} - - # Reverse it for these two sites, which don't work right without it. - {-fast-redirects} - www.ukc.ac.uk/cgi-bin/wac\.cgi\? - login.yahoo.com - - Turn on page filtering, with one exception for sourceforge: - - # Run everything through the default filter file (re_filterfile): - {+filter} - - # But please don't re_filter code from sourceforge! - {-filter} - .cvs.sourceforge.net - - Now some URLs that we want "blocked", ie we won't see them. Many of - these use regular expressions that will expand to match multiple URLs: - - # Blocklist: - {+block} - /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g)) - /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/]) - /.*/(ng)?adclient\.cgi - /.*/(plain|live|rotate)[-_.]?ads?/ - /.*/(sponsor)s?[0-9]?/ - /.*/_?(plain|live)?ads?(-banners)?/ - /.*/abanners/ - /.*/ad(sdna_image|gifs?)/ - /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe) - /.*/adbanners/ - /.*/adserver - /.*/adstream\.cgi - /.*/adv((er)?ts?|ertis(ing|ements?))?/ - /.*/banner_?ads/ - /.*/banners?/ - /.*/banners?\.cgi/ - /.*/cgi-bin/centralad/getimage - /.*/images/addver\.gif - /.*/images/marketing/.*\.(gif|jpe?g) - /.*/popupads/ - /.*/siteads/ - /.*/sponsor.*\.gif - /.*/sponsors?[0-9]?/ - /.*/advert[0-9]+\.jpg - /Media/Images/Adds/ - /ad_images/ - /adimages/ - /.*/ads/ - /bannerfarm/ - /grafikk/annonse/ - /graphics/defaultAd/ - /image\.ng/AdType - /image\.ng/transactionID - /images/.*/.*_anim\.gif # alvin brattli - /ip_img/.*\.(gif|jpe?g) - /rotateads/ - /rotations/ - /worldnet/ad\.cgi - /cgi-bin/nph-adclick.exe/ - /.*/Image/BannerAdvertising/ - /.*/ad-bin/ - /.*/adlib/server\.cgi - /autoads/ - _________________________________________________________________ - -Aliases - - Custom "actions", known to Junkbuster as "aliases", can be defined by - combining other "actions". These can in turn be invoked just like the - built-in "actions". Currently, an alias can contain any character - except space, tab, "=", "{" or "}". But please use only "a"- "z", - "0"-"9", "+", and "-". Alias names are not case sensitive, and must be - defined before anything else in actionsfile! And there can only be one - set of "aliases" of defined. - - Now let's define a few aliases: - - # Useful customer aliases we can use later. These must come first! - {{alias}} - +no-cookies = +no-cookies-set +no-cookies-read - -no-cookies = -no-cookies-set -no-cookies-read - fragile = -block -no-cookies -filter -fast-redirects -hide-refere - r -no-popups - shop = -no-cookies -filter -fast-redirects - +imageblock = +block +image - #For people who don't like to type too much: ;-) - c0 = +no-cookies - c1 = -no-cookies - c2 = -no-cookies-set +no-cookies-read - c3 = +no-cookies-set -no-cookies-read - #... etc. Customize to your heart's content. - - Some examples using our "shop" and "fragile" aliases from above: - - # These sites are very complex and require - # minimal interference. - {fragile} - .office.microsoft.com - .windowsupdate.microsoft.com - .nytimes.com - # Shopping sites - still want to block ads. - {shop} - .quietpc.com - .worldpay.com # for quietpc.com - .jungle.com - .scan.co.uk - # These shops require pop-ups - {shop -no-popups} - .dabs.com - .overclockers.co.uk - _________________________________________________________________ - -The Filter File - - The filter file defines what filtering of web pages Junkbuster does. - The default filter file is re_filterfile, located in the config - directory. In this file, any document content, whether viewable text - or embedded non-visible content, can be changed. - - This file uses regular expressions to alter or remove any string in - the target page. Some examples from the included default - re_filterfile: - - Stop web pages from displaying annoying messages in the status bar by - deleting such references: - - # The status bar is for displaying link targets, not pointless buzzwo - rds. - # Again, check it out on http://www.airport-cgn.de/. - s/status='.*?';*//ig - - Just for kicks, replace any occurrence of "Microsoft" with - "MicroSuck": - - s/microsoft(?!.com)/MicroSuck/ig - - Kill those auto-refresh tags: - - # Kill refresh tags. I like to refresh myself. Manually. - # check it out on http://www.airport-cgn.de/ and go to the arrivals p - age. - # - s/]*http-equiv[^>]*refresh.*URL=([^>]*?)"?>//i - s/]*http-equiv="?page-enter"?[^>]*content=[^>]*>//i - _________________________________________________________________ - -Quickstart to Using Junkbuster - - Install package, then run and enjoy! Junbuster accepts only one - command line option -- the configuration file to be used. Example Unix - startup command: - - - # /usr/sbin/junkbuster /etc/junkbuster/config & +The syntax of the configuration and filter files may change between different +Privoxy versions, unfortunately some enhancements cost backwards compatibility. + +All files use the "#" character to denote a comment (the rest of the line will +be ignored) and understand line continuation through placing a backslash ("\") +as the very last character in a line. If the # is preceded by a backslash, it +looses its special function. Placing a # in front of an otherwise valid +configuration line to prevent it from being interpreted is called "commenting +out" that line. Blank lines are ignored. + +The actions files and filter files can use Perl style regular expressions for +maximum flexibility. + +After making any changes, there is no need to restart Privoxy in order for the +changes to take effect. Privoxy detects such changes automatically. Note, +however, that it may take one or two additional requests for the change to take +effect. When changing the listening address of Privoxy, these "wake up" +requests must obviously be sent to the old listening address. + +------------------------------------------------------------------------------- + +7. The Main Configuration File + +Again, the main configuration file is named config on Linux/Unix/BSD and OS/2, +and config.txt on Windows. Configuration lines consist of an initial keyword +followed by a list of values, all separated by whitespace (any number of spaces +or tabs). For example: + + confdir /etc/privoxy + +Assigns the value /etc/privoxy to the option confdir and thus indicates that +the configuration directory is named "/etc/privoxy/". + +All options in the config file except for confdir and logdir are optional. +Watch out in the below description for what happens if you leave them unset. + +The main config file controls all aspects of Privoxy's operation that are not +location dependent (i.e. they apply universally, no matter where you may be +surfing). + +------------------------------------------------------------------------------- + +7.1. Local Set-up Documentation + +If you intend to operate Privoxy for more users than just yourself, it might be +a good idea to let them know how to reach you, what you block and why you do +that, your policies, etc. + +------------------------------------------------------------------------------- + +7.1.1. user-manual + +Specifies: + + Location of the Privoxy User Manual. + +Type of value: + + A fully qualified URI + +Default value: + + Unset + +Effect if unset: + + http://www.privoxy.org/version/user-manual/ will be used, where version is + the Privoxy version. + +Notes: + + The User Manual URI is the single best source of information on Privoxy, + and is used for help links from some of the internal CGI pages. The manual + itself is normally packaged with the binary distributions, so you probably + want to set this to a locally installed copy. + + Examples: + + The best all purpose solution is simply to put the full local PATH to where + the User Manual is located: + + user-manual /usr/share/doc/privoxy/user-manual + + + The User Manual is then available to anyone with access to Privoxy, by + following the built-in URL: http://config.privoxy.org/user-manual/ (or the + shortcut: http://p.p/user-manual/). + + If the documentation is not on the local system, it can be accessed from a + remote server, as: + + user-manual http://example.com/privoxy/user-manual/ + + + +-----------------------------------------------------------------+ + | Warning | + |-----------------------------------------------------------------| + |If set, this option should be the first option in the config | + |file, because it is used while the config file is being read on | + |start-up. | + +-----------------------------------------------------------------+ + +------------------------------------------------------------------------------- + +7.1.2. trust-info-url + +Specifies: + + A URL to be displayed in the error page that users will see if access to an + untrusted page is denied. + +Type of value: + + URL + +Default value: + + Two example URLs are provided + +Effect if unset: + + No links are displayed on the "untrusted" error page. + +Notes: + + The value of this option only matters if the experimental trust mechanism + has been activated. (See trustfile below.) + + If you use the trust mechanism, it is a good idea to write up some on-line + documentation about your trust policy and to specify the URL(s) here. Use + multiple times for multiple URLs. + + The URL(s) should be added to the trustfile as well, so users don't end up + locked out from the information on why they were locked out in the first + place! + +------------------------------------------------------------------------------- + +7.1.3. admin-address + +Specifies: + + An email address to reach the Privoxy administrator. + +Type of value: + + Email address + +Default value: + + Unset + +Effect if unset: + + No email address is displayed on error pages and the CGI user interface. + +Notes: + + If both admin-address and proxy-info-url are unset, the whole "Local + Privoxy Support" box on all generated pages will not be shown. + +------------------------------------------------------------------------------- + +7.1.4. proxy-info-url + +Specifies: + + A URL to documentation about the local Privoxy setup, configuration or + policies. + +Type of value: + + URL + +Default value: + + Unset + +Effect if unset: + + No link to local documentation is displayed on error pages and the CGI user + interface. + +Notes: + + If both admin-address and proxy-info-url are unset, the whole "Local + Privoxy Support" box on all generated pages will not be shown. + + This URL shouldn't be blocked ;-) + +------------------------------------------------------------------------------- + +7.2. Configuration and Log File Locations + +Privoxy can (and normally does) use a number of other files for additional +configuration, help and logging. This section of the configuration file tells +Privoxy where to find those other files. + +The user running Privoxy, must have read permission for all configuration +files, and write permission to any files that would be modified, such as log +files and actions files. + +------------------------------------------------------------------------------- + +7.2.1. confdir + +Specifies: + + The directory where the other configuration files are located. + +Type of value: + + Path name + +Default value: + + /etc/privoxy (Unix) or Privoxy installation dir (Windows) + +Effect if unset: + + Mandatory + +Notes: + + No trailing "/", please. + +------------------------------------------------------------------------------- + +7.2.2. templdir + +Specifies: + + An alternative directory where the templates are loaded from. + +Type of value: + + Path name + +Default value: + + unset + +Effect if unset: + + The templates are assumed to be located in confdir/template. + +Notes: + + Privoxy's original templates are usually overwritten with each update. Use + this option to relocate customized templates that should be kept. As + template variables might change between updates, you shouldn't expect + templates to work with Privoxy releases other than the one they were part + of, though. + +------------------------------------------------------------------------------- + +7.2.3. logdir + +Specifies: + + The directory where all logging takes place (i.e. where logfile and jarfile + are located). + +Type of value: + + Path name + +Default value: + + /var/log/privoxy (Unix) or Privoxy installation dir (Windows) + +Effect if unset: + + Mandatory + +Notes: + + No trailing "/", please. + +------------------------------------------------------------------------------- + +7.2.4. actionsfile + +Specifies: + + The actions file(s) to use + +Type of value: + + Complete file name, relative to confdir + +Default values: + + standard.action # Internal purposes, no editing recommended + + default.action # Main actions file + + user.action # User customizations + +Effect if unset: + + No actions are taken at all. More or less neutral proxying. + +Notes: + + Multiple actionsfile lines are permitted, and are in fact recommended! + + The default values include standard.action, which is used for internal + purposes and should be loaded, default.action, which is the "main" actions + file maintained by the developers, and user.action, where you can make your + personal additions. + + Actions files contain all the per site and per URL configuration for ad + blocking, cookie management, privacy considerations, etc. There is no point + in using Privoxy without at least one actions file. + + Note that since Privoxy 3.0.7, the complete filename, including the + ".action" extension has to be specified. The syntax change was necessary to + be consistent with the other file options and to allow previously forbidden + characters. + +------------------------------------------------------------------------------- + +7.2.5. filterfile + +Specifies: + + The filter file(s) to use + +Type of value: + + File name, relative to confdir + +Default value: + + default.filter (Unix) or default.filter.txt (Windows) + +Effect if unset: + + No textual content filtering takes place, i.e. all +filter{name} actions in + the actions files are turned neutral. + +Notes: + + Multiple filterfile lines are permitted. + + The filter files contain content modification rules that use regular + expressions. These rules permit powerful changes on the content of Web + pages, and optionally the headers as well, e.g., you could try to disable + your favorite JavaScript annoyances, re-write the actual displayed text, or + just have some fun playing buzzword bingo with web pages. + + The +filter{name} actions rely on the relevant filter (name) to be defined + in a filter file! + + A pre-defined filter file called default.filter that contains a number of + useful filters for common problems is included in the distribution. See the + section on the filter action for a list. + + It is recommended to place any locally adapted filters into a separate + file, such as user.filter. + +------------------------------------------------------------------------------- + +7.2.6. logfile + +Specifies: + + The log file to use + +Type of value: + + File name, relative to logdir + +Default value: + + Unset (commented out). When activated: logfile (Unix) or privoxy.log + (Windows). + +Effect if unset: + + No logfile is written. + +Notes: + + The logfile is where all logging and error messages are written. The level + of detail and number of messages are set with the debug option (see below). + The logfile can be useful for tracking down a problem with Privoxy (e.g., + it's not blocking an ad you think it should block) and it can help you to + monitor what your browser is doing. + + Depending on the debug options below, the logfile may be a privacy risk if + third parties can get access to it. As most users will never look at it, + Privoxy 3.0.7 and later only log fatal errors by default. + + For most troubleshooting purposes, you will have to change that, please + refer to the debugging section for details. + + Your logfile will grow indefinitely, and you will probably want to + periodically remove it. On Unix systems, you can do this with a cron job + (see "man cron"). For Red Hat based Linux distributions, a logrotate script + has been included. + + Any log files must be writable by whatever user Privoxy is being run as (on + Unix, default user id is "privoxy"). + +------------------------------------------------------------------------------- + +7.2.7. jarfile + +Specifies: + + The file to store intercepted cookies in + +Type of value: + + File name, relative to logdir + +Default value: + + Unset (commented out). When activated: jarfile (Unix) or privoxy.jar + (Windows). + +Effect if unset: + + Intercepted cookies are not stored in a dedicated log file. + +Notes: + + The jarfile may grow to ridiculous sizes over time. + + If debug 8 (show header parsing) is enabled, cookies are also written to + the logfile with the rest of the headers. Therefore this option isn't very + useful and may be removed in future releases. Please report to the + developers if you are still using it. + +------------------------------------------------------------------------------- + +7.2.8. trustfile + +Specifies: + + The name of the trust file to use + +Type of value: + + File name, relative to confdir + +Default value: + + Unset (commented out). When activated: trust (Unix) or trust.txt (Windows) + +Effect if unset: + + The entire trust mechanism is disabled. + +Notes: + + The trust mechanism is an experimental feature for building white-lists and + should be used with care. It is NOT recommended for the casual user. + + If you specify a trust file, Privoxy will only allow access to sites that + are specified in the trustfile. Sites can be listed in one of two ways: + + Prepending a ~ character limits access to this site only (and any sub-paths + within this site), e.g. ~www.example.com allows access to ~www.example.com/ + features/news.html, etc. + + Or, you can designate sites as trusted referrers, by prepending the name + with a + character. The effect is that access to untrusted sites will be + granted -- but only if a link from this trusted referrer was used to get + there. The link target will then be added to the "trustfile" so that + future, direct accesses will be granted. Sites added via this mechanism do + not become trusted referrers themselves (i.e. they are added with a ~ + designation). There is a limit of 512 such entries, after which new entries + will not be made. + + If you use the + operator in the trust file, it may grow considerably over + time. + + It is recommended that Privoxy be compiled with the --disable-force, + --disable-toggle and --disable-editor options, if this feature is to be + used. + + Possible applications include limiting Internet access for children. + +------------------------------------------------------------------------------- + +7.3. Debugging + +These options are mainly useful when tracing a problem. Note that you might +also want to invoke Privoxy with the --no-daemon command line option when +debugging. + +------------------------------------------------------------------------------- + +7.3.1. debug + +Specifies: + + Key values that determine what information gets logged. + +Type of value: + + Integer values + +Default value: + + 0 (i.e.: only fatal errors (that cause Privoxy to exit) are logged) + +Effect if unset: + + Default value is used (see above). + +Notes: + + The available debug levels are: + + debug 1 # log each request destination (and the crunch reason if Privoxy intercepted the request) + debug 2 # show each connection status + debug 4 # show I/O status + debug 8 # show header parsing + debug 16 # log all data written to the network into the logfile + debug 32 # debug force feature + debug 64 # debug regular expression filters + debug 128 # debug redirects + debug 256 # debug GIF de-animation + debug 512 # Common Log Format + debug 1024 # debug kill pop-ups + debug 2048 # CGI user interface + debug 4096 # Startup banner and warnings. + debug 8192 # Non-fatal errors + + + To select multiple debug levels, you can either add them or use multiple + debug lines. + + A debug level of 1 is informative because it will show you each request as + it happens. 1, 4096 and 8192 are recommended so that you will notice when + things go wrong. The other levels are probably only of interest if you are + hunting down a specific problem. They can produce a hell of an output + (especially 16). + + Privoxy used to ship with the debug levels recommended above enabled by + default, but due to privacy concerns 3.0.7 and later are configured to only + log fatal errors. + + If you are used to the more verbose settings, simply enable the debug lines + below again. + + If you want to use pure CLF (Common Log Format), you should set "debug 512" + ONLY and not enable anything else. + + Privoxy has a hard-coded limit for the length of log messages. If it's + reached, messages are logged truncated and marked with "... [too long, + truncated]". + + Please don't file any support requests without trying to reproduce the + problem with increased debug level first. Once you read the log messages, + you may even be able to solve the problem on your own. + +------------------------------------------------------------------------------- + +7.3.2. single-threaded + +Specifies: + + Whether to run only one server thread. + +Type of value: + + None + +Default value: + + Unset + +Effect if unset: + + Multi-threaded (or, where unavailable: forked) operation, i.e. the ability + to serve multiple requests simultaneously. + +Notes: + + This option is only there for debugging purposes. It will drastically + reduce performance. + +------------------------------------------------------------------------------- + +7.4. Access Control and Security + +This section of the config file controls the security-relevant aspects of +Privoxy's configuration. + +------------------------------------------------------------------------------- + +7.4.1. listen-address + +Specifies: + + The IP address and TCP port on which Privoxy will listen for client + requests. + +Type of value: + + [IP-Address]:Port + +Default value: + + 127.0.0.1:8118 + +Effect if unset: + + Bind to 127.0.0.1 (localhost), port 8118. This is suitable and recommended + for home users who run Privoxy on the same machine as their browser. + +Notes: + + You will need to configure your browser(s) to this proxy address and port. + + If you already have another service running on port 8118, or if you want to + serve requests from other machines (e.g. on your local network) as well, + you will need to override the default. + + If you leave out the IP address, Privoxy will bind to all interfaces + (addresses) on your machine and may become reachable from the Internet. In + that case, consider using access control lists (ACL's, see below), and/or a + firewall. + + If you open Privoxy to untrusted users, you will also want to make sure + that the following actions are disabled: enable-edit-actions and + enable-remote-toggle + +Example: + + Suppose you are running Privoxy on a machine which has the address + 192.168.0.1 on your local private network (192.168.0.0) and has another + outside connection with a different address. You want it to serve requests + from inside only: + + listen-address 192.168.0.1:8118 + + +------------------------------------------------------------------------------- + +7.4.2. toggle + +Specifies: + + Initial state of "toggle" status + +Type of value: + + 1 or 0 + +Default value: + + 1 + +Effect if unset: + + Act as if toggled on + +Notes: + + If set to 0, Privoxy will start in "toggled off" mode, i.e. mostly behave + like a normal, content-neutral proxy with both ad blocking and content + filtering disabled. See enable-remote-toggle below. + + The windows version will only display the toggle icon in the system tray if + this option is present. + +------------------------------------------------------------------------------- + +7.4.3. enable-remote-toggle + +Specifies: + + Whether or not the web-based toggle feature may be used + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + The web-based toggle feature is disabled. + +Notes: + + When toggled off, Privoxy mostly acts like a normal, content-neutral proxy, + i.e. doesn't block ads or filter content. + + Access to the toggle feature can not be controlled separately by "ACLs" or + HTTP authentication, so that everybody who can access Privoxy (see "ACLs" + and listen-address above) can toggle it for all users. So this option is + not recommended for multi-user environments with untrusted users. + + Note that malicious client side code (e.g Java) is also capable of using + this option. + + As a lot of Privoxy users don't read documentation, this feature is + disabled by default. + + Note that you must have compiled Privoxy with support for this feature, + otherwise this option has no effect. + +------------------------------------------------------------------------------- + +7.4.4. enable-remote-http-toggle + +Specifies: + + Whether or not Privoxy recognizes special HTTP headers to change its + behaviour. + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + Privoxy ignores special HTTP headers. + +Notes: + + When toggled on, the client can change Privoxy's behaviour by setting + special HTTP headers. Currently the only supported special header is + "X-Filter: No", to disable filtering for the ongoing request, even if it is + enabled in one of the action files. + + This feature is disabled by default. If you are using Privoxy in a + environment with trusted clients, you may enable this feature at your + discretion. Note that malicious client side code (e.g Java) is also capable + of using this feature. + + This option will be removed in future releases as it has been obsoleted by + the more general header taggers. + +------------------------------------------------------------------------------- + +7.4.5. enable-edit-actions + +Specifies: + + Whether or not the web-based actions file editor may be used + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + The web-based actions file editor is disabled. + +Notes: + + Access to the editor can not be controlled separately by "ACLs" or HTTP + authentication, so that everybody who can access Privoxy (see "ACLs" and + listen-address above) can modify its configuration for all users. + + This option is not recommended for environments with untrusted users and as + a lot of Privoxy users don't read documentation, this feature is disabled + by default. + + Note that malicious client side code (e.g Java) is also capable of using + the actions editor and you shouldn't enable this options unless you + understand the consequences and are sure your browser is configured + correctly. + + Note that you must have compiled Privoxy with support for this feature, + otherwise this option has no effect. + +------------------------------------------------------------------------------- + +7.4.6. enforce-blocks + +Specifies: + + Whether the user is allowed to ignore blocks and can "go there anyway". + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + Blocks are not enforced. + +Notes: + + Privoxy is mainly used to block and filter requests as a service to the + user, for example to block ads and other junk that clogs the pipes. + Privoxy's configuration isn't perfect and sometimes innocent pages are + blocked. In this situation it makes sense to allow the user to enforce the + request and have Privoxy ignore the block. + + In the default configuration Privoxy's "Blocked" page contains a "go there + anyway" link to adds a special string (the force prefix) to the request + URL. If that link is used, Privoxy will detect the force prefix, remove it + again and let the request pass. + + Of course Privoxy can also be used to enforce a network policy. In that + case the user obviously should not be able to bypass any blocks, and that's + what the "enforce-blocks" option is for. If it's enabled, Privoxy hides the + "go there anyway" link. If the user adds the force prefix by hand, it will + not be accepted and the circumvention attempt is logged. + +Examples: + + enforce-blocks 1 + +------------------------------------------------------------------------------- + +7.4.7. ACLs: permit-access and deny-access + +Specifies: + + Who can access what. + +Type of value: + + src_addr[/src_masklen] [dst_addr[/dst_masklen]] + + Where src_addr and dst_addr are IP addresses in dotted decimal notation or + valid DNS names, and src_masklen and dst_masklen are subnet masks in CIDR + notation, i.e. integer values from 2 to 30 representing the length (in + bits) of the network address. The masks and the whole destination part are + optional. + +Default value: + + Unset + +Effect if unset: + + Don't restrict access further than implied by listen-address + +Notes: + + Access controls are included at the request of ISPs and systems + administrators, and are not usually needed by individual users. For a + typical home user, it will normally suffice to ensure that Privoxy only + listens on the localhost (127.0.0.1) or internal (home) network address by + means of the listen-address option. + + Please see the warnings in the FAQ that Privoxy is not intended to be a + substitute for a firewall or to encourage anyone to defer addressing basic + security weaknesses. + + Multiple ACL lines are OK. If any ACLs are specified, Privoxy only talks to + IP addresses that match at least one permit-access line and don't match any + subsequent deny-access line. In other words, the last match wins, with the + default being deny-access. + + If Privoxy is using a forwarder (see forward below) for a particular + destination URL, the dst_addr that is examined is the address of the + forwarder and NOT the address of the ultimate target. This is necessary + because it may be impossible for the local Privoxy to determine the IP + address of the ultimate target (that's often what gateways are used for). + + You should prefer using IP addresses over DNS names, because the address + lookups take time. All DNS names must resolve! You can not use domain + patterns like "*.org" or partial domain names. If a DNS name resolves to + multiple IP addresses, only the first one is used. + + Denying access to particular sites by ACL may have undesired side effects + if the site in question is hosted on a machine which also hosts other sites + (most sites are). + +Examples: + + Explicitly define the default behavior if no ACL and listen-address are + set: "localhost" is OK. The absence of a dst_addr implies that all + destination addresses are OK: + + permit-access localhost + + + Allow any host on the same class C subnet as www.privoxy.org access to + nothing but www.example.com (or other domains hosted on the same system): + + permit-access www.privoxy.org/24 www.example.com/32 + + + Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere, + with the exception that 192.168.45.73 may not access the IP address behind + www.dirty-stuff.example.com: + + permit-access 192.168.45.64/26 + deny-access 192.168.45.73 www.dirty-stuff.example.com + + +------------------------------------------------------------------------------- + +7.4.8. buffer-limit + +Specifies: + + Maximum size of the buffer for content filtering. + +Type of value: + + Size in Kbytes + +Default value: + + 4096 + +Effect if unset: + + Use a 4MB (4096 KB) limit. + +Notes: + + For content filtering, i.e. the +filter and +deanimate-gif actions, it is + necessary that Privoxy buffers the entire document body. This can be + potentially dangerous, since a server could just keep sending data + indefinitely and wait for your RAM to exhaust -- with nasty consequences. + Hence this option. + + When a document buffer size reaches the buffer-limit, it is flushed to the + client unfiltered and no further attempt to filter the rest of the document + is made. Remember that there may be multiple threads running, which might + require up to buffer-limit Kbytes each, unless you have enabled + "single-threaded" above. + +------------------------------------------------------------------------------- + +7.5. Forwarding + +This feature allows routing of HTTP requests through a chain of multiple +proxies. + +Forwarding can be used to chain Privoxy with a caching proxy to speed up +browsing. Using a parent proxy may also be necessary if the machine that +Privoxy runs on has no direct Internet access. + +Note that parent proxies can severely decrease your privacy level. For example +a parent proxy could add your IP address to the request headers and if it's a +caching proxy it may add the "Etag" header to revalidation requests again, even +though you configured Privoxy to remove it. It may also ignore Privoxy's header +time randomization and use the original values which could be used by the +server as cookie replacement to track your steps between visits. + +Also specified here are SOCKS proxies. Privoxy supports the SOCKS 4 and SOCKS +4A protocols. + +------------------------------------------------------------------------------- + +7.5.1. forward + +Specifies: + + To which parent HTTP proxy specific requests should be routed. + +Type of value: + + target_pattern http_parent[:port] + + where target_pattern is a URL pattern that specifies to which requests + (i.e. URLs) this forward rule shall apply. Use / to denote "all URLs". + http_parent[:port] is the DNS name or IP address of the parent HTTP proxy + through which the requests should be forwarded, optionally followed by its + listening port (default: 8080). Use a single dot (.) to denote "no + forwarding". + +Default value: + + Unset + +Effect if unset: + + Don't use parent HTTP proxies. + +Notes: + + If http_parent is ".", then requests are not forwarded to another HTTP + proxy but are made directly to the web servers. + + Multiple lines are OK, they are checked in sequence, and the last match + wins. + +Examples: + + Everything goes to an example parent proxy, except SSL on port 443 (which + it doesn't handle): + + forward / parent-proxy.example.org:8080 + forward :443 . + + + Everything goes to our example ISP's caching proxy, except for requests to + that ISP's sites: + + forward / caching-proxy.isp.example.net:8000 + forward .isp.example.net . + + +------------------------------------------------------------------------------- + +7.5.2. forward-socks4 and forward-socks4a + +Specifies: + + Through which SOCKS proxy (and optionally to which parent HTTP proxy) + specific requests should be routed. + +Type of value: + + target_pattern socks_proxy[:port] http_parent[:port] + + where target_pattern is a URL pattern that specifies to which requests + (i.e. URLs) this forward rule shall apply. Use / to denote "all URLs". + http_parent and socks_proxy are IP addresses in dotted decimal notation or + valid DNS names (http_parent may be "." to denote "no HTTP forwarding"), + and the optional port parameters are TCP ports, i.e. integer values from 1 + to 64535 + +Default value: + + Unset + +Effect if unset: + + Don't use SOCKS proxies. + +Notes: + + Multiple lines are OK, they are checked in sequence, and the last match + wins. + + The difference between forward-socks4 and forward-socks4a is that in the + SOCKS 4A protocol, the DNS resolution of the target hostname happens on the + SOCKS server, while in SOCKS 4 it happens locally. + + If http_parent is ".", then requests are not forwarded to another HTTP + proxy but are made (HTTP-wise) directly to the web servers, albeit through + a SOCKS proxy. + +Examples: + + From the company example.com, direct connections are made to all "internal" + domains, but everything outbound goes through their ISP's proxy by way of + example.com's corporate SOCKS 4A gateway to the Internet. + + forward-socks4a / socks-gw.example.com:1080 www-cache.isp.example.net:8080 + forward .example.com . + + + A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent + looks like this: + + forward-socks4 / socks-gw.example.com:1080 . + + + To chain Privoxy and Tor, both running on the same system, you would use + something like: + + forward-socks4a / 127.0.0.1:9050 . + + + The public Tor network can't be used to reach your local network, if you + need to access local servers you therefore might want to make some + exceptions: + + forward 192.168.*.*/ . + forward 10.*.*.*/ . + forward 127.*.*.*/ . + + + Unencrypted connections to systems in these address ranges will be as (un) + secure as the local network is, but the alternative is that you can't reach + the local network through Privoxy at all. Of course this may actually be + desired and there is no reason to make these exceptions if you aren't sure + you need them. + + If you also want to be able to reach servers in your local network by using + their names, you will need additional exceptions that look like this: + + forward localhost/ . + + +------------------------------------------------------------------------------- + +7.5.3. Advanced Forwarding Examples + +If you have links to multiple ISPs that provide various special content only to +their subscribers, you can configure multiple Privoxies which have connections +to the respective ISPs to act as forwarders to each other, so that your users +can see the internal content of all ISPs. + +Assume that host-a has a PPP connection to isp-a.example.net. And host-b has a +PPP connection to isp-b.example.org. Both run Privoxy. Their forwarding +configuration can look like this: + +host-a: + + forward / . + forward .isp-b.example.net host-b:8118 + + +host-b: + + forward / . + forward .isp-a.example.org host-a:8118 + + +Now, your users can set their browser's proxy to use either host-a or host-b +and be able to browse the internal content of both isp-a and isp-b. + +If you intend to chain Privoxy and squid locally, then chaining as browser -> +squid -> privoxy is the recommended way. + +Assuming that Privoxy and squid run on the same box, your squid configuration +could then look like this: + + # Define Privoxy as parent proxy (without ICP) + cache_peer 127.0.0.1 parent 8118 7 no-query + + # Define ACL for protocol FTP + acl ftp proto FTP + + # Do not forward FTP requests to Privoxy + always_direct allow ftp + + # Forward all the rest to Privoxy + never_direct allow all + + +You would then need to change your browser's proxy settings to squid's address +and port. Squid normally uses port 3128. If unsure consult http_port in +squid.conf. + +You could just as well decide to only forward requests you suspect of leading +to Windows executables through a virus-scanning parent proxy, say, on +antivir.example.com, port 8010: + + forward / . + forward /.*\.(exe|com|dll|zip)$ antivir.example.com:8010 + + +------------------------------------------------------------------------------- + +7.5.4. forwarded-connect-retries + +Specifies: + + How often Privoxy retries if a forwarded connection request fails. + +Type of value: + + Number of retries. + +Default value: + + 0 + +Effect if unset: + + Connections forwarded through other proxies are treated like direct + connections and no retry attempts are made. + +Notes: + + forwarded-connect-retries is mainly interesting for socks4a connections, + where Privoxy can't detect why the connections failed. The connection might + have failed because of a DNS timeout in which case a retry makes sense, but + it might also have failed because the server doesn't exist or isn't + reachable. In this case the retry will just delay the appearance of + Privoxy's error message. + + Note that in the context of this option, "forwarded connections" includes + all connections that Privoxy forwards through other proxies. This option is + not limited to the HTTP CONNECT method. + + Only use this option, if you are getting lots of forwarding-related error + messages that go away when you try again manually. Start with a small value + and check Privoxy's logfile from time to time, to see how many retries are + usually needed. + +Examples: + + forwarded-connect-retries 1 + +------------------------------------------------------------------------------- + +7.5.5. accept-intercepted-requests + +Specifies: + + Whether intercepted requests should be treated as valid. + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + Only proxy requests are accepted, intercepted requests are treated as + invalid. + +Notes: + + If you don't trust your clients and want to force them to use Privoxy, + enable this option and configure your packet filter to redirect outgoing + HTTP connections into Privoxy. + + Make sure that Privoxy's own requests aren't redirected as well. + Additionally take care that Privoxy can't intentionally connect to itself, + otherwise you could run into redirection loops if Privoxy's listening port + is reachable by the outside or an attacker has access to the pages you + visit. + +Examples: + + accept-intercepted-requests 1 + +------------------------------------------------------------------------------- + +7.5.6. allow-cgi-request-crunching + +Specifies: + + Whether requests to Privoxy's CGI pages can be blocked or redirected. + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + Privoxy ignores block and redirect actions for its CGI pages. + +Notes: + + By default Privoxy ignores block or redirect actions for its CGI pages. + Intercepting these requests can be useful in multi-user setups to implement + fine-grained access control, but it can also render the complete web + interface useless and make debugging problems painful if done without care. + + Don't enable this option unless you're sure that you really need it. + +Examples: + + allow-cgi-request-crunching 1 + +------------------------------------------------------------------------------- + +7.5.7. split-large-forms + +Specifies: + + Whether the CGI interface should stay compatible with broken HTTP clients. + +Type of value: + + 0 or 1 + +Default value: + + 0 + +Effect if unset: + + The CGI form generate long GET URLs. + +Notes: + + Privoxy's CGI forms can lead to rather long URLs. This isn't a problem as + far as the HTTP standard is concerned, but it can confuse clients with + arbitrary URL length limitations. + + Enabling split-large-forms causes Privoxy to divide big forms into smaller + ones to keep the URL length down. It makes editing a lot less convenient + and you can no longer submit all changes at once, but at least it works + around this browser bug. + + If you don't notice any editing problems, there is no reason to enable this + option, but if one of the submit buttons appears to be broken, you should + give it a try. + +Examples: + + split-large-forms 1 + +------------------------------------------------------------------------------- + +7.6. Windows GUI Options + +Privoxy has a number of options specific to the Windows GUI interface: + +If "activity-animation" is set to 1, the Privoxy icon will animate when +"Privoxy" is active. To turn off, set to 0. + + activity-animation 1 + + +If "log-messages" is set to 1, Privoxy will log messages to the console window: + + log-messages 1 + + +If "log-buffer-size" is set to 1, the size of the log buffer, i.e. the amount +of memory used for the log messages displayed in the console window, will be +limited to "log-max-lines" (see below). + +Warning: Setting this to 0 will result in the buffer to grow infinitely and eat +up all your memory! + + log-buffer-size 1 + + +log-max-lines is the maximum number of lines held in the log buffer. See above. + + log-max-lines 200 + + +If "log-highlight-messages" is set to 1, Privoxy will highlight portions of the +log messages with a bold-faced font: + + log-highlight-messages 1 + + +The font used in the console window: + + log-font-name Comic Sans MS + + +Font size used in the console window: + + log-font-size 8 + + +"show-on-task-bar" controls whether or not Privoxy will appear as a button on +the Task bar when minimized: + + show-on-task-bar 0 + + +If "close-button-minimizes" is set to 1, the Windows close button will minimize +Privoxy instead of closing the program (close with the exit option on the File +menu). + + close-button-minimizes 1 + + +The "hide-console" option is specific to the MS-Win console version of Privoxy. +If this option is used, Privoxy will disconnect from and hide the command +console. + + #hide-console + + +------------------------------------------------------------------------------- + +8. Actions Files + +The actions files are used to define what actions Privoxy takes for which URLs, +and thus determines how ad images, cookies and various other aspects of HTTP +content and transactions are handled, and on which sites (or even parts +thereof). There are a number of such actions, with a wide range of +functionality. Each action does something a little different. These actions +give us a veritable arsenal of tools with which to exert our control, +preferences and independence. Actions can be combined so that their effects are +aggregated when applied against a given set of URLs. + +There are three action files included with Privoxy with differing purposes: + + * default.action - is the primary action file that sets the initial values + for all actions. It is intended to provide a base level of functionality + for Privoxy's array of features. So it is a set of broad rules that should + work reasonably well as-is for most users. This is the file that the + developers are keeping updated, and making available to users. The user's + preferences as set in standard.action, e.g. either Cautious (the default), + Medium, or Advanced (see below). + + * user.action - is intended to be for local site preferences and exceptions. + As an example, if your ISP or your bank has specific requirements, and need + special handling, this kind of thing should go here. This file will not be + upgraded. + + * standard.action - is used only by the web based editor at http:// + config.privoxy.org/edit-actions-list?f=default, to set various pre-defined + sets of rules for the default actions section in default.action. + + Edit Set to Cautious Set to Medium Set to Advanced + + These have increasing levels of aggressiveness and have no influence on + your browsing unless you select them explicitly in the editor. A default + installation should be pre-set to Cautious (versions prior to 3.0.5 were + set to Medium). New users should try this for a while before adjusting the + settings to more aggressive levels. The more aggressive the settings, then + the more likelihood there is of problems such as sites not working as they + should. + + The Edit button allows you to turn each action on/off individually for + fine-tuning. The Cautious button changes the actions list to low/safe + settings which will activate ad blocking and a minimal set of Privoxy's + features, and subsequently there will be less of a chance for accidental + problems. The Medium button sets the list to a medium level of other + features and a low level set of privacy features. The Advanced button sets + the list to a high level of ad blocking and medium level of privacy. See + the chart below. The latter three buttons over-ride any changes via with + the Edit button. More fine-tuning can be done in the lower sections of this + internal page. + + It is not recommend to edit the standard.action file itself. + + The default profiles, and their associated actions, as pre-defined in + standard.action are: + + Table 1. Default Configurations + + +---------------------------------------------------------------+ + | Feature | Cautious | Medium | Advanced | + |--------------------------+-----------+------------+-----------| + |Ad-blocking Aggressiveness|medium |high |high | + |--------------------------+-----------+------------+-----------| + |Ad-filtering by size |no |yes |yes | + |--------------------------+-----------+------------+-----------| + |Ad-filtering by link |no |no |yes | + |--------------------------+-----------+------------+-----------| + |Pop-up killing |blocks only|blocks only |blocks only| + |--------------------------+-----------+------------+-----------| + |Privacy Features |low |medium |medium/high| + |--------------------------+-----------+------------+-----------| + |Cookie handling |none |session-only|kill | + |--------------------------+-----------+------------+-----------| + |Referer forging |no |yes |yes | + |--------------------------+-----------+------------+-----------| + |GIF de-animation |no |yes |yes | + |--------------------------+-----------+------------+-----------| + |Fast redirects |no |no |yes | + |--------------------------+-----------+------------+-----------| + |HTML taming |no |no |yes | + |--------------------------+-----------+------------+-----------| + |JavaScript taming |no |no |yes | + |--------------------------+-----------+------------+-----------| + |Web-bug killing |no |yes |yes | + |--------------------------+-----------+------------+-----------| + |Image tag reordering |no |no |yes | + +---------------------------------------------------------------+ + +The list of actions files to be used are defined in the main configuration +file, and are processed in the order they are defined (e.g. default.action is +typically processed before user.action). The content of these can all be viewed +and edited from http://config.privoxy.org/show-status. The over-riding +principle when applying actions, is that the last action that matches a given +URL wins. The broadest, most general rules go first (defined in +default.action), followed by any exceptions (typically also in default.action), +which are then followed lastly by any local preferences (typically in +user.action). Generally, user.action has the last word. + +An actions file typically has multiple sections. If you want to use "aliases" +in an actions file, you have to place the (optional) alias section at the top +of that file. Then comes the default set of rules which will apply universally +to all sites and pages (be very careful with using such a universal set in +user.action or any other actions file after default.action, because it will +override the result from consulting any previous file). And then below that, +exceptions to the defined universal policies. You can regard user.action as an +appendix to default.action, with the advantage that it is a separate file, +which makes preserving your personal settings across Privoxy upgrades easier. + +Actions can be used to block anything you want, including ads, banners, or just +some obnoxious URL whose content you would rather not see. Cookies can be +accepted or rejected, or accepted only during the current browser session (i.e. +not written to disk), content can be modified, some JavaScripts tamed, +user-tracking fooled, and much more. See below for a complete list of actions. + +------------------------------------------------------------------------------- + +8.1. Finding the Right Mix + +Note that some actions, like cookie suppression or script disabling, may render +some sites unusable that rely on these techniques to work properly. Finding the +right mix of actions is not always easy and certainly a matter of personal +taste. And, things can always change, requiring refinements in the +configuration. In general, it can be said that the more "aggressive" your +default settings (in the top section of the actions file) are, the more +exceptions for "trusted" sites you will have to make later. If, for example, +you want to crunch all cookies per default, you'll have to make exceptions from +that rule for sites that you regularly use and that require cookies for +actually useful purposes, like maybe your bank, favorite shop, or newspaper. + +We have tried to provide you with reasonable rules to start from in the +distribution actions files. But there is no general rule of thumb on these +things. There just are too many variables, and sites are constantly changing. +Sooner or later you will want to change the rules (and read this chapter again +:). + +------------------------------------------------------------------------------- + +8.2. How to Edit + +The easiest way to edit the actions files is with a browser by using our +browser-based editor, which can be reached from http://config.privoxy.org/ +show-status. Note: the config file option enable-edit-actions must be enabled +for this to work. The editor allows both fine-grained control over every single +feature on a per-URL basis, and easy choosing from wholesale sets of defaults +like "Cautious", "Medium" or "Advanced". Warning: the "Advanced" setting is +more aggressive, and will be more likely to cause problems for some sites. +Experienced users only! + +If you prefer plain text editing to GUIs, you can of course also directly edit +the the actions files with your favorite text editor. Look at default.action +which is richly commented with many good examples. + +------------------------------------------------------------------------------- + +8.3. How Actions are Applied to Requests + +Actions files are divided into sections. There are special sections, like the " +alias" sections which will be discussed later. For now let's concentrate on +regular sections: They have a heading line (often split up to multiple lines +for readability) which consist of a list of actions, separated by whitespace +and enclosed in curly braces. Below that, there is a list of URL and tag +patterns, each on a separate line. + +To determine which actions apply to a request, the URL of the request is +compared to all URL patterns in each "action file". Every time it matches, the +list of applicable actions for the request is incrementally updated, using the +heading of the section in which the pattern is located. The same is done again +for tags and tag patterns later on. + +If multiple applying sections set the same action differently, the last match +wins. If not, the effects are aggregated. E.g. a URL might match a regular +section with a heading line of { +handle-as-image }, then later another one +with just { +block }, resulting in both actions to apply. And there may well be +cases where you will want to combine actions together. Such a section then +might look like: + + { +handle-as-image +block } + # Block these as if they were images. Send no block page. + banners.example.com + media.example.com/.*banners + .example.com/images/ads/ + + +You can trace this process for URL patterns and any given URL by visiting http: +//config.privoxy.org/show-url-info. + +Examples and more detail on this is provided in the Appendix, Troubleshooting: +Anatomy of an Action section. + +------------------------------------------------------------------------------- + +8.4. Patterns + +As mentioned, Privoxy uses "patterns" to determine what actions might apply to +which sites and pages your browser attempts to access. These "patterns" use +wild card type pattern matching to achieve a high degree of flexibility. This +allows one expression to be expanded and potentially match against many similar +patterns. + +Generally, an URL pattern has the form /, where both the +and are optional. (This is why the special / pattern matches all URLs). +Note that the protocol portion of the URL pattern (e.g. http://) should not be +included in the pattern. This is assumed already! + +The pattern matching syntax is different for the domain and path parts of the +URL. The domain part uses a simple globbing type matching technique, while the +path part uses a more flexible "Regular Expressions (PCRE)" based syntax. + +www.example.com/ + + is a domain-only pattern and will match any request to www.example.com, + regardless of which document on that server is requested. So ALL pages in + this domain would be covered by the scope of this action. Note that a + simple example.com is different and would NOT match. + +www.example.com + + means exactly the same. For domain-only patterns, the trailing / may be + omitted. + +www.example.com/index.html$ + + matches all the documents on www.example.com whose name starts with / + index.html. + +www.example.com/index.html$ + + matches only the single document /index.html on www.example.com. + +/index.html$ + + matches the document /index.html, regardless of the domain, i.e. on any web + server anywhere. + +index.html + + matches nothing, since it would be interpreted as a domain name and there + is no top-level domain called .html. So its a mistake. + +------------------------------------------------------------------------------- + +8.4.1. The Domain Pattern + +The matching of the domain part offers some flexible options: if the domain +starts or ends with a dot, it becomes unanchored at that end. For example: + +.example.com + + matches any domain with first-level domain com and second-level domain + example. For example www.example.com, example.com and + foo.bar.baz.example.com. Note that it wouldn't match if the second-level + domain was another-example. + +www. + + matches any domain that STARTS with www. (It also matches the domain www + but most of the time that doesn't matter.) + +.example. + + matches any domain that CONTAINS .example.. And, by the way, also included + would be any files or documents that exist within that domain since no path + limitations are specified. (Correctly speaking: It matches any FQDN that + contains example as a domain.) This might be www.example.com, + news.example.de, or www.example.net/cgi/testing.pl for instance. All these + cases are matched. + +Additionally, there are wild-cards that you can use in the domain names +themselves. These work similarly to shell globbing type wild-cards: "*" +represents zero or more arbitrary characters (this is equivalent to the +"Regular Expression" based syntax of ".*"), "?" represents any single character +(this is equivalent to the regular expression syntax of a simple "."), and you +can define "character classes" in square brackets which is similar to the same +regular expression technique. All of this can be freely mixed: + +ad*.example.com + + matches "adserver.example.com", "ads.example.com", etc but not + "sfads.example.com" + +*ad*.example.com + + matches all of the above, and then some. + +.?pix.com + + matches www.ipix.com, pictures.epix.com, a.b.c.d.e.upix.com etc. + +www[1-9a-ez].example.c* + + matches www1.example.com, www4.example.cc, wwwd.example.cy, + wwwz.example.com etc., but not wwww.example.com. + +While flexible, this is not the sophistication of full regular expression based +syntax. + +------------------------------------------------------------------------------- + +8.4.2. The Path Pattern + +Privoxy uses Perl compatible (PCRE) "Regular Expression" based syntax (through +the PCRE library) for matching the path portion (after the slash), and is thus +more flexible. + +There is an Appendix with a brief quick-start into regular expressions, and +full (very technical) documentation on PCRE regex syntax is available on-line +at http://www.pcre.org/man.txt. You might also find the Perl man page on +regular expressions (man perlre) useful, which is available on-line at http:// +perldoc.perl.org/perlre.html. + +Note that the path pattern is automatically left-anchored at the "/", i.e. it +matches as if it would start with a "^" (regular expression speak for the +beginning of a line). + +Please also note that matching in the path is CASE INSENSITIVE by default, but +you can switch to case sensitive at any point in the pattern by using the "(? +-i)" switch: www.example.com/(?-i)PaTtErN.* will match only documents whose +path starts with PaTtErN in exactly this capitalization. + +.example.com/.* + + Is equivalent to just ".example.com", since any documents within that + domain are matched with or without the ".*" regular expression. This is + redundant + +.example.com/.*/index.html$ + + Will match any page in the domain of "example.com" that is named + "index.html", and that is part of some path. For example, it matches + "www.example.com/testing/index.html" but NOT "www.example.com/index.html" + because the regular expression called for at least two "/'s", thus the path + requirement. It also would match "www.example.com/testing/index_html", + because of the special meta-character ".". + +.example.com/(.*/)?index\.html$ + + This regular expression is conditional so it will match any page named + "index.html" regardless of path which in this case can have one or more "/ + 's". And this one must contain exactly ".html" (but does not have to end + with that!). + +.example.com/(.*/)(ads|banners?|junk) + + This regular expression will match any path of "example.com" that contains + any of the words "ads", "banner", "banners" (because of the "?") or "junk". + The path does not have to end in these words, just contain them. + +.example.com/(.*/)(ads|banners?|junk)/.*\.(jpe?g|gif|png)$ + + This is very much the same as above, except now it must end in either + ".jpg", ".jpeg", ".gif" or ".png". So this one is limited to common image + formats. + +There are many, many good examples to be found in default.action, and more +tutorials below in Appendix on regular expressions. + +------------------------------------------------------------------------------- + +8.4.3. The Tag Pattern + +Tag patterns are used to change the applying actions based on the request's +tags. Tags can be created with either the client-header-tagger or the +server-header-tagger action. + +Tag patterns have to start with "TAG:", so Privoxy can tell them apart from URL +patterns. Everything after the colon including white space, is interpreted as a +regular expression with path pattern syntax, except that tag patterns aren't +left-anchored automatically (Privoxy doesn't silently add a "^", you have to do +it yourself if you need it). + +To match all requests that are tagged with "foo" your pattern line should be +"TAG:^foo$", "TAG:foo" would work as well, but it would also match requests +whose tags contain "foo" somewhere. "TAG: foo" wouldn't work as it requires +white space. + +Sections can contain URL and tag patterns at the same time, but tag patterns +are checked after the URL patterns and thus always overrule them, even if they +are located before the URL patterns. + +Once a new tag is added, Privoxy checks right away if it's matched by one of +the tag patterns and updates the action settings accordingly. As a result tags +can be used to activate other tagger actions, as long as these other taggers +look for headers that haven't already be parsed. + +For example you could tag client requests which use the POST method, then use +this tag to activate another tagger that adds a tag if cookies are sent, and +then use a block action based on the cookie tag. This allows the outcome of one +action, to be input into a subsequent action. However if you'd reverse the +position of the described taggers, and activated the method tagger based on the +cookie tagger, no method tags would be created. The method tagger would look +for the request line, but at the time the cookie tag is created, the request +line has already been parsed. + +While this is a limitation you should be aware of, this kind of indirection is +seldom needed anyway and even the example doesn't make too much sense. + +------------------------------------------------------------------------------- + +8.5. Actions + +All actions are disabled by default, until they are explicitly enabled +somewhere in an actions file. Actions are turned on if preceded with a "+", and +turned off if preceded with a "-". So a +action means "do that action", e.g. ++block means "please block URLs that match the following patterns", and -block +means "don't block URLs that match the following patterns, even if +block +previously applied." + +Again, actions are invoked by placing them on a line, enclosed in curly braces +and separated by whitespace, like in {+some-action -some-other-action +{some-parameter}}, followed by a list of URL patterns, one per line, to which +they apply. Together, the actions line and the following pattern lines make up +a section of the actions file. + +Actions fall into three categories: + + * Boolean, i.e the action can only be "enabled" or "disabled". Syntax: + + +name # enable action name + -name # disable action name + + + Example: +block + + * Parameterized, where some value is required in order to enable this type of + action. Syntax: + + +name{param} # enable action and set parameter to param, + # overwriting parameter from previous match if necessary + -name # disable action. The parameter can be omitted + + + Note that if the URL matches multiple positive forms of a parameterized + action, the last match wins, i.e. the params from earlier matches are + simply ignored. + + Example: +hide-user-agent{Mozilla/5.0 (X11; U; FreeBSD i386; en-US; + rv:1.8.1.4) Gecko/20070602 Firefox/2.0.0.4} + + * Multi-value. These look exactly like parameterized actions, but they behave + differently: If the action applies multiple times to the same URL, but with + different parameters, all the parameters from all matches are remembered. + This is used for actions that can be executed for the same request + repeatedly, like adding multiple headers, or filtering through multiple + filters. Syntax: + + +name{param} # enable action and add param to the list of parameters + -name{param} # remove the parameter param from the list of parameters + # If it was the last one left, disable the action. + -name # disable this action completely and remove all parameters from the list + + + Examples: +add-header{X-Fun-Header: Some text} and +filter{html-annoyances} + +If nothing is specified in any actions file, no "actions" are taken. So in this +case Privoxy would just be a normal, non-blocking, non-filtering proxy. You +must specifically enable the privacy and blocking features you need (although +the provided default actions files will give a good starting point). + +Later defined action sections always over-ride earlier ones of the same type. +So exceptions to any rules you make, should come in the latter part of the file +(or in a file that is processed later when using multiple actions files such as +user.action). For multi-valued actions, the actions are applied in the order +they are specified. Actions files are processed in the order they are defined +in config (the default installation has three actions files). It also quite +possible for any given URL to match more than one "pattern" (because of +wildcards and regular expressions), and thus to trigger more than one set of +actions! Last match wins. + +The list of valid Privoxy actions are: + +------------------------------------------------------------------------------- + +8.5.1. add-header + +Typical use: + + Confuse log analysis, custom applications + +Effect: + + Sends a user defined HTTP header to the web server. + +Type: + + Multi-value. + +Parameter: + + Any string value is possible. Validity of the defined HTTP headers is not + checked. It is recommended that you use the "X-" prefix for custom headers. + +Notes: + + This action may be specified multiple times, in order to define multiple + headers. This is rarely needed for the typical user. If you don't know what + "HTTP headers" are, you definitely don't need to worry about this one. + +Example usage: + + +add-header{X-User-Tracking: sucks} + + +------------------------------------------------------------------------------- + +8.5.2. block + +Typical use: + + Block ads or other unwanted content + +Effect: + + Requests for URLs to which this action applies are blocked, i.e. the + requests are trapped by Privoxy and the requested URL is never retrieved, + but is answered locally with a substitute page or image, as determined by + the handle-as-image, set-image-blocker, and handle-as-empty-document + actions. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + Privoxy sends a special "BLOCKED" page for requests to blocked pages. This + page contains links to find out why the request was blocked, and a + click-through to the blocked content (the latter only if compiled with the + force feature enabled). The "BLOCKED" page adapts to the available screen + space -- it displays full-blown if space allows, or miniaturized and + text-only if loaded into a small frame or window. If you are using Privoxy + right now, you can take a look at the "BLOCKED" page. + + A very important exception occurs if both block and handle-as-image, apply + to the same request: it will then be replaced by an image. If + set-image-blocker (see below) also applies, the type of image will be + determined by its parameter, if not, the standard checkerboard pattern is + sent. + + It is important to understand this process, in order to understand how + Privoxy deals with ads and other unwanted content. Blocking is a core + feature, and one upon which various other features depend. + + The filter action can perform a very similar task, by "blocking" banner + images and other content through rewriting the relevant URLs in the + document's HTML source, so they don't get requested in the first place. + Note that this is a totally different technique, and it's easy to confuse + the two. + +Example usage (section): + + {+block} + # Block and replace with "blocked" page + .nasty-stuff.example.com + + {+block +handle-as-image} + # Block and replace with image + .ad.doubleclick.net + .ads.r.us/banners/ + + {+block +handle-as-empty-document} + # Block and then ignore + adserver.exampleclick.net/.*\.js$ + + +------------------------------------------------------------------------------- + +8.5.3. client-header-filter + +Typical use: + + Rewrite or remove single client headers. + +Effect: + + All client headers to which this action applies are filtered on-the-fly + through the specified regular expression based substitutions. + +Type: + + Parameterized. + +Parameter: + + The name of a client-header filter, as defined in one of the filter files. + +Notes: + + Client-header filters are applied to each header on its own, not to all at + once. This makes it easier to diagnose problems, but on the downside you + can't write filters that only change header x if header y's value is z. You + can do that by using tags though. + + Client-header filters are executed after the other header actions have + finished and use their output as input. + + If the request URL gets changed, Privoxy will detect that and use the new + one. This can be used to rewrite the request destination behind the + client's back, for example to specify a Tor exit relay for certain + requests. + + Please refer to the filter file chapter to learn which client-header + filters are available by default, and how to create your own. + +Example usage (section): + + {+client-header-filter{hide-tor-exit-notation}} + .exit/ + + + +------------------------------------------------------------------------------- + +8.5.4. client-header-tagger + +Typical use: + + Block requests based on their headers. + +Effect: + + Client headers to which this action applies are filtered on-the-fly through + the specified regular expression based substitutions, the result is used as + tag. + +Type: + + Parameterized. + +Parameter: + + The name of a client-header tagger, as defined in one of the filter files. + +Notes: + + Client-header taggers are applied to each header on its own, and as the + header isn't modified, each tagger "sees" the original. + + Client-header taggers are the first actions that are executed and their + tags can be used to control every other action. + +Example usage (section): + + # Tag every request with the User-Agent header + {+client-header-tagger{user-agent}} + / + + + +------------------------------------------------------------------------------- + +8.5.5. content-type-overwrite + +Typical use: + + Stop useless download menus from popping up, or change the browser's + rendering mode + +Effect: + + Replaces the "Content-Type:" HTTP server header. + +Type: + + Parameterized. + +Parameter: + + Any string. + +Notes: + + The "Content-Type:" HTTP server header is used by the browser to decide + what to do with the document. The value of this header can cause the + browser to open a download menu instead of displaying the document by + itself, even if the document's format is supported by the browser. + + The declared content type can also affect which rendering mode the browser + chooses. If XHTML is delivered as "text/html", many browsers treat it as + yet another broken HTML document. If it is send as "application/xml", + browsers with XHTML support will only display it, if the syntax is correct. + + If you see a web site that proudly uses XHTML buttons, but sets + "Content-Type: text/html", you can use Privoxy to overwrite it with + "application/xml" and validate the web master's claim inside your + XHTML-supporting browser. If the syntax is incorrect, the browser will + complain loudly. + + You can also go the opposite direction: if your browser prints error + messages instead of rendering a document falsely declared as XHTML, you can + overwrite the content type with "text/html" and have it rendered as broken + HTML document. + + By default content-type-overwrite only replaces "Content-Type:" headers + that look like some kind of text. If you want to overwrite it + unconditionally, you have to combine it with force-text-mode. This + limitation exists for a reason, think twice before circumventing it. + + Most of the time it's easier to replace this action with a custom + server-header filter. It allows you to activate it for every document of a + certain site and it will still only replace the content types you aimed at. + + Of course you can apply content-type-overwrite to a whole site and then + make URL based exceptions, but it's a lot more work to get the same + precision. + +Example usage (sections): + + # Check if www.example.net/ really uses valid XHTML + { +content-type-overwrite{application/xml} } + www.example.net/ + + # but leave the content type unmodified if the URL looks like a style sheet + {-content-type-overwrite} + www.example.net/.*\.css$ + www.example.net/.*style + + +------------------------------------------------------------------------------- + +8.5.6. crunch-client-header + +Typical use: + + Remove a client header Privoxy has no dedicated action for. + +Effect: + + Deletes every header sent by the client that contains the string the user + supplied as parameter. + +Type: + + Parameterized. + +Parameter: + + Any string. + +Notes: + + This action allows you to block client headers for which no dedicated + Privoxy action exists. Privoxy will remove every client header that + contains the string you supplied as parameter. + + Regular expressions are not supported and you can't use this action to + block different headers in the same request, unless they contain the same + string. + + crunch-client-header is only meant for quick tests. If you have to block + several different headers, or only want to modify parts of them, you should + use a client-header filter. + + +-----------------------------------------------------------------+ + | Warning | + |-----------------------------------------------------------------| + |Don't block any header without understanding the consequences. | + +-----------------------------------------------------------------+ +Example usage (section): + + # Block the non-existent "Privacy-Violation:" client header + { +crunch-client-header{Privacy-Violation:} } + / + + + +------------------------------------------------------------------------------- + +8.5.7. crunch-if-none-match + +Typical use: + + Prevent yet another way to track the user's steps between sessions. + +Effect: + + Deletes the "If-None-Match:" HTTP client header. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + Removing the "If-None-Match:" HTTP client header is useful for filter + testing, where you want to force a real reload instead of getting status + code "304" which would cause the browser to use a cached copy of the page. + + It is also useful to make sure the header isn't used as a cookie + replacement (unlikely but possible). + + Blocking the "If-None-Match:" header shouldn't cause any caching problems, + as long as the "If-Modified-Since:" header isn't blocked or missing as + well. + + It is recommended to use this action together with hide-if-modified-since + and overwrite-last-modified. + +Example usage (section): + + # Let the browser revalidate cached documents but don't + # allow the server to use the revalidation headers for user tracking. + {+hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} + / + + +------------------------------------------------------------------------------- + +8.5.8. crunch-incoming-cookies + +Typical use: + + Prevent the web server from setting HTTP cookies on your system + +Effect: + + Deletes any "Set-Cookie:" HTTP headers from server replies. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + This action is only concerned with incoming HTTP cookies. For outgoing HTTP + cookies, use crunch-outgoing-cookies. Use both to disable HTTP cookies + completely. + + It makes no sense at all to use this action in conjunction with the + session-cookies-only action, since it would prevent the session cookies + from being set. See also filter-content-cookies. + +Example usage: + + +crunch-incoming-cookies + + +------------------------------------------------------------------------------- + +8.5.9. crunch-server-header + +Typical use: + + Remove a server header Privoxy has no dedicated action for. + +Effect: + + Deletes every header sent by the server that contains the string the user + supplied as parameter. + +Type: + + Parameterized. + +Parameter: + + Any string. + +Notes: + + This action allows you to block server headers for which no dedicated + Privoxy action exists. Privoxy will remove every server header that + contains the string you supplied as parameter. + + Regular expressions are not supported and you can't use this action to + block different headers in the same request, unless they contain the same + string. + + crunch-server-header is only meant for quick tests. If you have to block + several different headers, or only want to modify parts of them, you should + use a custom server-header filter. + + +-----------------------------------------------------------------+ + | Warning | + |-----------------------------------------------------------------| + |Don't block any header without understanding the consequences. | + +-----------------------------------------------------------------+ +Example usage (section): + + # Crunch server headers that try to prevent caching + { +crunch-server-header{no-cache} } + / + + +------------------------------------------------------------------------------- + +8.5.10. crunch-outgoing-cookies + +Typical use: + + Prevent the web server from reading any HTTP cookies from your system + +Effect: + + Deletes any "Cookie:" HTTP headers from client requests. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + This action is only concerned with outgoing HTTP cookies. For incoming HTTP + cookies, use crunch-incoming-cookies. Use both to disable HTTP cookies + completely. + + It makes no sense at all to use this action in conjunction with the + session-cookies-only action, since it would prevent the session cookies + from being read. + +Example usage: + + +crunch-outgoing-cookies + + +------------------------------------------------------------------------------- + +8.5.11. deanimate-gifs + +Typical use: + + Stop those annoying, distracting animated GIF images. + +Effect: + + De-animate GIF animations, i.e. reduce them to their first or last image. + +Type: + + Parameterized. + +Parameter: + + "last" or "first" + +Notes: + + This will also shrink the images considerably (in bytes, not pixels!). If + the option "first" is given, the first frame of the animation is used as + the replacement. If "last" is given, the last frame of the animation is + used instead, which probably makes more sense for most banner animations, + but also has the risk of not showing the entire last frame (if it is only a + delta to an earlier frame). + + You can safely use this action with patterns that will also match non-GIF + objects, because no attempt will be made at anything that doesn't look like + a GIF. + +Example usage: + + +deanimate-gifs{last} + + +------------------------------------------------------------------------------- + +8.5.12. downgrade-http-version + +Typical use: + + Work around (very rare) problems with HTTP/1.1 + +Effect: + + Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + This is a left-over from the time when Privoxy didn't support important + HTTP/1.1 features well. It is left here for the unlikely case that you + experience HTTP/1.1 related problems with some server out there. Not all + HTTP/1.1 features and requirements are supported yet, so there is a chance + you might need this action. + +Example usage (section): + + {+downgrade-http-version} + problem-host.example.com + + +------------------------------------------------------------------------------- + +8.5.13. fast-redirects + +Typical use: + + Fool some click-tracking scripts and speed up indirect links. + +Effect: + + Detects redirection URLs and redirects the browser without contacting the + redirection server first. + +Type: + + Parameterized. + +Parameter: + + + "simple-check" to just search for the string "http://" to detect + redirection URLs. + + + "check-decoded-url" to decode URLs (if necessary) before searching for + redirection URLs. + +Notes: + + Many sites, like yahoo.com, don't just link to other sites. Instead, they + will link to some script on their own servers, giving the destination as a + parameter, which will then redirect you to the final target. URLs resulting + from this scheme typically look like: "http://www.example.org/ + click-tracker.cgi?target=http%3a//www.example.net/". + + Sometimes, there are even multiple consecutive redirects encoded in the + URL. These redirections via scripts make your web browsing more traceable, + since the server from which you follow such a link can see where you go to. + Apart from that, valuable bandwidth and time is wasted, while your browser + asks the server for one redirect after the other. Plus, it feeds the + advertisers. + + This feature is currently not very smart and is scheduled for improvement. + If it is enabled by default, you will have to create some exceptions to + this action. It can lead to failures in several ways: + + Not every URLs with other URLs as parameters is evil. Some sites offer a + real service that requires this information to work. For example a + validation service needs to know, which document to validate. + fast-redirects assumes that every URL parameter that looks like another URL + is a redirection target, and will always redirect to the last one. Most of + the time the assumption is correct, but if it isn't, the user gets + redirected anyway. + + Another failure occurs if the URL contains other parameters after the URL + parameter. The URL: "http://www.example.org/?redirect=http%3a// + www.example.net/&foo=bar". contains the redirection URL "http:// + www.example.net/", followed by another parameter. fast-redirects doesn't + know that and will cause a redirect to "http://www.example.net/&foo=bar". + Depending on the target server configuration, the parameter will be + silently ignored or lead to a "page not found" error. You can prevent this + problem by first using the redirect action to remove the last part of the + URL, but it requires a little effort. + + To detect a redirection URL, fast-redirects only looks for the string + "http://", either in plain text (invalid but often used) or encoded as + "http%3a//". Some sites use their own URL encoding scheme, encrypt the + address of the target server or replace it with a database id. In theses + cases fast-redirects is fooled and the request reaches the redirection + server where it probably gets logged. + +Example usage: + + { +fast-redirects{simple-check} } + one.example.com + + { +fast-redirects{check-decoded-url} } + another.example.com/testing + + +------------------------------------------------------------------------------- + +8.5.14. filter + +Typical use: + + Get rid of HTML and JavaScript annoyances, banner advertisements (by size), + do fun text replacements, add personalized effects, etc. + +Effect: + + All instances of text-based type, most notably HTML and JavaScript, to + which this action applies, can be filtered on-the-fly through the specified + regular expression based substitutions. (Note: as of version 3.0.3 plain + text documents are exempted from filtering, because web servers often use + the text/plain MIME type for all files whose type they don't know.) + +Type: + + Parameterized. + +Parameter: + + The name of a content filter, as defined in the filter file. Filters can be + defined in one or more files as defined by the filterfile option in the + config file. default.filter is the collection of filters supplied by the + developers. Locally defined filters should go in their own file, such as + user.filter. + + When used in its negative form, and without parameters, all filtering is + completely disabled. + +Notes: + + For your convenience, there are a number of pre-defined filters available + in the distribution filter file that you can use. See the examples below + for a list. + + Filtering requires buffering the page content, which may appear to slow + down page rendering since nothing is displayed until all content has passed + the filters. (It does not really take longer, but seems that way since the + page is not incrementally displayed.) This effect will be more noticeable + on slower connections. + + "Rolling your own" filters requires a knowledge of "Regular Expressions" + and "HTML". This is very powerful feature, and potentially very intrusive. + Filters should be used with caution, and where an equivalent "action" is + not available. + + The amount of data that can be filtered is limited to the buffer-limit + option in the main config file. The default is 4096 KB (4 Megs). Once this + limit is exceeded, the buffered data, and all pending data, is passed + through unfiltered. + + Inappropriate MIME types, such as zipped files, are not filtered at all. + (Again, only text-based types except plain text). Encrypted SSL data (from + HTTPS servers) cannot be filtered either, since this would violate the + integrity of the secure transaction. In some situations it might be + necessary to protect certain text, like source code, from filtering by + defining appropriate -filter exceptions. + + Compressed content can't be filtered either, unless Privoxy is compiled + with zlib support (requires at least Privoxy 3.0.7), in which case Privoxy + will decompress the content before filtering it. + + If you use a Privoxy version without zlib support, but want filtering to + work on as much documents as possible, even those that would normally be + sent compressed, you must use the prevent-compression action in conjunction + with filter. + + Content filtering can achieve some of the same effects as the block action, + i.e. it can be used to block ads and banners. But the mechanism works quite + differently. One effective use, is to block ad banners based on their size + (see below), since many of these seem to be somewhat standardized. + + Feedback with suggestions for new or improved filters is particularly + welcome! + + The below list has only the names and a one-line description of each + predefined filter. There are more verbose explanations of what these + filters do in the filter file chapter. + +Example usage (with filters from the distribution default.filter file). See the + Predefined Filters section for more explanation on each: + + +filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse + + + +filter{js-events} # Kill all JS event bindings (Radically destructive! Only for extra nasty sites) + + + +filter{html-annoyances} # Get rid of particularly annoying HTML abuse + + + +filter{content-cookies} # Kill cookies that come in the HTML or JS content + + + +filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups) + + + +filter{unsolicited-popups} # Disable only unsolicited pop-up windows. Useful if your browser lacks this ability. + + + +filter{all-popups} # Kill all popups in JavaScript and HTML. Useful if your browser lacks this ability. + + + +filter{img-reorder} # Reorder attributes in tags to make the banners-by-* filters more effective + + + +filter{banners-by-size} # Kill banners by size + + + +filter{banners-by-link} # Kill banners by their links to known clicktrackers + + + +filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking) + + + +filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap + + + +filter{jumping-windows} # Prevent windows from resizing and moving themselves + + + +filter{frameset-borders} # Give frames a border and make them resizeable + + + +filter{demoronizer} # Fix MS's non-standard use of standard charsets + + + +filter{shockwave-flash} # Kill embedded Shockwave Flash objects + + + +filter{quicktime-kioskmode} # Make Quicktime movies savable + + + +filter{fun} # Text replacements for subversive browsing fun! + + + +filter{crude-parental} # Crude parental filtering (demo only) + + + +filter{ie-exploits} # Disable a known Internet Explorer bug exploits + + + +filter{site-specifics} # Custom filters for specific site related problems + + + +filter{google} # Removes text ads and other Google specific improvements + + + +filter{yahoo} # Removes text ads and other Yahoo specific improvements + + + +filter{msn} # Removes text ads and other MSN specific improvements + + + +filter{blogspot} # Cleans up Blogspot blogs + + + +filter{no-ping} # Removes non-standard ping attributes from anchor and area tags + + +------------------------------------------------------------------------------- + +8.5.15. force-text-mode + +Typical use: + + Force Privoxy to treat a document as if it was in some kind of text format. + +Effect: + + Declares a document as text, even if the "Content-Type:" isn't detected as + such. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + As explained above, Privoxy tries to only filter files that are in some + kind of text format. The same restrictions apply to content-type-overwrite. + force-text-mode declares a document as text, without looking at the + "Content-Type:" first. + + +-----------------------------------------------------------------+ + | Warning | + |-----------------------------------------------------------------| + |Think twice before activating this action. Filtering binary data | + |with regular expressions can cause file damage. | + +-----------------------------------------------------------------+ +Example usage: + + +force-text-mode + + + +------------------------------------------------------------------------------- + +8.5.16. forward-override + +Typical use: + + Change the forwarding settings based on User-Agent or request origin + +Effect: + + Overrules the forward directives in the configuration file. + +Type: + + Multi-value. + +Parameter: + + + "forward ." to use a direct connection without any additional proxies. + + + "forward 127.0.0.1:8123" to use the HTTP proxy listening at 127.0.0.1 + port 8123. + + + "forward-socks4a 127.0.0.1:9050 ." to use the socks4a proxy listening + at 127.0.0.1 port 9050. Replace "forward-socks4a" with "forward-socks4" + to use a socks4 connection (with local DNS resolution) instead. + + + "forward-socks4a 127.0.0.1:9050 proxy.example.org:8000" to use the + socks4a proxy listening at 127.0.0.1 port 9050 to reach the HTTP proxy + listening at proxy.example.org port 8000. Replace "forward-socks4a" + with "forward-socks4" to use a socks4 connection (with local DNS + resolution) instead. + +Notes: + + This action takes parameters similar to the forward directives in the + configuration file, but without the URL pattern. It can be used as + replacement, but normally it's only used in cases where matching based on + the request URL isn't sufficient. + + +-----------------------------------------------------------------+ + | Warning | + |-----------------------------------------------------------------| + |Please read the description for the forward directives before | + |using this action. Forwarding to the wrong people will reduce | + |your privacy and increase the chances of man-in-the-middle | + |attacks. | + | | + |If the ports are missing or invalid, default values will be used.| + |This might change in the future and you shouldn't rely on it. | + |Otherwise incorrect syntax causes Privoxy to exit. | + | | + |Use the show-url-info CGI page to verify that your forward | + |settings do what you thought the do. | + +-----------------------------------------------------------------+ +Example usage: + + # Always use direct connections for requests previously tagged as + # "User-Agent: fetch libfetch/2.0" and make sure + # resuming downloads continues to work. + # This way you can continue to use Tor for your normal browsing, + # without overloading the Tor network with your FreeBSD ports updates + # or downloads of bigger files like ISOs. + # Note that HTTP headers are easy to fake and therefore their + # values are as (un)trustworthy as your clients and users. + {+forward-override{forward .} \ + -hide-if-modified-since \ + -overwrite-last-modified \ + } + TAG:^User-Agent: fetch libfetch/2\.0$ + + + +------------------------------------------------------------------------------- + +8.5.17. handle-as-empty-document + +Typical use: + + Mark URLs that should be replaced by empty documents if they get blocked + +Effect: + + This action alone doesn't do anything noticeable. It just marks URLs. If + the block action also applies, the presence or absence of this mark decides + whether an HTML "BLOCKED" page, or an empty document will be sent to the + client as a substitute for the blocked content. The empty document isn't + literally empty, but actually contains a single space. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + Some browsers complain about syntax errors if JavaScript documents are + blocked with Privoxy's default HTML page; this option can be used to + silence them. And of course this action can also be used to eliminate the + Privoxy BLOCKED message in frames. + + The content type for the empty document can be specified with + content-type-overwrite{}, but usually this isn't necessary. + +Example usage: + + # Block all documents on example.org that end with ".js", + # but send an empty document instead of the usual HTML message. + {+block +handle-as-empty-document} + example.org/.*\.js$ + + + +------------------------------------------------------------------------------- + +8.5.18. handle-as-image + +Typical use: + + Mark URLs as belonging to images (so they'll be replaced by images if they + do get blocked, rather than HTML pages) + +Effect: + + This action alone doesn't do anything noticeable. It just marks URLs as + images. If the block action also applies, the presence or absence of this + mark decides whether an HTML "blocked" page, or a replacement image (as + determined by the set-image-blocker action) will be sent to the client as a + substitute for the blocked content. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + The below generic example section is actually part of default.action. It + marks all URLs with well-known image file name extensions as images and + should be left intact. + + Users will probably only want to use the handle-as-image action in + conjunction with block, to block sources of banners, whose URLs don't + reflect the file type, like in the second example section. + + Note that you cannot treat HTML pages as images in most cases. For + instance, (in-line) ad frames require an HTML page to be sent, or they + won't display properly. Forcing handle-as-image in this situation will not + replace the ad frame with an image, but lead to error messages. + +Example usage (sections): + + # Generic image extensions: + # + {+handle-as-image} + /.*\.(gif|jpg|jpeg|png|bmp|ico)$ + + # These don't look like images, but they're banners and should be + # blocked as images: + # + {+block +handle-as-image} + some.nasty-banner-server.com/junk.cgi\?output=trash + + # Banner source! Who cares if they also have non-image content? + ad.doubleclick.net + + +------------------------------------------------------------------------------- + +8.5.19. hide-accept-language + +Typical use: + + Pretend to use different language settings. + +Effect: + + Deletes or replaces the "Accept-Language:" HTTP header in client requests. + +Type: + + Parameterized. + +Parameter: + + Keyword: "block", or any user defined value. + +Notes: + + Faking the browser's language settings can be useful to make a foreign + User-Agent set with hide-user-agent more believable. + + However some sites with content in different languages check the + "Accept-Language:" to decide which one to take by default. Sometimes it + isn't possible to later switch to another language without changing the + "Accept-Language:" header first. + + Therefore it's a good idea to either only change the "Accept-Language:" + header to languages you understand, or to languages that aren't wide + spread. + + Before setting the "Accept-Language:" header to a rare language, you should + consider that it helps to make your requests unique and thus easier to + trace. If you don't plan to change this header frequently, you should stick + to a common language. + +Example usage (section): + + # Pretend to use Canadian language settings. + {+hide-accept-language{en-ca} \ + +hide-user-agent{Mozilla/5.0 (X11; U; OpenBSD i386; en-CA; rv:1.8.0.4) Gecko/20060628 Firefox/1.5.0.4} \ + } + / + + +------------------------------------------------------------------------------- + +8.5.20. hide-content-disposition + +Typical use: + + Prevent download menus for content you prefer to view inside the browser. + +Effect: + + Deletes or replaces the "Content-Disposition:" HTTP header set by some + servers. + +Type: + + Parameterized. + +Parameter: + + Keyword: "block", or any user defined value. + +Notes: + + Some servers set the "Content-Disposition:" HTTP header for documents they + assume you want to save locally before viewing them. The + "Content-Disposition:" header contains the file name the browser is + supposed to use by default. + + In most browsers that understand this header, it makes it impossible to + just view the document, without downloading it first, even if it's just a + simple text file or an image. + + Removing the "Content-Disposition:" header helps to prevent this annoyance, + but some browsers additionally check the "Content-Type:" header, before + they decide if they can display a document without saving it first. In + these cases, you have to change this header as well, before the browser + stops displaying download menus. + + It is also possible to change the server's file name suggestion to another + one, but in most cases it isn't worth the time to set it up. + + This action will probably be removed in the future, use server-header + filters instead. + +Example usage: + + # Disarm the download link in Sourceforge's patch tracker + { -filter \ + +content-type-overwrite{text/plain}\ + +hide-content-disposition{block} } + .sourceforge.net/tracker/download\.php + + +------------------------------------------------------------------------------- + +8.5.21. hide-if-modified-since + +Typical use: + + Prevent yet another way to track the user's steps between sessions. + +Effect: + + Deletes the "If-Modified-Since:" HTTP client header or modifies its value. + +Type: + + Parameterized. + +Parameter: + + Keyword: "block", or a user defined value that specifies a range of hours. + +Notes: + + Removing this header is useful for filter testing, where you want to force + a real reload instead of getting status code "304", which would cause the + browser to use a cached copy of the page. + + Instead of removing the header, hide-if-modified-since can also add or + subtract a random amount of time to/from the header's value. You specify a + range of minutes where the random factor should be chosen from and Privoxy + does the rest. A negative value means subtracting, a positive value adding. + + Randomizing the value of the "If-Modified-Since:" makes it less likely that + the server can use the time as a cookie replacement, but you will run into + caching problems if the random range is too high. + + It is a good idea to only use a small negative value and let + overwrite-last-modified handle the greater changes. + + It is also recommended to use this action together with + crunch-if-none-match, otherwise it's more or less pointless. + +Example usage (section): + + # Let the browser revalidate but make tracking based on the time less likely. + {+hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} + / + + +------------------------------------------------------------------------------- + +8.5.22. hide-forwarded-for-headers + +Typical use: + + Improve privacy by not forwarding the source of the request in the HTTP + headers. + +Effect: + + Deletes any existing "X-Forwarded-for:" HTTP header from client requests. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + It is safe and recommended to leave this on. + +Example usage: + + +hide-forwarded-for-headers + + +------------------------------------------------------------------------------- + +8.5.23. hide-from-header + +Typical use: + + Keep your (old and ill) browser from telling web servers your email address + +Effect: + + Deletes any existing "From:" HTTP header, or replaces it with the specified + string. + +Type: + + Parameterized. + +Parameter: + + Keyword: "block", or any user defined value. + +Notes: + + The keyword "block" will completely remove the header (not to be confused + with the block action). + + Alternately, you can specify any value you prefer to be sent to the web + server. If you do, it is a matter of fairness not to use any address that + is actually used by a real person. + + This action is rarely needed, as modern web browsers don't send "From:" + headers anymore. + +Example usage: + + +hide-from-header{block} + + + or + + +hide-from-header{spam-me-senseless@sittingduck.example.com} + + +------------------------------------------------------------------------------- + +8.5.24. hide-referrer + +Typical use: + + Conceal which link you followed to get to a particular site + +Effect: + + Deletes the "Referer:" (sic) HTTP header from the client request, or + replaces it with a forged one. + +Type: + + Parameterized. + +Parameter: + + + "conditional-block" to delete the header completely if the host has + changed. + + + "conditional-forge" to forge the header if the host has changed. + + + "block" to delete the header unconditionally. + + + "forge" to pretend to be coming from the homepage of the server we are + talking to. + + + Any other string to set a user defined referrer. + +Notes: + + conditional-block is the only parameter, that isn't easily detected in the + server's log file. If it blocks the referrer, the request will look like + the visitor used a bookmark or typed in the address directly. + + Leaving the referrer unmodified for requests on the same host allows the + server owner to see the visitor's "click path", but in most cases she could + also get that information by comparing other parts of the log file: for + example the User-Agent if it isn't a very common one, or the user's IP + address if it doesn't change between different requests. + + Always blocking the referrer, or using a custom one, can lead to failures + on servers that check the referrer before they answer any requests, in an + attempt to prevent their content from being embedded or linked to + elsewhere. + + Both conditional-block and forge will work with referrer checks, as long as + content and valid referring page are on the same host. Most of the time + that's the case. + + hide-referer is an alternate spelling of hide-referrer and the two can be + can be freely substituted with each other. ("referrer" is the correct + English spelling, however the HTTP specification has a bug - it requires it + to be spelled as "referer".) + +Example usage: + + +hide-referrer{forge} + + + or + + +hide-referrer{http://www.yahoo.com/} + + +------------------------------------------------------------------------------- + +8.5.25. hide-user-agent + +Typical use: + + Try to conceal your type of browser and client operating system + +Effect: + + Replaces the value of the "User-Agent:" HTTP header in client requests with + the specified value. + +Type: + + Parameterized. + +Parameter: + + Any user-defined string. + +Notes: + + +-----------------------------------------------------------------+ + | Warning | + |-----------------------------------------------------------------| + |This can lead to problems on web sites that depend on looking at | + |this header in order to customize their content for different | + |browsers (which, by the way, is NOT the right thing to do: good | + |web sites work browser-independently). | + +-----------------------------------------------------------------+ + + Using this action in multi-user setups or wherever different types of + browsers will access the same Privoxy is not recommended. In single-user, + single-browser setups, you might use it to delete your OS version + information from the headers, because it is an invitation to exploit known + bugs for your OS. It is also occasionally useful to forge this in order to + access sites that won't let you in otherwise (though there may be a good + reason in some cases). Example of this: some MSN sites will not let Mozilla + enter, yet forging to a Netscape 6.1 user-agent works just fine. (Must be + just a silly MS goof, I'm sure :-). + + More information on known user-agent strings can be found at http:// + www.user-agents.org/ and http://en.wikipedia.org/wiki/User_agent. + +Example usage: + + +hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)} + + +------------------------------------------------------------------------------- + +8.5.26. inspect-jpegs + +Typical use: + + Try to protect against a MS buffer over-run in JPEG processing + +Effect: + + Protect against a known exploit + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + See Microsoft Security Bulletin MS04-028. JPEG images are one of the most + common image types found across the Internet. The exploit as described can + allow execution of code on the target system, giving an attacker access to + the system in question by merely planting an altered JPEG image, which + would have no obvious indications of what lurks inside. This action tries + to prevent this exploit if delivered through unencrypted HTTP. + + Note that the exploit mentioned is several years old and it's unlikely that + your client is still vulnerable against it. This action may be removed in + one of the next releases. + +Example usage: + + +inspect-jpegs + + +------------------------------------------------------------------------------- + +8.5.27. kill-popups + +Typical use: + + Eliminate those annoying pop-up windows (deprecated) + +Effect: + + While loading the document, replace JavaScript code that opens pop-up + windows with (syntactically neutral) dummy code on the fly. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + This action is basically a built-in, hardwired special-purpose filter + action, but there are important differences: For kill-popups, the document + need not be buffered, so it can be incrementally rendered while + downloading. But kill-popups doesn't catch as many pop-ups as filter + {all-popups} does and is not as smart as filter{unsolicited-popups} is. + + Think of it as a fast and efficient replacement for a filter that you can + use if you don't want any filtering at all. Note that it doesn't make sense + to combine it with any filter action, since as soon as one filter applies, + the whole document needs to be buffered anyway, which destroys the + advantage of the kill-popups action over its filter equivalent. + + Killing all pop-ups unconditionally is problematic. Many shops and banks + rely on pop-ups to display forms, shopping carts etc, and the filter + {unsolicited-popups} does a better job of catching only the unwanted ones. + + If the only kind of pop-ups that you want to kill are exit consoles (those + really nasty windows that appear when you close an other one), you might + want to use filter{js-annoyances} instead. + + This action is most appropriate for browsers that don't have any controls + for unwanted pop-ups. Not recommended for general usage. + + This action doesn't work very reliable and may be removed in future + releases. + +Example usage: + + +kill-popups + + +------------------------------------------------------------------------------- + +8.5.28. limit-connect + +Typical use: + + Prevent abuse of Privoxy as a TCP proxy relay or disable SSL for untrusted + sites + +Effect: + + Specifies to which ports HTTP CONNECT requests are allowable. + +Type: + + Parameterized. + +Parameter: + + A comma-separated list of ports or port ranges (the latter using dashes, + with the minimum defaulting to 0 and the maximum to 65K). + +Notes: + + By default, i.e. if no limit-connect action applies, Privoxy only allows + HTTP CONNECT requests to port 443 (the standard, secure HTTPS port). Use + limit-connect if more fine-grained control is desired for some or all + destinations. + + The CONNECT methods exists in HTTP to allow access to secure websites + ("https://" URLs) through proxies. It works very simply: the proxy connects + to the server on the specified port, and then short-circuits its + connections to the client and to the remote server. This means + CONNECT-enabled proxies can be used as TCP relays very easily. + + Privoxy relays HTTPS traffic without seeing the decoded content. Websites + can leverage this limitation to circumvent Privoxy's filters. By specifying + an invalid port range you can disable HTTPS entirely. If you plan to + disable SSL by default, consider enabling + treat-forbidden-connects-like-blocks as well, to be able to quickly create + exceptions. + +Example usages: + + +limit-connect{443} # This is the default and need not be specified. + +limit-connect{80,443} # Ports 80 and 443 are OK. + +limit-connect{-3, 7, 20-100, 500-} # Ports less than 3, 7, 20 to 100 and above 500 are OK. + +limit-connect{-} # All ports are OK + +limit-connect{,} # No HTTPS/SSL traffic is allowed + + +------------------------------------------------------------------------------- + +8.5.29. prevent-compression + +Typical use: + + Ensure that servers send the content uncompressed, so it can be passed + through filters. + +Effect: + + Removes the Accept-Encoding header which can be used to ask for compressed + transfer. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + More and more websites send their content compressed by default, which is + generally a good idea and saves bandwidth. But the filter, deanimate-gifs + and kill-popups actions need access to the uncompressed data. + + When compiled with zlib support (available since Privoxy 3.0.7), content + that should be filtered is decompressed on-the-fly and you don't have to + worry about this action. If you are using an older Privoxy version, or one + that hasn't been compiled with zlib support, this action can be used to + convince the server to send the content uncompressed. + + Most text-based instances compress very well, the size is seldom decreased + by less than 50%, for markup-heavy instances like news feeds saving more + than 90% of the original size isn't unusual. + + Not using compression will therefore slow down the transfer, and you should + only enable this action if you really need it. As of Privoxy 3.0.7 it's + disabled in all predefined action settings. + + Note that some (rare) ill-configured sites don't handle requests for + uncompressed documents correctly. Broken PHP applications tend to send an + empty document body, some IIS versions only send the beginning of the + content. If you enable prevent-compression per default, you might want to + add exceptions for those sites. See the example for how to do that. + +Example usage (sections): + + # Selectively turn off compression, and enable a filter + # + { +filter{tiny-textforms} +prevent-compression } + # Match only these sites + .google. + sourceforge.net + sf.net + + # Or instead, we could set a universal default: + # + { +prevent-compression } + / # Match all sites + + # Then maybe make exceptions for broken sites: + # + { -prevent-compression } + .compusa.com/ + + +------------------------------------------------------------------------------- + +8.5.30. overwrite-last-modified + +Typical use: + + Prevent yet another way to track the user's steps between sessions. + +Effect: + + Deletes the "Last-Modified:" HTTP server header or modifies its value. + +Type: + + Parameterized. + +Parameter: + + One of the keywords: "block", "reset-to-request-time" and "randomize" + +Notes: + + Removing the "Last-Modified:" header is useful for filter testing, where + you want to force a real reload instead of getting status code "304", which + would cause the browser to reuse the old version of the page. + + The "randomize" option overwrites the value of the "Last-Modified:" header + with a randomly chosen time between the original value and the current + time. In theory the server could send each document with a different + "Last-Modified:" header to track visits without using cookies. "Randomize" + makes it impossible and the browser can still revalidate cached documents. + + "reset-to-request-time" overwrites the value of the "Last-Modified:" header + with the current time. You could use this option together with + hided-if-modified-since to further customize your random range. + + The preferred parameter here is "randomize". It is safe to use, as long as + the time settings are more or less correct. If the server sets the + "Last-Modified:" header to the time of the request, the random range + becomes zero and the value stays the same. Therefore you should later + randomize it a second time with hided-if-modified-since, just to be sure. + + It is also recommended to use this action together with + crunch-if-none-match. + +Example usage: + + # Let the browser revalidate without being tracked across sessions + { +hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} + / + + +------------------------------------------------------------------------------- + +8.5.31. redirect + +Typical use: + + Redirect requests to other sites. + +Effect: + + Convinces the browser that the requested document has been moved to another + location and the browser should get it from there. + +Type: + + Parameterized + +Parameter: + + An absolute URL or a single pcrs command. + +Notes: + + Requests to which this action applies are answered with a HTTP redirect to + URLs of your choosing. The new URL is either provided as parameter, or + derived by applying a single pcrs command to the original URL. + + This action will be ignored if you use it together with block. It can be + combined with fast-redirects{check-decoded-url} to redirect to a decoded + version of a rewritten URL. + + Use this action carefully, make sure not to create redirection loops and be + aware that using your own redirects might make it possible to fingerprint + your requests. + +Example usages: + + # Replace example.com's style sheet with another one + { +redirect{http://localhost/css-replacements/example.com.css} } + example.com/stylesheet\.css + + # Create a short, easy to remember nickname for a favorite site + # (relies on the browser accept and forward invalid URLs to Privoxy) + { +redirect{http://www.privoxy.org/user-manual/actions-file.html} } + a + + # Always use the expanded view for Undeadly.org articles + # (Note the $ at the end of the URL pattern to make sure + # the request for the rewritten URL isn't redirected as well) + {+redirect{s@$@&mode=expanded@}} + undeadly.org/cgi\?action=article&sid=\d*$ + + +------------------------------------------------------------------------------- + +8.5.32. send-vanilla-wafer + +Typical use: + + Feed log analysis scripts with useless data. + +Effect: + + Sends a cookie with each request stating that you do not accept any + copyright on cookies sent to you, and asking the site operator not to track + you. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + The vanilla wafer is a (relatively) unique header and could conceivably be + used to track you. + + This action is rarely used and not enabled in the default configuration. + +Example usage: + + +send-vanilla-wafer + + +------------------------------------------------------------------------------- + +8.5.33. send-wafer + +Typical use: + + Send custom cookies or feed log analysis scripts with even more useless + data. + +Effect: + + Sends a custom, user-defined cookie with each request. + +Type: + + Multi-value. + +Parameter: + + A string of the form "name=value". + +Notes: + + Being multi-valued, multiple instances of this action can apply to the same + request, resulting in multiple cookies being sent. + + This action is rarely used and not enabled in the default configuration. + +Example usage (section): + + {+send-wafer{UsingPrivoxy=true}} + my-internal-testing-server.void + + +------------------------------------------------------------------------------- + +8.5.34. server-header-filter + +Typical use: + + Rewrite or remove single server headers. + +Effect: + + All server headers to which this action applies are filtered on-the-fly + through the specified regular expression based substitutions. + +Type: + + Parameterized. + +Parameter: + + The name of a server-header filter, as defined in one of the filter files. + +Notes: + + Server-header filters are applied to each header on its own, not to all at + once. This makes it easier to diagnose problems, but on the downside you + can't write filters that only change header x if header y's value is z. You + can do that by using tags though. + + Server-header filters are executed after the other header actions have + finished and use their output as input. + + Please refer to the filter file chapter to learn which server-header + filters are available by default, and how to create your own. + +Example usage (section): + + {+server-header-filter{html-to-xml}} + example.org/xml-instance-that-is-delivered-as-html + + {+server-header-filter{xml-to-html}} + example.org/instance-that-is-delivered-as-xml-but-is-not + + + +------------------------------------------------------------------------------- + +8.5.35. server-header-tagger + +Typical use: + + Enable or disable filters based on the Content-Type header. + +Effect: + + Server headers to which this action applies are filtered on-the-fly through + the specified regular expression based substitutions, the result is used as + tag. + +Type: + + Parameterized. + +Parameter: + + The name of a server-header tagger, as defined in one of the filter files. + +Notes: + + Server-header taggers are applied to each header on its own, and as the + header isn't modified, each tagger "sees" the original. + + Server-header taggers are executed before all other header actions that + modify server headers. Their tags can be used to control all of the other + server-header actions, the content filters and the crunch actions (redirect + and block). + + Obviously crunching based on tags created by server-header taggers doesn't + prevent the request from showing up in the server's log file. + +Example usage (section): + + # Tag every request with the content type declared by the server + {+server-header-tagger{content-type}} + / + + + +------------------------------------------------------------------------------- + +8.5.36. session-cookies-only + +Typical use: + + Allow only temporary "session" cookies (for the current browser session + only). + +Effect: + + Deletes the "expires" field from "Set-Cookie:" server headers. Most + browsers will not store such cookies permanently and forget them in between + sessions. + +Type: + + Boolean. + +Parameter: + + N/A + +Notes: + + This is less strict than crunch-incoming-cookies / crunch-outgoing-cookies + and allows you to browse websites that insist or rely on setting cookies, + without compromising your privacy too badly. + + Most browsers will not permanently store cookies that have been processed + by session-cookies-only and will forget about them between sessions. This + makes profiling cookies useless, but won't break sites which require + cookies so that you can log in for transactions. This is generally turned + on for all sites, and is the recommended setting. + + It makes no sense at all to use session-cookies-only together with + crunch-incoming-cookies or crunch-outgoing-cookies. If you do, cookies will + be plainly killed. + + Note that it is up to the browser how it handles such cookies without an + "expires" field. If you use an exotic browser, you might want to try it out + to be sure. + + This setting also has no effect on cookies that may have been stored + previously by the browser before starting Privoxy. These would have to be + removed manually. + + Privoxy also uses the content-cookies filter to block some types of + cookies. Content cookies are not effected by session-cookies-only. + +Example usage: + + +session-cookies-only + + +------------------------------------------------------------------------------- + +8.5.37. set-image-blocker + +Typical use: + + Choose the replacement for blocked images + +Effect: + + This action alone doesn't do anything noticeable. If both block and + handle-as-image also apply, i.e. if the request is to be blocked as an + image, then the parameter of this action decides what will be sent as a + replacement. + +Type: + + Parameterized. + +Parameter: + + + "pattern" to send a built-in checkerboard pattern image. The image is + visually decent, scales very well, and makes it obvious where banners + were busted. + + + "blank" to send a built-in transparent image. This makes banners + disappear completely, but makes it hard to detect where Privoxy has + blocked images on a given page and complicates troubleshooting if + Privoxy has blocked innocent images, like navigation icons. + + + "target-url" to send a redirect to target-url. You can redirect to any + image anywhere, even in your local filesystem via "file:///" URL. (But + note that not all browsers support redirecting to a local file system). + + A good application of redirects is to use special Privoxy-built-in + URLs, which send the built-in images, as target-url. This has the same + visual effect as specifying "blank" or "pattern" in the first place, + but enables your browser to cache the replacement image, instead of + requesting it over and over again. + +Notes: + + The URLs for the built-in images are "http://config.privoxy.org/ + send-banner?type=type", where type is either "blank" or "pattern". + + There is a third (advanced) type, called "auto". It is NOT to be used in + set-image-blocker, but meant for use from filters. Auto will select the + type of image that would have applied to the referring page, had it been an + image. + +Example usage: + + Built-in pattern: + + +set-image-blocker{pattern} + + + Redirect to the BSD daemon: + + +set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif} + + + Redirect to the built-in pattern for better caching: + + +set-image-blocker{http://config.privoxy.org/send-banner?type=pattern} + + +------------------------------------------------------------------------------- + +8.5.38. treat-forbidden-connects-like-blocks + +Typical use: + + Block forbidden connects with an easy to find error message. + +Effect: + + If this action is enabled, Privoxy no longer makes a difference between + forbidden connects and ordinary blocks. + +Type: + + Boolean + +Parameter: + + N/A + +Notes: + + By default Privoxy answers forbidden "Connect" requests with a short error + message inside the headers. If the browser doesn't display headers (most + don't), you just see an empty page. + + With this action enabled, Privoxy displays the message that is used for + ordinary blocks instead. If you decide to make an exception for the page in + question, you can do so by following the "See why" link. + + For "Connect" requests the clients tell Privoxy which host they are + interested in, but not which document they plan to get later. As a result, + the "Go there anyway" wouldn't work and is therefore suppressed. + +Example usage: + + +treat-forbidden-connects-like-blocks + + +------------------------------------------------------------------------------- + +8.5.39. Summary + +Note that many of these actions have the potential to cause a page to +misbehave, possibly even not to display at all. There are many ways a site +designer may choose to design his site, and what HTTP header content, and other +criteria, he may depend on. There is no way to have hard and fast rules for all +sites. See the Appendix for a brief example on troubleshooting actions. + +------------------------------------------------------------------------------- + +8.6. Aliases + +Custom "actions", known to Privoxy as "aliases", can be defined by combining +other actions. These can in turn be invoked just like the built-in actions. +Currently, an alias name can contain any character except space, tab, "=", "{" +and "}", but we strongly recommend that you only use "a" to "z", "0" to "9", +"+", and "-". Alias names are not case sensitive, and are not required to start +with a "+" or "-" sign, since they are merely textually expanded. + +Aliases can be used throughout the actions file, but they must be defined in a +special section at the top of the file! And there can only be one such section +per actions file. Each actions file may have its own alias section, and the +aliases defined in it are only visible within that file. + +There are two main reasons to use aliases: One is to save typing for frequently +used combinations of actions, the other one is a gain in flexibility: If you +decide once how you want to handle shops by defining an alias called "shop", +you can later change your policy on shops in one place, and your changes will +take effect everywhere in the actions file where the "shop" alias is used. +Calling aliases by their purpose also makes your actions files more readable. + +Currently, there is one big drawback to using aliases, though: Privoxy's +built-in web-based action file editor honors aliases when reading the actions +files, but it expands them before writing. So the effects of your aliases are +of course preserved, but the aliases themselves are lost when you edit sections +that use aliases with it. + +Now let's define some aliases... + + # Useful custom aliases we can use later. + # + # Note the (required!) section header line and that this section + # must be at the top of the actions file! + # + {{alias}} + + # These aliases just save typing later: + # (Note that some already use other aliases!) + # + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + +block-as-image = +block +handle-as-image + allow-all-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies} + + # These aliases define combinations of actions + # that are useful for certain types of sites: + # + fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -kill-popups -prevent-compression + + shop = -crunch-all-cookies -filter{all-popups} -kill-popups + + # Short names for other aliases, for really lazy people ;-) + # + c0 = +crunch-all-cookies + c1 = -crunch-all-cookies + + +...and put them to use. These sections would appear in the lower part of an +actions file and define exceptions to the default actions (as specified further +up for the "/" pattern): + + # These sites are either very complex or very keen on + # user data and require minimal interference to work: + # + {fragile} + .office.microsoft.com + .windowsupdate.microsoft.com + # Gmail is really mail.google.com, not gmail.com + mail.google.com + + # Shopping sites: + # Allow cookies (for setting and retrieving your customer data) + # + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + mybank.example.com + + # These shops require pop-ups: + # + {-kill-popups -filter{all-popups} -filter{unsolicited-popups}} + .dabs.com + .overclockers.co.uk + + +Aliases like "shop" and "fragile" are typically used for "problem" sites that +require more than one action to be disabled in order to function properly. + +------------------------------------------------------------------------------- + +8.7. Actions Files Tutorial + +The above chapters have shown which actions files there are and how they are +organized, how actions are specified and applied to URLs, how patterns work, +and how to define and use aliases. Now, let's look at an example default.action +and user.action file and see how all these pieces come together: + +------------------------------------------------------------------------------- + +8.7.1. default.action + +Every config file should start with a short comment stating its purpose: + +# Sample default.action file + + +Then, since this is the default.action file, the first section is a special +section for internal use that you needn't change or worry about: + +########################################################################## +# Settings -- Don't change! For internal Privoxy use ONLY. +########################################################################## + +{{settings}} +for-privoxy-version=3.0 + + +After that comes the (optional) alias section. We'll use the example section +from the above chapter on aliases, that also explains why and how aliases are +used: + +########################################################################## +# Aliases +########################################################################## +{{alias}} + + # These aliases just save typing later: + # (Note that some already use other aliases!) + # + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + +block-as-image = +block +handle-as-image + mercy-for-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies} + + # These aliases define combinations of actions + # that are useful for certain types of sites: + # + fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -kill-popups + shop = -crunch-all-cookies -filter{all-popups} -kill-popups + + +Now come the regular sections, i.e. sets of actions, accompanied by URL +patterns to which they apply. Remember all actions are disabled when matching +starts, so we have to explicitly enable the ones we want. + +The first regular section is probably the most important. It has only one +pattern, "/", but this pattern matches all URLs. Therefore, the set of actions +used in this "default" section will be applied to all requests as a start. It +can be partly or wholly overridden by later matches further down this file, or +in user.action, but it will still be largely responsible for your overall +browsing experience. + +Again, at the start of matching, all actions are disabled, so there is no need +to disable any actions here. (Remember: a "+" preceding the action name enables +the action, a "-" disables!). Also note how this long line has been made more +readable by splitting it into multiple lines with line continuation. + +########################################################################## +# "Defaults" section: +########################################################################## + { \ + +deanimate-gifs \ + +filter{html-annoyances} \ + +filter{refresh-tags} \ + +filter{webbugs} \ + +filter{ie-exploits} \ + +hide-forwarded-for-headers \ + +hide-from-header{block} \ + +hide-referrer{forge} \ + +prevent-compression \ + +session-cookies-only \ + +set-image-blocker{pattern} \ + } + / # forward slash will match *all* potential URL patterns. + + +The default behavior is now set. + +The first of our specialized sections is concerned with "fragile" sites, i.e. +sites that require minimum interference, because they are either very complex +or very keen on tracking you (and have mechanisms in place that make them +unusable for people who avoid being tracked). We will simply use our +pre-defined fragile alias instead of stating the list of actions explicitly: + +########################################################################## +# Exceptions for sites that'll break under the default action set: +########################################################################## + +# "Fragile" Use a minimum set of actions for these sites (see alias above): +# +{ fragile } +.office.microsoft.com # surprise, surprise! +.windowsupdate.microsoft.com +mail.google.com + + +Shopping sites are not as fragile, but they typically require cookies to log +in, and pop-up windows for shopping carts or item details. Again, we'll use a +pre-defined alias: + +# Shopping sites: +# +{ shop } +.quietpc.com +.worldpay.com # for quietpc.com +.jungle.com +.scan.co.uk + + +The fast-redirects action, which we enabled per default above, breaks some +sites. So disable it for popular sites where we know it misbehaves: + +{ -fast-redirects } +login.yahoo.com +edit.*.yahoo.com +.google.com +.altavista.com/.*(like|url|link):http +.altavista.com/trans.*urltext=http +.nytimes.com + + +It is important that Privoxy knows which URLs belong to images, so that if they +are to be blocked, a substitute image can be sent, rather than an HTML page. +Contacting the remote site to find out is not an option, since it would destroy +the loading time advantage of banner blocking, and it would feed the +advertisers (in terms of money and information). We can mark any URL as an +image with the handle-as-image action, and marking all URLs that end in a known +image file extension is a good start: + +########################################################################## +# Images: +########################################################################## + +# Define which file types will be treated as images, in case they get +# blocked further down this file: +# +{ +handle-as-image } +/.*\.(gif|jpe?g|png|bmp|ico)$ + + +And then there are known banner sources. They often use scripts to generate the +banners, so it won't be visible from the URL that the request is for an image. +Hence we block them and mark them as images in one go, with the help of our ++block-as-image alias defined above. (We could of course just as well use + +block +handle-as-image here.) Remember that the type of the replacement image +is chosen by the set-image-blocker action. Since all URLs have matched the +default section with its +set-image-blocker{pattern} action before, it still +applies and needn't be repeated: + +# Known ad generators: +# +{ +block-as-image } +ar.atwola.com +.ad.doubleclick.net +.ad.*.doubleclick.net +.a.yimg.com/(?:(?!/i/).)*$ +.a[0-9].yimg.com/(?:(?!/i/).)*$ +bs*.gsanet.com +.qkimg.net + + +One of the most important jobs of Privoxy is to block banners. Many of these +can be "blocked" by the filter{banners-by-size} action, which we enabled above, +and which deletes the references to banner images from the pages while they are +loaded, so the browser doesn't request them anymore, and hence they don't need +to be blocked here. But this naturally doesn't catch all banners, and some +people choose not to use filters, so we need a comprehensive list of patterns +for banner URLs here, and apply the block action to them. + +First comes many generic patterns, which do most of the work, by matching +typical domain and path name components of banners. Then comes a list of +individual patterns for specific sites, which is omitted here to keep the +example short: + +########################################################################## +# Block these fine banners: +########################################################################## +{ +block } + +# Generic patterns: +# +ad*. +.*ads. +banner?. +count*. +/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?) +/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/ + +# Site-specific patterns (abbreviated): +# +.hitbox.com + + +It's quite remarkable how many advertisers actually call their banner servers +ads.company.com, or call the directory in which the banners are stored simply +"banners". So the above generic patterns are surprisingly effective. + +But being very generic, they necessarily also catch URLs that we don't want to +block. The pattern .*ads. e.g. catches "nasty-ads.nasty-corp.com" as intended, +but also "downloads.sourcefroge.net" or "adsl.some-provider.net." So here come +some well-known exceptions to the +block section above. + +Note that these are exceptions to exceptions from the default! Consider the URL +"downloads.sourcefroge.net": Initially, all actions are deactivated, so it +wouldn't get blocked. Then comes the defaults section, which matches the URL, +but just deactivates the block action once again. Then it matches .*ads., an +exception to the general non-blocking policy, and suddenly +block applies. And +now, it'll match .*loads., where -block applies, so (unless it matches again +further down) it ends up with no block action applying. + +########################################################################## +# Save some innocent victims of the above generic block patterns: +########################################################################## + +# By domain: +# +{ -block } +adv[io]*. # (for advogato.org and advice.*) +adsl. # (has nothing to do with ads) +adobe. # (has nothing to do with ads either) +ad[ud]*. # (adult.* and add.*) +.edu # (universities don't host banners (yet!)) +.*loads. # (downloads, uploads etc) + +# By path: +# +/.*loads/ + +# Site-specific: +# +www.globalintersec.com/adv # (adv = advanced) +www.ugu.com/sui/ugu/adv + + +Filtering source code can have nasty side effects, so make an exception for our +friends at sourceforge.net, and all paths with "cvs" in them. Note that -filter +disables all filters in one fell swoop! + +# Don't filter code! +# +{ -filter } +/(.*/)?cvs +bugzilla. +developer. +wiki. +.sourceforge.net + + +The actual default.action is of course much more comprehensive, but we hope +this example made clear how it works. + +------------------------------------------------------------------------------- + +8.7.2. user.action + +So far we are painting with a broad brush by setting general policies, which +would be a reasonable starting point for many people. Now, you might want to be +more specific and have customized rules that are more suitable to your personal +habits and preferences. These would be for narrowly defined situations like +your ISP or your bank, and should be placed in user.action, which is parsed +after all other actions files and hence has the last word, over-riding any +previously defined actions. user.action is also a safe place for your personal +settings, since default.action is actively maintained by the Privoxy developers +and you'll probably want to install updated versions from time to time. + +So let's look at a few examples of things that one might typically do in +user.action: + +# My user.action file. + + +As aliases are local to the actions file that they are defined in, you can't +use the ones from default.action, unless you repeat them here: + +# Aliases are local to the file they are defined in. +# (Re-)define aliases for this file: +# +{{alias}} +# +# These aliases just save typing later, and the alias names should +# be self explanatory. +# ++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies +-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + allow-all-cookies = -crunch-all-cookies -session-cookies-only + allow-popups = -filter{all-popups} -kill-popups ++block-as-image = +block +handle-as-image +-block-as-image = -block + +# These aliases define combinations of actions that are useful for +# certain types of sites: +# +fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer -kill-popups +shop = -crunch-all-cookies allow-popups + +# Allow ads for selected useful free sites: +# +allow-ads = -block -filter{banners-by-size} -filter{banners-by-link} + +# Alias for specific file types that are text, but might have conflicting +# MIME types. We want the browser to force these to be text documents. +handle-as-text = -filter +-content-type-overwrite{text/plain} +-force-text-mode -hide-content-disposition + + + + +Say you have accounts on some sites that you visit regularly, and you don't +want to have to log in manually each time. So you'd like to allow persistent +cookies for these sites. The allow-all-cookies alias defined above does exactly +that, i.e. it disables crunching of cookies in any direction, and the +processing of cookies to make them only temporary. + +{ allow-all-cookies } + sourceforge.net + .yahoo.com + .msdn.microsoft.com + .redhat.com + + +Your bank is allergic to some filter, but you don't know which, so you disable +them all: + +{ -filter } + .your-home-banking-site.com + + +Some file types you may not want to filter for various reasons: + +# Technical documentation is likely to contain strings that might +# erroneously get altered by the JavaScript-oriented filters: +# +.tldp.org +/(.*/)?selfhtml/ + +# And this stupid host sends streaming video with a wrong MIME type, +# so that Privoxy thinks it is getting HTML and starts filtering: +# +stupid-server.example.com/ + + +Example of a simple block action. Say you've seen an ad on your favourite page +on example.com that you want to get rid of. You have right-clicked the image, +selected "copy image location" and pasted the URL below while removing the +leading http://, into a { +block } section. Note that { +handle-as-image } need +not be specified, since all URLs ending in .gif will be tagged as images by the +general rules as set in default.action anyway: + +{ +block } + www.example.com/nasty-ads/sponsor\.gif + another.example.net/more/junk/here/ + + +The URLs of dynamically generated banners, especially from large banner farms, +often don't use the well-known image file name extensions, which makes it +impossible for Privoxy to guess the file type just by looking at the URL. You +can use the +block-as-image alias defined above for these cases. Note that +objects which match this rule but then turn out NOT to be an image are +typically rendered as a "broken image" icon by the browser. Use cautiously. + +{ +block-as-image } + .doubleclick.net + .fastclick.net + /Realmedia/ads/ + ar.atwola.com/ + + +Now you noticed that the default configuration breaks Forbes Magazine, but you +were too lazy to find out which action is the culprit, and you were again too +lazy to give feedback, so you just used the fragile alias on the site, and -- +whoa! -- it worked. The fragile aliases disables those actions that are most +likely to break a site. Also, good for testing purposes to see if it is Privoxy +that is causing the problem or not. We later find other regular sites that +misbehave, and add those to our personalized list of troublemakers: + +{ fragile } + .forbes.com + webmail.example.com + .mybank.com + + +You like the "fun" text replacements in default.filter, but it is disabled in +the distributed actions file. So you'd like to turn it on in your private, +update-safe config, once and for all: + +{ +filter{fun} } + / # For ALL sites! + + +Note that the above is not really a good idea: There are exceptions to the +filters in default.action for things that really shouldn't be filtered, like +code on CVS->Web interfaces. Since user.action has the last word, these +exceptions won't be valid for the "fun" filtering specified here. + +You might also worry about how your favourite free websites are funded, and +find that they rely on displaying banner advertisements to survive. So you +might want to specifically allow banners for those sites that you feel provide +value to you: + +{ allow-ads } + .sourceforge.net + .slashdot.org + .osdn.net + + +Note that allow-ads has been aliased to -block, -filter{banners-by-size}, and - +filter{banners-by-link} above. + +Invoke another alias here to force an over-ride of the MIME type application/ +x-sh which typically would open a download type dialog. In my case, I want to +look at the shell script, and then I can save it should I choose to. + +{ handle-as-text } + /.*\.sh$ + + +user.action is generally the best place to define exceptions and additions to +the default policies of default.action. Some actions are safe to have their +default policies set here though. So let's set a default policy to have a +"blank" image as opposed to the checkerboard pattern for ALL sites. "/" of +course matches all URL paths and patterns: + +{ +set-image-blocker{blank} } +/ # ALL sites + + +------------------------------------------------------------------------------- + +9. Filter Files + +On-the-fly text substitutions need to be defined in a "filter file". Once +defined, they can then be invoked as an "action". + +Privoxy supports three different filter actions: filter to rewrite the content +that is send to the client, client-header-filter to rewrite headers that are +send by the client, and server-header-filter to rewrite headers that are send +by the server. + +Privoxy also supports two tagger actions: client-header-tagger and +server-header-tagger. Taggers and filters use the same syntax in the filter +files, the difference is that taggers don't modify the text they are filtering, +but use a rewritten version of the filtered text as tag. The tags can then be +used to change the applying actions through sections with tag-patterns. + +Multiple filter files can be defined through the filterfile config directive. +The filters as supplied by the developers are located in default.filter. It is +recommended that any locally defined or modified filters go in a separately +defined file such as user.filter. + +Common tasks for content filters are to eliminate common annoyances in HTML and +JavaScript, such as pop-up windows, exit consoles, crippled windows without +navigation tools, the infamous tag etc, to suppress images with certain +width and height attributes (standard banner sizes or web-bugs), or just to +have fun. + +Enabled content filters are applied to any content whose "Content Type" header +is recognised as a sign of text-based content, with the exception of text/ +plain. Use the force-text-mode action to also filter other content. + +Substitutions are made at the source level, so if you want to "roll your own" +filters, you should first be familiar with HTML syntax, and, of course, regular +expressions. + +Just like the actions files, the filter file is organized in sections, which +are called filters here. Each filter consists of a heading line, that starts +with one of the keywords FILTER:, CLIENT-HEADER-FILTER: or +SERVER-HEADER-FILTER: followed by the filter's name, and a short (one line) +description of what it does. Below that line come the jobs, i.e. lines that +define the actual text substitutions. By convention, the name of a filter +should describe what the filter eliminates. The comment is used in the +web-based user interface. + +Once a filter called name has been defined in the filter file, it can be +invoked by using an action of the form +filter{name} in any actions file. + +Filter definitions start with a header line that contains the filter type, the +filter name and the filter description. A content filter header line for a +filter called "foo" could look like this: + +FILTER: foo Replace all "foo" with "bar" + + +Below that line, and up to the next header line, come the jobs that define what +text replacements the filter executes. They are specified in a syntax that +imitates Perl's s/// operator. If you are familiar with Perl, you will find +this to be quite intuitive, and may want to look at the PCRS documentation for +the subtle differences to Perl behaviour. Most notably, the non-standard option +letter U is supported, which turns the default to ungreedy matching. + +If you are new to "Regular Expressions", you might want to take a look at the +Appendix on regular expressions, and see the Perl manual for the s/// +operator's syntax and Perl-style regular expressions in general. The below +examples might also help to get you started. + +------------------------------------------------------------------------------- + +9.1. Filter File Tutorial + +Now, let's complete our "foo" content filter. We have already defined the +heading, but the jobs are still missing. Since all it does is to replace "foo" +with "bar", there is only one (trivial) job needed: + +s/foo/bar/ + + +But wait! Didn't the comment say that all occurrences of "foo" should be +replaced? Our current job will only take care of the first "foo" on each page. +For global substitution, we'll need to add the g option: + +s/foo/bar/g + + +Our complete filter now looks like this: + +FILTER: foo Replace all "foo" with "bar" +s/foo/bar/g + + +Let's look at some real filters for more interesting examples. Here you see a +filter that protects against some common annoyances that arise from JavaScript +abuse. Let's look at its jobs one after the other: + +FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse + +# Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm +# +s|()|$1"Not Your Business!"$2|Usg + + +Following the header line and a comment, you see the job. Note that it uses | +as the delimiter instead of /, because the pattern contains a forward slash, +which would otherwise have to be escaped by a backslash (\). + +Now, let's examine the pattern: it starts with the text tag. + +That's more than we want, but the pattern continues: document\.referrer matches +only the exact string "document.referrer". The dot needed to be escaped, i.e. +preceded by a backslash, to take away its special meaning as a joker, and make +it just a regular dot. So far, the meaning is: Match from the start of the +first . You already know what .* means, so the whole +pattern translates to: Match from the start of the first " tag. Furthermore, the s +option says that the match may span multiple lines in the page, and the g +option again means that the substitution is global. + +So, to summarize, the pattern means: Match all scripts that contain the text +"document.referrer". Remember the parts of the script from (and including) the +start tag up to (and excluding) the string "document.referrer" as $1, and the +part following that string, up to and including the closing tag, as $2. + +Now the pattern is deciphered, but wasn't this about substituting things? So +lets look at the substitute: $1"Not Your Business!"$2 is easy to read: The text +remembered as $1, followed by "Not Your Business!" (including the quotation +marks!), followed by the text remembered as $2. This produces an exact copy of +the original string, with the middle part (the "document.referrer") replaced by +"Not Your Business!". + +The whole job now reads: Replace "document.referrer" by "Not Your Business!" +wherever it appears inside a