X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fsource%2Fuser-manual.sgml;h=7fd3f8353d949d61ba563ab0dc5f0f365b7d9021;hp=8f6ab5fbca0bc1b9a276b1b80423cda1aff96f0b;hb=7bbee96637ad3a65a3ef35d37efc7fc059a96e5a;hpb=dcb6d2261f6cc345036a0054be5904667754a2ab diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml index 8f6ab5fb..7fd3f835 100644 --- a/doc/source/user-manual.sgml +++ b/doc/source/user-manual.sgml @@ -1,57 +1,106 @@ - + + + + + + + + + + + + + + + + + + + + + + + + + +Privoxy"> +]> - -
-Privoxy User Manual -$Id: user-manual.sgml,v 1.61 2002/03/29 01:31:08 hal9 Exp $ +Privoxy &p-version; User Manual + + + + + + Copyright &my-copy; 2001 - 2007 by + Privoxy Developers + + + +$Id: user-manual.sgml,v 2.31 2007/06/02 14:01:37 fabiankeil Exp $ + + - - - - By: Privoxy Developers - - - + + + + This is here to keep vim syntax file from breaking :/ + If I knew enough to fix it, I would. + PLEASE DO NOT REMOVE! HB: hal@foobox.net + + +]]> + - The user manual gives users information on how to install, configure and use - Privoxy. Privoxy is a - web proxy with advanced filtering capabilities for protecting privacy, - filtering web page content, managing cookies, controlling access, and - removing ads, banners, pop-ups and other obnoxious Internet - Junk. Privoxy has a very flexible configuration - and can be customized to suit individual needs and - tastes. Privoxy has application for both - stand-alone systems and multi-user networks. + The Privoxy User Manual gives users information on how to + install, configure and use Privoxy. + + + &p-intro; + + -You can find the latest version of the user manual at http://www.privoxy.org/user-manual/. + You can find the latest version of the Privoxy User Manual at http://www.privoxy.org/user-manual/. + Please see the Contact section on how to + contact the developers. @@ -61,172 +110,41 @@ You can find the latest version of the user manual at Introduction - - Privoxy is a web proxy with advanced filtering - capabilities for protecting privacy, filtering web page content, managing - cookies, controlling access, and removing ads, banners, pop-ups and other - obnoxious Internet junk. Privoxy has a very - flexible configuration and can be customized to suit individual needs and - tastes. Privoxy has application for both - stand-alone systems and multi-user networks. - - - - Privoxy is based on the code of the - Internet Junkbuster. - Junkbuster was originally written by JunkBusters - Corporation, and was released as free open-source software under the GNU GPL. - Stefan Waldherr made many improvements, and started the SourceForge project - to continue development. - - - - Privoxy continues the - Junkbuster tradition, but adds many - refinements and enhancements. - - +Introduction - This documentation is included with the current BETA version of - Privoxy and is mostly complete at this - point. The most up to date reference for the time being is still the comments - in the source files and in the individual configuration files. Development - of version 3.0 is currently nearing completion, and includes many significant - changes and enhancements over earlier versions. The target release date for - stable v3.0 is soon ;-) + This documentation is included with the current &p-status; version of + Privoxy, v.&p-version;. + + - Since this is a BETA version, not all new features are well tested. This + Since this is a &p-status; version, not all new features are well tested. This documentation may be slightly out of sync as a result (especially with CVS sources). And there may be bugs, though hopefully not many! - +]]> - -New Features - - In addition to Internet Junkbuster's traditional - feature of ad and banner blocking and cookie management, - Privoxy provides new features, some of them - currently under development: - - - - - - - - - Integrated browser based configuration and control utility (http://p.p). Browser-based tracing of rule - and filter effects. - - - - - - Blocking of annoying pop-up browser windows. - - - - - - HTTP/1.1 compliant (most, but not all 1.1 features are supported). - - - - - - Support for Perl Compatible Regular Expressions in the configuration files, and - generally a more sophisticated and flexible configuration syntax over - previous versions. - - - - - - GIF de-animation. - - - - - - Web page content filtering (removes banners based on size, - invisible web-bugs, JavaScript, pop-ups, status bar abuse, - etc.) - - - - - - Bypass many click-tracking scripts (avoids script redirection). - - - - - - - Multi-threaded (POSIX and native threads). - - - - - - Auto-detection and re-reading of config file changes. - - - - - - User-customizable HTML templates (e.g. 404 error page). - - - - - - Improved cookie management features (e.g. session based cookies). - - - - - - Improved signal handling, and a true daemon mode (Unix). - - - - - - Builds from source on most UNIX-like systems. Packages available for: Linux - (RedHat, SuSE, or Debian), Windows, Sun Solaris, Mac OSX, OS/2, HP-UX 11 and AmigaOS. - - - - - - - In addition, the configuration is much more powerful and versatile over-all. - - - - - - +Features + + In addition to the core + features of ad blocking and + cookie management, + Privoxy provides many supplemental + features, + that give the end-user more control, more privacy and more freedom: + + + &newfeatures; + @@ -236,544 +154,860 @@ You can find the latest version of the user manual at Installation - - Privoxy is available as raw source code, or - pre-compiled binaries. See the Privoxy Home Page - for binaries and current release info. Privoxy - is also available via CVS. - This is the recommended approach at this time. But please be aware that CVS - is constantly changing, and it may break in mysterious ways. - - -Source - For gzipped tar archives, unpack the source: + Privoxy is available both in convenient pre-compiled + packages for a wide range of operating systems, and as raw source code. + For most users, we recommend using the packages, which can be downloaded from our + Privoxy Project + Page. - - tar xzvf privoxy-2.9.13-beta-src* [.tgz or .tar.gz] - cd privoxy-2.9.13-beta - + Note: + On some platforms, the installer may remove previously installed versions, if + found. (See below for your platform). In any case be sure to backup + your old configuration if it is valuable to you. See the note to upgraders section below. + +Binary Packages - For retrieving the current CVS sources, you'll need the CVS - package installed first. To download CVS source: +How to install the binary packages depends on your operating system: + + + +Red Hat and Fedora RPMs + - - cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login - cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co current - cd current - + RPMs can be installed with rpm -Uvh privoxy-&p-version;-1.rpm, + and will use /etc/privoxy for the location + of configuration files. - This will create a directory named current/, which will - contain the source tree. + Note that on Red Hat, Privoxy will + not be automatically started on system boot. You will + need to enable that using chkconfig, + ntsysv, or similar methods. - Then, in either case, to build from tarball/CVS source: + If you have problems with failed dependencies, try rebuilding the SRC RPM: + rpm --rebuild privoxy-&p-version;-1.src.rpm. This + will use your locally installed libraries and RPM version. - - ./configure (--help to see options) - make (the make from gnu, gmake for *BSD) - su - make -n install (to see where all the files will go) - make install (to really install) - + Also note that if you have a Junkbuster RPM installed + on your system, you need to remove it first, because the packages conflict. + Otherwise, RPM will try to remove Junkbuster + automatically if found, before installing Privoxy. + + +Debian - For Redhat and SuSE Linux RPM packages, see below. + DEBs can be installed with apt-get install privoxy, + and will use /etc/privoxy for the location of + configuration files. + - + +Windows + + + Just double-click the installer, which will guide you through + the installation process. You will find the configuration files + in the same directory as you installed Privoxy in. + + + Version 3.0.4 introduced full Windows service + functionality. On Windows only, the Privoxy + program has two new command line arguments to install and uninstall + Privoxy as a service. + + + + Arguments: + + + --install[:service_name] + + + --uninstall[:service_name] + + + + + + After invoking Privoxy with + --install, you will need to bring up the + Windows service console to assign the user you + want Privoxy to run under, and whether or not you + want it to run whenever the system starts. You can start the + Windows services console with the following + command: services.msc. If you do not take the manual step + of modifying Privoxy's service settings, it will + not start. Note too that you will need to give Privoxy a user account that + actually exists, or it will not be permitted to + write to its log and configuration files. + + -Red Hat +Solaris, NetBSD, HP-UX + - To build Redhat RPM packages, install source as above. Then: + Create a new directory, cd to it, then unzip and + untar the archive. For the most part, you'll have to figure out where + things go. + + + +OS/2 - - autoheader - autoconf - ./configure - make redhat-dist - + First, make sure that no previous installations of + Junkbuster and / or + Privoxy are left on your + system. Check that no Junkbuster + or Privoxy objects are in + your startup folder. + - This will create both binary and src RPMs in the usual places. Example: + Then, just double-click the WarpIN self-installing archive, which will + guide you through the installation process. A shadow of the + Privoxy executable will be placed in your + startup folder so it will start automatically whenever OS/2 starts. -    /usr/src/redhat/RPMS/i686/privoxy-2.9.11-1.i686.rpm + The directory you choose to install Privoxy + into will contain all of the configuration files. + + + +Mac OSX -    /usr/src/redhat/SRPMS/privoxy-2.9.11-1.src.rpm + Unzip the downloaded file (you can either double-click on the file + from the finder, or from the desktop if you downloaded it there). + Then, double-click on the package installer icon named + Privoxy.pkg + and follow the installation process. + Privoxy will be installed in the folder + /Library/Privoxy. + It will start automatically whenever you start up. To prevent it from + starting automatically, remove or rename the folder + /Library/StartupItems/Privoxy. - - To install, of course: + To start Privoxy by hand, double-click on + StartPrivoxy.command in the + /Library/Privoxy folder. + Or, type this command in the Terminal: - - - rpm -Uvv /usr/src/redhat/RPMS/i686/privoxy-2.9.11-1.i686.rpm - + + /Library/Privoxy/StartPrivoxy.command + - - This will place the Privoxy configuration - files in /etc/privoxy/, and log files in - /var/log/privoxy/. + You will be prompted for the administrator password. - - + -SuSE +AmigaOS - To build SuSE RPM packages, install source as above. Then: + Copy and then unpack the lha archive to a suitable location. + All necessary files will be installed into Privoxy + directory, including all configuration and log files. To uninstall, just + remove this directory. + - - - autoheader - autoconf - ./configure - make suse-dist - - + +FreeBSD - This will create both binary and src RPMs in the usual places. Example: + Privoxy is part of FreeBSD's Ports Collection, you can build and install + it with cd /usr/ports/www/privoxy; make install clean. - -    /usr/src/packages/RPMS/i686/privoxy-2.9.11-1.i686.rpm + If you don't use the ports, you can fetch and install + the package with pkg_add -r privoxy. -    /usr/src/packages/SRPMS/privoxy-2.9.11-1.src.rpm + The port skeleton and the package can also be downloaded from the + File Release + Page, but if you're interested in stable releases only you don't + gain anything by using them. + + +Gentoo - To install, of course: + Gentoo source packages (Ebuilds) for Privoxy are + contained in the Gentoo Portage Tree (they are not on the download page, + but there is a Gentoo section, where you can see when a new + Privoxy Version is added to the Portage Tree). - - - rpm -Uvv /usr/src/packages/RPMS/i686/privoxy-2.9.11-1.i686.rpm - + Before installing Privoxy under Gentoo just do + first emerge rsync to get the latest changes from the + Portage tree. With emerge privoxy you install the latest + version. - - This will place the Privoxy configuration - files in /etc/privoxy/, and log files in - /var/log/privoxy/. + Configuration files are in /etc/privoxy, the + documentation is in /usr/share/doc/privoxy-&p-version; + and the Log directory is in /var/log/privoxy. + - -OS/2 - - - - - Privoxy is packaged in a WarpIN self- - installing archive. The self-installing program will be named depending - on the release version, something like: - ijbos2_setup_1.2.3.exe. In order to install it, simply - run this executable or double-click on its icon and follow the WarpIN - installation panels. A shadow of the Privoxy - executable will be placed in your startup folder so it will start - automatically whenever OS/2 starts. - +Building from Source - The directory you choose to install Privoxy - into will contain all of the configuration files. + The most convenient way to obtain the Privoxy sources + is to download the source tarball from our + project download + page. - If you would like to build binary images on OS/2 yourself, you will need - a few Unix-like tools: autoconf, autoheader and sh. These tools will be - used to create the required config.h file, which is not part of the - source distribution because it differs based on platform. You will also - need a compiler. - The distribution has been created using IBM VisualAge compilers, but you - can use any compiler you like. GCC/EMX has the disadvantage of needing - to be single-threaded due to a limitation of EMX's implementation of the - select() socket call. + If you like to live on the bleeding edge and are not afraid of using + possibly unstable development versions, you can check out the up-to-the-minute + version directly from the + CVS repository. + - - In addition to needing the source code distribution as outlined earlier, - you will want to extract the os2seutp directory from CVS: - - cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login - cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co os2setup - - This will create a directory named os2setup/, which will contain the - Makefile.vac makefile and os2build.cmd - which is used to completely create the binary distribution. The sequence - of events for building the executable for yourself goes something like this: - - cd current - autoheader - autoconf - sh configure - cd ..\os2setup - nmake -f Makefile.vac - - You will see this sequence laid out in os2build.cmd. - + +&buildsource; + - - - -Windows -Click-click. (I need help on this. Not a clue here. Also for -configuration section below. HB.) + +Keeping your Installation Up-to-Date + + As user feedback comes in and development continues, we will make updated versions + of both the main actions file (as a separate + package) and the software itself (including the actions file) available for + download. - - -Other - Some quick notes on other Operating Systems. + If you wish to receive an email notification whenever we release updates of + Privoxy or the actions file, subscribe + to our announce mailing list, ijbswa-announce@lists.sourceforge.net. - For FreeBSD (and other *BSDs?), the build will require gmake - instead of the included make. gmake is - available from http://www.gnu.org. - The rest should be the same as above for Linux/Unix. + In order not to lose your personal changes and adjustments when updating + to the latest default.action file we strongly + recommend that you use user.action and + user.filter for your local + customizations of Privoxy. See the Chapter on actions files for details. + - - -Quickstart to Using <application>Privoxy</application> + +What's New in this Release - Before launching Privoxy for the first time, you - will want to configure your browser(s) to use Privoxy - and the HTTP and HTTPS proxy. The default is localhost for the proxy address, - and port 8118 (earlier versions used port 800). This is the one required - configuration that must be done! - - - - With Netscape (and - Mozilla), this can be set under Edit - -> Preferences -> Advanced -> Proxies -> HTTP Proxy. - For Internet Explorer: Tools > - Internet Properties -> Connections -> LAN Setting. Then, - check Use Proxy and fill in the appropriate info (Address: - localhost, Port: 8118). Include if HTTPS proxy support too. + There are many improvements and new features since Privoxy 3.0.6, the last stable release: - After doing this, flush your browser's disk and memory caches to force a - re-reading of all pages and get rid of any ads that may be cached. You - are now ready to start enjoying the benefits of using - Privoxy. - + + + + Header filtering can be done with dedicated header filters now. As a result + the actions filter-client-headers and filter-server-headers + that were introduced with Privoxy 3.0.5 to apply + the content filters to the headers as, well have been removed again. + + + - -For for SuSE: /etc/rc.d/privoxy start + - -For RedHat: /etc/rc.d/init.d/privoxy start - + + +Note to Upgraders - If no configuration file is specified on the command line, - Privoxy will look for a file named - config in the current directory. Except on Win32 where - it will try config.txt. If no file is specified on the - command line and no default configuration file can be found, - Privoxy will fail to start. + A quick list of things to be aware of before upgrading from earlier + versions of Privoxy: - - The included default configuration files should give a reasonable starting - point, though may be somewhat aggressive in blocking junk. Most of the - per site configuration is done in the actions files. These - are where various cookie actions are defined, ad and banner blocking, - and other aspects of Privoxy configuration. There - are several such files included, with varying levels of aggressiveness. - + - - You will probably want to keep an eye out for sites that require persistent - cookies, and add these to default.action as needed. By - default, most of these will be accepted only during the current browser - session, until you add them to the configuration. If you want the browser to - handle this instead, you will need to edit - default.action and disable this feature. If you use more - than one browser, it would make more sense to let - Privoxy handle this. In which case, the browser(s) - should be set to accept all cookies. + + + Some installers may remove earlier versions completely, including + configuration files. Save any important configuration files! + + + + + On the other hand, other installers may not overwrite any existing configuration + files, thinking you will want to do that. You may want to manually check + your saved files against the newer versions to see if the improvements have + merit, or whether there are new options that you may want to consider. + There are a number of new features, but most won't be available unless + these features are incorporated into your configuration somehow. + + + + + See the full documentation on + fast-redirects + which has changed syntax, and will require adjustments to local configs, + such as user.action. You must reference the new + syntax: + + + + { +fast-redirects{check-decoded-url} } + .example.com + mybank.com + .google. - - Privoxy is HTTP/1.1 compliant, but not all 1.1 - features are as yet implemented. If browsers that support HTTP/1.1 (like - Mozilla or recent versions of I.E.) experience - problems, you might try to force HTTP/1.0 compatibility. For Mozilla, look - under Edit -> Preferences -> Debug -> Networking. - Or set the +downgrade config option in - default.action. - + + + + The jarfile, + cookie logger, is off by default now. + + - - After running Privoxy for a while, you can - start to fine tune the configuration to suit your personal, or site, - preferences and requirements. There are many, many aspects that can - be customized. Actions (as specified in default.action) - can be adjusted by pointing your browser to - http://p.p/, - and then follow the link to edit the actions list. - (This is an internal page and does not require Internet access.) - + + + What constitutes a default configuration has changed, + and you may want to review which actions are on by + default. This is primarily a matter of emphasis, but some features + you may have been used to, may now be off by default. + There are also a number of new actions and filters you may want to + consider, most of which are not fully incorporated into the default + settings as yet (see above). + + - - In fact, various aspects of Privoxy - configuration can be viewed from this page, including - current configuration parameters, source code version numbers, - the browser's request headers, and actions that apply - to a given URL. In addition to the default.action file - editor mentioned above, Privoxy can also - be turned on and off from this page. - + + + The default actions setting is now Cautious. Previous + releases had a default setting of Medium. Experienced + users may want to adjust this, as it is fairly conservative by &my-app; + standards and past practices. See + http://config.privoxy.org/edit-actions-list?f=default. New users + should try the default settings for a while before turning up the volume. + + - - If you encounter problems, please verify it is a - Privoxy bug, by disabling - Privoxy, and then trying the same page. - Also, try another browser if possible to eliminate browser or site - problems. Before reporting it as a bug, see if there is not a configuration - option that is enabled that is causing the page not to load. You can then add - an exception for that page or site. For instance, try adding it to the - {fragile} section of default.action. - This will turn off most actions for this site. For more on troubleshooting - problem sites, see the Appendix. If a bug, please report it - to the developers (see below). - + + + The default setting has filtering turned off, which + subsequently means that compression is on. Remember + that filtering does not work on compressed pages, so if you use, or want to + use, filtering, you will need to force compression off. Example: + + + + { +filter{google} +prevent-compression } + .google. + + + Or if you use a number of filters, or filter many sites, you may just want + to turn off compression for all sites in + default.action (or + user.action). + + - + + + Also, session-cookies-only is + off by default now. If you've liked this feature in the past, you may want + to turn it back on in user.action now. + + - -Command Line Options - - Privoxy may be invoked with the following - command-line options: + + + + + + Some installers may not automatically start + Privoxy after installation. + + + + + + + + +Quickstart to Using Privoxy - --version - - - Print version info and exit, Unix only. - - + Install Privoxy. See the Installation Section below for platform specific + information. + + + - --help + Advanced users and those who want to offer Privoxy + service to more than just their local machine should check the main config file, especially the security-relevant options. These are + off by default. + + + - Print a short usage info and exit, Unix only. + Start Privoxy, if the installation program has + not done this already (may vary according to platform). See the section + Starting Privoxy. - + + - --no-daemon + Set your browser to use Privoxy as HTTP and + HTTPS (SSL) proxy + by setting the proxy configuration for address of + 127.0.0.1 and port 8118. + DO NOT activate proxying for FTP or + any protocols besides HTTP and HTTPS (SSL)! It won't work! + + + - Don't become a daemon, i.e. don't fork and become process group - leader, don't detach from controlling tty. Unix only. + Flush your browser's disk and memory caches, to remove any cached ad images. + If using Privoxy to manage + cookies, + you should remove any currently stored cookies too. + - --pidfile FILE - + A default installation should provide a reasonable starting point for + most. There will undoubtedly be occasions where you will want to adjust the + configuration, but that can be dealt with as the need arises. Little + to no initial configuration is required in most cases. - On startup, write the process ID to FILE. Delete the - FILE on exit. Failiure to create or delete the - FILE is non-fatal. If no FILE - option is given, no PID file will be used. Unix only. - + See the Configuration section for more + configuration options, and how to customize your installation. + You might also want to look at the next section for a quick + introduction to how Privoxy blocks ads and + banners. + + - --user USER[.GROUP] - + If you experience ads that slip through, innocent images that are + blocked, or otherwise feel the need to fine-tune + Privoxy's behavior, take a look at the actions files. As a quick start, you might + find the richly commented examples + helpful. You can also view and edit the actions files through the web-based user interface. The + Appendix Troubleshooting: Anatomy of an + Action has hints on how to understand and debug actions that + misbehave. + + + - After (optionally) writing the PID file, assume the user ID of - USER, and if included the GID of GROUP. Exit if the - privileges are not sufficient to do so. Unix only. + For easy access to &my-app;'s most important controls, drag the provided + Bookmarklets into your browser's + personal toolbar. + - configfile + Please see the section Contacting the + Developers on how to report bugs, problems with websites or to get + help. + + + - If no configfile is included on the command line, - Privoxy will look for a file named - config in the current directory (except on Win32 - where it will look for config.txt instead). Specify - full path to avoid confusion. + Now enjoy surfing with enhanced control, comfort and privacy! - + - - - - - - - - -<application>Privoxy</application> Configuration - - All Privoxy configuration is kept - in text files. These files can be edited with a text editor. - Many important aspects of Privoxy can - also be controlled easily with a web browser. - - - - -Controlling <application>Privoxy</application> with Your Web Browser - - Privoxy can be reached by the special - URL http://p.p/ (or alternately - http://config.privoxy.org/), - which is an internal page. You will see the following section: - - - + +Quickstart to Ad Blocking + - - -Please choose from the following options: - - * Show information about the current configuration - * Show the source code version numbers - * Show the client's request headers. - * Show which actions apply to a URL and why - * Toggle Privoxy on or off - * Edit the actions list - - + Ad blocking is but one of Privoxy's + array of features. Many of these features are for the technically minded advanced + user. But, ad and banner blocking is surely common ground for everybody. - - - This should be self-explanatory. Note the last item is an editor for the - actions list, which is where much of the ad, banner, cookie, - and URL blocking magic is configured as well as other advanced features of - Privoxy. This is an easy way to adjust various - aspects of Privoxy configuration. The actions - file, and other configuration files, are explained in detail below. - Privoxy will automatically detect any changes - to these files. + + This section will provide a quick summary of ad blocking so + you can get up to speed quickly without having to read the more extensive + information provided below, though this is highly recommended. + + + First a bit of a warning ... blocking ads is much like blocking SPAM: the + more aggressive you are about it, the more likely you are to block + things that were not intended. And the more likely that some things + may not work as intended. So there is a trade off here. If you want + extreme ad free browsing, be prepared to deal with more + problem sites, and to spend more time adjusting the + configuration to solve these unintended consequences. In short, there is + not an easy way to eliminate all ads. Either take + the easy way and settle for most ads blocked with the + default configuration, or jump in and tweak it for your personal surfing + habits and preferences. + + + Secondly, a brief explanation of Privoxy's + actions. Actions in this context, are + the directives we use to tell Privoxy to perform + some task relating to WWW transactions (i.e. web browsing). We tell + Privoxy to take some action. Each + action has a unique name and function. While there are many potential + actions in Privoxy's + arsenal, only a few are used for ad blocking. Actions, and action + configuration files, are explained in depth below. + + + Actions are specified in Privoxy's configuration, + followed by one or more URLs to which the action should apply. URLs + can actually be URL type patterns that use + wildcards so they can apply potentially to a range of similar URLs. The + actions, together with the URL patterns are called a section. + + + When you connect to a website, the full URL will either match one or more + of the sections as defined in Privoxy's configuration, + or not. If so, then Privoxy will perform the + respective actions. If not, then nothing special happens. Furthermore, web + pages may contain embedded, secondary URLs that your web browser will + use to load additional components of the page, as it parses the + original page's HTML content. An ad image for instance, is just an URL + embedded in the page somewhere. The image itself may be on the same server, + or a server somewhere else on the Internet. Complex web pages will have many + such embedded URLs. &my-app; can deal with each URL individually, so, for + instance, the main page text is not touched, but images from such-and-such + server are blocked. + + + + The most important actions for basic ad blocking are: block, handle-as-image, + handle-as-empty-document,and + set-image-blocker: - Toggle Privoxy On or Off is handy for sites that might - have problems with your current actions and filters, or just to test if - a site misbehaves, whether it is Privoxy - causing the problem or not. Privoxy continues - to run as a proxy in this case, but all filtering is disabled. - - - - - - + + + + + block - this is perhaps + the single most used action, and is particularly important for ad blocking. + This action stops any contact between your browser and any URL patterns + that match this action's configuration. It can be used for blocking ads, + but also anything that is determined to be unwanted. By itself, it simply + stops any communication with the remote server and sends + Privoxy's own built-in BLOCKED page instead to + let you now what has happened (with some exceptions, see below). + + + + + handle-as-image - + tells Privoxy to treat this URL as an image. + Privoxy's default configuration already does this + for all common image types (e.g. GIF), but there are many situations where this + is not so easy to determine. So we'll force it in these cases. This is particularly + important for ad blocking, since only if we know that it's an image of + some kind, can we replace it with an image of our choosing, instead of the + Privoxy BLOCKED page (which would only result in + a broken image icon). There are some limitations to this + though. For instance, you can't just brute-force an image substitution for + an entire HTML page in most situations. + + + + + handle-as-empty-document - + sends an empty document instead of Privoxy's + normal BLOCKED HTML page. This is useful for file types that are neither + HTML nor images, such as blocking JavaScript files. + + + + + set-image-blocker - tells + Privoxy what to display in place of an ad image that + has hit a block rule. For this to come into play, the URL must match a + block action somewhere in the + configuration, and, it must also match an + handle-as-image action. + + + The configuration options on what to display instead of the ad are: + + + +    pattern - a checkerboard pattern, so that an ad + replacement is obvious. This is the default. + + + + +    blank - A very small empty GIF image is displayed. + This is the so-called invisible configuration option. + + + + +    http://<URL> - A redirect to any image anywhere + of the user's choosing (advanced usage). + + + - + + - -Configuration Files Overview - For Unix, *BSD and Linux, all configuration files are located in - /etc/privoxy/ by default. For MS Windows, OS/2, and - AmigaOS these are all in the same directory as the - Privoxy executable. The name and number of - configuration files has changed from previous versions, and is subject to - change as development progresses. + The quickest way to adjust any of these settings is with your browser through + the special Privoxy editor at http://config.privoxy.org/show-status + (shortcut: http://p.p/show-status). This + is an internal page, and does not require Internet access. Select the + appropriate actions file, and click + Edit. It is best to put personal or + local preferences in user.action since this is not + meant to be overwritten during upgrades, and will over-ride the settings in + other files. Here you can insert new actions, and URLs for ad + blocking or other purposes, and make other adjustments to the configuration. + Privoxy will detect these changes automatically. - The installed defaults provide a reasonable starting point, though possibly - aggressive by some standards. For the time being, there are only three - default configuration files (this will change in time): + A quick and simple step by step example: @@ -781,2455 +1015,6740 @@ Please choose from the following options: - The main configuration file is named config - on Linux, Unix, BSD, OS/2, and AmigaOS and config.txt - on Windows. + Right click on the ad image to be blocked, then select + Copy Link Location from the + pop-up menu. - - The default.action file is used to define various - actions relating to images, banners, pop-ups, access - restrictions, banners and cookies. There is a CGI based editor for this - file that can be accessed via http://p.p. (Other actions - files are included as well with differing levels of filtering - and blocking, e.g. ijb-basic.action.) + Set your browser to + http://config.privoxy.org/show-status - - The default.filter file can be used to re-write the raw - page content, including viewable text as well as embedded HTML and JavaScript, - and whatever else lurks on any given web page. + Find user.action in the top section, and click + on Edit: - + + +
Actions Files in Use + + + + + + [ Screenshot of Actions Files in Use ] + + +
+
+ + + + + You should have a section with only + block listed under + Actions:. + If not, click a Insert new section below + button, and in the new section that just appeared, click the + Edit button right under the word Actions:. + This will bring up a list of all actions. Find + block near the top, and click + in the Enabled column, then Submit + just below the list. + + + + + Now, in the block actions section, + click the Add button, and paste the URL the + browser got from Copy Link Location. + Remove the http:// at the beginning of the URL. Then, click + Submit (or + OK if in a pop-up window). + + + + + Now go back to the original page, and press SHIFT-Reload + (or flush all browser caches). The image should be gone now. + + +
- default.action and default.filter - can use Perl style regular expressions for maximum flexibility. All files use - the # character to denote a comment. Such - lines are not processed by Privoxy. After - making any changes, there is no need to restart - Privoxy in order for the changes to take - effect. Privoxy should detect such changes - automatically. + This is a very crude and simple example. There might be good reasons to use a + wildcard pattern match to include potentially similar images from the same + site. For a more extensive explanation of patterns, and + the entire actions concept, see the Actions + section. - While under development, the configuration content is subject to change. - The below documentation may not be accurate by the time you read this. - Also, what constitutes a default setting, may change, so - please check all your configuration files on important issues. + For advanced users who want to hand edit their config files, you might want + to now go to the Actions Files Tutorial. + The ideas explained therein also apply to the web-based editor. + + There are also various + filters that can be used for ad blocking + (filters are a special subset of actions). These + fall into the advanced usage category, and are explained in + depth in later sections. + +
+
+ + - - -The Main Configuration File + + +Starting Privoxy - Again, the main configuration file is named config on - Linux/Unix/BSD and OS/2, and config.txt on Windows. - Configuration lines consist of an initial keyword followed by a list of - values, all separated by whitespace (any number of spaces or tabs). For - example: + Before launching Privoxy for the first time, you + will want to configure your browser(s) to use + Privoxy as a HTTP and HTTPS (SSL) + proxy. The default is + 127.0.0.1 (or localhost) for the proxy address, and port 8118 (earlier versions + used port 8000). This is the one configuration step that must be done +! - - - - - blockfile blocklist.ini - - - + Please note that Privoxy can only proxy HTTP and + HTTPS traffic. It will not work with FTP or other protocols. - - Indicates that the blockfile is named blocklist.ini. (A - default installation does not use this.) - + + +
Proxy Configuration Showing + Mozilla/Netscape HTTP and HTTPS (SSL) Settings + + + + + + [ Screenshot of Mozilla Proxy Configuration ] + + +
+
+ - - A # indicates a comment. Any part of a - line following a # is ignored, except if - the # is preceded by a - \. + + With Firefox, this is typically set under: + + + Tools -> Options -> General -> Connection Settings -> Manual Proxy Configuration - - Thus, by placing a # at the start of an - existing configuration line, you can make it a comment and it will be treated - as if it weren't there. This is called commenting out an - option and can be useful to turn off features: If you comment out the - logfile line, Privoxy will not - log to a file at all. Watch for the default: section in each - explanation to see what happens if the option is left unset (or commented - out). - + - - Long lines can be continued on the next line by using a - \ as the very last character. + + Or optionally on some platforms: + + + Edit -> Preferences -> General -> Connection Settings -> Manual Proxy Configuration - - There are various aspects of Privoxy behavior - that can be tuned. + + + + + With Netscape (and + Mozilla), this can be set under: - + + + + Edit -> Preferences -> Advanced -> Proxies -> HTTP Proxy - -Defining Other Configuration Files + - Privoxy can use a number of other files to tell it - what ads to block, what cookies to accept, etc. This section of the - configuration file tells Privoxy where to find - all those other files. + For Internet Explorer v.5-6: + + Tools -> Internet Options -> Connections -> LAN Settings + + - On Windows and AmigaOS, - Privoxy looks for these files in the same - directory as the executable. On Unix and OS/2, - Privoxy looks for these files in the current - working directory. In either case, an absolute path name can be used to - avoid problems. + Then, check Use Proxy and fill in the appropriate info + (Address: 127.0.0.1, Port: 8118). Include HTTPS (SSL), if you want HTTPS + proxy support too (sometimes labeled Secure). Make sure any + checkboxes like Use the same proxy server for all protocols is + UNCHECKED. You want only HTTP and HTTPS (SSL)! + + +
Proxy Configuration Showing + Internet Explorer HTTP and HTTPS (Secure) Settings + + + + + + [ Screenshot of IE Proxy Configuration ] + + +
+
+ + - When development goes modular and multi-user, the blocker, filter, and - per-user config will be stored in subdirectories of confdir. - For now, only confdir/templates is used for storing HTML - templates for CGI results. + After doing this, flush your browser's disk and memory caches to force a + re-reading of all pages and to get rid of any ads that may be cached. Remove + any cookies, + if you want Privoxy to manage that. You are now + ready to start enjoying the benefits of using + Privoxy! - The location of the configuration files: + Privoxy itself is typically started by specifying the + main configuration file to be used on the command line. If no configuration + file is specified on the command line, Privoxy + will look for a file named config in the current + directory. Except on Win32 where it will try config.txt. + +Red Hat and Fedora - - - - confdir /etc/privoxy # No trailing /, please. - - - + A default Red Hat installation may not start &my-app; upon boot. It will use + the file /etc/privoxy/config as its main configuration + file. - - The directory where all logging (i.e. logfile and - jarfile) takes place. No trailing - /, please: + + # /etc/rc.d/init.d/privoxy start + - - - - - logdir /var/log/privoxy - - - + Or ... - - Note that all file specifications below are relative to - the above two directories! + + # service privoxy start + + + +Debian - The default.action file contains patterns to specify the - actions to apply to requests for each site. Default: Cookies to and from all - destinations are kept only during the current browser session (i.e. they are - not saved to disk). Pop-ups are disabled for all sites. All sites are - filtered through selected sections of default.filter. No sites - are blocked. Privoxy displays a checkboard type - pattern for filtered ads and other images. The syntax of this file is - explained in detail below. Other - actions files are included, and you are free to use any of - them. They have varying degrees of aggressiveness. + We use a script. Note that Debian typically starts &my-app; upon booting per + default. It will use the file + /etc/privoxy/config as its main configuration + file. - - - - - actionsfile default.action - - - + + # /etc/init.d/privoxy start + + + + + +Windows - - - - filterfile default.filter - - - +Click on the &my-app; Icon to start Privoxy. If no configuration file is + specified on the command line, Privoxy will look + for a file named config.txt. Note that Windows will + automatically start &my-app; when the system starts if you chose that option + when installing. - - The logfile is where all logging and error messages are written. The logfile - can be useful for tracking down a problem with - Privoxy (e.g., it's not blocking an ad you - think it should block) but in most cases you probably will never look at it. + Privoxy can run with full Windows service functionality. + On Windows only, the &my-app; program has two new command line arguments + to install and uninstall &my-app; as a service. See the + Windows Installation + instructions for details. + + +Solaris, NetBSD, FreeBSD, HP-UX and others - Your logfile will grow indefinitely, and you will probably want to - periodically remove it. On Unix systems, you can do this with a cron job - (see man cron). For Redhat, a logrotate - script has been included. +Example Unix startup command: - - On SuSE Linux systems, you can place a line like /var/log/privoxy.* - +1024k 644 nobody.nogroup in /etc/logfiles, with - the effect that cron.daily will automatically archive, gzip, and empty the - log, when it exceeds 1M size. + + # /usr/sbin/privoxy /etc/privoxy/config + + + +OS/2 - Default: Log to the a file named logfile. - Comment out to disable logging. + During installation, Privoxy is configured to + start automatically when the system restarts. You can start it manually by + double-clicking on the Privoxy icon in the + Privoxy folder. + + +Mac OSX - - - - logfile logfile - - - + During installation, Privoxy is configured to + start automatically when the system restarts. To start &my-app; manually, + double-click on the StartPrivoxy.command icon in the + /Library/Privoxy folder. Or, type this command + in the Terminal: - - The jarfile defines where - Privoxy stores the cookies it intercepts. Note - that if you use a jarfile, it may grow quite large. Default: - Don't store intercepted cookies. + + /Library/Privoxy/StartPrivoxy.command + - - - - - #jarfile jarfile - - - + You will be prompted for the administrator password. + + + +AmigaOS - If you specify a trustfile, - Privoxy will only allow access to sites that - are named in the trustfile. You can also mark sites as trusted referrers, - with the effect that access to untrusted sites will be granted, if a link - from a trusted referrer was used. The link target will then be added to the - trustfile. This is a very restrictive feature that typical - users most probably want to leave disabled. Default: Disabled, don't use the - trust mechanism. + Start Privoxy (with RUN <>NIL:) in your + startnet script (AmiTCP), in + s:user-startup (RoadShow), as startup program in your + startup script (Genesis), or as startup action (Miami and MiamiDx). + Privoxy will automatically quit when you quit your + TCP/IP stack (just ignore the harmless warning your TCP/IP stack may display that + Privoxy is still running). + + +Gentoo - - - - #trustfile trust - - - + A script is again used. It will use the file /etc/privoxy/config + as its main configuration file. - - If you use the trust mechanism, it is a good idea to write up some on-line - documentation about your blocking policy and to specify the URL(s) here. They - will appear on the page that your users receive when they try to access - untrusted content. Use multiple times for multiple URLs. Default: Don't - display links on the untrusted info page. + + /etc/init.d/privoxy start + - - - - - trust-info-url http://www.your-site.com/why_we_block.html - trust-info-url http://www.your-site.com/what_we_allow.html - - - + Note that Privoxy is not automatically started at + boot time by default. You can change this with the rc-update + command. + + + rc-update add privoxy default + + + - - - - - + + + See the section Command line options for + further info. + - -Other Configuration Options +must find a better place for this paragraph - This part of the configuration file contains options that control how - Privoxy operates. + The included default configuration files should give a reasonable starting + point. Most of the per site configuration is done in the + actions files. These are + where various cookie actions are defined, ad and banner blocking, and other + aspects of Privoxy configuration. There are several + such files included, with varying levels of aggressiveness. - Admin-address should be set to the email address of the proxy - administrator. It is used in many of the proxy-generated pages. Default: - fill@me.in.please. + You will probably want to keep an eye out for sites for which you may prefer + persistent cookies, and add these to your actions configuration as needed. By + default, most of these will be accepted only during the current browser + session (aka session cookies), unless you add them to the + configuration. If you want the browser to handle this instead, you will need + to edit user.action (or through the web based interface) + and disable this feature. If you use more than one browser, it would make + more sense to let Privoxy handle this. In which + case, the browser(s) should be set to accept all cookies. - - - - #admin-address fill@me.in.please - - - + Another feature where you will probably want to define exceptions for trusted + sites is the popup-killing (through the +kill-popups and + +filter{popups} + actions), because your favorite shopping, banking, or leisure site may need + popups (explained below). - Proxy-info-url can be set to a URL that contains more info - about this Privoxy installation, it's - configuration and policies. It is used in many of the proxy-generated pages - and its use is highly recommended in multi-user installations, since your - users will want to know why certain content is blocked or modified. Default: - Don't show a link to on-line documentation. + Privoxy is HTTP/1.1 compliant, but not all of + the optional 1.1 features are as yet supported. In the unlikely event that + you experience inexplicable problems with browsers that use HTTP/1.1 per default + (like Mozilla or recent versions of I.E.), you might + try to force HTTP/1.0 compatibility. For Mozilla, look under Edit -> + Preferences -> Debug -> Networking. + Alternatively, set the +downgrade-http-version config option in + default.action which will downgrade your browser's HTTP + requests from HTTP/1.1 to HTTP/1.0 before processing them. - - - - proxy-info-url http://www.your-site.com/proxy.html - - - + After running Privoxy for a while, you can + start to fine tune the configuration to suit your personal, or site, + preferences and requirements. There are many, many aspects that can + be customized. Actions + can be adjusted by pointing your browser to + http://config.privoxy.org/ + (shortcut: http://p.p/), + and then follow the link to View & Change the Current Configuration. + (This is an internal page and does not require Internet access.) - Listen-address specifies the address and port where - Privoxy will listen for connections from your - Web browser. The default is to listen on the localhost port 8118, and - this is suitable for most users. (In your web browser, under proxy - configuration, list the proxy server as localhost and the - port as 8118). + In fact, various aspects of Privoxy + configuration can be viewed from this page, including + current configuration parameters, source code version numbers, + the browser's request headers, and actions that apply + to a given URL. In addition to the actions file + editor mentioned above, Privoxy can also + be turned on and off (toggled) from this page. - If you already have another service running on port 8118, or if you want to - serve requests from other machines (e.g. on your local network) as well, you - will need to override the default. The syntax is - listen-address [<ip-address>]:<port>. If you leave - out the IP address, Privoxy will bind to all - interfaces (addresses) on your machine and may become reachable from the - Internet. In that case, consider using access control lists (acl's) (see - aclfile above), or a firewall. + If you encounter problems, try loading the page without + Privoxy. If that helps, enter the URL where + you have the problems into the browser + based rule tracing utility. See which rules apply and why, and + then try turning them off for that site one after the other, until the problem + is gone. When you have found the culprit, you might want to turn the rest on + again. - For example, suppose you are running Privoxy on - a machine which has the address 192.168.0.1 on your local private network - (192.168.0.0) and has another outside connection with a different address. - You want it to serve requests from inside only: + If the above paragraph sounds gibberish to you, you might want to read more about the actions concept + or even dive deep into the Appendix + on actions. - - - - listen-address 192.168.0.1:8118 - - - + If you can't get rid of the problem at all, think you've found a bug in + Privoxy, want to propose a new feature or smarter rules, please see the + section Contacting the + Developers below. +--> + + + +Command Line Options - If you want it to listen on all addresses (including the outside - connection): + Privoxy may be invoked with the following + command-line options: - - - - listen-address :8118 - - - + + + + + --version + + + Print version info and exit. Unix only. + + + + + --help + + + Print short usage info and exit. Unix only. + + + + + --no-daemon + + + Don't become a daemon, i.e. don't fork and become process group + leader, and don't detach from controlling tty. Unix only. + + + + + --pidfile FILE + + + + On startup, write the process ID to FILE. Delete the + FILE on exit. Failure to create or delete the + FILE is non-fatal. If no FILE + option is given, no PID file will be used. Unix only. + + + + + --user USER[.GROUP] + + + + After (optionally) writing the PID file, assume the user ID of + USER, and if included the GID of GROUP. Exit if the + privileges are not sufficient to do so. Unix only. + + + + + --chroot + + + + Before changing to the user ID given in the --user option, + chroot to that user's home directory, i.e. make the kernel pretend to the &my-app; + process that the directory tree starts there. If set up carefully, this can limit + the impact of possible vulnerabilities in &my-app; to the files contained in that hierarchy. + Unix only. + + + + + configfile + + + If no configfile is included on the command line, + Privoxy will look for a file named + config in the current directory (except on Win32 + where it will look for config.txt instead). Specify + full path to avoid confusion. If no config file is found, + Privoxy will fail to start. + + + + - If you do this, consider using ACLs (see aclfile above). Note: - you will need to point your browser(s) to the address and port that you have - configured here. Default: localhost:8118 (127.0.0.1:8118). + On MS Windows only there are two additional + command-line options to allow Privoxy to install and + run as a service. See the +Window Installation section +for details. + + +
+ + + + + +Privoxy Configuration + + All Privoxy configuration is stored + in text files. These files can be edited with a text editor. + Many important aspects of Privoxy can + also be controlled easily with a web browser. + + + + + + +Controlling Privoxy with Your Web Browser - The debug option sets the level of debugging information to log in the - logfile (and to the console in the Windows version). A debug level of 1 is - informative because it will show you each request as it happens. Higher - levels of debug are probably only of interest to developers. + Privoxy's user interface can be reached through the special + URL http://config.privoxy.org/ + (shortcut: http://p.p/), + which is a built-in page and works without Internet access. + You will see the following section: + + + + +     Privoxy Menu + + + +         ▪  View & change the current configuration + + +         ▪  View the source code version numbers + + +         ▪  View the request headers. + + +         ▪  Look up which actions apply to a URL and why + + +         ▪  Toggle Privoxy on or off + + +         ▪  Documentation + + + + + + - - - - debug 1 # GPC = show each GET/POST/CONNECT request - debug 2 # CONN = show each connection status - debug 4 # IO = show I/O status - debug 8 # HDR = show header parsing - debug 16 # LOG = log all data into the logfile - debug 32 # FRC = debug force feature - debug 64 # REF = debug regular expression filter - debug 128 # = debug fast redirects - debug 256 # = debug GIF de-animation - debug 512 # CLF = Common Log Format - debug 1024 # = debug kill pop-ups - debug 4096 # INFO = Startup banner and warnings. - debug 8192 # ERROR = Non-fatal errors - - - + This should be self-explanatory. Note the first item leads to an editor for the + actions files, which is where the ad, banner, + cookie, and URL blocking magic is configured as well as other advanced features of + Privoxy. This is an easy way to adjust various + aspects of Privoxy configuration. The actions + file, and other configuration files, are explained in detail below. - It is highly recommended that you enable ERROR - reporting (debug 8192), at least until v3.0 is released. + Toggle Privoxy On or Off is handy for sites that might + have problems with your current actions and filters. You can in fact use + it as a test to see whether it is Privoxy + causing the problem or not. Privoxy continues + to run as a proxy in this case, but all manipulation is disabled, i.e. + Privoxy acts like a normal forwarding proxy. There + is even a toggle Bookmarklet offered, so + that you can toggle Privoxy with one click from + your browser. + + + + + + + + + + +Configuration Files Overview - The reporting of FATAL errors (i.e. ones which crash - Privoxy) is always on and cannot be disabled. + For Unix, *BSD and Linux, all configuration files are located in + /etc/privoxy/ by default. For MS Windows, OS/2, and + AmigaOS these are all in the same directory as the + Privoxy executable. - If you want to use CLF (Common Log Format), you should set debug - 512 ONLY, do not enable anything else. + The installed defaults provide a reasonable starting point, though + some settings may be aggressive by some standards. For the time being, the + principle configuration files are: - Multiple debug directives, are OK - they're logical-OR'd - together. + + + + + The main configuration file is named config + on Linux, Unix, BSD, OS/2, and AmigaOS and config.txt + on Windows. This is a required file. + + + + + + default.action (the main actions file) + is used to define which actions relating to banner-blocking, images, pop-ups, + content modification, cookie handling etc should be applied by default. It also defines many + exceptions (both positive and negative) from this default set of actions that enable + Privoxy to selectively eliminate the junk, and only the junk, on + as many websites as possible. + + + Multiple actions files may be defined in config. These + are processed in the order they are defined. Local customizations and locally + preferred exceptions to the default policies as defined in + default.action (which you will most probably want + to define sooner or later) are probably best applied in + user.action, where you can preserve them across + upgrades. standard.action is only for + Privoxy's internal use. + + + There is also a web based editor that can be accessed from + http://config.privoxy.org/show-status + (Shortcut: http://p.p/show-status) for the + various actions files. + + + + + + Filter files (the filter + file) can be used to re-write the raw page content, including + viewable text as well as embedded HTML and JavaScript, and whatever else + lurks on any given web page. The filtering jobs are only pre-defined here; + whether to apply them or not is up to the actions files. + default.filter includes various filters made + available for use by the developers. Some are much more intrusive than + others, and all should be used with caution. You may define additional + filter files in config as you can with + actions files. We suggest user.filter for any + locally defined filters or customizations. + + + + - - - - debug 15 # same as setting the first 4 listed above - - - + The syntax of all configuration files has remained the same throughout the + 3.x series. There have been enhancements, but no changes that would preclude + the use of any configuration file from one version to the next. (There is + one exception: +fast-redirects which + has enhanced syntax and will require updating any local configs from earlier + versions.) - Default: + All files use the # character to denote a + comment (the rest of the line will be ignored) and understand line continuation + through placing a backslash ("\") as the very last character + in a line. If the # is preceded by a backslash, it looses + its special function. Placing a # in front of an otherwise + valid configuration line to prevent it from being interpreted is called "commenting + out" that line. Blank lines are ignored. - - - - debug 1 # URLs - debug 4096 # Info - debug 8192 # Errors - *we highly recommended enabling this* - - - + The actions files and filter files + can use Perl style regular expressions for + maximum flexibility. - Privoxy normally uses - multi-threading, a software technique that permits it to - handle many different requests simultaneously. In some cases you may wish to - disable this -- particularly if you're trying to debug a problem. The - single-threaded option forces - Privoxy to handle requests sequentially. - Default: Multi-threaded mode. + After making any changes, there is no need to restart + Privoxy in order for the changes to take + effect. Privoxy detects such changes + automatically. Note, however, that it may take one or two additional + requests for the change to take effect. When changing the listening address + of Privoxy, these wake up requests + must obviously be sent to the old listening address. + - - - - #single-threaded - - - + While under development, the configuration content is subject to change. + The below documentation may not be accurate by the time you read this. + Also, what constitutes a default setting, may change, so + please check all your configuration files on important issues. +]]> + + + + + + + + + + + + &config; + + + + + + + + + +Actions Files - toggle allows you to temporarily disable all - Privoxy's filtering. Just set toggle - 0. - + The actions files are used to define what actions + Privoxy takes for which URLs, and thus determines + how ad images, cookies and various other aspects of HTTP content and + transactions are handled, and on which sites (or even parts thereof). + There are a number of such actions, with a wide range of functionality. + Each action does something a little different. + These actions give us a veritable arsenal of tools with which to exert + our control, preferences and independence. Actions can be combined so that + their effects are aggregated when applied against a given set of URLs. + + + There + are three action files included with Privoxy with + differing purposes: + + + + + + + default.action - is the primary action file + that sets the initial values for all actions. It is intended to + provide a base level of functionality for + Privoxy's array of features. So it is + a set of broad rules that should work reasonably well as-is for most users. + This is the file that the developers are keeping updated, and making available to users. + The user's preferences as set in standard.action, + e.g. either Cautious (the default), + Medium, or Advanced (see + below). + + + + + user.action - is intended to be for local site + preferences and exceptions. As an example, if your ISP or your bank + has specific requirements, and need special handling, this kind of + thing should go here. This file will not be upgraded. + + + + + standard.action - is used only by the web based editor + at + http://config.privoxy.org/edit-actions-list?f=default, + to set various pre-defined sets of rules for the default actions section + in default.action. + + + Edit Set to Cautious Set to Medium Set to Advanced + + + These have increasing levels of aggressiveness and have no + influence on your browsing unless you select them explicitly in the + editor. A default installation should be pre-set to + Cautious (versions prior to 3.0.5 were set to + Medium). New users should try this for a while before + adjusting the settings to more aggressive levels. The more aggressive + the settings, then the more likelihood there is of problems such as sites + not working as they should. + + + The Edit button allows you to turn each + action on/off individually for fine-tuning. The Cautious + button changes the actions list to low/safe settings which will activate + ad blocking and a minimal set of &my-app;'s features, and subsequently + there will be less of a chance for accidental problems. The + Medium button sets the list to a medium level of + other features and a low level set of privacy features. The + Advanced button sets the list to a high level of + ad blocking and medium level of privacy. See the chart below. The latter + three buttons over-ride any changes via with the + Edit button. More fine-tuning can be done in the + lower sections of this internal page. + + + It is not recommend to edit the standard.action file + itself. + + + The default profiles, and their associated actions, as pre-defined in + standard.action are: + + + Default Configurations + + + + + + + + Feature + Cautious + Medium + Advanced + + + + + + + + + + + + + + Ad-blocking Aggressiveness + medium + high + high + + + + Ad-filtering by size + no + yes + yes + + + + Ad-filtering by link + no + no + yes + + + Pop-up killing + blocks only + blocks only + blocks only + + + + Privacy Features + low + medium + medium/high + + + + Cookie handling + none + session-only + kill + + + + Referer forging + no + yes + yes + + + + + GIF de-animation + no + yes + yes + + + + + Fast redirects + no + no + yes + + + + HTML taming + no + no + yes + + + + JavaScript taming + no + no + yes + + + + Web-bug killing + no + yes + yes + + + + Image tag reordering + no + no + yes + + + + +
+
+ +
+
+
+ + + The list of actions files to be used are defined in the main configuration + file, and are processed in the order they are defined (e.g. + default.action is typically processed before + user.action). The content of these can all be viewed and + edited from http://config.privoxy.org/show-status. + The over-riding principle when applying actions, is that the last action that + matches a given URL, wins. The broadest, most general rules go first + (defined in default.action), + followed by any exceptions (typically also in + default.action), which are then followed lastly by any + local preferences (typically in user.action). + Generally, user.action has the last word. + - The Windows version of Privoxy puts an icon in - the system tray, which also allows you to change this option. If you - right-click on that icon (or select the Options menu), one - choice is Enable. Clicking on enable toggles - Privoxy on and off. This is useful if you want - to temporarily disable Privoxy, e.g., to access - a site that requires cookies which you would otherwise have blocked. This can also - be toggled via a web browser at the Privoxy - internal address of http://p.p on - any platform. + An actions file typically has multiple sections. If you want to use + aliases in an actions file, you have to place the (optional) + alias section at the top of that file. + Then comes the default set of rules which will apply universally to all + sites and pages (be very careful with using such a + universal set in user.action or any other actions file after + default.action, because it will override the result + from consulting any previous file). And then below that, + exceptions to the defined universal policies. You can regard + user.action as an appendix to default.action, + with the advantage that is a separate file, which makes preserving your + personal settings across Privoxy upgrades easier. - - toggle 1 means Privoxy runs - normally, toggle 0 means that - Privoxy becomes a non-anonymizing non-blocking - proxy. Default: 1 (on). + + Actions can be used to block anything you want, including ads, banners, or + just some obnoxious URL that you would rather not see. Cookies can be accepted + or rejected, or accepted only during the current browser session (i.e. not + written to disk), content can be modified, JavaScripts tamed, user-tracking + fooled, and much more. See below for a complete list + of actions. + + +Finding the Right Mix - - - - toggle 1 - - - + Note that some actions, like cookie suppression + or script disabling, may render some sites unusable that rely on these + techniques to work properly. Finding the right mix of actions is not always easy and + certainly a matter of personal taste. And, things can always change, requiring + refinements in the configuration. In general, it can be said that the more + aggressive your default settings (in the top section of the + actions file) are, the more exceptions for trusted sites you + will have to make later. If, for example, you want to crunch all cookies per + default, you'll have to make exceptions from that rule for sites that you + regularly use and that require cookies for actually useful purposes, like maybe + your bank, favorite shop, or newspaper. - For content filtering, i.e. the +filter and - +deanimate-gif actions, it is necessary that - Privoxy buffers the entire document body. - This can be potentially dangerous, since a server could just keep sending - data indefinitely and wait for your RAM to exhaust. With nasty consequences. + We have tried to provide you with reasonable rules to start from in the + distribution actions files. But there is no general rule of thumb on these + things. There just are too many variables, and sites are constantly changing. + Sooner or later you will want to change the rules (and read this chapter again :). + + + +How to Edit - The buffer-limit option lets you set the maximum - size in Kbytes that each buffer may use. When the documents buffer exceeds - this size, it is flushed to the client unfiltered and no further attempt to - filter the rest of it is made. Remember that there may multiple threads - running, which might require increasing the buffer-limit - Kbytes each, unless you have enabled - single-threaded above. + The easiest way to edit the actions files is with a browser by + using our browser-based editor, which can be reached from http://config.privoxy.org/show-status. + The editor allows both fine-grained control over every single feature on a + per-URL basis, and easy choosing from wholesale sets of defaults like + Cautious, Medium or Advanced. + Warning: the Advanced setting is more aggressive, and + will be more likely to cause problems for some sites. Experienced users only! - - - - buffer-limit 4069 - - - + If you prefer plain text editing to GUIs, you can of course also directly edit the + the actions files with your favorite text editor. Look at + default.action which is richly commented with many + good examples. + - - To enable the web-based default.action file editor set - enable-edit-actions to 1, or 0 to disable. Note - that you must have compiled Privoxy with - support for this feature, otherwise this option has no effect. This - internal page can be reached at http://p.p. - + +How Actions are Applied to Requests - Security note: If this is enabled, anyone who can use the proxy - can edit the actions file, and their changes will affect all users. - For shared proxies, you probably want to disable this. Default: enabled. + Actions files are divided into sections. There are special sections, + like the alias sections which will + be discussed later. For now let's concentrate on regular sections: They have a + heading line (often split up to multiple lines for readability) which consist + of a list of actions, separated by whitespace and enclosed in curly braces. + Below that, there is a list of URL and tag patterns, each on a separate line. - - - - enable-edit-actions 1 - - - + To determine which actions apply to a request, the URL of the request is + compared to all URL patterns in each action file. + Every time it matches, the list of applicable actions for the request is + incrementally updated, using the heading of the section in which the + pattern is located. The same is done again for tags and tag patterns later on. - Allow Privoxy to be toggled on and off - remotely, using your web browser. Set enable-remote-toggleto - 1 to enable, and 0 to disable. Note that you must have compiled - Privoxy with support for this feature, - otherwise this option has no effect. + If multiple applying sections set the same action differently, + the last match wins. If not, the effects are aggregated. + E.g. a URL might match a regular section with a heading line of { + +handle-as-image }, + then later another one with just { + +block }, resulting + in both actions to apply. And there may well be + cases where you will want to combine actions together. Such a section then + might look like: + + + { +handle-as-image +block } + # Block these as if they were images. Send no block page. + banners.example.com + media.example.com/.*banners + .example.com/images/ads/ + + - Security note: If this is enabled, anyone who can use the proxy can toggle - it on or off (see http://p.p), and - their changes will affect all users. For shared proxies, you probably want to - disable this. Default: enabled. + You can trace this process for URL patterns and any given URL by visiting http://config.privoxy.org/show-url-info. - - - - enable-remote-toggle 1 - - - + Examples and more detail on this is provided in the Appendix, + Troubleshooting: Anatomy of an Action section. - - - - - + - - -Access Control List (ACL) - - Access controls are included at the request of some ISPs and systems - administrators, and are not usually needed by individual users. Please note - the warnings in the FAQ that this proxy is not intended to be a substitute - for a firewall or to encourage anyone to defer addressing basic security - weaknesses. + +Patterns + + As mentioned, Privoxy uses patterns + to determine what actions might apply to which sites and + pages your browser attempts to access. These patterns use wild + card type pattern matching to achieve a high degree of + flexibility. This allows one expression to be expanded and potentially match + against many similar patterns. - + - If no access settings are specified, the proxy talks to anyone that - connects. If any access settings file are specified, then the proxy - talks only to IP addresses permitted somewhere in this file and not - denied later in this file. + Generally, a URL pattern has the form + <domain>/<path>, where both the + <domain> and <path> are + optional. (This is why the special / pattern matches all + URLs). Note that the protocol portion of the URL pattern (e.g. + http://) should not be included in + the pattern. This is assumed already! - - Summary -- if using an ACL: + The pattern matching syntax is different for the domain and path parts of + the URL. The domain part uses a simple globbing type matching technique, + while the path part uses a more flexible + Regular + Expressions (PCRE) based syntax. - - - Client must have permission to receive service. - - - - - LAST match in ACL wins. - - - - - Default behavior is to deny service. - - + + + www.example.com/ + + + is a domain-only pattern and will match any request to www.example.com, + regardless of which document on that server is requested. So ALL pages in + this domain would be covered by the scope of this action. Note that a + simple example.com is different and would NOT match. + + + + + www.example.com + + + means exactly the same. For domain-only patterns, the trailing / may + be omitted. + + + + + www.example.com/index.html + + + matches only the single document /index.html + on www.example.com. + + + + + /index.html + + + matches the document /index.html, regardless of the domain, + i.e. on any web server anywhere. + + + + + index.html + + + matches nothing, since it would be interpreted as a domain name and + there is no top-level domain called .html. So its + a mistake. + + + + - - The syntax for an entry in the Access Control List is: - - - - - - ACTION SRC_ADDR[/SRC_MASKLEN] [ DST_ADDR[/DST_MASKLEN] ] - - - - + +The Domain Pattern - Where the individual fields are: + The matching of the domain part offers some flexible options: if the + domain starts or ends with a dot, it becomes unanchored at that end. + For example: - - - - - ACTION = permit-access or deny-access + + + .example.com + + + matches any domain that ENDS in + .example.com + + + + + www. + + + matches any domain that STARTS with + www. + + + + + .example. + + + matches any domain that CONTAINS .example.. + And, by the way, also included would be any files or documents that exist + within that domain since no path limitations are specified. (Correctly + speaking: It matches any FQDN that contains example as + a domain.) This might be www.example.com, + news.example.de, or + www.example.net/cgi/testing.pl for instance. All these + cases are matched. + + + + - SRC_ADDR = client hostname or dotted IP address - SRC_MASKLEN = number of bits in the subnet mask for the source + + Additionally, there are wild-cards that you can use in the domain names + themselves. These work similarly to shell globbing type wild-cards: + * represents zero or more arbitrary characters (this is + equivalent to the + Regular + Expression based syntax of .*), + ? represents any single character (this is equivalent to the + regular expression syntax of a simple .), and you can define + character classes in square brackets which is similar to + the same regular expression technique. All of this can be freely mixed: + + + + + ad*.example.com + + + matches adserver.example.com, + ads.example.com, etc but not sfads.example.com + + + + + *ad*.example.com + + + matches all of the above, and then some. + + + + + .?pix.com + + + matches www.ipix.com, + pictures.epix.com, a.b.c.d.e.upix.com etc. + + + + + www[1-9a-ez].example.c* + + + matches www1.example.com, + www4.example.cc, wwwd.example.cy, + wwwz.example.com etc., but not + wwww.example.com. + + + + - DST_ADDR = server or forwarder hostname or dotted IP address - DST_MASKLEN = number of bits in the subnet mask for the target - - - + + While flexible, this is not the sophistication of full regular expression based syntax. + - - The field separator (FS) is whitespace (space or tab). - + - - IMPORTANT NOTE: If Privoxy is using a - forwarder (see below) or a gateway for a particular destination URL, the - DST_ADDR that is examined is the address of the forwarder - or the gateway and NOT the address of the ultimate - target. This is necessary because it may be impossible for the local - Privoxy to determine the address of the - ultimate target (that's often what gateways are used for). - - - Here are a few examples to show how the ACL features work: - + +The Path Pattern - localhost is OK -- no DST_ADDR implies that - ALL destination addresses are OK: + Privoxy uses Perl compatible (PCRE) + Regular + Expression based syntax + (through the PCRE library) for + matching the path portion (after the slash), and is thus more flexible. - - - - permit-access localhost - - - + There is an Appendix with a brief quick-start into regular + expressions, and full (very technical) documentation on PCRE regex syntax is available on-line + at http://www.pcre.org/man.txt. + You might also find the Perl man page on regular expressions (man perlre) + useful, which is available on-line at http://perldoc.perl.org/perlre.html. - A silly example to illustrate permitting any host on the class-C subnet with - Privoxy to go anywhere: + Note that the path pattern is automatically left-anchored at the /, + i.e. it matches as if it would start with a ^ (regular expression speak + for the beginning of a line). - - - - permit-access www.privoxy.com/24 - - - + Please also note that matching in the path is CASE INSENSITIVE + by default, but you can switch to case sensitive at any point in the pattern by using the + (?-i) switch: www.example.com/(?-i)PaTtErN.* will match + only documents whose path starts with PaTtErN in + exactly this capitalization. - - Except deny one particular IP address from using it at all: - + + + .example.com/.* + + + Is equivalent to just .example.com, since any documents + within that domain are matched with or without the .* + regular expression. This is redundant + + + + + .example.com/.*/index.html + + + Will match any page in the domain of example.com that is + named index.html, and that is part of some path. For + example, it matches www.example.com/testing/index.html but + NOT www.example.com/index.html because the regular + expression called for at least two /'s, thus the path + requirement. It also would match + www.example.com/testing/index_html, because of the + special meta-character .. + + + + + .example.com/(.*/)?index\.html + + + This regular expression is conditional so it will match any page + named index.html regardless of path which in this case can + have one or more /'s. And this one must contain exactly + .html (but does not have to end with that!). + + + + + .example.com/(.*/)(ads|banners?|junk) + + + This regular expression will match any path of example.com + that contains any of the words ads, banner, + banners (because of the ?) or junk. + The path does not have to end in these words, just contain them. + + + + + .example.com/(.*/)(ads|banners?|junk)/.*\.(jpe?g|gif|png)$ + + + This is very much the same as above, except now it must end in either + .jpg, .jpeg, .gif or .png. So this + one is limited to common image formats. + + + + - - - - deny-access ident.privoxy.com - - - + There are many, many good examples to be found in default.action, + and more tutorials below in Appendix on regular expressions. - - You can also specify an explicit network address and subnet mask. - Explicit addresses do not have to be resolved to be used. - + - - - - - permit-access 207.153.200.0/24 - - - - + - - A subnet mask of 0 matches anything, so the next line permits everyone. - + + +The Tag Pattern - - - - permit-access 0.0.0.0/0 - - - + Tag patterns are used to change the applying actions based on the + request's tags. Tags can be created with either the + client-header-tagger + or the server-header-tagger action. - Note, you cannot say: + Tag patterns have to start with TAG:, so &my-app; + can tell them apart from URL patterns. Everything after the colon + including white space, is interpreted as a regular expression with + path patterns syntax, except that tag patterns aren't left-anchored + automatically (Privoxy doesn't silently add a ^, + you have to do it yourself if you need it). - - - - permit-access .org - - - + To match all requests that are tagged with foo + your pattern line should be TAG:^foo$, + TAG:foo would work as well, but it would also + match requests whose tags contain foo somewhere. - to allow all *.org domains. Every IP address listed must resolve fully. + Sections can contain URL and tag patterns at the same time, + but tag patterns are checked after the URL patterns and thus + always overrule them, even if they are located before the URL patterns. - An ISP may want to provide a Privoxy that is - accessible by the world and yet restrict use of some of their - private content to hosts on its internal network (i.e. its own subscribers). - Say, for instance the ISP owns the Class-B IP address block 123.124.0.0 (a 16 - bit netmask). This is how they could do it: + Once a new tag is added, Privoxy checks right away if it's matched by one + of the tag patterns and updates the action settings accordingly. As a result + tags can be used to activate other tagger actions, as long as these other + taggers look for headers that haven't already be parsed. - - - - permit-access 0.0.0.0/0 0.0.0.0/0 # other clients can go anywhere - # with the following exceptions: - - deny-access 0.0.0.0/0 123.124.0.0/16 # block all external requests for - # sites on the ISP's network - - permit 0.0.0.0/0 www.my_isp.com # except for the ISP's main - # web site - - permit 123.124.0.0/16 0.0.0.0/0 # the ISP's clients can go - # anywhere - - - + For example you could tag client requests which use the POST method, + use this tag to activate another tagger that adds a tag if cookies + are send, and then block based on the cookie tag. However if you'd + reverse the position of the described taggers, and activated the method + tagger based on the cookie tagger, no method tags would be created. + The method tagger would look for the request line, but at the time + the cookie tag is created the request line has already been parsed. - Note that if some hostnames are listed with multiple IP addresses, - the primary value returned by DNS (via gethostbyname()) is used. Default: - Anyone can access the proxy. + While this is a limitation you should be aware of, this kind of + indirection is seldom needed anyway and even the example doesn't + make too much sense. + + - -Forwarding - - - This feature allows chaining of HTTP requests via multiple proxies. - It can be used to better protect privacy and confidentiality when - accessing specific domains by routing requests to those domains - to a special purpose filtering proxy such as lpwa.com. Or to use - a caching proxy to speed up browsing. - - + +Actions - It can also be used in an environment with multiple networks to route - requests via multiple gateways allowing transparent access to multiple - networks without having to modify browser configurations. - + All actions are disabled by default, until they are explicitly enabled + somewhere in an actions file. Actions are turned on if preceded with a + +, and turned off if preceded with a -. So a + +action means do that action, e.g. + +block means please block URLs that match the + following patterns, and -block means don't + block URLs that match the following patterns, even if +block + previously applied. - - Also specified here are SOCKS proxies. Privoxy - SOCKS 4 and SOCKS 4A. The difference is that SOCKS 4A will resolve the target - hostname using DNS on the SOCKS server, not our local DNS client. - - The syntax of each line is: + + Again, actions are invoked by placing them on a line, enclosed in curly braces and + separated by whitespace, like in + {+some-action -some-other-action{some-parameter}}, + followed by a list of URL patterns, one per line, to which they apply. + Together, the actions line and the following pattern lines make up a section + of the actions file. - - - - - forward target_domain[:port] http_proxy_host[:port] - forward-socks4 target_domain[:port] socks_proxy_host[:port] http_proxy_host[:port] - forward-socks4a target_domain[:port] socks_proxy_host[:port] http_proxy_host[:port] - - - + + Actions fall into three categories: - If http_proxy_host is ., then requests are not forwarded to a - HTTP proxy but are made directly to the web servers. - + + + + Boolean, i.e the action can only be enabled or + disabled. Syntax: + + + + +name # enable action name + -name # disable action name + + + Example: +block + + - - Lines are checked in sequence, and the last match wins. - - - There is an implicit line equivalent to the following, which specifies that - anything not finding a match on the list is to go out without forwarding - or gateway protocol, like so: - + + + Parameterized, where some value is required in order to enable this type of action. + Syntax: + + + + +name{param} # enable action and set parameter to param, + # overwriting parameter from previous match if necessary + -name # disable action. The parameter can be omitted + + + Note that if the URL matches multiple positive forms of a parameterized action, + the last match wins, i.e. the params from earlier matches are simply ignored. + + + Example: +hide-user-agent{ Mozilla 1.0 } + + + + + + Multi-value. These look exactly like parameterized actions, + but they behave differently: If the action applies multiple times to the + same URL, but with different parameters, all the parameters + from all matches are remembered. This is used for actions + that can be executed for the same request repeatedly, like adding multiple + headers, or filtering through multiple filters. Syntax: + + + + +name{param} # enable action and add param to the list of parameters + -name{param} # remove the parameter param from the list of parameters + # If it was the last one left, disable the action. + -name # disable this action completely and remove all parameters from the list + + + Examples: +add-header{X-Fun-Header: Some text} and + +filter{html-annoyances} + + - - - - - forward .* . # implicit - - - + - In the following common configuration, everything goes to Lucent's LPWA, - except SSL on port 443 (which it doesn't handle): + If nothing is specified in any actions file, no actions are + taken. So in this case Privoxy would just be a + normal, non-blocking, non-anonymizing proxy. You must specifically enable the + privacy and blocking features you need (although the provided default actions + files will give a good starting point). - - - - forward .* lpwa.com:8000 - forward :443 . - - - + Later defined actions always over-ride earlier ones. So exceptions + to any rules you make, should come in the latter part of the file (or + in a file that is processed later when using multiple actions files such + as user.action). For multi-valued actions, the actions + are applied in the order they are specified. Actions files are processed in + the order they are defined in config (the default + installation has three actions files). It also quite possible for any given + URL to match more than one pattern (because of wildcards and + regular expressions), and thus to trigger more than one set of actions! Last + match wins. + - - Some users have reported difficulties related to LPWA's use of - . as the last element of the domain, and have said that this - can be fixed with this: - - - - - - - forward lpwa. lpwa.com:8000 - - - - - - - (NOTE: the syntax for specifying target_domain has changed since the - previous paragraph was written -- it will not work now. More information - is welcome.) + The list of valid Privoxy actions are: - - In this fictitious example, everything goes via an ISP's caching proxy, - except requests to that ISP: - - - - - - forward .* caching.myisp.net:8000 - forward myisp.net . - - - - + + + + + - - For the @home network, we're told the forwarding configuration is this: - + - - - - - forward .* proxy:8080 - - - - + +add-header - - Also, we're told they insist on getting cookies and JavaScript, so you should - allow cookies from home.com. We consider JavaScript a potential security risk. - Java need not be enabled. - + + + Typical use: + + Confuse log analysis, custom applications + + - - In this example direct connections are made to all internal - domains, but everything else goes through Lucent's LPWA by way of the - company's SOCKS gateway to the Internet. - + + Effect: + + + Sends a user defined HTTP header to the web server. + + + - - - - - forward-socks4 .* lpwa.com:8000 firewall.my_company.com:1080 - forward my_company.com . - - - - + + Type: + + + Multi-value. + + + + + Parameter: + + + Any string value is possible. Validity of the defined HTTP headers is not checked. + It is recommended that you use the X- prefix + for custom headers. + + + + + + Notes: + + + This action may be specified multiple times, in order to define multiple + headers. This is rarely needed for the typical user. If you don't know what + HTTP headers are, you definitely don't need to worry about this + one. + + + - - This is how you could set up a site that always uses SOCKS but no forwarders: - + + Example usage: + + + +add-header{X-User-Tracking: sucks} + + + + + - - - - - forward-socks4a .* . firewall.my_company.com:1080 - - - - - - An advanced example for network administrators: - + + +block - - If you have links to multiple ISPs that provide various special content to - their subscribers, you can configure forwarding to pass requests to the - specific host that's connected to that ISP so that everybody can see all - of the content on all of the ISPs. - + + + Typical use: + + Block ads or other unwanted content + + - - This is a bit tricky, but here's an example: - + + Effect: + + + Requests for URLs to which this action applies are blocked, i.e. the + requests are trapped by &my-app; and the requested URL is never retrieved, + but is answered locally with a substitute page or image, as determined by + the handle-as-image, + set-image-blocker, and + handle-as-empty-document actions. + + + + + + Type: + + + Boolean. + + - - host-a has a PPP connection to isp-a.com. And host-b has a PPP connection to - isp-b.com. host-a can run a Privoxy proxy with - forwarding like this: - + + Parameter: + + N/A + + + + + Notes: + + + Privoxy sends a special BLOCKED page + for requests to blocked pages. This page contains links to find out why the request + was blocked, and a click-through to the blocked content (the latter only if compiled with the + force feature enabled). The BLOCKED page adapts to the available + screen space -- it displays full-blown if space allows, or miniaturized and text-only + if loaded into a small frame or window. If you are using Privoxy + right now, you can take a look at the + BLOCKED + page. + + + A very important exception occurs if both + block and handle-as-image, + apply to the same request: it will then be replaced by an image. If + set-image-blocker + (see below) also applies, the type of image will be determined by its parameter, + if not, the standard checkerboard pattern is sent. + + + It is important to understand this process, in order + to understand how Privoxy deals with + ads and other unwanted content. Blocking is a core feature, and one + upon which various other features depend. + + + The filter + action can perform a very similar task, by blocking + banner images and other content through rewriting the relevant URLs in the + document's HTML source, so they don't get requested in the first place. + Note that this is a totally different technique, and it's easy to confuse the two. + + + - - - - - forward .* . - forward isp-b.com host-b:8118 - - - - + + Example usage (section): + + + {+block} +# Block and replace with "blocked" page + .nasty-stuff.example.com - - host-b can run a Privoxy proxy with forwarding - like this: - +{+block +handle-as-image} +# Block and replace with image + .ad.doubleclick.net + .ads.r.us/banners/ - - - - - forward .* . - forward isp-a.com host-a:8118 - - - - +{+block +handle-as-empty-document} +# Block and then ignore + adserver.exampleclick.net/.*\.js$ + + + - - Now, anyone on the Internet (including users on host-a - and host-b) can set their browser's proxy to either - host-a or host-b and be able to browse the content on isp-a or isp-b. - - - Here's another practical example, for University of Kent at - Canterbury students with a network connection in their room, who - need to use the University's Squid web cache. - + + - - - - - forward *. ssbcache.ukc.ac.uk:3128 # Use the proxy, except for: - forward .ukc.ac.uk . # Anything on the same domain as us - forward * . # Host with no domain specified - forward 129.12.*.* . # A dotted IP on our /16 network. - forward 127.*.*.* . # Loopback address - forward localhost.localdomain . # Loopback address - forward www.ukc.mirror.ac.uk . # Specific host - - - - - - If you intend to chain Privoxy and - squid locally, then chain as - browser -> squid -> privoxy is the recommended way. - + + +client-header-filter - - Your squid configuration could then look like this: - + + + Typical use: + + + Rewrite or remove single client headers. + + + - - - - - # Define Privoxy as parent cache - - cache_peer 127.0.0.1 parent 8118 0 no-query - - # Define ACL for protocol FTP - acl FTP proto FTP + + Effect: + + + All client headers to which this action applies are filtered on-the-fly through + the specified regular expression based substitutions. + + + - # Do not forward ACL FTP to privoxy - always_direct allow FTP + + Type: + + + Parameterized. + + - # Do not forward ACL CONNECT (https) to privoxy - always_direct allow CONNECT + + Parameter: + + + The name of a client-header filter, as defined in one of the + filter files. + + + + + + Notes: + + + Client-header filters are applied to each header on its own, not to + all at once. This makes it easier to diagnose problems, but on the downside + you can't write filters that only change header x if header y's value is z. + You can do that by using tags though. + + + Client-header filters are executed after the other header actions have finished + and use their output as input. + + + Please refer to the filter file chapter + to learn which client-header filters are available by default, and how to + create your own. + - # Forward the rest to privoxy - never_direct allow all - - - - + + + Example usage (section): + + + +{+client-header-filter{hide-tor-exit-notation}} +.exit/ + + + + + + - - + +client-header-tagger - -Windows GUI Options - - - Privoxy has a number of options specific to the - Windows GUI interface: - + + + Typical use: + + + Block requests based on their headers. + + + - - If activity-animation is set to 1, the - Privoxy icon will animate when - Privoxy is active. To turn off, set to 0. - + + Effect: + + + Client headers to which this action applies are filtered on-the-fly through + the specified regular expression based substitutions, the result is used as + tag. + + + - - - - - activity-animation 1 - - - - + + Type: + + + Parameterized. + + - - If log-messages is set to 1, - Privoxy will log messages to the console - window: - + + Parameter: + + + The name of a client-header tagger, as defined in one of the + filter files. + + + + + + Notes: + + + Client-header taggers are applied to each header on its own, + and as the header isn't modified, each tagger sees + the original. + + + Client-header taggers are the first actions that are executed + and their tags can be used to control every other action. + - - - - - log-messages 1 - - - - + - - If log-buffer-size is set to 1, the size of the log buffer, - i.e. the amount of memory used for the log messages displayed in the - console window, will be limited to log-max-lines (see below). - + + Example usage (section): + + + +# Tag every request with the User-Agent header +{+client-header-filter{user-agent}} +/ + + + + + + + - - Warning: Setting this to 0 will result in the buffer to grow infinitely and - eat up all your memory! - - - - - - log-buffer-size 1 - - - - + + + +content-type-overwrite - - log-max-lines is the maximum number of lines held - in the log buffer. See above. - + + + Typical use: + + Stop useless download menus from popping up, or change the browser's rendering mode + + - - - - - log-max-lines 200 - - - - + + Effect: + + + Replaces the Content-Type: HTTP server header. + + + - - If log-highlight-messages is set to 1, - Privoxy will highlight portions of the log - messages with a bold-faced font: - + + Type: + + + Parameterized. + + - - - - - log-highlight-messages 1 - - - - + + Parameter: + + + Any string. + + + + + + Notes: + + + The Content-Type: HTTP server header is used by the + browser to decide what to do with the document. The value of this + header can cause the browser to open a download menu instead of + displaying the document by itself, even if the document's format is + supported by the browser. + + + The declared content type can also affect which rendering mode + the browser chooses. If XHTML is delivered as text/html, + many browsers treat it as yet another broken HTML document. + If it is send as application/xml, browsers with + XHTML support will only display it, if the syntax is correct. + + + If you see a web site that proudly uses XHTML buttons, but sets + Content-Type: text/html, you can use &my-app; + to overwrite it with application/xml and validate + the web master's claim inside your XHTML-supporting browser. + If the syntax is incorrect, the browser will complain loudly. + + + You can also go the opposite direction: if your browser prints + error messages instead of rendering a document falsely declared + as XHTML, you can overwrite the content type with + text/html and have it rendered as broken HTML document. + + + By default content-type-overwrite only replaces + Content-Type: headers that look like some kind of text. + If you want to overwrite it unconditionally, you have to combine it with + force-text-mode. + This limitation exists for a reason, think twice before circumventing it. + + + Most of the time it's easier to replace this action with a custom + server-header filter. + It allows you to activate it for every document of a certain site and it will still + only replace the content types you aimed at. + + + Of course you can apply content-type-overwrite + to a whole site and then make URL based exceptions, but it's a lot + more work to get the same precision. + + + - - The font used in the console window: - + + Example usage (sections): + + + # Check if www.example.net/ really uses valid XHTML +{ +content-type-overwrite{application/xml} } +www.example.net/ + +# but leave the content type unmodified if the URL looks like a style sheet +{-content-type-overwrite} +www.example.net/.*\.css$ +www.example.net/.*style + + + + + + - - - - - log-font-name Comic Sans MS - - - - - - Font size used in the console window: - + + + +crunch-client-header - - - - - log-font-size 8 - - - - + + + Typical use: + + Remove a client header Privoxy has no dedicated action for. + + - - show-on-task-bar controls whether or not - Privoxy will appear as a button on the Task bar - when minimized: - + + Effect: + + + Deletes every header sent by the client that contains the string the user supplied as parameter. + + + - - - - - show-on-task-bar 0 - - - - + + Type: + + + Parameterized. + + - - If close-button-minimizes is set to 1, the Windows close - button will minimize Privoxy instead of closing - the program (close with the exit option on the File menu). - + + Parameter: + + + Any string. + + + + + + Notes: + + + This action allows you to block client headers for which no dedicated + Privoxy action exists. + Privoxy will remove every client header that + contains the string you supplied as parameter. + + + Regular expressions are not supported and you can't + use this action to block different headers in the same request, unless + they contain the same string. + + + crunch-client-header is only meant for quick tests. + If you have to block several different headers, or only want to modify + parts of them, you should use a + client-header filter. + + + + Don't block any header without understanding the consequences. + + + + + + + Example usage (section): + + + # Block the non-existent "Privacy-Violation:" client header +{ +crunch-client-header{Privacy-Violation:} } +/ + + + + + + - - - - - close-button-minimizes 1 - - - - - - The hide-console option is specific to the MS-Win console - version of Privoxy. If this option is used, - Privoxy will disconnect from and hide the - command console. - + + +crunch-if-none-match + + + + Typical use: + + Prevent yet another way to track the user's steps between sessions. + + - - - - - #hide-console - - - - + + Effect: + + + Deletes the If-None-Match: HTTP client header. + + + - - + + Type: + + + Boolean. + + - + + Parameter: + + + N/A + + + + + + Notes: + + + Removing the If-None-Match: HTTP client header + is useful for filter testing, where you want to force a real + reload instead of getting status code 304 which + would cause the browser to use a cached copy of the page. + + + It is also useful to make sure the header isn't used as a cookie + replacement. + + + Blocking the If-None-Match: header shouldn't cause any + caching problems, as long as the If-Modified-Since: header + isn't blocked as well. + + + It is recommended to use this action together with + hide-if-modified-since + and + overwrite-last-modified. + + + + + Example usage (section): + + + # Let the browser revalidate cached documents without being tracked across sessions +{ +hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} +/ + + + + + - - -The Actions File - - The default.action file (formerly - actionsfile or ijb.action) is used to define what actions - Privoxy takes, and thus determines how images, - cookies and various other aspects of HTTP content and transactions are - handled. Images can be anything you want, including ads, banners, or just - some obnoxious URL that you would rather not see. Cookies can be accepted - or rejected, or accepted only during the current browser session (i.e. - not written to disk). Changes to default.action should - be immediately visible to Privoxy without - the need to restart. - + + +crunch-incoming-cookies - - The easiest way to edit actions file is with a browser by - loading http://p.p/, and then select - Edit Actions List. A text editor can also be used. - + + + Typical use: + + + Prevent the web server from setting any cookies on your system + + + - - To determine which actions apply to a request, the URL of the request is - compared to all patterns in this file. Every time it matches, the list of - applicable actions for the URL is incrementally updated. You can trace - this process by visiting http://p.p/show-url-info. - + + Effect: + + + Deletes any Set-Cookie: HTTP headers from server replies. + + + + + Type: + + + Boolean. + + - - There are four types of lines in this file: comments (begin with a - # character), actions, aliases and patterns, all of which are - explained below, as well as the configuration file syntax that - Privoxy understands. + + Parameter: + + + N/A + + + + + + Notes: + + + This action is only concerned with incoming cookies. For + outgoing cookies, use + crunch-outgoing-cookies. + Use both to disable cookies completely. + + + It makes no sense at all to use this action in conjunction + with the session-cookies-only action, + since it would prevent the session cookies from being set. See also + filter-content-cookies. + + + - + + Example usage: + + + +crunch-incoming-cookies + + + + + - -URL Domain and Path Syntax - - Generally, a pattern has the form <domain>/<path>, where both the - <domain> and <path> part are optional. If you only specify a - domain part, the / can be left out: - + +crunch-server-header + + + + Typical use: + + Remove a server header Privoxy has no dedicated action for. + + - - www.example.com - is a domain only pattern and will match any request to - www.example.com. - + + Effect: + + + Deletes every header sent by the server that contains the string the user supplied as parameter. + + + - - www.example.com/ - means exactly the same. - + + Type: + + + Parameterized. + + - - www.example.com/index.html - matches only the single - document /index.html on www.example.com. - + + Parameter: + + + Any string. + + + + + + Notes: + + + This action allows you to block server headers for which no dedicated + Privoxy action exists. Privoxy + will remove every server header that contains the string you supplied as parameter. + + + Regular expressions are not supported and you can't + use this action to block different headers in the same request, unless + they contain the same string. + + + crunch-server-header is only meant for quick tests. + If you have to block several different headers, or only want to modify + parts of them, you should use a custom + server-header filter. + + + + Don't block any header without understanding the consequences. + + + + + + + Example usage (section): + + + # Crunch server headers that try to prevent caching +{ +crunch-server-header{no-cache} } +/ + + + + + - - /index.html - matches the document /index.html, regardless of - the domain. - - - index.html - matches nothing, since it would be - interpreted as a domain name and there is no top-level domain called - .html. - + + +crunch-outgoing-cookies - - The matching of the domain part offers some flexible options: if the - domain starts or ends with a dot, it becomes unanchored at that end. - For example: - + + + Typical use: + + + Prevent the web server from reading any cookies from your system + + + - - .example.com - matches any domain that ENDS in - .example.com. - + + Effect: + + + Deletes any Cookie: HTTP headers from client requests. + + + - - www. - matches any domain that STARTS with - www. - + + Type: + + + Boolean. + + - - Additionally, there are wild-cards that you can use in the domain names - themselves. They work pretty similar to shell wild-cards: * - stands for zero or more arbitrary characters, ? stands for - any single character. And you can define character classes in square - brackets and they can be freely mixed: - + + Parameter: + + + N/A + + + + + + Notes: + + + This action is only concerned with outgoing cookies. For + incoming cookies, use + crunch-incoming-cookies. + Use both to disable cookies completely. + + + It makes no sense at all to use this action in conjunction + with the session-cookies-only action, + since it would prevent the session cookies from being read. + + + - - ad*.example.com - matches adserver.example.com, - ads.example.com, etc but not sfads.example.com. - + + Example usage: + + + +crunch-outgoing-cookies + + + - - *ad*.example.com - matches all of the above, and then some. - + + - - .?pix.com - matches www.ipix.com, - pictures.epix.com, a.b.c.d.e.upix.com, etc. - - - www[1-9a-ez].example.com - matches www1.example.com, - www4.example.com, wwwd.example.com, - wwwz.example.com, etc., but not - wwww.example.com. - + + +deanimate-gifs - - If Privoxy was compiled with - pcre support (default), Perl compatible regular expressions - can be used. See the pcre/docs/ directory or man - perlre (also available on http://www.perldoc.com/perl5.6/pod/perlre.html) - for details. A brief discussion of regular expressions is in the - Appendix. For instance: - + + + Typical use: + + Stop those annoying, distracting animated GIF images. + + - - /.*/advert[0-9]+\.jpe?g - would match a URL from any - domain, with any path that includes advert followed - immediately by one or more digits, then a . and ending in - either jpeg or jpg. So we match - example.com/ads/advert2.jpg, and - www.example.com/ads/banners/advert39.jpeg, but not - www.example.com/ads/banners/advert39.gif (no gifs in the - example pattern). - + + Effect: + + + De-animate GIF animations, i.e. reduce them to their first or last image. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + last or first + + + + + + Notes: + + + This will also shrink the images considerably (in bytes, not pixels!). If + the option first is given, the first frame of the animation + is used as the replacement. If last is given, the last + frame of the animation is used instead, which probably makes more sense for + most banner animations, but also has the risk of not showing the entire + last frame (if it is only a delta to an earlier frame). + + + You can safely use this action with patterns that will also match non-GIF + objects, because no attempt will be made at anything that doesn't look like + a GIF. + + + + + + Example usage: + + + +deanimate-gifs{last} + + + + + + + + +downgrade-http-version + + + + Typical use: + + Work around (very rare) problems with HTTP/1.1 + + + + + Effect: + + + Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + This is a left-over from the time when Privoxy + didn't support important HTTP/1.1 features well. It is left here for the + unlikely case that you experience HTTP/1.1 related problems with some server + out there. Not all (optional) HTTP/1.1 features are supported yet, so there + is a chance you might need this action. + + + + + + Example usage (section): + + + {+downgrade-http-version} +problem-host.example.com + + + + + + + + + +fast-redirects + + + + Typical use: + + Fool some click-tracking scripts and speed up indirect links. + + + + + Effect: + + + Detects redirection URLs and redirects the browser without contacting + the redirection server first. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + + + simple-check to just search for the string http:// + to detect redirection URLs. + + + + + check-decoded-url to decode URLs (if necessary) before searching + for redirection URLs. + + + + + + + + Notes: + + + Many sites, like yahoo.com, don't just link to other sites. Instead, they + will link to some script on their own servers, giving the destination as a + parameter, which will then redirect you to the final target. URLs + resulting from this scheme typically look like: + http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/. + + + Sometimes, there are even multiple consecutive redirects encoded in the + URL. These redirections via scripts make your web browsing more traceable, + since the server from which you follow such a link can see where you go + to. Apart from that, valuable bandwidth and time is wasted, while your + browser asks the server for one redirect after the other. Plus, it feeds + the advertisers. + + + This feature is currently not very smart and is scheduled for improvement. + If it is enabled by default, you will have to create some exceptions to + this action. It can lead to failures in several ways: + + + Not every URLs with other URLs as parameters is evil. + Some sites offer a real service that requires this information to work. + For example a validation service needs to know, which document to validate. + fast-redirects assumes that every URL parameter that + looks like another URL is a redirection target, and will always redirect to + the last one. Most of the time the assumption is correct, but if it isn't, + the user gets redirected anyway. + + + Another failure occurs if the URL contains other parameters after the URL parameter. + The URL: + http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar. + contains the redirection URL http://www.example.net/, + followed by another parameter. fast-redirects doesn't know that + and will cause a redirect to http://www.example.net/&foo=bar. + Depending on the target server configuration, the parameter will be silently ignored + or lead to a page not found error. You can prevent this problem by + first using the redirect action + to remove the last part of the URL, but it requires a little effort. + + + To detect a redirection URL, fast-redirects only + looks for the string http://, either in plain text + (invalid but often used) or encoded as http%3a//. + Some sites use their own URL encoding scheme, encrypt the address + of the target server or replace it with a database id. In theses cases + fast-redirects is fooled and the request reaches the + redirection server where it probably gets logged. + + + + + + Example usage: + + + + { +fast-redirects{simple-check} } + .example.com + + { +fast-redirects{check-decoded-url} } + another.example.com/testing + + + + + + + + + + +filter + + + + Typical use: + + Get rid of HTML and JavaScript annoyances, banner advertisements (by size), + do fun text replacements, add personalized effects, etc. + + + + + Effect: + + + All instances of text-based type, most notably HTML and JavaScript, to which + this action applies, can be filtered on-the-fly through the specified regular + expression based substitutions. (Note: as of version 3.0.3 plain text documents + are exempted from filtering, because web servers often use the + text/plain MIME type for all files whose type they don't know.) + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + The name of a content filter, as defined in the filter file. + Filters can be defined in one or more files as defined by the + filterfile + option in the config file. + default.filter is the collection of filters + supplied by the developers. Locally defined filters should go + in their own file, such as user.filter. + + + When used in its negative form, + and without parameters, all filtering is completely disabled. + + + + + + Notes: + + + For your convenience, there are a number of pre-defined filters available + in the distribution filter file that you can use. See the examples below for + a list. + + + Filtering requires buffering the page content, which may appear to + slow down page rendering since nothing is displayed until all content has + passed the filters. (It does not really take longer, but seems that way + since the page is not incrementally displayed.) This effect will be more + noticeable on slower connections. + + + Rolling your own + filters requires a knowledge of + Regular + Expressions and + HTML. + This is very powerful feature, and potentially very intrusive. + Filters should be used with caution, and where an equivalent + action is not available. + + + The amount of data that can be filtered is limited to the + buffer-limit + option in the main config file. The + default is 4096 KB (4 Megs). Once this limit is exceeded, the buffered + data, and all pending data, is passed through unfiltered. + + + Inappropriate MIME types, such as zipped files, are not filtered at all. + (Again, only text-based types except plain text). Encrypted SSL data + (from HTTPS servers) cannot be filtered either, since this would violate + the integrity of the secure transaction. In some situations it might + be necessary to protect certain text, like source code, from filtering + by defining appropriate -filter exceptions. + + + Compressed content can't be filtered either, unless &my-app; + is compiled with zlib support (requires at least &my-app; 3.0.7), + in which case &my-app; will decompress the content before filtering + it. + + + If you use a &my-app; version without zlib support, but want filtering to work on + as much documents as possible, even those that would normally be sent compressed, + you must use the prevent-compression + action in conjunction with filter. + + + Content filtering can achieve some of the same effects as the + block + action, i.e. it can be used to block ads and banners. But the mechanism + works quite differently. One effective use, is to block ad banners + based on their size (see below), since many of these seem to be somewhat + standardized. + + + Feedback with suggestions for new or + improved filters is particularly welcome! + + + The below list has only the names and a one-line description of each + predefined filter. There are more + verbose explanations of what these filters do in the filter file chapter. + + + + + + Example usage (with filters from the distribution default.filter file). + See the Predefined Filters section for + more explanation on each: + + + + +filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse + + + + +filter{js-events} # Kill all JS event bindings (Radically destructive! Only for extra nasty sites) + + + + +filter{html-annoyances} # Get rid of particularly annoying HTML abuse + + + + +filter{content-cookies} # Kill cookies that come in the HTML or JS content + + + + +filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups) + + + + +filter{unsolicited-popups} # Disable only unsolicited pop-up windows. Useful if your browser lacks this ability. + + + + +filter{all-popups} # Kill all popups in JavaScript and HTML. Useful if your browser lacks this ability. + + + + +filter{img-reorder} # Reorder attributes in <img> tags to make the banners-by-* filters more effective + + + + +filter{banners-by-size} # Kill banners by size + + + + +filter{banners-by-link} # Kill banners by their links to known clicktrackers + + + + +filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking) + + + + +filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap + + + + +filter{jumping-windows} # Prevent windows from resizing and moving themselves + + + + +filter{frameset-borders} # Give frames a border and make them resizeable + + + + +filter{demoronizer} # Fix MS's non-standard use of standard charsets + + + + +filter{shockwave-flash} # Kill embedded Shockwave Flash objects + + + + +filter{quicktime-kioskmode} # Make Quicktime movies savable + + + + +filter{fun} # Text replacements for subversive browsing fun! + + + + +filter{crude-parental} # Crude parental filtering (demo only) + + + + +filter{ie-exploits} # Disable some known Internet Explorer bug exploits + + + + +filter{site-specifics} # Custom filters for specific site related problems + + + + +filter{google} # Removes text ads and other Google specific improvements + + + + +filter{yahoo} # Removes text ads and other Yahoo specific improvements + + + + +filter{msn} # Removes text ads and other MSN specific improvements + + + + +filter{blogspot} # Cleans up Blogspot blogs + + + + +filter{no-ping} # Removes non-standard ping attributes from anchor and area tags + + + + + + + + + +force-text-mode + + + + Typical use: + + Force Privoxy to treat a document as if it was in some kind of text format. + + + + + Effect: + + + Declares a document as text, even if the Content-Type: isn't detected as such. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + As explained above, + Privoxy tries to only filter files that are + in some kind of text format. The same restrictions apply to + content-type-overwrite. + force-text-mode declares a document as text, + without looking at the Content-Type: first. + + + + Think twice before activating this action. Filtering binary data + with regular expressions can cause file damage. + + + + + + + Example usage: + + + ++force-text-mode + + + + + + + + + + +forward-override + + + + Typical use: + + Change the forwarding settings based on User-Agent or request origin + + + + + Effect: + + + Overrules the forward directives in the configuration files. + + + + + + Type: + + + Multi-value. + + + + + Parameter: + + + + forward . to use a direct connection without any additional proxies. + + + + forward 127.0.0.1:8123 to use the HTTP proxy listening at 127.0.0.1 port 8123. + + + + + forward-socks4a 127.0.0.1:9050 . to use the socks4a proxy listening at 127.0.0.1 port 9050. + Replace forward-socks4a with forward-socks4 to use a socks4 connection (with local DNS + resolution) instead. + + + + + forward-socks4a 127.0.0.1:9050 proxy.example.org:8000 to use the socks4a proxy + listening at 127.0.0.1 port 9050 to reach the HTTP proxy listening at proxy.example.org port 8000. + Replace forward-socks4a with forward-socks4 to use a socks4 connection (with local DNS + resolution) instead. + + + + + + + + Notes: + + + This action takes parameters similar to the + forward directives in the configuration + file, but without the URL pattern. It can be used as replacement, but normally it's only + used in cases where matching based on the request URL isn't sufficient. + + + + Please read the description for the forward directives before + using this action. Forwarding to the wrong people will reduce your privacy and increase the + chances of man-in-the-middle attacks. + + + If the ports are missing or invalid, default values will be used. This might change + in the future and you shouldn't rely on it. Otherwise incorrect syntax causes Privoxy + to exit. + + + Use the show-url-info CGI page + to verify that your forward settings do what you thought the do. + + + + + + + Example usage: + + + +# Always use direct connections for requests previously tagged as +# User-Agent: fetch libfetch/2.0 and make sure +# resuming downloads continues to work. +# This way you can continue to use Tor for your normal browsing, +# without overloading the Tor network with your FreeBSD ports updates +# or downloads of bigger files like ISOs. +{+forward-override{forward .} \ + -hide-if-modified-since \ + -overwrite-last-modified \ +} +TAG:^User-Agent: fetch libfetch/2.0$ + + + + + + + + + + +handle-as-empty-document + + + + Typical use: + + Mark URLs that should be replaced by empty documents if they get blocked + + + + + Effect: + + + This action alone doesn't do anything noticeable. It just marks URLs. + If the block action also applies, + the presence or absence of this mark decides whether an HTML BLOCKED + page, or an empty document will be sent to the client as a substitute for the blocked content. + The empty document isn't literally empty, but actually contains a single space. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + Some browsers complain about syntax errors if JavaScript documents + are blocked with Privoxy's + default HTML page; this option can be used to silence them. + And of course this action can also be used to eliminate the &my-app; + BLOCKED message in frames. + + + The content type for the empty document can be specified with + content-type-overwrite{}, + but usually this isn't necessary. + + + + + + Example usage: + + + # Block all documents on example.org that end with ".js", +# but send an empty document instead of the usual HTML message. +{+block +handle-as-empty-document} +example.org/.*\.js$ + + + + + + + + + + +handle-as-image + + + + Typical use: + + Mark URLs as belonging to images (so they'll be replaced by images if they do get blocked, rather than HTML pages) + + + + + Effect: + + + This action alone doesn't do anything noticeable. It just marks URLs as images. + If the block action also applies, + the presence or absence of this mark decides whether an HTML blocked + page, or a replacement image (as determined by the set-image-blocker action) will be sent to the + client as a substitute for the blocked content. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + The below generic example section is actually part of default.action. + It marks all URLs with well-known image file name extensions as images and should + be left intact. + + + Users will probably only want to use the handle-as-image action in conjunction with + block, to block sources of banners, whose URLs don't + reflect the file type, like in the second example section. + + + Note that you cannot treat HTML pages as images in most cases. For instance, (in-line) ad + frames require an HTML page to be sent, or they won't display properly. + Forcing handle-as-image in this situation will not replace the + ad frame with an image, but lead to error messages. + + + + + + Example usage (sections): + + + # Generic image extensions: +# +{+handle-as-image} +/.*\.(gif|jpg|jpeg|png|bmp|ico)$ + +# These don't look like images, but they're banners and should be +# blocked as images: +# +{+block +handle-as-image} +some.nasty-banner-server.com/junk.cgi?output=trash + +# Banner source! Who cares if they also have non-image content? +ad.doubleclick.net + + + + + + + + + + +hide-accept-language + + + + Typical use: + + Pretend to use different language settings. + + + + + Effect: + + + Deletes or replaces the Accept-Language: HTTP header in client requests. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + Keyword: block, or any user defined value. + + + + + + Notes: + + + Faking the browser's language settings can be useful to make a + foreign User-Agent set with + hide-user-agent + more believable. + + + However some sites with content in different languages check the + Accept-Language: to decide which one to take by default. + Sometimes it isn't possible to later switch to another language without + changing the Accept-Language: header first. + + + Therefore it's a good idea to either only change the + Accept-Language: header to languages you understand, + or to languages that aren't wide spread. + + + Before setting the Accept-Language: header + to a rare language, you should consider that it helps to + make your requests unique and thus easier to trace. + If you don't plan to change this header frequently, + you should stick to a common language. + + + + + + Example usage (section): + + + # Pretend to use Canadian language settings. +{+hide-accept-language{en-ca} \ ++hide-user-agent{Mozilla/5.0 (X11; U; OpenBSD i386; en-CA; rv:1.8.0.4) Gecko/20060628 Firefox/1.5.0.4} \ +} +/ + + + + + + + + + +hide-content-disposition + + + + Typical use: + + Prevent download menus for content you prefer to view inside the browser. + + + + + Effect: + + + Deletes or replaces the Content-Disposition: HTTP header set by some servers. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + Keyword: block, or any user defined value. + + + + + + Notes: + + + Some servers set the Content-Disposition: HTTP header for + documents they assume you want to save locally before viewing them. + The Content-Disposition: header contains the file name + the browser is supposed to use by default. + + + In most browsers that understand this header, it makes it impossible to + just view the document, without downloading it first, + even if it's just a simple text file or an image. + + + Removing the Content-Disposition: header helps + to prevent this annoyance, but some browsers additionally check the + Content-Type: header, before they decide if they can + display a document without saving it first. In these cases, you have + to change this header as well, before the browser stops displaying + download menus. + + + It is also possible to change the server's file name suggestion + to another one, but in most cases it isn't worth the time to set + it up. + + + + + + Example usage: + + + # Disarm the download link in Sourceforge's patch tracker +{ -filter \ + +content-type-overwrite{text/plain}\ + +hide-content-disposition{block} } + .sourceforge.net/tracker/download\.php + + + + + + + + + +hide-if-modified-since + + + + Typical use: + + Prevent yet another way to track the user's steps between sessions. + + + + + Effect: + + + Deletes the If-Modified-Since: HTTP client header or modifies its value. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + Keyword: block, or a user defined value that specifies a range of hours. + + + + + + Notes: + + + Removing this header is useful for filter testing, where you want to force a real + reload instead of getting status code 304, which would cause the + browser to use a cached copy of the page. + + + Instead of removing the header, hide-if-modified-since can + also add or subtract a random amount of time to/from the header's value. + You specify a range of minutes where the random factor should be chosen from and + Privoxy does the rest. A negative value means + subtracting, a positive value adding. + + + Randomizing the value of the If-Modified-Since: makes + sure it isn't used as a cookie replacement, but you will run into + caching problems if the random range is too high. + + + It is a good idea to only use a small negative value and let + overwrite-last-modified + handle the greater changes. + + + It is also recommended to use this action together with + crunch-if-none-match. + + + + + + Example usage (section): + + + # Let the browser revalidate without being tracked across sessions +{ +hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} +/ + + + + + + + + + +hide-forwarded-for-headers + + + + Typical use: + + Improve privacy by hiding the true source of the request + + + + + Effect: + + + Deletes any existing X-Forwarded-for: HTTP header from client requests, + and prevents adding a new one. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + It is fairly safe to leave this on. + + + This action is scheduled for improvement: It should be able to generate forged + X-Forwarded-for: headers using random IP addresses from a specified network, + to make successive requests from the same client look like requests from a pool of different + users sharing the same proxy. + + + + + + Example usage: + + + +hide-forwarded-for-headers + + + + + + + + + +hide-from-header + + + + Typical use: + + Keep your (old and ill) browser from telling web servers your email address + + + + + Effect: + + + Deletes any existing From: HTTP header, or replaces it with the + specified string. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + Keyword: block, or any user defined value. + + + + + + Notes: + + + The keyword block will completely remove the header + (not to be confused with the block + action). + + + Alternately, you can specify any value you prefer to be sent to the web + server. If you do, it is a matter of fairness not to use any address that + is actually used by a real person. + + + This action is rarely needed, as modern web browsers don't send + From: headers anymore. + + + + + + Example usage: + + + +hide-from-header{block} or + +hide-from-header{spam-me-senseless@sittingduck.example.com} + + + + + + + + + +hide-referrer + + + + Typical use: + + Conceal which link you followed to get to a particular site + + + + + Effect: + + + Deletes the Referer: (sic) HTTP header from the client request, + or replaces it with a forged one. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + + conditional-block to delete the header completely if the host has changed. + + + block to delete the header unconditionally. + + + forge to pretend to be coming from the homepage of the server we are talking to. + + + Any other string to set a user defined referrer. + + + + + + + Notes: + + + conditional-block is the only parameter, + that isn't easily detected in the server's log file. If it blocks the + referrer, the request will look like the visitor used a bookmark or + typed in the address directly. + + + Leaving the referrer unmodified for requests on the same host + allows the server owner to see the visitor's click path, + but in most cases she could also get that information by comparing + other parts of the log file: for example the User-Agent if it isn't + a very common one, or the user's IP address if it doesn't change between + different requests. + + + Always blocking the referrer, or using a custom one, can lead to + failures on servers that check the referrer before they answer any + requests, in an attempt to prevent their valuable content from being + embedded or linked to elsewhere. + + + Both conditional-block and forge + will work with referrer checks, as long as content and valid referring page + are on the same host. Most of the time that's the case. + + + hide-referer is an alternate spelling of + hide-referrer and the two can be can be freely + substituted with each other. (referrer is the + correct English spelling, however the HTTP specification has a bug - it + requires it to be spelled as referer.) + + + + + + Example usage: + + + +hide-referrer{forge} or + +hide-referrer{http://www.yahoo.com/} + + + + + + + + + +hide-user-agent + + + + Typical use: + + Conceal your type of browser and client operating system + + + + + Effect: + + + Replaces the value of the User-Agent: HTTP header + in client requests with the specified value. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + Any user-defined string. + + + + + + Notes: + + + + This can lead to problems on web sites that depend on looking at this header in + order to customize their content for different browsers (which, by the + way, is NOT the right thing to do: good web sites + work browser-independently). + + + + + Using this action in multi-user setups or wherever different types of + browsers will access the same Privoxy is + not recommended. In single-user, single-browser + setups, you might use it to delete your OS version information from + the headers, because it is an invitation to exploit known bugs for your + OS. It is also occasionally useful to forge this in order to access + sites that won't let you in otherwise (though there may be a good + reason in some cases). Example of this: some MSN sites will not + let Mozilla enter, yet forging to a + Netscape 6.1 user-agent works just fine. + (Must be just a silly MS goof, I'm sure :-). + + + This action is scheduled for improvement. + + + + + + Example usage: + + + +hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)} + + + + + + + + + +inspect-jpegs + + + + Typical use: + + To protect against the MS buffer over-run in JPEG processing + + + + + Effect: + + + Protect against a known exploit + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + See Microsoft Security Bulletin MS04-028. JPEG images are one of the most + common image types found across the Internet. The exploit as described can + allow execution of code on the target system, giving an attacker access + to the system in question by merely planting an altered JPEG image, which + would have no obvious indications of what lurks inside. This action + prevents unwanted intrusion. + + + + + + + Example usage: + + +inspect-jpegs + + + + + + + + + + +kill-popups<anchor id="kill-popup"> + + + + Typical use: + + Eliminate those annoying pop-up windows (deprecated) + + + + + Effect: + + + While loading the document, replace JavaScript code that opens + pop-up windows with (syntactically neutral) dummy code on the fly. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + This action is basically a built-in, hardwired special-purpose filter + action, but there are important differences: For kill-popups, + the document need not be buffered, so it can be incrementally rendered while + downloading. But kill-popups doesn't catch as many pop-ups as + filter{all-popups} + does and is not as smart as filter{unsolicited-popups} + is. + + + Think of it as a fast and efficient replacement for a filter that you + can use if you don't want any filtering at all. Note that it doesn't make + sense to combine it with any filter action, + since as soon as one filter applies, + the whole document needs to be buffered anyway, which destroys the advantage of + the kill-popups action over its filter equivalent. + + + Killing all pop-ups unconditionally is problematic. Many shops and banks rely on + pop-ups to display forms, shopping carts etc, and the filter{unsolicited-popups} + does a better job of catching only the unwanted ones. + + + If the only kind of pop-ups that you want to kill are exit consoles (those + really nasty windows that appear when you close an other + one), you might want to use + filter{js-annoyances} + instead. + + + This action is most appropriate for browsers that don't have any controls + for unwanted pop-ups. Not recommended for general usage. + + + + + + + + Example usage: + + +kill-popups + + + + + + + + +limit-connect + + + + Typical use: + + Prevent abuse of Privoxy as a TCP proxy relay or disable SSL for untrusted sites + + + + + Effect: + + + Specifies to which ports HTTP CONNECT requests are allowable. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + A comma-separated list of ports or port ranges (the latter using dashes, with the minimum + defaulting to 0 and the maximum to 65K). + + + + + + Notes: + + + By default, i.e. if no limit-connect action applies, + Privoxy only allows HTTP CONNECT + requests to port 443 (the standard, secure HTTPS port). Use + limit-connect if more fine-grained control is desired + for some or all destinations. + + + The CONNECT methods exists in HTTP to allow access to secure websites + (https:// URLs) through proxies. It works very simply: + the proxy connects to the server on the specified port, and then + short-circuits its connections to the client and to the remote server. + This can be a big security hole, since CONNECT-enabled proxies can be + abused as TCP relays very easily. + + + Privoxy relays HTTPS traffic without seeing + the decoded content. Websites can leverage this limitation to circumvent &my-app;'s + filters. By specifying an invalid port range you can disable HTTPS entirely. + If you plan to disable SSL by default, consider enabling + treat-forbidden-connects-like-blocks + as well, to be able to quickly create exceptions. + + + + + + Example usages: + + + + + + +limit-connect{443} # This is the default and need not be specified. ++limit-connect{80,443} # Ports 80 and 443 are OK. ++limit-connect{-3, 7, 20-100, 500-} # Ports less than 3, 7, 20 to 100 and above 500 are OK. ++limit-connect{-} # All ports are OK ++limit-connect{,} # No HTTPS/SSL traffic is allowed + + + + + + + + +prevent-compression + + + + Typical use: + + + Ensure that servers send the content uncompressed, so it can be + passed through filters. + + + + + + Effect: + + + Removes the Accept-Encoding header which can be used to ask for compressed transfer. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + More and more websites send their content compressed by default, which + is generally a good idea and saves bandwidth. But the filter, deanimate-gifs + and kill-popups actions need + access to the uncompressed data. + + + When compiled with zlib support (available since &my-app; 3.0.7), content that should be + filtered is decompressed on-the-fly and you don't have to worry about this action. + If you are using an older &my-app; version, or one that hasn't been compiled with zlib + support, this action can be used to convince the server to send the content uncompressed. + + + Most text-based instances compress very well, the size is seldom decreased by less than 50%, + for markup-heavy instances like news feeds saving more than 90% of the original size isn't + unusual. + + + Not using compression will therefore slow down the transfer, and you should only + enable this action if you really need it. As of &my-app; 3.0.7 it's disabled in all + predefined action settings. + + + Note that some (rare) ill-configured sites don't handle requests for uncompressed + documents correctly. Broken PHP applications tend to send an empty document body, + some IIS versions only send the beginning of the content. If you enable + prevent-compression per default, you might want to add + exceptions for those sites. See the example for how to do that. + + + + + + Example usage (sections): + + + +# Selectively turn off compression, and enable a filter +# +{ +filter{tiny-textforms} +prevent-compression } +# Match only these sites + .google. + sourceforge.net + sf.net + +# Or instead, we could set a universal default: +# +{ +prevent-compression } + / # Match all sites + +# Then maybe make exceptions for broken sites: +# +{ -prevent-compression } +.compusa.com/ + + + + + + + + + + +overwrite-last-modified + + + + Typical use: + + Prevent yet another way to track the user's steps between sessions. + + + + + Effect: + + + Deletes the Last-Modified: HTTP server header or modifies its value. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + One of the keywords: block, reset-to-request-time + and randomize + + + + + + Notes: + + + Removing the Last-Modified: header is useful for filter + testing, where you want to force a real reload instead of getting status + code 304, which would cause the browser to reuse the old + version of the page. + + + The randomize option overwrites the value of the + Last-Modified: header with a randomly chosen time + between the original value and the current time. In theory the server + could send each document with a different Last-Modified: + header to track visits without using cookies. Randomize + makes it impossible and the browser can still revalidate cached documents. + + + reset-to-request-time overwrites the value of the + Last-Modified: header with the current time. You could use + this option together with + hided-if-modified-since + to further customize your random range. + + + The preferred parameter here is randomize. It is safe + to use, as long as the time settings are more or less correct. + If the server sets the Last-Modified: header to the time + of the request, the random range becomes zero and the value stays the same. + Therefore you should later randomize it a second time with + hided-if-modified-since, + just to be sure. + + + It is also recommended to use this action together with + crunch-if-none-match. + + + + + + Example usage: + + + # Let the browser revalidate without being tracked across sessions +{ +hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} +/ + + + + + + + + + +redirect + + + + Typical use: + + + Redirect requests to other sites. + + + + + + Effect: + + + Convinces the browser that the requested document has been moved + to another location and the browser should get it from there. + + + + + + Type: + + + Parameterized + + + + + Parameter: + + + An absolute URL or a single pcrs command. + + + + + + Notes: + + + Requests to which this action applies are answered with a + HTTP redirect to URLs of your choosing. The new URL is + either provided as parameter, or derived by applying a + single pcrs command to the original URL. + + + This action will be ignored if you use it together with + block. + It can be combined with + fast-redirects{check-decoded-url} + to redirect to a decoded version of a rewritten URL. + + + Use this action carefully, make sure not to create redirection loops + and be aware that using your own redirects might make it + possible to fingerprint your requests. + + + + + + Example usages: + + + # Replace example.com's style sheet with another one +{ +redirect{http://localhost/css-replacements/example.com.css} } + example.com/stylesheet\.css + +# Create a short, easy to remember nickname for a favorite site +# (relies on the browser accept and forward invalid URLs to &my-app;) +{ +redirect{http://www.privoxy.org/user-manual/actions-file.html} } + a + +# Always use the expanded view for Undeadly.org articles +# (Note the $ at the end of the URL pattern to make sure +# the request for the rewritten URL isn't redirected as well) +{+redirect{s@$@&mode=expanded@}} +undeadly.org/cgi\?action=article&sid=\d*$ + + + + + + + + + + +send-vanilla-wafer + + + + Typical use: + + + Feed log analysis scripts with useless data. + + + + + + Effect: + + + Sends a cookie with each request stating that you do not accept any copyright + on cookies sent to you, and asking the site operator not to track you. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + The vanilla wafer is a (relatively) unique header and could conceivably be used to track you. + + + This action is rarely used and not enabled in the default configuration. + + + + + + Example usage: + + + +send-vanilla-wafer + + + + + + + + + + +send-wafer + + + + Typical use: + + + Send custom cookies or feed log analysis scripts with even more useless data. + + + + + + Effect: + + + Sends a custom, user-defined cookie with each request. + + + + + + Type: + + + Multi-value. + + + + + Parameter: + + + A string of the form name=value. + + + + + + Notes: + + + Being multi-valued, multiple instances of this action can apply to the same request, + resulting in multiple cookies being sent. + + + This action is rarely used and not enabled in the default configuration. + + + + + Example usage (section): + + + {+send-wafer{UsingPrivoxy=true}} +my-internal-testing-server.void + + + + + + + + + +server-header-filter + + + + Typical use: + + + Rewrite or remove single server headers. + + + + + + Effect: + + + All server headers to which this action applies are filtered on-the-fly + through the specified regular expression based substitutions. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + The name of a server-header filter, as defined in one of the + filter files. + + + + + + Notes: + + + Server-header filters are applied to each header on its own, not to + all at once. This makes it easier to diagnose problems, but on the downside + you can't write filters that only change header x if header y's value is z. + You can do that by using tags though. + + + Server-header filters are executed after the other header actions have finished + and use their output as input. + + + Please refer to the filter file chapter + to learn which server-header filters are available by default, and how to + create your own. + + + + + Example usage (section): + + + +{+server-header-filter{html-to-xml}} +example.org/xml-instance-that-is-delivered-as-html + +{+server-header-filter{xml-to-html}} +example.org/instance-that-is-delivered-as-xml-but-is-not + + + + + + + + + + + +server-header-tagger + + + + Typical use: + + + Disable or disable filters based on the Content-Type header. + + + + + + Effect: + + + Server headers to which this action applies are filtered on-the-fly through + the specified regular expression based substitutions, the result is used as + tag. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + The name of a server-header tagger, as defined in one of the + filter files. + + + + + + Notes: + + + Server-header taggers are applied to each header on its own, + and as the header isn't modified, each tagger sees + the original. + + + Server-header taggers are executed before all other header actions + that modify server headers. Their tags can be used to control + all of the other server-header actions, the content filters + and the crunch actions (redirect + and block). + + + Obviously crunching based on tags created by server-header taggers + doesn't prevent the request from showing up in the server's log file. + + + + + + Example usage (section): + + + +# Tag every request with the declared content type +{+client-header-filter{content-type}} +/ + + + + + + + + + + + +session-cookies-only + + + + Typical use: + + + Allow only temporary session cookies (for the current + browser session only). + + + + + + Effect: + + + Deletes the expires field from Set-Cookie: + server headers. Most browsers will not store such cookies permanently and + forget them in between sessions. + + + + + + Type: + + + Boolean. + + + + + Parameter: + + + N/A + + + + + + Notes: + + + This is less strict than crunch-incoming-cookies / + crunch-outgoing-cookies and allows you to browse + websites that insist or rely on setting cookies, without compromising your privacy too badly. + + + Most browsers will not permanently store cookies that have been processed by + session-cookies-only and will forget about them between sessions. + This makes profiling cookies useless, but won't break sites which require cookies so + that you can log in for transactions. This is generally turned on for all + sites, and is the recommended setting. + + + It makes no sense at all to use session-cookies-only + together with crunch-incoming-cookies or + crunch-outgoing-cookies. If you do, cookies + will be plainly killed. + + + Note that it is up to the browser how it handles such cookies without an expires + field. If you use an exotic browser, you might want to try it out to be sure. + + + This setting also has no effect on cookies that may have been stored + previously by the browser before starting Privoxy. + These would have to be removed manually. + + + Privoxy also uses + the content-cookies filter + to block some types of cookies. Content cookies are not effected by + session-cookies-only. + + + + + + Example usage: + + + +session-cookies-only + + + + + + + + + +set-image-blocker + + + + Typical use: + + Choose the replacement for blocked images + + + + + Effect: + + + This action alone doesn't do anything noticeable. If both + block and handle-as-image also + apply, i.e. if the request is to be blocked as an image, + then the parameter of this action decides what will be + sent as a replacement. + + + + + + Type: + + + Parameterized. + + + + + Parameter: + + + + + pattern to send a built-in checkerboard pattern image. The image is visually + decent, scales very well, and makes it obvious where banners were busted. + + + + + blank to send a built-in transparent image. This makes banners disappear + completely, but makes it hard to detect where Privoxy has blocked + images on a given page and complicates troubleshooting if Privoxy + has blocked innocent images, like navigation icons. + + + + + target-url to + send a redirect to target-url. You can redirect + to any image anywhere, even in your local filesystem via file:/// URL. + (But note that not all browsers support redirecting to a local file system). + + + A good application of redirects is to use special Privoxy-built-in + URLs, which send the built-in images, as target-url. + This has the same visual effect as specifying blank or pattern in + the first place, but enables your browser to cache the replacement image, instead of requesting + it over and over again. + + + + + + + + Notes: + + + The URLs for the built-in images are http://config.privoxy.org/send-banner?type=type, where type is + either blank or pattern. + + + There is a third (advanced) type, called auto. It is NOT to be + used in set-image-blocker, but meant for use from filters. + Auto will select the type of image that would have applied to the referring page, had it been an image. + + + + + + Example usage: + + + Built-in pattern: + + + +set-image-blocker{pattern} + + + Redirect to the BSD daemon: + + + +set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif} + + + Redirect to the built-in pattern for better caching: + + + +set-image-blocker{http://config.privoxy.org/send-banner?type=pattern} + + + + + + + + + +treat-forbidden-connects-like-blocks + + + + Typical use: + + Block forbidden connects with an easy to find error message. + + + + + Effect: + + + If this action is enabled, Privoxy no longer + makes a difference between forbidden connects and ordinary blocks. + + + + + + Type: + + + Boolean + + + + + Parameter: + + N/A + + + + + Notes: + + + By default Privoxy answers + forbidden Connect requests + with a short error message inside the headers. If the browser doesn't display + headers (most don't), you just see an empty page. + + + With this action enabled, Privoxy displays + the message that is used for ordinary blocks instead. If you decide + to make an exception for the page in question, you can do so by + following the See why link. + + + For Connect requests the clients tell + Privoxy which host they are interested + in, but not which document they plan to get later. As a result, the + Go there anyway wouldn't work and is therefore suppressed. + + + + + + Example usage: + + + +treat-forbidden-connects-like-blocks + + + + + + + + + +Summary + + Note that many of these actions have the potential to cause a page to + misbehave, possibly even not to display at all. There are many ways + a site designer may choose to design his site, and what HTTP header + content, and other criteria, he may depend on. There is no way to have hard + and fast rules for all sites. See the Appendix for a brief example on troubleshooting + actions. + + + + + + +Aliases + + Custom actions, known to Privoxy + as aliases, can be defined by combining other actions. + These can in turn be invoked just like the built-in actions. + Currently, an alias name can contain any character except space, tab, + =, + { and }, but we strongly + recommend that you only use a to z, + 0 to 9, +, and -. + Alias names are not case sensitive, and are not required to start with a + + or - sign, since they are merely textually + expanded. + + + Aliases can be used throughout the actions file, but they must be + defined in a special section at the top of the file! + And there can only be one such section per actions file. Each actions file may + have its own alias section, and the aliases defined in it are only visible + within that file. + + + There are two main reasons to use aliases: One is to save typing for frequently + used combinations of actions, the other one is a gain in flexibility: If you + decide once how you want to handle shops by defining an alias called + shop, you can later change your policy on shops in + one place, and your changes will take effect everywhere + in the actions file where the shop alias is used. Calling aliases + by their purpose also makes your actions files more readable. + + + Currently, there is one big drawback to using aliases, though: + Privoxy's built-in web-based action file + editor honors aliases when reading the actions files, but it expands + them before writing. So the effects of your aliases are of course preserved, + but the aliases themselves are lost when you edit sections that use aliases + with it. + + + + Now let's define some aliases... + + + + + # Useful custom aliases we can use later. + # + # Note the (required!) section header line and that this section + # must be at the top of the actions file! + # + {{alias}} + + # These aliases just save typing later: + # (Note that some already use other aliases!) + # + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + +block-as-image = +block +handle-as-image + allow-all-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies} + + # These aliases define combinations of actions + # that are useful for certain types of sites: + # + fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -kill-popups -prevent-compression + + shop = -crunch-all-cookies -filter{all-popups} -kill-popups + + # Short names for other aliases, for really lazy people ;-) + # + c0 = +crunch-all-cookies + c1 = -crunch-all-cookies + + + + ...and put them to use. These sections would appear in the lower part of an + actions file and define exceptions to the default actions (as specified further + up for the / pattern): + + + + + # These sites are either very complex or very keen on + # user data and require minimal interference to work: + # + {fragile} + .office.microsoft.com + .windowsupdate.microsoft.com + # Gmail is really mail.google.com, not gmail.com + mail.google.com + + # Shopping sites: + # Allow cookies (for setting and retrieving your customer data) + # + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + mybank.example.com + + # These shops require pop-ups: + # + {-kill-popups -filter{all-popups} -filter{unsolicited-popups}} + .dabs.com + .overclockers.co.uk + + + + Aliases like shop and fragile are typically used for + problem sites that require more than one action to be disabled + in order to function properly. + + + + + +Actions Files Tutorial + + The above chapters have shown which actions files + there are and how they are organized, how actions are specified and applied + to URLs, how patterns work, and how to + define and use aliases. Now, let's look at an + example default.action and user.action + file and see how all these pieces come together: + + +default.action + + +Every config file should start with a short comment stating its purpose: + + + + # Sample default.action file <ijbswa-developers@lists.sourceforge.net> + + + +Then, since this is the default.action file, the +first section is a special section for internal use that you needn't +change or worry about: + + + + +########################################################################## +# Settings -- Don't change! For internal Privoxy use ONLY. +########################################################################## + +{{settings}} +for-privoxy-version=3.0 + + + +After that comes the (optional) alias section. We'll use the example +section from the above chapter on aliases, +that also explains why and how aliases are used: + + + + +########################################################################## +# Aliases +########################################################################## +{{alias}} + + # These aliases just save typing later: + # (Note that some already use other aliases!) + # + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + +block-as-image = +block +handle-as-image + mercy-for-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies} + + # These aliases define combinations of actions + # that are useful for certain types of sites: + # + fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -kill-popups + shop = -crunch-all-cookies -filter{all-popups} -kill-popups + + + + Now come the regular sections, i.e. sets of actions, accompanied + by URL patterns to which they apply. Remember all actions + are disabled when matching starts, so we have to explicitly + enable the ones we want. + + + + The first regular section is probably the most important. It has only + one pattern, /, but this pattern + matches all URLs. Therefore, the + set of actions used in this default section will + be applied to all requests as a start. It can be partly or + wholly overridden by later matches further down this file, or in user.action, + but it will still be largely responsible for your overall browsing + experience. + + + + Again, at the start of matching, all actions are disabled, so there is + no real need to disable any actions here, but we will do that nonetheless, + to have a complete listing for your reference. (Remember: a + + preceding the action name enables the action, a - disables!). + Also note how this long line has been made more readable by splitting it into + multiple lines with line continuation. + + + + +########################################################################## +# "Defaults" section: +########################################################################## + { \ + -add-header \ + -client-header-filter{hide-tor-exit-notation} \ + -block \ + -content-type-overwrite \ + -crunch-client-header \ + -crunch-if-none-match \ + -crunch-incoming-cookies \ + -crunch-server-header \ + -crunch-outgoing-cookies \ + +deanimate-gifs \ + -downgrade-http-version \ + -fast-redirects{check-decoded-url} \ + -filter{js-annoyances} \ + -filter{js-events} \ + +filter{html-annoyances} \ + -filter{content-cookies} \ + +filter{refresh-tags} \ + -filter{unsolicited-popups} \ + -filter{all-popups} \ + -filter{img-reorder} \ + -filter{banners-by-size} \ + -filter{banners-by-link} \ + +filter{webbugs} \ + -filter{tiny-textforms} \ + -filter{jumping-windows} \ + -filter{frameset-borders} \ + -filter{demoronizer} \ + -filter{shockwave-flash} \ + -filter{quicktime-kioskmode} \ + -filter{fun} \ + -filter{crude-parental} \ + +filter{ie-exploits} \ + -filter{google} \ + -filter{yahoo} \ + -filter{msn} \ + -filter{blogspot} \ + -filter{no-ping} \ + -force-text-mode \ + -handle-as-empty-document \ + -handle-as-image \ + -hide-accept-language \ + -hide-content-disposition \ + -hide-if-modified-since \ + +hide-forwarded-for-headers \ + +hide-from-header{block} \ + +hide-referrer{forge} \ + -hide-user-agent \ + -inspect-jpegs \ + -kill-popups \ + -limit-connect \ + +prevent-compression \ + -overwrite-last-modified \ + -redirect \ + -send-vanilla-wafer \ + -send-wafer \ + -server-header-filter{xml-to-html} \ + -server-header-filter{html-to-xml} \ + +session-cookies-only \ + +set-image-blocker{pattern} \ + -treat-forbidden-connects-like-blocks \ + } + / # forward slash will match *all* potential URL patterns. + + + + The default behavior is now set. Note that some actions, like not hiding + the user agent, are part of a general policy that applies + universally and won't get any exceptions defined later. Other choices, + like not blocking (which is understandably the + default!) need exceptions, i.e. we need to specify explicitly what we + want to block in later sections. + + + + The first of our specialized sections is concerned with fragile + sites, i.e. sites that require minimum interference, because they are either + very complex or very keen on tracking you (and have mechanisms in place that + make them unusable for people who avoid being tracked). We will simply use + our pre-defined fragile alias instead of stating the list + of actions explicitly: + + + + +########################################################################## +# Exceptions for sites that'll break under the default action set: +########################################################################## + +# "Fragile" Use a minimum set of actions for these sites (see alias above): +# +{ fragile } +.office.microsoft.com # surprise, surprise! +.windowsupdate.microsoft.com +mail.google.com + + + + Shopping sites are not as fragile, but they typically + require cookies to log in, and pop-up windows for shopping + carts or item details. Again, we'll use a pre-defined alias: + + + + +# Shopping sites: +# +{ shop } +.quietpc.com +.worldpay.com # for quietpc.com +.jungle.com +.scan.co.uk + + + + + + The fast-redirects + action, which we enabled per default above, breaks some sites. So disable + it for popular sites where we know it misbehaves: + + + + +{ -fast-redirects } +login.yahoo.com +edit.*.yahoo.com +.google.com +.altavista.com/.*(like|url|link):http +.altavista.com/trans.*urltext=http +.nytimes.com + + + + It is important that Privoxy knows which + URLs belong to images, so that if they are to + be blocked, a substitute image can be sent, rather than an HTML page. + Contacting the remote site to find out is not an option, since it + would destroy the loading time advantage of banner blocking, and it + would feed the advertisers (in terms of money and + information). We can mark any URL as an image with the handle-as-image action, + and marking all URLs that end in a known image file extension is a + good start: + + + + +########################################################################## +# Images: +########################################################################## + +# Define which file types will be treated as images, in case they get +# blocked further down this file: +# +{ +handle-as-image } +/.*\.(gif|jpe?g|png|bmp|ico)$ + + + + And then there are known banner sources. They often use scripts to + generate the banners, so it won't be visible from the URL that the + request is for an image. Hence we block them and + mark them as images in one go, with the help of our + +block-as-image alias defined above. (We could of + course just as well use +block + +handle-as-image here.) + Remember that the type of the replacement image is chosen by the + set-image-blocker + action. Since all URLs have matched the default section with its + +set-image-blocker{pattern} + action before, it still applies and needn't be repeated: + + + + +# Known ad generators: +# +{ +block-as-image } +ar.atwola.com +.ad.doubleclick.net +.ad.*.doubleclick.net +.a.yimg.com/(?:(?!/i/).)*$ +.a[0-9].yimg.com/(?:(?!/i/).)*$ +bs*.gsanet.com +.qkimg.net + + + + One of the most important jobs of Privoxy + is to block banners. Many of these can be blocked + by the filter{banners-by-size} + action, which we enabled above, and which deletes the references to banner + images from the pages while they are loaded, so the browser doesn't request + them anymore, and hence they don't need to be blocked here. But this naturally + doesn't catch all banners, and some people choose not to use filters, so we + need a comprehensive list of patterns for banner URLs here, and apply the + block action to them. + + + First comes many generic patterns, which do most of the work, by + matching typical domain and path name components of banners. Then comes + a list of individual patterns for specific sites, which is omitted here + to keep the example short: + - Please note that matching in the path is case - INSENSITIVE by default, but you can switch to case - sensitive at any point in the pattern by using the - (?-i) switch: + +########################################################################## +# Block these fine banners: +########################################################################## +{ +block } + +# Generic patterns: +# +ad*. +.*ads. +banner?. +count*. +/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?) +/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/ + +# Site-specific patterns (abbreviated): +# +.hitbox.com - www.example.com/(?-i)PaTtErN.* - will match only - documents whose path starts with PaTtErN in - exactly this capitalization. + It's quite remarkable how many advertisers actually call their banner + servers ads.company.com, or call the directory + in which the banners are stored simply banners. So the above + generic patterns are surprisingly effective. + + + But being very generic, they necessarily also catch URLs that we don't want + to block. The pattern .*ads. e.g. catches + nasty-ads.nasty-corp.com as intended, + but also downloads.sourcefroge.net or + adsl.some-provider.net. So here come some + well-known exceptions to the +block + section above. + + + Note that these are exceptions to exceptions from the default! Consider the URL + downloads.sourcefroge.net: Initially, all actions are deactivated, + so it wouldn't get blocked. Then comes the defaults section, which matches the + URL, but just deactivates the block + action once again. Then it matches .*ads., an exception to the + general non-blocking policy, and suddenly + +block applies. And now, it'll match + .*loads., where -block + applies, so (unless it matches again further down) it ends up + with no block action applying. + + + + +########################################################################## +# Save some innocent victims of the above generic block patterns: +########################################################################## + +# By domain: +# +{ -block } +adv[io]*. # (for advogato.org and advice.*) +adsl. # (has nothing to do with ads) +adobe. # (has nothing to do with ads either) +ad[ud]*. # (adult.* and add.*) +.edu # (universities don't host banners (yet!)) +.*loads. # (downloads, uploads etc) + +# By path: +# +/.*loads/ + +# Site-specific: +# +www.globalintersec.com/adv # (adv = advanced) +www.ugu.com/sui/ugu/adv + + + + Filtering source code can have nasty side effects, + so make an exception for our friends at sourceforge.net, + and all paths with cvs in them. Note that + -filter + disables all filters in one fell swoop! + + + + +# Don't filter code! +# +{ -filter } +/(.*/)?cvs +bugzilla. +developer. +wiki. +.sourceforge.net + + + + The actual default.action is of course much more + comprehensive, but we hope this example made clear how it works. + + + + +user.action + + + So far we are painting with a broad brush by setting general policies, + which would be a reasonable starting point for many people. Now, + you might want to be more specific and have customized rules that + are more suitable to your personal habits and preferences. These would + be for narrowly defined situations like your ISP or your bank, and should + be placed in user.action, which is parsed after all other + actions files and hence has the last word, over-riding any previously + defined actions. user.action is also a + safe place for your personal settings, since + default.action is actively maintained by the + Privoxy developers and you'll probably want + to install updated versions from time to time. + + + + So let's look at a few examples of things that one might typically do in + user.action: + + + + + + + +# My user.action file. <fred@foobar.com> + + + + As aliases are local to the actions + file that they are defined in, you can't use the ones from + default.action, unless you repeat them here: + + + + +# Aliases are local to the file they are defined in. +# (Re-)define aliases for this file: +# +{{alias}} +# +# These aliases just save typing later, and the alias names should +# be self explanatory. +# ++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies +-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + allow-all-cookies = -crunch-all-cookies -session-cookies-only + allow-popups = -filter{all-popups} -kill-popups ++block-as-image = +block +handle-as-image +-block-as-image = -block + +# These aliases define combinations of actions that are useful for +# certain types of sites: +# +fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer -kill-popups +shop = -crunch-all-cookies allow-popups + +# Allow ads for selected useful free sites: +# +allow-ads = -block -filter{banners-by-size} -filter{banners-by-link} + +# Alias for specific file types that are text, but might have conflicting +# MIME types. We want the browser to force these to be text documents. +handle-as-text = -filter +-content-type-overwrite{text/plain} +-force-text-mode -hide-content-disposition + + + + + Say you have accounts on some sites that you visit regularly, and + you don't want to have to log in manually each time. So you'd like + to allow persistent cookies for these sites. The + allow-all-cookies alias defined above does exactly + that, i.e. it disables crunching of cookies in any direction, and the + processing of cookies to make them only temporary. + + + + +{ allow-all-cookies } + sourceforge.net + .yahoo.com + .msdn.microsoft.com + .redhat.com + + + + Your bank is allergic to some filter, but you don't know which, so you disable them all: + + + + +{ -filter } + .your-home-banking-site.com + + + + Some file types you may not want to filter for various reasons: + + + + +# Technical documentation is likely to contain strings that might +# erroneously get altered by the JavaScript-oriented filters: +# +.tldp.org +/(.*/)?selfhtml/ + +# And this stupid host sends streaming video with a wrong MIME type, +# so that Privoxy thinks it is getting HTML and starts filtering: +# +stupid-server.example.com/ + + + + Example of a simple block action. Say you've + seen an ad on your favourite page on example.com that you want to get rid of. + You have right-clicked the image, selected copy image location + and pasted the URL below while removing the leading http://, into a + { +block } section. Note that { +handle-as-image + } need not be specified, since all URLs ending in + .gif will be tagged as images by the general rules as set + in default.action anyway: + + + + +{ +block } + www.example.com/nasty-ads/sponsor\.gif + another.popular.site.net/more/junk/here/ + + + + The URLs of dynamically generated banners, especially from large banner + farms, often don't use the well-known image file name extensions, which + makes it impossible for Privoxy to guess + the file type just by looking at the URL. + You can use the +block-as-image alias defined above for + these cases. + Note that objects which match this rule but then turn out NOT to be an + image are typically rendered as a broken image icon by the + browser. Use cautiously. + + + + +{ +block-as-image } + .doubleclick.net + .fastclick.net + /Realmedia/ads/ + ar.atwola.com/ + + + + Now you noticed that the default configuration breaks Forbes Magazine, + but you were too lazy to find out which action is the culprit, and you + were again too lazy to give feedback, so + you just used the fragile alias on the site, and + -- whoa! -- it worked. The fragile + aliases disables those actions that are most likely to break a site. Also, + good for testing purposes to see if it is Privoxy + that is causing the problem or not. We later find other regular sites + that misbehave, and add those to our personalized list of troublemakers: + + + + +{ fragile } + .forbes.com + webmail.example.com + .mybank.com + + + + You like the fun text replacements in default.filter, + but it is disabled in the distributed actions file. (My colleagues on the team just + don't have a sense of humour, that's why! ;-). So you'd like to turn it on in your private, + update-safe config, once and for all: + + + + +{ +filter{fun} } + / # For ALL sites! + + + + Note that the above is not really a good idea: There are exceptions + to the filters in default.action for things that + really shouldn't be filtered, like code on CVS->Web interfaces. Since + user.action has the last word, these exceptions + won't be valid for the fun filtering specified here. + + + + You might also worry about how your favourite free websites are + funded, and find that they rely on displaying banner advertisements + to survive. So you might want to specifically allow banners for those + sites that you feel provide value to you: + + + + +{ allow-ads } + .sourceforge.net + .slashdot.org + .osdn.net + + + + Note that allow-ads has been aliased to + -block, + -filter{banners-by-size}, and + -filter{banners-by-link} above. + + + + Invoke another alias here to force an over-ride of the MIME type + application/x-sh which typically would open a download type + dialog. In my case, I want to look at the shell script, and then I can save + it should I choose to. + + + + +{ handle-as-text } + /.*\.sh$ + + + + user.action is generally the best place to define + exceptions and additions to the default policies of + default.action. Some actions are safe to have their + default policies set here though. So let's set a default policy to have a + blank image as opposed to the checkerboard pattern for + ALL sites. / of course matches all URL + paths and patterns: + + + + +{ +set-image-blocker{blank} } +/ # ALL sites + +
+ - + + + +Filter Files - -Actions - Actions are enabled if preceded with a +, and disabled if - preceded with a -. Actions are invoked by enclosing the - action name in curly braces (e.g. {+some_action}), followed by a list of - URLs to which the action applies. There are three classes of actions: + On-the-fly text substitutions need + to be defined in a filter file. Once defined, they + can then be invoked as an action. - + &my-app; supports three different filter actions: + filter to + rewrite the content that is send to the client, + client-header-filter + to rewrite headers that are send by the client, and + server-header-filter + to rewrite headers that are send by the server, and + - - - Boolean (e.g. +/-block): - - - - - - {+name} # enable this action - {-name} # disable this action - - - - - + + &my-app; also supports two tagger actions: + client-header-tagger + and + server-header-tagger. + Taggers and filters use the same syntax in the filter files, the differnce + is that taggers don't modify the text they are filtering, but use a rewritten + version of the filtered text as tag. The tags can then be used to change the + applying actions through sections with tag-patterns. + - - - parameterized (e.g. +/-hide-user-agent): - - - - - - {+name{param}} # enable action and set parameter to param - {-name} # disable action - - - - - + + Multiple filter files can be defined through the filterfile config directive. The filters + as supplied by the developers will be found in + default.filter. It is recommended that any locally + defined or modified filters go in a separately defined file such as + user.filter. - - - Multi-value (e.g. {+/-add-header{Name: value}}, {+/-wafer{name=value}}): - - - - - - {+name{param}} # enable action and add parameter param - {-name{param}} # remove the parameter param - {-name} # disable this action totally - - - - - + - + + Command tasks for content filters are to eliminate common annoyances in + HTML and JavaScript, such as pop-up windows, + exit consoles, crippled windows without navigation tools, the + infamous <BLINK> tag etc, to suppress images with certain + width and height attributes (standard banner sizes or web-bugs), + or just to have fun. + + + + Content filtering works on any text-based document type, including + HTML, JavaScript, CSS etc. (all text/* + MIME types, except text/plain). + Substitutions are made at the source level, so if you want to roll + your own filters, you should first be familiar with HTML syntax, + and, of course, regular expressions. + + + + Just like the actions files, the + filter file is organized in sections, which are called filters + here. Each filter consists of a heading line, that starts with one of the + keywords FILTER:, + CLIENT-HEADER-FILTER: or SERVER-HEADER-FILTER: + followed by the filter's name, and a short (one line) + description of what it does. Below that line + come the jobs, i.e. lines that define the actual + text substitutions. By convention, the name of a filter + should describe what the filter eliminates. The + comment is used in the web-based + user interface. + + + + Once a filter called name has been defined + in the filter file, it can be invoked by using an action of the form + +filter{name} + in any actions file. + + + + Filter definitions start with a header line that contains the filter + type, the filter name and the filter description. + A content filter header line for a filter called foo could look + like this: + + + + FILTER: foo Replace all "foo" with "bar" + + + + Below that line, and up to the next header line, come the jobs that + define what text replacements the filter executes. They are specified + in a syntax that imitates Perl's + s/// operator. If you are familiar with Perl, you + will find this to be quite intuitive, and may want to look at the + PCRS documentation for the subtle differences to Perl behaviour. Most + notably, the non-standard option letter U is supported, + which turns the default to ungreedy matching. + + + + If you are new to + Regular + Expressions, you might want to take a look at + the Appendix on regular expressions, and + see the Perl + manual for + the + s/// operator's syntax and Perl-style regular + expressions in general. + The below examples might also help to get you started. + + + + + +Filter File Tutorial + + Now, let's complete our foo content filter. We have already defined + the heading, but the jobs are still missing. Since all it does is to replace + foo with bar, there is only one (trivial) job + needed: + + + + s/foo/bar/ + + + + But wait! Didn't the comment say that all occurrences + of foo should be replaced? Our current job will only take + care of the first foo on each page. For global substitution, + we'll need to add the g option: + + + + s/foo/bar/g + + + + Our complete filter now looks like this: + + + FILTER: foo Replace all "foo" with "bar" +s/foo/bar/g + + + + Let's look at some real filters for more interesting examples. Here you see + a filter that protects against some common annoyances that arise from JavaScript + abuse. Let's look at its jobs one after the other: + + + + + +FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse + +# Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm +# +s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|Usg + + + + Following the header line and a comment, you see the job. Note that it uses + | as the delimiter instead of /, because + the pattern contains a forward slash, which would otherwise have to be escaped + by a backslash (\). + + + + Now, let's examine the pattern: it starts with the text <script.* + enclosed in parentheses. Since the dot matches any character, and * + means: Match an arbitrary number of the element left of myself, this + matches <script, followed by any text, i.e. + it matches the whole page, from the start of the first <script> tag. + + + + That's more than we want, but the pattern continues: document\.referrer + matches only the exact string document.referrer. The dot needed to + be escaped, i.e. preceded by a backslash, to take away its + special meaning as a joker, and make it just a regular dot. So far, the meaning is: + Match from the start of the first <script> tag in a the page, up to, and including, + the text document.referrer, if both are present + in the page (and appear in that order). + + + + But there's still more pattern to go. The next element, again enclosed in parentheses, + is .*</script>. You already know what .* + means, so the whole pattern translates to: Match from the start of the first <script> + tag in a page to the end of the last <script> tag, provided that the text + document.referrer appears somewhere in between. + + + + This is still not the whole story, since we have ignored the options and the parentheses: + The portions of the page matched by sub-patterns that are enclosed in parentheses, will be + remembered and be available through the variables $1, $2, ... in + the substitute. The U option switches to ungreedy matching, which means + that the first .* in the pattern will only eat up all + text in between <script and the first occurrence + of document.referrer, and that the second .* will + only span the text up to the first </script> + tag. Furthermore, the s option says that the match may span + multiple lines in the page, and the g option again means that the + substitution is global. + + + + So, to summarize, the pattern means: Match all scripts that contain the text + document.referrer. Remember the parts of the script from + (and including) the start tag up to (and excluding) the string + document.referrer as $1, and the part following + that string, up to and including the closing tag, as $2. + + + + Now the pattern is deciphered, but wasn't this about substituting things? So + lets look at the substitute: $1"Not Your Business!"$2 is + easy to read: The text remembered as $1, followed by + "Not Your Business!" (including + the quotation marks!), followed by the text remembered as $2. + This produces an exact copy of the original string, with the middle part + (the document.referrer) replaced by "Not Your + Business!". + + + + The whole job now reads: Replace document.referrer by + "Not Your Business!" wherever it appears inside a + <script> tag. Note that this job won't break JavaScript syntax, + since both the original and the replacement are syntactically valid + string objects. The script just won't have access to the referrer + information anymore. + + + + We'll show you two other jobs from the JavaScript taming department, but + this time only point out the constructs of special interest: + + + + +# The status bar is for displaying link targets, not pointless blahblah +# +s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig + + + + \s stands for whitespace characters (space, tab, newline, + carriage return, form feed), so that \s* means: zero + or more whitespace. The ? in .*? + makes this matching of arbitrary text ungreedy. (Note that the U + option is not set). The ['"] construct means: a single + or a double quote. Finally, \1 is + a back-reference to the first parenthesis just like $1 above, + with the difference that in the pattern, a backslash indicates + a back-reference, whereas in the substitute, it's the dollar. + + + + So what does this job do? It replaces assignments of single- or double-quoted + strings to the window.status object with a dummy assignment + (using a variable name that is hopefully odd enough not to conflict with + real variables in scripts). Thus, it catches many cases where e.g. pointless + descriptions are displayed in the status bar instead of the link target when + you move your mouse over links. + + + + +# Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html +# +s/(<body [^>]*)onunload(.*>)/$1never$2/iU + + + + Including the + OnUnload + event binding in the HTML DOM was a CRIME. + When I close a browser window, I want it to close and die. Basta. + This job replaces the onunload attribute in + <body> tags with the dummy word never. + Note that the i option makes the pattern matching + case-insensitive. Also note that ungreedy matching alone doesn't always guarantee + a minimal match: In the first parenthesis, we had to use [^>]* + instead of .* to prevent the match from exceeding the + <body> tag if it doesn't contain OnUnload, but the page's + content does. + + + + The last example is from the fun department: - If nothing is specified in this file, no actions are taken. - So in this case Privoxy would just be a - normal, non-blocking, non-anonymizing proxy. You must specifically - enable the privacy and blocking features you need (although the - provided default default.action file will - give a good starting point). + +FILTER: fun Fun text replacements + +# Spice the daily news: +# +s/microsoft(?!\.com)/MicroSuck/ig - Later defined actions always over-ride earlier ones. For multi-valued - actions, the actions are applied in the order they are specified. + Note the (?!\.com) part (a so-called negative lookahead) + in the job's pattern, which means: Don't match, if the string + .com appears directly following microsoft + in the page. This prevents links to microsoft.com from being trashed, while + still replacing the word everywhere else. - The list of valid Privoxy actions are: + +# Buzzword Bingo (example for extended regex syntax) +# +s* industry[ -]leading \ +| cutting[ -]edge \ +| customer[ -]focused \ +| market[ -]driven \ +| award[ -]winning # Comments are OK, too! \ +| high[ -]performance \ +| solutions[ -]based \ +| unmatched \ +| unparalleled \ +| unrivalled \ +*<font color="red"><b>BINGO!</b></font> \ +*igx - - - - - Add the specified HTTP header, which is not checked for validity. - You may specify this many times to specify many different headers: - - - - - - +add-header{Name: value} - - - - - - - - - - Block this URL totally. In a default installation, a blocked - URL will result in bright red banner that says BLOCKED, - with a reason why it is being blocked. - - - - - - +block - - - - - - - - - - De-animate all animated GIF images, i.e. reduce them to their last frame. - This will also shrink the images considerably (in bytes, not pixels!). If - the option first is given, the first frame of the animation - is used as the replacement. If last is given, the last frame - of the animation is used instead, which probably makes more sense for most - banner animations, but also has the risk of not showing the entire last - frame (if it is only a delta to an earlier frame). - - - - - - +deanimate-gifs{last} - +deanimate-gifs{first} - - - - - - - - - +downgrade will downgrade HTTP/1.1 client requests to - HTTP/1.0 and downgrade the responses as well. Use this action for servers - that use HTTP/1.1 protocol features that - Privoxy doesn't handle well yet. HTTP/1.1 - is only partially implemented. Default is not to downgrade requests. - - - - - - +downgrade - - - - - - - - - Many sites, like yahoo.com, don't just link to other sites. Instead, they - will link to some script on their own server, giving the destination as a - parameter, which will then redirect you to the final target. URLs resulting - from this scheme typically look like: - http://some.place/some_script?http://some.where-else. - - - Sometimes, there are even multiple consecutive redirects encoded in the - URL. These redirections via scripts make your web browsing more traceable, - since the server from which you follow such a link can see where you go to. - Apart from that, valuable bandwidth and time is wasted, while your browser - ask the server for one redirect after the other. Plus, it feeds the - advertisers. - - - The +fast-redirects option enables interception of these - types of requests by Privoxy, who will cut off - all but the last valid URL in the request and send a local redirect back to - your browser without contacting the intermediate site(s). - - - - - - +fast-redirects - - - - - - - - - Apply the filters in the section_header - section of the default.filter file to the site(s). - default.filter sections are grouped according to like - functionality. - - - - - - - +filter{section_header} - - - - + The x option in this job turns on extended syntax, and allows for + e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting. + - - Filter sections that are pre-defined in the supplied - default.filter include: - + + You get the idea? + + -
- - - html-annoyances: Get rid of particularly annoying HTML abuse. - - - - - js-annoyances: Get rid of particularly annoying JavaScript abuse - - - - - no-poups: Kill all popups in JS and HTML - - - - - frameset-borders: Give frames a border - - - - - webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking) - - - - - no-refresh: Automatic refresh sucks on auto-dialup lines - - - - - fun: Text replacements for subversive browsing fun! - - - - - nimda: Remove (virus) Nimda code. - - - - - banners-by-size: Kill banners by size - - - - - crude-parental: Kill all web pages that contain the words "sex" or "warez" - - -
+ - +The Pre-defined Filters - - - Block any existing X-Forwarded-for header, and do not add a new one: - - - - - - +hide-forwarded - - - - - + - - - - Treat this URL as an image. This only matters if it's also +blocked, - in which case a blocked image can be sent rather than a HTML page. - See +image-blocker{} below for the control over what is actually sent. - If you want invisible ads, they should be defined as - images and blocked. And also, - image-blocker should be set to blank. - - - - - - +image - - - - - - - - Decides what to do with URLs that end up tagged with {+block - +image}, e.g an advertizement. There are five options. - -image-blocker will send a HTML blocked page, - usually resulting in a broken image icon. - - - -+image-blocker{blank} will send a 1x1 transparent GIF -image. And finally, +image-blocker{http://xyz.com} will send a -HTTP temporary redirect to the specified image. This has the advantage of the -icon being being cached by the browser, which will speed up the display. -+image-blocker{pattern} will send a checkboard type pattern - - - - - - - - - - +image-blocker{blank} - +image-blocker{pattern} - +image-blocker{http://p.p/send-banner} - - - - - - - - - By default (i.e. in the absence of a +limit-connect - action), Privoxy will only allow CONNECT - requests to port 443, which is the standard port for https as a - precaution. - + +The distribution default.filter file contains a selection of +pre-defined filters for your convenience: + + + + + js-annoyances + + + The purpose of this filter is to get rid of particularly annoying JavaScript abuse. + To that end, it + + + + replaces JavaScript references to the browser's referrer information + with the string "Not Your Business!". This compliments the hide-referrer action on the content level. + + + + + removes the bindings to the DOM's + unload + event which we feel has no right to exist and is responsible for most exit consoles, i.e. + nasty windows that pop up when you close another one. + + + + + removes code that causes new windows to be opened with undesired properties, such as being + full-screen, non-resizeable, without location, status or menu bar etc. + + + + + + Use with caution. This is an aggressive filter, and can break sites that + rely heavily on JavaScript. + + + - - The CONNECT methods exists in HTTP to allow access to secure websites - (https:// URLs) through proxies. It works very simply: the proxy - connects to the server on the specified port, and then short-circuits - its connections to the client and to the remote proxy. - This can be a big security hole, since CONNECT-enabled proxies can - be abused as TCP relays very easily. - - - - If you want to allow CONNECT for more ports than this, or want to forbid - CONNECT altogether, you can specify a comma separated list of ports and - port ranges (the latter using dashes, with the minimum defaulting to 0 and - max to 65K): - + + js-events + + + This is a very radical measure. It removes virtually all JavaScript event bindings, which + means that scripts can not react to user actions such as mouse movements or clicks, window + resizing etc, anymore. Use with caution! + + + We strongly discourage using this filter as a default since it breaks + many legitimate scripts. It is meant for use only on extra-nasty sites (should you really + need to go there). + + + - - - - - +limit-connect{443} # This is the default and need no be specified. - +limit-connect{80,443} # Ports 80 and 443 are OK. - +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to 100 - #and above 500 are OK. - - - - + + html-annoyances + + + This filter will undo many common instances of HTML based abuse. + + + The BLINK and MARQUEE tags + are neutralized (yeah baby!), and browser windows will be created as + resizeable (as of course they should be!), and will have location, + scroll and menu bars -- even if specified otherwise. + + + - - - - - +no-compression prevents the website from compressing the - data. Some websites do this, which can be a problem for - Privoxy, since +filter, - +no-popup and +gif-deanimate will not work on - compressed data. This will slow down connections to those websites, - though. Default is nocompression is turned on. - + + content-cookies + + + Most cookies are set in the HTTP dialog, where they can be intercepted + by the + crunch-incoming-cookies + and crunch-outgoing-cookies + actions. But web sites increasingly make use of HTML meta tags and JavaScript + to sneak cookies to the browser on the content level. + + + This filter disables most HTML and JavaScript code that reads or sets + cookies. It cannot detect all clever uses of these types of code, so it + should not be relied on as an absolute fix. Use it wherever you would also + use the cookie crunch actions. + + + - - - - - +nocompression - - - - - - - - - If the website sets cookies, no-cookies-keep will make sure - they are erased when you exit and restart your web browser. This makes - profiling cookies useless, but won't break sites which require cookies so - that you can log in for transactions. Default: on. - - - - - - +no-cookies-keep - - - - - - - - - Prevent the website from reading cookies: - - - - - - +no-cookies-read - - - - - - - - - Prevent the website from setting cookies: - - - - - - +no-cookies-set - - - - - - - - - Filter the website through a built-in filter to disable those obnoxious - JavaScript pop-up windows via window.open(), etc. The two alternative - spellings are equivalent. - - - - - - +no-popup - +no-popups - - - - - - - - - This action only applies if you are using a jarfile - for saving cookies. It sends a cookie to every site stating that you do not - accept any copyright on cookies sent to you, and asking them not to track - you. Of course, this is a (relatively) unique header they could use to - track you. - - - - - - +vanilla-wafer - - - - - - - - - This allows you to add an arbitrary cookie. It can be specified multiple - times in order to add as many cookies as you like. - - - - - - +wafer{name=value} - - - - - + + refresh tags + + + Disable any refresh tags if the interval is greater than nine seconds (so + that redirections done via refresh tags are not destroyed). This is useful + for dial-on-demand setups, or for those who find this HTML feature + annoying. + + + - - + + unsolicited-popups + + + This filter attempts to prevent only unsolicited pop-up + windows from opening, yet still allow pop-up windows that the user + has explicitly chosen to open. It was added in version 3.0.1, + as an improvement over earlier such filters. + + + Technical note: The filter works by redefining the window.open JavaScript + function to a dummy function, PrivoxyWindowOpen(), + during the loading and rendering phase of each HTML page access, and + restoring the function afterward. + + + This is recommended only for browsers that cannot perform this function + reliably themselves. And be aware that some sites require such windows + in order to function normally. Use with caution. + + + - - The meaning of any of the above is reversed by preceding the action with a - -, in place of the +. - + + all-popups + + + Attempt to prevent all pop-up windows from opening. + Note this should be used with even more discretion than the above, since + it is more likely to break some sites that require pop-ups for normal + usage. Use with caution. + + + - - Some examples: - + + img-reorder + + + This is a helper filter that has no value if used alone. It makes the + banners-by-size and banners-by-link + (see below) filters more effective and should be enabled together with them. + + + + + + banners-by-size + + + This filter removes image tags purely based on what size they are. Fortunately + for us, many ads and banner images tend to conform to certain standardized + sizes, which makes this filter quite effective for ad stripping purposes. + + + Occasionally this filter will cause false positives on images that are not ads, + but just happen to be of one of the standard banner sizes. + + + Recommended only for those who require extreme ad blocking. The default + block rules should catch 95+% of all ads without this filter enabled. + + + - - Turn off cookies by default, then allow a few through for specified sites: - - - - - - - # Turn off all persistent cookies - { +no-cookies-read } - { +no-cookies-set } - # Allow cookies for this browser session ONLY - { +no-cookies-keep } - - # Exceptions to the above, sites that benefit from persistent cookies - { -no-cookies-read } - { -no-cookies-set } - { -no-cookies-keep } - .javasoft.com - .sun.com - .yahoo.com - .msdn.microsoft.com - .redhat.com + + banners-by-link + + + This is an experimental filter that attempts to kill any banners if + their URLs seem to point to known or suspected click trackers. It is currently + not of much value and is not recommended for use by default. + + + - # Alternative way of saying the same thing - {-no-cookies-set -no-cookies-read -no-cookies-keep} - .sourceforge.net - .sf.net - - - - + + webbugs + + + Webbugs are small, invisible images (technically 1X1 GIF images), that + are used to track users across websites, and collect information on them. + As an HTML page is loaded by the browser, an embedded image tag causes the + browser to contact a third-party site, disclosing the tracking information + through the requested URL and/or cookies for that third-party domain, without + the user ever becoming aware of the interaction with the third-party site. + HTML-ized spam also uses a similar technique to verify email addresses. + + + This filter removes the HTML code that loads such webbugs. + + + - - Now turn off fast redirects, and then we allow two exceptions: - + + tiny-textforms + + + A rather special-purpose filter that can be used to enlarge textareas (those + multi-line text boxes in web forms) and turn off hard word wrap in them. + It was written for the sourceforge.net tracker system where such boxes are + a nuisance, but it can be handy on other sites, too. + + + It is not recommended to use this filter as a default. + + + - - - - - # Turn them off! - {+fast-redirects} - - # Reverse it for these two sites, which don't work right without it. - {-fast-redirects} - www.ukc.ac.uk/cgi-bin/wac\.cgi\? - login.yahoo.com - - - - + + jumping-windows + + + Many consider windows that move, or resize themselves to be abusive. This filter + neutralizes the related JavaScript code. Note that some sites might not display + or behave as intended when using this filter. Use with caution. + + + - - Turn on page filtering according to rules in the defined sections - of refilterfile, and make one exception for - sourceforge: - + + frameset-borders + + + Some web designers seem to assume that everyone in the world will view their + web sites using the same browser brand and version, screen resolution etc, + because only that assumption could explain why they'd use static frame sizes, + yet prevent their frames from being resized by the user, should they be too + small to show their whole content. + + + This filter removes the related HTML code. It should only be applied to sites + which need it. + + + - - - - - # Run everything through the filter file, using only the - # specified sections: - +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups}\ - +filter{webbugs} +filter{nimda} +filter{banners-by-size} - - # Then disable filtering of code from sourceforge! - {-filter} - .cvs.sourceforge.net - - - - - - - Now some URLs that we want blocked, ie we won't see them. - Many of these use regular expressions that will expand to match multiple - URLs: - - - - - - - # Blocklist: - {+block} - /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g)) - /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/]) - /.*/(ng)?adclient\.cgi - /.*/(plain|live|rotate)[-_.]?ads?/ - /.*/(sponsor)s?[0-9]?/ - /.*/_?(plain|live)?ads?(-banners)?/ - /.*/abanners/ - /.*/ad(sdna_image|gifs?)/ - /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe) - /.*/adbanners/ - /.*/adserver - /.*/adstream\.cgi - /.*/adv((er)?ts?|ertis(ing|ements?))?/ - /.*/banner_?ads/ - /.*/banners?/ - /.*/banners?\.cgi/ - /.*/cgi-bin/centralad/getimage - /.*/images/addver\.gif - /.*/images/marketing/.*\.(gif|jpe?g) - /.*/popupads/ - /.*/siteads/ - /.*/sponsor.*\.gif - /.*/sponsors?[0-9]?/ - /.*/advert[0-9]+\.jpg - /Media/Images/Adds/ - /ad_images/ - /adimages/ - /.*/ads/ - /bannerfarm/ - /grafikk/annonse/ - /graphics/defaultAd/ - /image\.ng/AdType - /image\.ng/transactionID - /images/.*/.*_anim\.gif # alvin brattli - /ip_img/.*\.(gif|jpe?g) - /rotateads/ - /rotations/ - /worldnet/ad\.cgi - /cgi-bin/nph-adclick.exe/ - /.*/Image/BannerAdvertising/ - /.*/ad-bin/ - /.*/adlib/server\.cgi - /autoads/ - - - - + + demoronizer + + + Many Microsoft products that generate HTML use non-standard extensions (read: + violations) of the ISO 8859-1 aka Latin-1 character set. This can cause those + HTML documents to display with errors on standard-compliant platforms. + + + This filter translates the MS-only characters into Latin-1 equivalents. + It is not necessary when using MS products, and will cause corruption of + all documents that use 8-bit character sets other than Latin-1. It's mostly + worthwhile for Europeans on non-MS platforms, if weird garbage characters + sometimes appear on some pages, or user agents that don't correct for this on + the fly. + + + + - - Note that many of these actions have the potential to cause a page to - misbehave, possibly even not to display at all. There are many ways - a site designer may choose to design his site, and what HTTP header - content he may depend on. There is no way to have hard and fast rules - for all sites. See the Appendix - for a brief example on troubleshooting actions. + + shockwave-flash + + + A filter for shockwave haters. As the name suggests, this filter strips code + out of web pages that is used to embed shockwave flash objects. + + + + + - + + quicktime-kioskmode + + + Change HTML code that embeds Quicktime objects so that kioskmode, which + prevents saving, is disabled. + + + -
+ + fun + + + Text replacements for subversive browsing fun. Make fun of your favorite + Monopolist or play buzzword bingo. + + + - + + crude-parental + + + A demonstration-only filter that shows how Privoxy + can be used to delete web content on a keyword basis. + + + + + ie-exploits + + + An experimental collection of text replacements to disable malicious HTML and JavaScript + code that exploits known security holes in Internet Explorer. + + + Presently, it only protects against Nimda and a cross-site scripting bug, and + would need active maintenance to provide more substantial protection. + + + - - -Aliases - - Custom actions, known to Privoxy - as aliases, can be defined by combining other actions. - These can in turn be invoked just like the built-in actions. - Currently, an alias can contain any character except space, tab, =, - { or }. But please use only a- - z, 0-9, +, and - -. Alias names are not case sensitive, and - must be defined before anything else in the - default.actionfile ! And there can only be one set of - aliases defined. - + + site-specifics + + + Some web sites have very specific problems, the cure for which doesn't apply + anywhere else, or could even cause damage on other sites. + + + This is a collection of such site-specific cures which should only be applied + to the sites they were intended for, which is what the supplied + default.action file does. Users shouldn't need to change + anything regarding this filter. + + + - - Now let's define a few aliases: - + + google + + + A CSS based block for Google text ads. Also removes a width limitation + and the toolbar advertisement. + + + + + + yahoo + + + Another CSS based block, this time for Yahoo text ads. And removes + a width limitation as well. + + + - - - - - # Useful customer aliases we can use later. These must come first! - {{alias}} - +no-cookies = +no-cookies-set +no-cookies-read - -no-cookies = -no-cookies-set -no-cookies-read - fragile = -block -no-cookies -filter -fast-redirects -hide-referer -no-popups - shop = -no-cookies -filter -fast-redirects - +imageblock = +block +image + + msn + + + Another CSS based block, this time for MSN text ads. And removes + tracking URLs, as well as a width limitation. + + + - #For people who don't like to type too much: ;-) - c0 = +no-cookies - c1 = -no-cookies - c2 = -no-cookies-set +no-cookies-read - c3 = +no-cookies-set -no-cookies-read - #... etc. Customize to your heart's content. - - - - + + blogspot + + + Cleans up some Blogspot blogs. Read the fine print before using this one! + + + This filter also intentionally removes some navigation stuff and sets the + page width to 100%. As a result, some rounded corners would + appear to early or not at all and as fixing this would require a browser + that understands background-size (CSS3), they are removed instead. + + + - - Some examples using our shop and fragile - aliases from above: - + + xml-to-html + + + Server-header filter to change the Content-Type from xml to html. + + + + + + html-to-xml + + + Server-header filter to change the Content-Type from html to xml. + + + - - - - - # These sites are very complex and require - # minimal interference. - {fragile} - .office.microsoft.com - .windowsupdate.microsoft.com - .nytimes.com + + no-ping + + + Removes the non-standard ping attribute from + anchor and area HTML tags. + + + - # Shopping sites - still want to block ads. - {shop} - .quietpc.com - .worldpay.com # for quietpc.com - .jungle.com - .scan.co.uk + + hide-tor-exit-notation + + + Client-header filter to remove the Tor exit node notation + found in Host and Referer headers. + + + If &my-app; and Tor are chained and &my-app; + is configured to use socks4a, one can use http://www.example.org.foobar.exit/ + to access the host www.example.org through the + Tor exit node foobar. + + + As the HTTP client isn't aware of this notation, it treats the + whole string www.example.org.foobar.exit as host and uses it + for the Host and Referer headers. From the + server's point of view the resulting headers are invalid and can cause problems. + + + An invalid Referer header can trigger hot-linking + protections, an invalid Host header will make it impossible for + the server to find the right vhost (several domains hosted on the same IP address). + + + This client-header filter removes the foo.exit part in those headers + to prevent the mentioned problems. Note that it only modifies + the HTTP headers, it doesn't make it impossible for the server + to detect your Tor exit node based on the IP address + the request is coming from. + + + - # These shops require pop-ups - {shop -no-popups} - .dabs.com - .overclockers.co.uk - - - - + + -
+ + - -The Filter File - - Any web page can be dynamically modified with the filter file. This - modification can be removal, or re-writing, of any web page content, - including tags and non-visible content. The default filter file is - default.filter, located in the config directory. - + +Privoxy's Template Files - The included example file is divided into sections. Each section begins - with the FILTER keyword, followed by the identifier - for that section, e.g. FILTER: webbugs. Each section performs - a similar type of filtering, such as html-annoyances. - + All Privoxy built-in pages, i.e. error pages such as the + 404 - No Such Domain + error page, the BLOCKED + page + and all pages of its web-based + user interface, are generated from templates. + (Privoxy must be running for the above links to work as + intended.) - This file uses regular expressions to alter or remove any string in the - target page. The expressions can only operate on one line at a time. Some - examples from the included default default.filter: + These templates are stored in a subdirectory of the configuration + directory called templates. On Unixish platforms, + this is typically + /etc/privoxy/templates/. - Stop web pages from displaying annoying messages in the status bar by - deleting such references: + The templates are basically normal HTML files, but with place-holders (called symbols + or exports), which Privoxy fills at run time. You can + edit the templates with a normal text editor, should you want to customize them. + (Not recommended for the casual user). Note that + just like in configuration files, lines starting with # are + ignored when the templates are filled in. - - - - FILTER: html-annoyances - - # New browser windows should be resizeable and have a location and status - # bar. Make it so. - # - s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig - s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig - s/scrolling="?(no|0|Auto)"?/scrolling=1/ig - s/menubar="?(no|0)"?/menubar=1/ig - - # The <BLINK> tag was a crime! - # - s*<blink>|</blink>**ig - - # Is this evil? - # - #s/framespacing="?(no|0)"?//ig - #s/margin(height|width)=[0-9]*//gi - - - + The place-holders are of the form @name@, and you will + find a list of available symbols, which vary from template to template, + in the comments at the start of each file. Note that these comments are not + always accurate, and that it's probably best to look at the existing HTML + code to find out which symbols are supported and what they are filled in with. - Just for kicks, replace any occurrence of Microsoft with - MicroSuck, and have a little fun with topical buzzwords: + A special application of this substitution mechanism is to make whole + blocks of HTML code disappear when a specific symbol is set. We use this + for many purposes, one of them being to include the beta warning in all + our user interface (CGI) pages when Privoxy + is in an alpha or beta development stage: - - - - FILTER: fun + +<!-- @if-unstable-start --> - s/microsoft(?!.com)/MicroSuck/ig + ... beta warning HTML code goes here ... - # Buzzword Bingo: - # - s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig - - - +<!-- if-unstable-end@ --> - Kill those pesky little web-bugs: + If the "unstable" symbol is set, everything in between and including + @if-unstable-start and if-unstable-end@ + will disappear, leaving nothing but an empty comment: - - - - # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking) - FILTER: webbugs - - s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig - - - + <!-- --> - - - - - - - - - -Templates - When Privoxy displays one of its internal - pages, such as a 404 Not Found error page, it uses the appropriate template. - On Linux, BSD, and Unix, these are located in - /etc/privoxy/templates by default. These may be - customized, if desired. + There's also an if-then-else construct and an #include + mechanism, but you'll sure find out if you are inclined to edit the + templates ;-) + + + All templates refer to a style located at + http://config.privoxy.org/send-stylesheet. + This is, of course, locally served by Privoxy + and the source for it can be found and edited in the + cgi-style.css template. - @@ -3241,144 +7760,56 @@ icon being being cached by the browser, which will speed up the display. Contacting the Developers, Bug Reporting and Feature Requests - -We value your feedback. However, to provide you with the best support, -please note: - - - - Use the Sourceforge support forum to get - help. - - Submit bugs only thru our Sourceforge bug - forum. -Make sure that the bug has not already been submitted. Please try to -verify that it is a Privoxy bug, and not -a browser or site bug first. If you are using your own custom configuration, -please try the stock configs to see if the problem is a configuration -related bug. And if not using the latest development snapshot, please -try the latest one. Or even better, CVS sources. - - - - Submit feature requests only thru our Sourceforge feature request forum. - - - - - - -For any other issues, feel free to use the mailing lists. - - - - Anyone interested in actively participating in development and related - discussions can join the appropriate mailing list - here. - Archives are available here too. - + + &contacting; + + + -Copyright and History - - -License - - Privoxy is free software; you can - redistribute it and/or modify it under the terms of the GNU General Public - License as published by the Free Software Foundation; either version 2 of the - License, or (at your option) any later version. - +Privoxy Copyright, License and History - - This program is distributed in the hope that it will be useful, but WITHOUT - ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS - FOR A PARTICULAR PURPOSE. See the GNU General Public License for more - details, which is available from the Free Software Foundation, - Inc, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. - + + ©right; + + +License + + &license; + - - -History - - Privoxy is evolved, and derived from, - the Internet Junkbuster, with many - improvments and enhancements over the original. - - - - Junkbuster was originally written by Anonymous - Coders and Junkbuster's - Corporation, and was released as free open-source software under the - GNU GPL. Stefan - Waldherr made many improvements, and started the SourceForge project - Privoxy to rekindle development. There are now several active - developers contributing. The last stable release of - Junkbuster was v2.0.2, which has now - grown whiskers ;-). - +History + + &history; + + +Authors + + &p-authors; + - -See also - - - - -   http://sourceforge.net/projects/ijbswa, - the Project Page for Privoxy. - - - - -   http://www.privoxy.org/ - - - - -   http://p.p/ - - - - -   http://www.junkbusters.com/ht/en/cookies.html - - - - -   http://www.waldherr.org/junkbuster/ - - - - -   http://privacy.net/analyze/ - - - - -  http://www.squid-cache.org/ - - + - + + +See Also + + &seealso; + @@ -3391,30 +7822,33 @@ For any other issues, feel free to use the Regular Expressions - Privoxy can use regular expressions - in various config files. Assuming support for pcre (Perl - Compatible Regular Expressions) is compiled in, which is the default. Such - configuration directives do not require regular expressions, but they can be - used to increase flexibility by matching a pattern with wild-cards against - URLs. + Privoxy uses Perl-style regular + expressions in its actions + files and filter file, + through the PCRE and + + PCRS libraries. If you are reading this, you probably don't understand what regular expressions are, or what they can do. So this will be a very brief - introduction only. A full explanation would require a book ;-) + introduction only. A full explanation would require a book ;-) - Regular expressions is a way of matching one character - expression against another to see if it matches or not. One of the - expressions is a literal string of readable characters - (letter, numbers, etc), and the other is a complex string of literal - characters combined with wild-cards, and other special characters, called - meta-characters. The meta-characters have special meanings and - are used to build the complex pattern to be matched against. Perl Compatible - Regular Expressions is an enhanced form of the regular expression language - with backward compatibility. + Regular expressions provide a language to describe patterns that can be + run against strings of characters (letter, numbers, etc), to see if they + match the string or not. The patterns are themselves (sometimes complex) + strings of literal characters, combined with wild-cards, and other special + characters, called meta-characters. The meta-characters have + special meanings and are used to build complex patterns to be matched against. + Perl Compatible Regular Expressions are an especially convenient + dialect of the regular expression language. @@ -3435,72 +7869,71 @@ For any other issues, feel free to use the - A now something a little more complex: + And now something a little more complex: @@ -3566,7 +7999,7 @@ For any other issues, feel free to use the - - s/microsoft(?!.com)/MicroSuck/i - This is - a substitution. MicroSuck will replace any occurrence of - microsoft. The i at the end of the expression - means ignore case. The (?!.com) means - the match should fail if microsoft is followed by - .com. In other words, this acts like a NOT - modifier. In case this is a hyperlink, we don't want to break it ;-). - - We are barely scratching the surface of regular expressions here so that you can understand the default Privoxy @@ -3612,9 +8035,14 @@ For any other issues, feel free to use the http://www.perldoc.com/perl5.6/pod/perlre.html + http://perldoc.perl.org/perlre.html + + For information on regular expression based substitutions and their applications + in filters, please see the filter file tutorial + in this manual. + @@ -3622,7 +8050,7 @@ For any other issues, feel free to use the - Alternately, this may be reached at http://p.p/, but this - variation may not work as reliably as the above in some configurations. + There is a shortcut: http://p.p/ (But it + doesn't provide a fall-back to a real page, in case the request is not + sent through Privoxy) - Show information about the current configuration: + Show information about the current configuration, including viewing and + editing of actions files:
@@ -3687,7 +8116,7 @@ For any other issues, feel free to use the
- - - - Edit the actions list file: - -
- - http://config.privoxy.org/edit-actions - -
-
- These may be bookmarked for quick reference. + These may be bookmarked for quick reference. See next. Bookmarklets - Here are some bookmarklets to allow you to easily access a - mini version of this page. They are designed for MS Internet - Explorer, but should work equally well in Netscape, Mozilla, and other - browsers which support JavaScript. They are designed to run directly from - your bookmarks - not by clicking the links below (although that will work for - testing). + Below are some bookmarklets to allow you to easily access a + mini version of some of Privoxy's + special pages. They are designed for MS Internet Explorer, but should work + equally well in Netscape, Mozilla, and other browsers which support + JavaScript. They are designed to run directly from your bookmarks - not by + clicking the links below (although that should work for testing). To save them, right-click the link and choose Add to Favorites (IE) or Add Bookmark (Netscape). You will get a warning that the bookmark may not be safe - just click OK. Then you can run the - Bookmarklet directly from your favourites/bookmarks. For even faster access, + Bookmarklet directly from your favorites/bookmarks. For even faster access, you can put them on the Links bar (IE) or the Personal Toolbar (Netscape), and run them with a single click. @@ -3775,34 +8193,49 @@ For any other issues, feel free to use the Enable Privoxy + Privoxy - Enable - Disable Privoxy + Privoxy - Disable - Toggle Privoxy (Toggles between enabled and disabled) + Privoxy - Toggle Privoxy (Toggles between enabled and disabled) - View Privoxy Status + Privoxy- View Status + + + + + + Privoxy - Why? - - Credit: The site which gave me the general idea for these bookmarklets is - www.bookmarklets.com. They + Credit: The site which gave us the general idea for these bookmarklets is + www.bookmarklets.com. They have more information about bookmarklets. @@ -3812,116 +8245,336 @@ For any other issues, feel free to use the +Chain of Events + + Let's take a quick look at how some of Privoxy's + core features are triggered, and the ensuing sequence of events when a web + page is requested by your browser: + + + + + + + First, your web browser requests a web page. The browser knows to send + the request to Privoxy, which will in turn, + relay the request to the remote web server after passing the following + tests: + + + + + Privoxy traps any request for its own internal CGI + pages (e.g http://p.p/) and sends the CGI page back to the browser. + + + + + Next, Privoxy checks to see if the URL + matches any +block patterns. If + so, the URL is then blocked, and the remote web server will not be contacted. + +handle-as-image + and + +handle-as-empty-document + are then checked, and if there is no match, an + HTML BLOCKED page is sent back to the browser. Otherwise, if + it does match, an image is returned for the former, and an empty text + document for the latter. The type of image would depend on the setting of + +set-image-blocker + (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere). + + + + + Untrusted URLs are blocked. If URLs are being added to the + trust file, then that is done. + + + + + If the URL pattern matches the +fast-redirects action, + it is then processed. Unwanted parts of the requested URL are stripped. + + + + + Now the rest of the client browser's request headers are processed. If any + of these match any of the relevant actions (e.g. +hide-user-agent, + etc.), headers are suppressed or forged as determined by these actions and + their parameters. + + + + + Now the web server starts sending its response back (i.e. typically a web + page). + + + + + First, the server headers are read and processed to determine, among other + things, the MIME type (document type) and encoding. The headers are then + filtered as determined by the + +crunch-incoming-cookies, + +session-cookies-only, + and +downgrade-http-version + actions. + + + + + If the +kill-popups + action applies, and it is an HTML or JavaScript document, the popup-code in the + response is filtered on-the-fly as it is received. + + + + + If any +filter action + or +deanimate-gifs + action applies (and the document type fits the action), the rest of the page is + read into memory (up to a configurable limit). Then the filter rules (from + default.filter and any other filter files) are + processed against the buffered content. Filters are applied in the order + they are specified in one of the filter files. Animated GIFs, if present, + are reduced to either the first or last frame, depending on the action + setting.The entire page, which is now filtered, is then sent by + Privoxy back to your browser. + + + If neither a +filter action + or +deanimate-gifs + matches, then Privoxy passes the raw data through + to the client browser as it becomes available. + + + + + As the browser receives the now (possibly filtered) page content, it + reads and then requests any URLs that may be embedded within the page + source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g. + frames), sounds, etc. For each of these objects, the browser issues a + separate request (this is easily viewable in Privoxy's + logs). And each such request is in turn processed just as above. Note that a + complex web page will have many, many such embedded URLs. If these + secondary requests are to a different server, then quite possibly a very + differing set of actions is triggered. + + + + + + + NOTE: This is somewhat of a simplistic overview of what happens with each URL + request. For the sake of brevity and simplicity, we have focused on + Privoxy's core features only. + + +
+ + -Anatomy of an Action - - - The way Privoxy applies actions - to any given URL can be complex, and not always so easy to understand what - is happening. And sometimes we need to be able to see - just what Privoxy is doing. Especially, - if something Privoxy is doing is causing - us a problem inadvertantly. It can be a little daunting to look at - the actions files themselves, since they tend to be filled with - regular expressions whose consequences are not always - so obvious. Privoxy provides the - http://config.privoxy.org/show-url-info - page that can show us very specifically how actions - are being applied to any given URL. This is a big help for troubleshooting. - +Troubleshooting: Anatomy of an Action - First, enter one URL (or partial URL) at the prompt, and then - Privoxy will tell us - how the current configuration will handle it. This will not - help with filtering effects from the default.filter file! It - also will not tell you about any other URLs that may be embedded within the - URL you are testing. For instance, images such as ads are expressed as URLs - within the raw page source of HTML pages. So you will only get info for the - actual URL that is pasted into the prompt area -- not any sub-URLs. If you - want to know about embedded URLs like ads, you will have to dig those out of - the HTML source. Use your browser's View Page Source option - for this. + The way Privoxy applies + actions and filters + to any given URL can be complex, and not always so + easy to understand what is happening. And sometimes we need to be able to + see just what Privoxy is + doing. Especially, if something Privoxy is doing + is causing us a problem inadvertently. It can be a little daunting to look at + the actions and filters files themselves, since they tend to be filled with + regular expressions whose consequences are not + always so obvious. - Let's look at an example, google.com, - one section at a time: + One quick test to see if Privoxy is causing a problem + or not, is to disable it temporarily. This should be the first troubleshooting + step. See the Bookmarklets section on a quick + and easy way to do this (be sure to flush caches afterward!). Looking at the + logs is a good idea too. + + + Another easy troubleshooting step to try is if you have done any + customization of your installation, revert back to the installed + defaults and see if that helps. There are times the developers get complaints + about one thing or another, and the problem is more related to a customized + configuration issue. - - System default actions: + Privoxy also provides the + http://config.privoxy.org/show-url-info + page that can show us very specifically how actions + are being applied to any given URL. This is a big help for troubleshooting. + - { -add-header -block -deanimate-gifs -downgrade -fast-redirects -filter - -hide-forwarded -hide-from -hide-referer -hide-user-agent -image - -image-blocker -limit-connect -no-compression -no-cookies-keep - -no-cookies-read -no-cookies-set -no-popups -vanilla-wafer -wafer } - - + + First, enter one URL (or partial URL) at the prompt, and then + Privoxy will tell us + how the current configuration will handle it. This will not + help with filtering effects (i.e. the +filter action) from + one of the filter files since this is handled very + differently and not so easy to trap! It also will not tell you about any other + URLs that may be embedded within the URL you are testing. For instance, images + such as ads are expressed as URLs within the raw page source of HTML pages. So + you will only get info for the actual URL that is pasted into the prompt area + -- not any sub-URLs. If you want to know about embedded URLs like ads, you + will have to dig those out of the HTML source. Use your browser's View + Page Source option for this. Or right click on the ad, and grab the + URL. - This is the top section, and only tells us of the compiled in defaults. This - is basically what Privoxy would do if there - were not any actions defined, i.e. it does nothing. Every action - is disabled. This is not particularly informative for our purposes here. OK, - next section: + Let's try an example, google.com, + and look at it one section at a time in a sample configuration (your real + configuration may vary): - Matches for http://google.com: - { -add-header -block +deanimate-gifs -downgrade +fast-redirects - +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups} - +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal} - +filter{fun} +hide-forwarded +hide-from{block} +hide-referer{forge} - -hide-user-agent -image +image-blocker{blank} +no-compression - +no-cookies-keep -no-cookies-read -no-cookies-set +no-popups - -vanilla-wafer -wafer } - / - - { -no-cookies-keep -no-cookies-read -no-cookies-set } - .google.com + In file: default.action [ View ] [ Edit ] + + {-add-header + -block + -client-header-filter{hide-tor-exit-notation} + -content-type-overwrite + -crunch-client-header + -crunch-if-none-match + -crunch-incoming-cookies + -crunch-outgoing-cookies + -crunch-server-header + +deanimate-gifs {last} + -downgrade-http-version + +fast-redirects {check-decoded-url} + -filter {js-events} + -filter {content-cookies} + -filter {all-popups} + -filter {banners-by-link} + -filter {tiny-textforms} + -filter {frameset-borders} + -filter {demoronizer} + -filter {shockwave-flash} + -filter {quicktime-kioskmode} + -filter {fun} + -filter {crude-parental} + -filter {site-specifics} + -filter {js-annoyances} + -filter {html-annoyances} + +filter {refresh-tags} + -filter {unsolicited-popups} + +filter {img-reorder} + +filter {banners-by-size} + +filter {webbugs} + +filter {jumping-windows} + +filter {ie-exploits} + -filter {google} + -filter {yahoo} + -filter {msn} + -filter {blogspot} + -filter {no-ping} + -force-text-mode + -handle-as-empty-document + -handle-as-image + -hide-accept-language + -hide-content-disposition + +hide-forwarded-for-headers + +hide-from-header {block} + -hide-if-modified-since + +hide-referrer {forge} + -hide-user-agent + -inspect-jpegs + -kill-popups + -limit-connect + -overwrite-last-modified + +prevent-compression + -redirect + -send-vanilla-wafer + -send-wafer + -server-header-filter{xml-to-html} + -server-header-filter{html-to-xml} + +session-cookies-only + +set-image-blocker {pattern} + -treat-forbidden-connects-like-blocks } +/ + + { -session-cookies-only } + .google.com { -fast-redirects } - .google.com + .google.com - +In file: user.action [ View ] [ Edit ] +(no matches in this file) + - This is much more informative, and tells us how we have defined our - actions, and which ones match for our example, - google.com. The first grouping shows our default - settings, which would apply to all URLs. If you look at your actions - file, this would be the section just below the aliases section - near the top. This applies to all URLs as signified by the single forward - slash -- /. - + This is telling us how we have defined our + actions, and + which ones match for our test case, google.com. + Displayed is all the actions that are available to us. Remember, + the + sign denotes on. - + denotes off. So some are on here, but many + are off. Each example we try may provide a slightly different + end result, depending on our configuration directives. + + + The first listing + is for our default.action file. The large, multi-line + listing, is how the actions are set to match for all URLs, i.e. our default + settings. If you look at your actions file, this would be the + section just below the aliases section near the top. This + will apply to all URLs as signified by the single forward slash at the end + of the listing -- / . - These are the default actions we have enabled. But we can define additional - actions that would be exceptions to these general rules, and then list - specific URLs that these exceptions would apply to. Last match wins. - Just below this then are two explict matches for .google.com. - The first is negating our various cookie blocking actions (i.e. we will allow - cookies here). The second is allowing fast-redirects. Note - that there is a leading dot here -- .google.com. This will - match any hosts and sub-domains, in the google.com domain also, such as - www.google.com. So, apparently, we have these actions defined - somewhere in the lower part of our actions file, and - google.com is referenced in these sections. + But we have defined additional actions that would be exceptions to these general + rules, and then we list specific URLs (or patterns) that these exceptions + would apply to. Last match wins. Just below this then are two explicit + matches for .google.com. The first is negating our previous + cookie setting, which was for +session-cookies-only + (i.e. not persistent). So we will allow persistent cookies for google, at + least that is how it is in this example. The second turns + off any +fast-redirects + action, allowing this to take place unmolested. Note that there is a leading + dot here -- .google.com. This will match any hosts and + sub-domains, in the google.com domain also, such as + www.google.com or mail.google.com. But it would not + match www.google.de! So, apparently, we have these two actions + defined as exceptions to the general rules at the top somewhere in the lower + part of our default.action file, and + google.com is referenced somewhere in these latter sections. + + + Then, for our user.action file, we again have no hits. + So there is nothing google-specific that we might have added to our own, local + configuration. If there was, those actions would over-rule any actions from + previously processed files, such as default.action. + user.action typically has the last word. This is the + best place to put hard and fast exceptions, - And now we pull it altogether in the bottom section and summarize how - Privoxy is appying all its actions + And finally we pull it all together in the bottom section and summarize how + Privoxy is applying all its actions to google.com: @@ -3930,16 +8583,75 @@ For any other issues, feel free to use the - { +block +image } - .ad.doubleclick.net - - { +block +image } + { +block } ad*. - { +block +image } - .doubleclick.net + { +block } + .ad. - + { +block +handle-as-image } + .[a-vx-z]*.doubleclick.net + - We'll just show the interesting part here, the explicit matches. It is - matched three different times. Each as an +block +image, + We'll just show the interesting part here - the explicit matches. It is + matched three different times. Two +block sections, + and a +block +handle-as-image, which is the expanded form of one of our aliases that had been defined as: - +imageblock. (Aliases are defined in the - first section of the actions file and typically used to combine more + +block-as-image. (Aliases are defined in + the first section of the actions file and typically used to combine more than one action.) @@ -3976,42 +8689,104 @@ For any other issues, feel free to use the +block + and an + +handle-as-image. + The custom alias +block-as-image just + simplifies the process and make it more readable. - One last example. Let's try http://www.rhapsodyk.net/adsl/HOWTO/. - This one is giving us problems. We are getting a blank page. Hmmm... + One last example. Let's try http://www.example.net/adsl/HOWTO/. + This one is giving us problems. We are getting a blank page. Hmmm ... - Matches for http://www.rhapsodyk.net/adsl/HOWTO/: - - { -add-header -block +deanimate-gifs -downgrade +fast-redirects - +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups} - +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal} - +filter{fun} +hide-forwarded +hide-from{block} +hide-referer{forge} - -hide-user-agent -image +image-blocker{blank} +no-compression - +no-cookies-keep -no-cookies-read -no-cookies-set +no-popups - -vanilla-wafer -wafer } + Matches for http://www.example.net/adsl/HOWTO/: + + In file: default.action [ View ] [ Edit ] + + {-add-header + -block + -client-header-filter{hide-tor-exit-notation} + -content-type-overwrite + -crunch-client-header + -crunch-if-none-match + -crunch-incoming-cookies + -crunch-outgoing-cookies + -crunch-server-header + +deanimate-gifs + -downgrade-http-version + +fast-redirects {check-decoded-url} + -filter {js-events} + -filter {content-cookies} + -filter {all-popups} + -filter {banners-by-link} + -filter {tiny-textforms} + -filter {frameset-borders} + -filter {demoronizer} + -filter {shockwave-flash} + -filter {quicktime-kioskmode} + -filter {fun} + -filter {crude-parental} + -filter {site-specifics} + -filter {js-annoyances} + -filter {html-annoyances} + +filter {refresh-tags} + -filter {unsolicited-popups} + +filter {img-reorder} + +filter {banners-by-size} + +filter {webbugs} + +filter {jumping-windows} + +filter {ie-exploits} + -filter {google} + -filter {yahoo} + -filter {msn} + -filter {blogspot} + -filter {no-ping} + -force-text-mode + -handle-as-empty-document + -handle-as-image + -hide-accept-language + -hide-content-disposition + +hide-forwarded-for-headers + +hide-from-header{block} + +hide-referer{forge} + -hide-user-agent + -inspect-jpegs + -kill-popups + -overwrite-last-modified + +prevent-compression + -redirect + -send-vanilla-wafer + -send-wafer + -server-header-filter{xml-to-html} + -server-header-filter{html-to-xml} + +session-cookies-only + +set-image-blocker{blank} + -treat-forbidden-connects-like-blocks } / - { +block +image } + { +block +handle-as-image } /ads - - + - Ooops, the /adsl/ is matching /ads! But - we did not want this at all! Now we see why we get the blank page. We could - now add a new action below this that explictly does not - block (-block) pages with adsl. There are various ways to - handle such exceptions. Example: + Ooops, the /adsl/ is matching /ads in our + configuration! But we did not want this at all! Now we see why we get the + blank page. It is actually triggering two different actions here, and + the effects are aggregated so that the URL is blocked, and &my-app; is told + to treat the block as if it were an image. But this is, of course, all wrong. + We could now add a new action below this (or better in our own + user.action file) that explicitly + un blocks ( + {-block}) paths with + adsl in them (remember, last match in the configuration + wins). There are various ways to handle such exceptions. Example: @@ -4019,66 +8794,112 @@ For any other issues, feel free to use the +filter actions. + These tend to be harder to troubleshoot. + Try adding the URL for the site to one of aliases that turn off + +filter: - {shop} + { shop } .quietpc.com .worldpay.com # for quietpc.com .jungle.com .scan.co.uk .forbes.com - - + - {shop} is an alias that expands to - { -filter -no-cookies -no-cookies-keep }. Or you could do - your own exception to negate filtering: + { shop } is an alias that expands to + { -filter -session-cookies-only }. + Or you could do your own exception to negate filtering: - {-filter} + { -filter } + # Disable ALL filter actions for sites in this section .forbes.com + developer.ibm.com + localhost + + + + + This would turn off all filtering for these sites. This is best + put in user.action, for local site + exceptions. Note that when a simple domain pattern is used by itself (without + the subsequent path portion), all sub-pages within that domain are included + automatically in the scope of the action. + + + + Images that are inexplicably being blocked, may well be hitting the ++filter{banners-by-size} + rule, which assumes + that images of certain sizes are ad banners (works well + most of the time since these tend to be standardized). + + + + { fragile } is an alias that disables most + actions that are the most likely to cause trouble. This can be used as a + last resort for problem sites. + + + + + { fragile } + # Handle with care: easy to break + mail.google. + mybank.example.com + - + + + Remember to flush caches! Note that the + mail.google reference lacks the TLD portion (e.g. + .com. This will effectively match any TLD with + google in it, such as mail.google.de, + just as an example. + + + If this still does not work, you will have to go through the remaining + actions one by one to find which one(s) is causing the problem. @@ -4102,10 +8923,454 @@ For any other issues, feel free to use the style. + - Small fixes in the actions chapter + - Small clarifications in the quickstart to ad blocking + - Removed from s since the new doc CSS + renders them red (bad in TOC). + + Revision 1.120 2002/05/23 19:16:43 roro + Correct Debian specials (installation and startup). + + Revision 1.119 2002/05/22 17:17:05 oes + Added Security hint + + Revision 1.118 2002/05/21 04:54:55 hal9 + -New Section: Quickstart to Ad Blocking + -Reformat Actions Anatomy to match new CGI layout + + Revision 1.117 2002/05/17 13:56:16 oes + - Reworked & extended Templates chapter + - Small changes to Regex appendix + - #included authors.sgml into (C) and hist chapter + + Revision 1.116 2002/05/17 03:23:46 hal9 + Fixing merge conflict in Quickstart section. + + Revision 1.115 2002/05/16 16:25:00 oes + Extended the Filter File chapter & minor fixes + + Revision 1.114 2002/05/16 09:42:50 oes + More ulink->link, added some hints to Quickstart section + + Revision 1.113 2002/05/15 21:07:25 oes + Extended and further commented the example actions files + + Revision 1.112 2002/05/15 03:57:14 hal9 + Spell check. A few minor edits here and there for better syntax and + clarification. + + Revision 1.111 2002/05/14 23:01:36 oes + Fixing the fixes + + Revision 1.110 2002/05/14 19:10:45 oes + Restored alphabetical order of actions + + Revision 1.109 2002/05/14 17:23:11 oes + Renamed the prevent-*-cookies actions, extended aliases section and moved it before the example AFs + + Revision 1.108 2002/05/14 15:29:12 oes + Completed proofreading the actions chapter + + Revision 1.107 2002/05/12 03:20:41 hal9 + Small clarifications for 127.0.0.1 vs localhost for listen-address since this + apparently an important distinction for some OS's. + + Revision 1.106 2002/05/10 01:48:20 hal9 + This is mostly proposed copyright/licensing additions and changes. Docs + are still GPL, but licensing and copyright are more visible. Also, copyright + changed in doc header comments (eliminate references to JB except FAQ). + + Revision 1.105 2002/05/05 20:26:02 hal9 + Sorting out license vs copyright in these docs. + + Revision 1.104 2002/05/04 08:44:45 swa + bumped version + + Revision 1.103 2002/05/04 00:40:53 hal9 + -Remove the TOC first page kludge. It's fixed proper now in ldp.dsl.in. + -Some minor additions to Quickstart. + + Revision 1.102 2002/05/03 17:46:00 oes + Further proofread & reactivated short build instructions + + Revision 1.101 2002/05/03 03:58:30 hal9 + Move the user-manual config directive to top of section. Add note about + Privoxy needing read permissions for configs, and write for logs. + + Revision 1.100 2002/04/29 03:05:55 hal9 + Add clarification on differences of new actions files. + + Revision 1.99 2002/04/28 16:59:05 swa + more structure in starting section + + Revision 1.98 2002/04/28 05:43:59 hal9 + This is the break up of configuration.html into multiple files. This + will probably break links elsewhere :( + + Revision 1.97 2002/04/27 21:04:42 hal9 + -Rewrite of Actions File example. + -Add section for user-manual directive in config. + + Revision 1.96 2002/04/27 05:32:00 hal9 + -Add short section to Filter Files to tie in with +filter action. + -Start rewrite of examples in Actions Examples (not finished). + + Revision 1.95 2002/04/26 17:23:29 swa + bookmarks cleaned, changed structure of user manual, screen and programlisting cleanups, and numerous other changes that I forgot + + Revision 1.94 2002/04/26 05:24:36 hal9 + -Add most of Andreas suggestions to Chain of Events section. + -A few other minor corrections and touch up. + + Revision 1.92 2002/04/25 18:55:13 hal9 + More catchups on new actions files, and new actions names. + Other assorted cleanups, and minor modifications. + + Revision 1.91 2002/04/24 02:39:31 hal9 + Add 'Chain of Events' section. + + Revision 1.90 2002/04/23 21:41:25 hal9 + Linuxconf is deprecated on RH, substitute chkconfig. + + Revision 1.89 2002/04/23 21:05:28 oes + Added hint for startup on Red Hat + + Revision 1.88 2002/04/23 05:37:54 hal9 + Add AmigaOS install stuff. + + Revision 1.87 2002/04/23 02:53:15 david__schmidt + Updated OSX installation section + Added a few English tweaks here an there + + Revision 1.86 2002/04/21 01:46:32 hal9 + Re-write actions section. + + Revision 1.85 2002/04/18 21:23:23 hal9 + Fix ugly typo (mine). + + Revision 1.84 2002/04/18 21:17:13 hal9 + Spell Redhat correctly (ie Red Hat). A few minor grammar corrections. + + Revision 1.83 2002/04/18 18:21:12 oes + Added RPM install detail + + Revision 1.82 2002/04/18 12:04:50 oes + Cosmetics + + Revision 1.81 2002/04/18 11:50:24 oes + Extended Install section - needs fixing by packagers + + Revision 1.80 2002/04/18 10:45:19 oes + Moved text to buildsource.sgml, renamed some filters, details + + Revision 1.79 2002/04/18 03:18:06 hal9 + Spellcheck, and minor touchups. + + Revision 1.78 2002/04/17 18:04:16 oes + Proofreading part 2 + + Revision 1.77 2002/04/17 13:51:23 oes + Proofreading, part one + + Revision 1.76 2002/04/16 04:25:51 hal9 + -Added 'Note to Upgraders' and re-ordered the 'Quickstart' section. + -Note about proxy may need requests to re-read config files. + + Revision 1.75 2002/04/12 02:08:48 david__schmidt + Remove OS/2 building info... it is already in the developer-manual + + Revision 1.74 2002/04/11 00:54:38 hal9 + Add small section on submitting actions. + + Revision 1.73 2002/04/10 18:45:15 swa + generated + + Revision 1.72 2002/04/10 04:06:19 hal9 + Added actions feedback to Bookmarklets section + + Revision 1.71 2002/04/08 22:59:26 hal9 + Version update. Spell chkconfig correctly :) + + Revision 1.70 2002/04/08 20:53:56 swa + ? + + Revision 1.69 2002/04/06 05:07:29 hal9 + -Add privoxy-man-page.sgml, for man page. + -Add authors.sgml for AUTHORS (and p-authors.sgml) + -Reworked various aspects of various docs. + -Added additional comments to sub-docs. + + Revision 1.68 2002/04/04 18:46:47 swa + consistent look. reuse of copyright, history et. al. + + Revision 1.67 2002/04/04 17:27:57 swa + more single file to be included at multiple points. make maintaining easier + + Revision 1.66 2002/04/04 06:48:37 hal9 + Structural changes to allow for conditional inclusion/exclusion of content + based on entity toggles, e.g. 'entity % p-not-stable "INCLUDE"'. And + definition of internal entities, e.g. 'entity p-version "2.9.13"' that will + eventually be set by Makefile. + More boilerplate text for use across multiple docs. + + Revision 1.65 2002/04/03 19:52:07 swa + enhance squid section due to user suggestion + + Revision 1.64 2002/04/03 03:53:43 hal9 + A few minor bug fixes, and touch ups. Ready for review. + + Revision 1.63 2002/04/01 16:24:49 hal9 + Define entities to include boilerplate text. See doc/source/*. + + Revision 1.62 2002/03/30 04:15:53 hal9 + - Fix privoxy.org/config links. + - Paste in Bookmarklets from Toggle page. + - Move Quickstart nearer top, and minor rework. + Revision 1.61 2002/03/29 01:31:08 hal9 Minor update.