X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fsource%2Fuser-manual.sgml;h=15ac7f60602025b5e1635b9bd62ac3f64662825a;hp=5b62f379c9e97174f3dced254781d7bd5242f127;hb=ed70e742f22c7a2eff07f2509e74080190271796;hpb=0c0171d3f0339ee3075ae384a58613ca88334460 diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml index 5b62f379..15ac7f60 100644 --- a/doc/source/user-manual.sgml +++ b/doc/source/user-manual.sgml @@ -1,4 +1,23 @@ - + + + + + + + + + + + + + + + + + + +]> -
-Junkbuster User Manual +Privoxy User Manual -$Id: user-manual.sgml,v 1.46 2002/03/10 00:51:08 hal9 Exp $ +$Id: user-manual.sgml,v 1.95 2002/04/26 17:23:29 swa Exp $ - By: Junkbuster Developers + By: Privoxy Developers + - The user manual gives users information on how to install, configure and use - Internet Junkbuster. Internet - Junkbuster is a web proxy with advanced filtering capabilities - for protecting privacy, filtering web page content, managing cookies, - controlling access, and removing ads, banners, pop-ups and other obnoxious - Internet Junk. Junkbuster has a very flexible configuration and can be - customized to suit individual needs and tastes. Internet - Junkbuster has application for both stand-alone systems and - multi-user networks. + + This is here to keep vim syntax file from breaking :/ + If I knew enough to fix it, I would. + PLEASE DO NOT REMOVE! HB: hal@foobox.net + +]]> + -You can find the latest version of the user manual at http://ijbswa.sourceforge.net/user-manual/. - + The user manual gives users information on how to install, configure and use + Privoxy. + + + + &p-intro; + + + + You can find the latest version of the user manual at http://www.privoxy.org/user-manual/. + Please see the Contact section on how to + contact the developers. + @@ -61,148 +91,47 @@ You can find the latest version of the user manual at + + + + -Introduction - - Internet Junkbuster is a web proxy with advanced - filtering capabilities for protecting privacy, filtering and modifying web - page content, managing cookies, controlling access, and removing ads, - banners, pop-ups and other obnoxious Internet Junk. - Junkbuster has a very flexible configuration and - can be customized to suit individual needs and tastes. Internet - Junkbuster has application for both stand-alone systems and - multi-user networks. - - + +Introduction - This documentation is included with the current BETA version of - Internet Junkbuster and is mostly complete at this - point. The most up to date reference for the time being is still the comments - in the source files and in the individual configuration files. Development - of version 3.0 is currently nearing completion, and includes many significant - changes and enhancements over earlier versions. The target release date for - stable v3.0 is soon ;-) + This documentation is included with the current &p-status; version of + Privoxy, v.&p-version;soon ;-)]]>. + + - Since this is a BETA version, not all new features are well tested. This + Since this is a &p-status; version, not all new features are well tested. This documentation may be slightly out of sync as a result (especially with CVS sources). And there may be bugs, though hopefully not many! - +]]> - -New Features - - In addition to Junkbuster's traditional features - of ad and banner blocking and cookie management, this is a list of new - features currently under development: - - +Features - - - - - Integrated browser based configuration and control utility (http://i.j.b). Browser-based tracing of rule - and filter effects. - - - - - - Modularized configuration that will allow for system wide settings, and - individual user settings. (not implemented yet, probably a 3.1 feature) - - - - - - Blocking of annoying pop-up browser windows. - - - - - - HTTP/1.1 compliant (most, but not all 1.1 features are supported). - - - - - - Support for Perl Compatible Regular Expressions in the configuration files, and - generally a more sophisticated and flexible configuration syntax over - previous versions. - - - - - - GIF de-animation. - - - - - - Web page content filtering (removes banners based on size, - invisible web-bugs, JavaScript, pop-ups, status bar abuse, - etc.) - - - - - - Bypass many click-tracking scripts (avoids script redirection). - - - - - - - Multi-threaded (POSIX and native threads). - - - - - - Auto-detection and re-reading of config file changes. - - - - - - User-customizable HTML templates (e.g. 404 error page). - - - - - - Improved cookie management features (e.g. session based cookies). - - - - - - Builds from source on most UNIX-like systems. Packages available for: Linux - (RedHat, SuSE, or Debian), Windows, Sun Solaris, Mac OSX, OS/2, HP-UX 11 and AmigaOS. - - - - - - - In addition, the configuration is much more powerful and versatile over-all. - - - - + In addition to Internet Junkbuster's traditional + features of ad and banner blocking and cookie management, + Privoxy provides new features: - + + &newfeatures; + @@ -212,327 +141,612 @@ You can find the latest version of the user manual at Installation - - Junkbuster is available as raw source code, or - pre-compiled binaries. See the Junkbuster Home Page - for binaries and current release info. Junkbuster - is also available via CVS. - This is the recommended approach at this time. But please be aware that CVS - is constantly changing, and it may break in mysterious ways. - - -Source - For gzipped tar archives, unpack the source: + Privoxy is available both in convenient pre-compiled + packages for a wide range of operating systems, and as raw source code. + For most users, we recommend using the packages, which can be downloaded from our + Privoxy Project + Page. For installing and compiling the source code, please look + into our Developer Manual. - - tar xzvf ijb_source_* [.tgz or .tar.gz] - cd ijb_source_2.9.11_beta - + If you like to live on the bleeding edge and are not afraid of using + possibly unstable development versions, you can check out the up-to-the-minute + version directly from the + CVS repository or simply download the nightly CVS + tarball. Again, we refer you to the Developer Manual. + + &supported; + + - For retrieving the current CVS sources, you'll need the CVS - package installed first. To download CVS source: + Note: If you have a previous Junkbuster or + Privoxy installation on your system, you + will need to remove it. Some platforms do this for you as part + of their installation procedure. (See below for your platform). - - cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login - cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co current - cd current - + In any case be sure to backup your old configuration + if it is valuable to you. See the + note to upgraders section + below. + +Red Hat and SuSE RPMs + - This will create a directory named current/, which will - contain the source tree. + RPMs can be installed with rpm -Uvh privoxy-&p-version;-1.rpm, + and will use /etc/privoxy for the location + of configuration files. - Then, in either case, to build from tarball/CVS source: + Note that on Red Hat, Privoxy will + not be automatically started on system boot. You will + need to enable that using chkconfig, + ntsysv, or similar methods. Note that SuSE will +automatically start Privoxy in the boot process. - - ./configure (--help to see options) - make (the make from gnu, gmake for *BSD) - su - make -n install (to see where all the files will go) - make install (to really install) - + If you have problems with failed dependencies, try rebuilding the SRC RPM: + rpm --rebuild privoxy-&p-version;-1.src.rpm;. This + will use your locally installed libraries and RPM version. - For Redhat and SuSE Linux RPM packages, see below. + Also note that if you have a Junkbuster RPM installed + on your system, you need to remove it first, because the packages conflict. + Otherwise, RPM will try to remove Junkbuster + automatically, before installing Privoxy. - - -Red Hat +Debian - To build Redhat RPM packages, install source as above. Then: + FIXME. + - - - autoheader [suggested for CVS source] - autoconf [suggested for CVS source] - ./configure - make redhat-dist - - + +Windows - This will create both binary and src RPMs in the usual places. Example: + Just double-click the installer, which will guide you through + the installation process. You will find the configuration files + in the same directory as you installed Privoxy in. We do not + use the registry of Windows. + + + +Solaris, NetBSD, FreeBSD, HP-UX -    /usr/src/redhat/RPMS/i686/junkbuster-2.9.11-1.i686.rpm - - -    /usr/src/redhat/SRPMS/junkbuster-2.9.11-1.src.rpm + Create a new directory, cd to it, then unzip and + untar the archive. For the most part, you'll have to figure out where + things go. FIXME. + + + +OS/2 - To install, of course: + First, make sure that no previous installations of + Junkbuster and / or + Privoxy are left on your + system. You can do this by - - rpm -Uvv /usr/src/redhat/RPMS/i686/junkbuster-2.9.11-1.i686.rpm - + Then, just double-click the WarpIN self-installing archive, which will + guide you through the installation process. A shadow of the + Privoxy executable will be placed in your + startup folder so it will start automatically whenever OS/2 starts. - This will place the Junkbuster configuration - files in /etc/junkbuster/, and log files in - /var/log/junkbuster/. + The directory you choose to install Privoxy + into will contain all of the configuration files. + + +Max OSX + + Unzip the downloaded package (you can either double-click on the file + in the finder, or on the desktop if you downloaded it there). Then, + double-click on the package installer icon and follow the installation + process. + Privoxy will be installed in the subdirectory + /Applications/Privoxy.app. + Privoxy will set itself up to start + automatically on system bring-up via + /System/Library/StartupItems/Privoxy. + -SuSE +AmigaOS - To build SuSE RPM packages, install source as above. Then: + Copy and then unpack the lha archive to a suitable location. + All necessary files will be installed into Privoxy + directory, including all configuration and log files. To uninstall, just + remove this directory. - - - autoheader [suggested for CVS source] - autoconf [suggested for CVS source] - ./configure - make suse-dist - + Start Privoxy (with RUN <>NIL:) in your + startnet script (AmiTCP), in + s:user-startup (RoadShow), as startup program in your + startup script (Genesis), or as startup action (Miami and MiamiDx). + Privoxy will automatically quit when you quit your + TCP/IP stack (just ignore the harmless warning your TCP/IP stack may display that + Privoxy is still running). + + + + + + +Note to Upgraders - This will create both binary and src RPMs in the usual places. Example: + There are very significant changes from older versions of + Junkbuster to the current + Privoxy. Configuration is substantially + changed. Junkbuster 2.0.x and earlier + configuration files will not migrate. The functionality of the old + blockfile, cookiefile and + imagelist, are now combined into the + actions files. default.action, + is the main actions file. Local exceptions should best be put into + user.action. - -    /usr/src/packages/RPMS/i686/junkbuster-2.9.11-1.i686.rpm + A filter file (typically default.filter) + is new as of Privoxy 2.9.x, and provides some + of the new sophistication (explained below). config is + much the same as before. -    /usr/src/packages/SRPMS/junkbuster-2.9.11-1.src.rpm + If upgrading from a 2.0.x version, you will have to use the new config + files, and possibly adapt any personal rules from your older files. + When porting personal rules over from the old blockfile + to the new actions files, please note that even the pattern syntax has + changed. If upgrading from 2.9.x development versions, it is still + recommended to use the new configuration files. - - To install, of course: + A quick list of things to be aware of before upgrading: - - rpm -Uvv /usr/src/packages/RPMS/i686/junkbuster-2.9.11-1.i686.rpm - + + + + + The default listening port is now 8118 due to a conflict with another + service (NAS). + + + + + Some installers may remove earlier versions completely. Save any + important configuration files! + + + + + Privoxy is controllable with a web browser + at the special URL: http://config.privoxy.org/ + (Shortcut: http://p.p/). Many + aspects of configuration can be done here, including temporarily disabling + Privoxy. + + + + + The primary configuration file for cookie management, ad and banner + blocking, and many other aspects of Privoxy + configuration is in the actions files. It is strongly + recommended to become familiar with the new actions concept below, + before modifying these files. Locally defined rules + should go into user.action. + + + + + + + Some installers may not automatically start + Privoxy after installation. + + + + + + +Quickstart to Using <application>Privoxy</application> - This will place the Junkbuster configuration - files in /etc/junkbuster/, and log files in - /var/log/junkbuster/. - + - + + + Install Privoxy. See the section Installing. + + + + + + Start Privoxy. See the section Starting Privoxy. + + + + + Change your browser's configuration to use the proxy localhost on port + 8118. See the section Starting Privoxy. + + - -OS/2 + + + Enjoy surfing with enhanced comfort and privacy. Please see the section + Contacting the Developers on how to report + bugs or problems with websites or to get help. You may want to change the + file user.action to further tweak your new browsing + experience. + + - + + + + + + +Starting <application>Privoxy</application> - Junkbuster is packaged in a WarpIN self- - installing archive. The self-installing program will be named depending - on the release version, something like: - ijbos2_setup_1.2.3.exe. In order to install it, simply - run this executable or double-click on its icon and follow the WarpIN - installation panels. A shadow of the Junkbuster - executable will be placed in your startup folder so it will start - automatically whenever OS/2 starts. + Before launching Privoxy for the first time, you + will want to configure your browser(s) to use Privoxy + as a HTTP and HTTPS proxy. The default is localhost for the proxy address, + and port 8118 (earlier versions used port 8000). This is the one + configuration step that must be done! + + + + With Netscape (and + Mozilla), this can be set under Edit + -> Preferences -> Advanced -> Proxies -> HTTP Proxy. + For Internet Explorer: Tools -> + Internet Properties -> Connections -> LAN Setting. Then, + check Use Proxy and fill in the appropriate info (Address: + localhost, Port: 8118). Include if HTTPS proxy support too. - The directory you choose to install Junkbuster - into will contain all of the configuration files. + After doing this, flush your browser's disk and memory caches to force a + re-reading of all pages and to get rid of any ads that may be cached. You + are now ready to start enjoying the benefits of using + Privoxy! + - If you would like to build binary images on OS/2 yourself, you will need - a few Unix-like tools: autoconf, autoheader and sh. These tools will be - used to create the required config.h file, which is not part of the - source distribution because it differs based on platform. You will also - need a compiler. - The distribution has been created using IBM VisualAge compilers, but you - can use any compiler you like. GCC/EMX has the disadvantage of needing - to be single-threaded due to a limitation of EMX's implementation of the - select() socket call. + Privoxy is typically started by specifying the + main configuration file to be used on the command line. Example Unix startup + command: - In addition to needing the source code distribution as outlined earlier, - you will want to extract the os2seutp directory from CVS: - - cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login - cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co os2setup - - This will create a directory named os2setup/, which will contain the - Makefile.vac makefile and os2build.cmd - which is used to completely create the binary distribution. The sequence - of events for building the executable for yourself goes something like this: - cd current - autoheader - autoconf - sh configure - cd ..\os2setup - nmake -f Makefile.vac - - You will see this sequence laid out in os2build.cmd. + # /usr/sbin/privoxy /etc/privoxy/config + - - - - -Windows -Click-click. (I need help on this. Not a clue here. Also for -configuration section below. HB.) + + See below for other command line options. - - -Other - Some quick notes on other Operating Systems. + An init script is provided for SuSE and Red Hat. - For FreeBSD (and other *BSDs?), the build will require gmake - instead of the included make. gmake is - available from http://www.gnu.org. - The rest should be the same as above for Linux/Unix. + For for SuSE: rcprivoxy start - - - - - - - - -JunkBuster Configuration - - All JunkBuster configuration is kept - in text files. These files can be edited with a text editor. - Many important aspects of JunkBuster can - also be controlled easily with a web browser. + + For Red Hat and Debian: /etc/rc.d/init.d/privoxy start + - + + If no configuration file is specified on the command line, + Privoxy will look for a file named + config in the current directory. Except on Win32 where + it will try config.txt. If no file is specified on the + command line and no default configuration file can be found, + Privoxy will fail to start. + - - -Controlling Junkbuster with Your Web Browser - JunkBuster can be reached by the special - URL http://i.j.b/ (or alternately - http://ijbswa.sourceforge.net/config/, - which is an internal page. You will see the following section: - + The included default configuration files should give a reasonable starting + point. Most of the per site configuration is done in the + actions files. These are where various cookie actions are + defined, ad and banner blocking, and other aspects of + Privoxy configuration. There are several such + files included, with varying levels of aggressiveness. - + You will probably want to keep an eye out for sites for which you may prefer + persistent cookies, and add these to your actions configuration as needed. By + default, most of these will be accepted only during the current browser + session (aka session cookies), unless you add them to the + configuration. If you want the browser to handle this instead, you will need + to edit user.action (or through the web based interface) + and disable this feature. If you use more than one browser, it would make + more sense to let Privoxy handle this. In which + case, the browser(s) should be set to accept all cookies. + -Please choose from the following options: + + Another feature where you will probably want to define exceptions for trusted + sites is the popup-killing (through the +popup and + +filter{popups} actions), because your favorite shopping, + banking, or leisure site may need popups (explained below). + - * Show information about the current configuration - * Show the source code version numbers - * Show the client's request headers. - * Show which actions apply to a URL and why - * Toggle JunkBuster on or off - * Edit the actions list + + Privoxy is HTTP/1.1 compliant, but not all of + the optional 1.1 features are as yet supported. In the unlikely event that + you experience inexplicable problems with browsers that use HTTP/1.1 per default + (like Mozilla or recent versions of I.E.), you might + try to force HTTP/1.0 compatibility. For Mozilla, look under Edit -> + Preferences -> Debug -> Networking. + Alternatively, set the +downgrade-http-version config option in + default.action which will downgrade your browser's HTTP + requests from HTTP/1.1 to HTTP/1.0 before processing them. + - + + After running Privoxy for a while, you can + start to fine tune the configuration to suit your personal, or site, + preferences and requirements. There are many, many aspects that can + be customized. Actions + can be adjusted by pointing your browser to + http://config.privoxy.org/ + (shortcut: http://p.p/), + and then follow the link to View & Change the Current Configuration. + (This is an internal page and does not require Internet access.) - This should be self-explanatory. Note the last item is an editor for the - actions list, which is where much of the ad, banner, cookie, - and URL blocking magic is configured as well as other advanced features of - Junkbuster. This is an easy way to adjust various - aspects of Junkbuster configuration. The actions - file, and other configuration files, are explained in detail below. - Junkbuster will automatically detect any changes - to these files. + In fact, various aspects of Privoxy + configuration can be viewed from this page, including + current configuration parameters, source code version numbers, + the browser's request headers, and actions that apply + to a given URL. In addition to the actions file + editor mentioned above, Privoxy can also + be turned on and off (toggled) from this page. - Toggle JunkBuster On or Off is handy for sites that might - have problems with your current actions and filters, or just to test if - a site misbehaves, whether it is JunkBuster - causing the problem or not. Junkbuster continues - to run as a proxy in this case, but all filtering is disabled. + If you encounter problems, try loading the page without + Privoxy. If that helps, enter the URL where + you have the problems into the browser + based rule tracing utility. See which rules apply and why, and + then try turning them off for that site one after the other, until the problem + is gone. When you have found the culprit, you might want to turn the rest on + again. + + + If the above paragraph sounds gibberish to you, you might want to read more about the actions concept + or even dive deep into the Appendix + on actions. + + + + If you can't get rid of the problem at all, think you've found a bug in + Privoxy, want to propose a new feature or smarter rules, please see the + section Contacting the + Developers below. + + + + +Command Line Options + + Privoxy may be invoked with the following + command-line options: + + + + + + + + --version + + + Print version info and exit. Unix only. + + + + + --help + + + Print short usage info and exit. Unix only. + + + + + --no-daemon + + + Don't become a daemon, i.e. don't fork and become process group + leader, and don't detach from controlling tty. Unix only. + + + + + --pidfile FILE + + + + On startup, write the process ID to FILE. Delete the + FILE on exit. Failure to create or delete the + FILE is non-fatal. If no FILE + option is given, no PID file will be used. Unix only. + + + + + --user USER[.GROUP] + + + + After (optionally) writing the PID file, assume the user ID of + USER, and if included the GID of GROUP. Exit if the + privileges are not sufficient to do so. Unix only. + + + + + configfile + + + If no configfile is included on the command line, + Privoxy will look for a file named + config in the current directory (except on Win32 + where it will look for config.txt instead). Specify + full path to avoid confusion. If no config file is found, + Privoxy will fail to start. + + + + + + + +<application>Privoxy</application> Configuration + + All Privoxy configuration is stored + in text files. These files can be edited with a text editor. + Many important aspects of Privoxy can + also be controlled easily with a web browser. + +Controlling <application>Privoxy</application> with Your Web Browser + + Privoxy's user interface can be reached through the special + URL http://config.privoxy.org/ + (shortcut: http://p.p/), + which is a built-in page and works without Internet access. + You will see the following section: + + + + + + + Privoxy Menu + + + +         ▪  View & change the current configuration + + +         ▪  View the source code version numbers + + +         ▪  View the request headers. + + +         ▪  Look up which actions apply to a URL and why + + +         ▪  Toggle Privoxy on or off + + + + + + + + This should be self-explanatory. Note the first item leads to an editor for the + actions list, which is where the ad, banner, cookie, + and URL blocking magic is configured as well as other advanced features of + Privoxy. This is an easy way to adjust various + aspects of Privoxy configuration. The actions + file, and other configuration files, are explained in detail below. + + + + Toggle Privoxy On or Off is handy for sites that might + have problems with your current actions and filters. You can in fact use + it as a test to see whether it is Privoxy + causing the problem or not. Privoxy continues + to run as a proxy in this case, but all filtering is disabled. There + is even a toggle Bookmarklet offered, so + that you can toggle Privoxy with one click from + your browser. + + + + + + + + + + + + Configuration Files Overview For Unix, *BSD and Linux, all configuration files are located in - /etc/junkbuster/ by default. For MS Windows, OS/2, and + /etc/privoxy/ by default. For MS Windows, OS/2, and AmigaOS these are all in the same directory as the - Junkbuster executable. The name and number of - configuration files has changed from previous versions, and is subject to - change as development progresses. + Privoxy executable. - The installed defaults provide a reasonable starting point, though possibly - aggressive by some standards. For the time being, there are only three - default configuration files (this will change in time): + The installed defaults provide a reasonable starting point, though + some settings may be aggressive by some standards. For the time being, the + principle configuration files are: @@ -540,28 +754,44 @@ Please choose from the following options: - The main configuration file is named config + The main configuration file is named config on Linux, Unix, BSD, OS/2, and AmigaOS and config.txt - on Windows. + on Windows. This is a required file. - The ijb.action file is used to define various - actions relating to images, banners, pop-ups, access - restrictions, banners and cookies. There is a CGI based editor for this - file that can be accessed via http://i.j.b. (Other actions - files are included as well with differing levels of filtering - and blocking, e.g. ijb-basic.action.) + default.action (the main actions file) is used to define + the default settings for various actions relating to images, banners, + pop-ups, access restrictions, banners and cookies. + + + Multiple actions files may be defined in config. These + are processed in the order they are defined. Local customizations and locally + preferred exceptions to the default policies as defined in + default.action are probably best applied in + user.action, which should be preserved across + upgrades. standard.action is also included. This is mostly + for Privoxy's internal use. + + + There is also a web based editor that can be accessed from + http://config.privoxy.org/show-status/ + (Shortcut: http://p.p/show-status/) for the + various actions files. - The re_filterfile file can be used to rewrite the raw - page content, including text as well as embedded HTML and JavaScript. + default.filter (the filter + file) can be used to re-write the raw page content, including + viewable text as well as embedded HTML and JavaScript, and whatever else + lurks on any given web page. The filtering jobs are only pre-defined here; + whether to apply them or not is up to the actions files. @@ -569,28 +799,45 @@ Please choose from the following options: - ijb.action and re_filterfile - can use Perl style regular expressions for maximum flexibility. All files use - the # character to denote a comment. Such - lines are not processed by Junkbuster. After - making any changes, there is no need to restart - Junkbuster in order for the changes to take - effect. Junkbuster should detect such changes - automatically. + All files use the # character to denote a + comment (the rest of the line will be ignored) angd understand line continuation + through placing a backslash ("\") as the very last character + in a line. If the # is preceded by a backslash, it looses + its special function. Placing a # in front of an otherwise + valid configuration line to prevent it from being interpreted is called "commenting + out" that line. + + + + The actions files and default.filter + can use Perl style regular expressions for + maximum flexibility. + + + + After making any changes, there is no need to restart + Privoxy in order for the changes to take + effect. Privoxy detects such changes + automatically. Note, however, that it may take one or two additional + requests for the change to take effect. When changing the listening address + of Privoxy, these wake up requests + must obviously be sent to the old listening address. + While under development, the configuration content is subject to change. The below documentation may not be accurate by the time you read this. Also, what constitutes a default setting, may change, so please check all your configuration files on important issues. +]]> - + The Main Configuration File Again, the main configuration file is named config on @@ -604,568 +851,1402 @@ Please choose from the following options: - blockfile blocklist.ini + confdir /etc/privoxy - - - - - - Indicates that the blockfile is named blocklist.ini. - - - - A # indicates a comment. Any part of a - line following a # is ignored, except if - the # is preceded by a - \. + + - Thus, by placing a # at the start of an - existing configuration line, you can make it a comment and it will be treated - as if it weren't there. This is called commenting out an - option and can be useful to turn off features: If you comment out the - logfile line, junkbuster will not - log to a file at all. Watch for the default: section in each - explanation to see what happens if the option is left unset (or commented - out). + Assigns the value /etc/privoxy to the option + confdir and thus indicates that the configuration + directory is named /etc/privoxy/. - Long lines can be continued on the next line by using a - \ as the very last character. + All options in the config file except for confdir and + logdir are optional. Watch out in the below description + for what happens if you leave them unset. - There are various aspects of Junkbuster behavior - that can be tuned. + The main config file controls all aspects of Privoxy's + operation that are not location dependent (i.e. they apply universally, no matter + where you may be surfing). - -Defining Other Configuration Files - - - Junkbuster can use a number of other files to tell it - what ads to block, what cookies to accept, etc. This section of the - configuration file tells Junkbuster where to find - all those other files. - - - - On Windows and AmigaOS, - Junkbuster looks for these files in the same - directory as the executable. On Unix and OS/2, - Junkbuster looks for these files in the current - working directory. In either case, an absolute path name can be used to - avoid problems. - + +Configuration and Log File Locations - When development goes modular and multi-user, the blocker, filter, and - per-user config will be stored in subdirectories of confdir. - For now, only confdir/templates is used for storing HTML - templates for CGI results. + Privoxy can (and normally does) use a number of + other files for additional configuration and logging. + This section of the configuration file tells Privoxy + where to find those other files. - - The location of the configuration files: - - - - - - confdir /etc/junkbuster # No trailing /, please. - - - - +confdir - - The directory where all logging (i.e. logfile and - jarfile) takes place. No trailing - /, please: - + + + Specifies: + + The directory where the other configuration files are located + + + + Type of value: + + Path name + + + + Default value: + + /etc/privoxy (Unix) or Privoxy installation dir (Windows) + + + + Effect if unset: + + Mandatory + + + + Notes: + + + No trailing /, please + + + When development goes modular and multi-user, the blocker, filter, and + per-user config will be stored in subdirectories of confdir. + For now, the configuration directory structure is flat, except for + confdir/templates, where the HTML templates for CGI + output reside (e.g. Privoxy's 404 error page). + + + + + - - - - - logdir /var/log/junkbuster - - - - - - Note that all file specifications below are relative to - the above two directories! - +logdir - - The ijb.action file contains patterns to specify the actions to - apply to requests for each site. Default: Cookies to and from all - destinations are kept only during the current browser session (i.e. they - are not saved to disk). Pop-ups are disabled for all sites. All sites are - filtered if re_filterfile specified according to the - contents of re_filterfile. No sites are blocked. The - JunkBuster logo is displayed for filtered ads and other images . The syntax - of this file is explained in detail below. - + + + Specifies: + + + The directory where all logging takes place (i.e. where logfile and + jarfile are located) + + + + + Type of value: + + Path name + + + + Default value: + + /var/log/privoxy (Unix) or Privoxy installation dir (Windows) + + + + Effect if unset: + + Mandatory + + + + Notes: + + + No trailing /, please + + + + + + + +actionsfile + + + + + + + + Specifies: + + + The actions file(s) to use + + + + + Type of value: + + File name, relative to confdir + + + + Default value: + + + + standard # Internal purposes, recommended not editing + + + default # Main actions file + + + user # User customizations + + + + + + Effect if unset: + + + No actions are taken at all. Simple neutral proxying. + + + + + Notes: + + + Multiple actionsfile lines are OK and are in fact recommended! + + + The default values include standard.action, which is used for internal + purposes and should be loaded, default.action, which is the + main actions file maintained by the developers, and + user.action, where you can make your personal additions. + + + There is no point in using Privoxy without an actions file. + + + + + + +filterfile + + + + Specifies: + + + The filter file to use + + + + + Type of value: + + File name, relative to confdir + + + + Default value: + + default.filter (Unix) or default.filter.txt (Windows) + + + + Effect if unset: + + + No textual content filtering takes place, i.e. all + +filter{name} + actions in the actions files are turned off + + + + + Notes: + + + The default.filter file contains content modification rules + that use regular expressions. These rules permit powerful + changes on the content of Web pages, e.g., you could disable your favorite + JavaScript annoyances, re-write the actual displayed text, or just have some + fun replacing Microsoft with MicroSuck wherever + it appears on a Web page. + + + + + - - - - - actionsfile ijb.action - - - - +logfile - - The re_filterfile file contains content modification rules. - These rules permit powerful changes on the content of Web pages, e.g., you - could disable your favorite JavaScript annoyances, rewrite the actual - content, or just have some fun replacing Microsoft with - MicroSuck wherever it appears on a Web page. Default: No - content modification, or whatever the developers are playing with :-/ - + + + Specifies: + + + The log file to use + + + + + Type of value: + + File name, relative to logdir + + + + Default value: + + logfile (Unix) or privoxy.log (Windows) + + + + Effect if unset: + + + No log file is used, all log messages go to the console (stderr). + + + + + Notes: + + + The windows version will additionally log to the console. + + + The logfile is where all logging and error messages are written. The level + of detail and number of messages are set with the debug + option (see below). The logfile can be useful for tracking down a problem with + Privoxy (e.g., it's not blocking an ad you + think it should block) but in most cases you probably will never look at it. + + + Your logfile will grow indefinitely, and you will probably want to + periodically remove it. On Unix systems, you can do this with a cron job + (see man cron). For Red Hat, a logrotate + script has been included. + + + On SuSE Linux systems, you can place a line like /var/log/privoxy.* + +1024k 644 nobody.nogroup in /etc/logfiles, with + the effect that cron.daily will automatically archive, gzip, and empty the + log, when it exceeds 1M size. + + + + + - - Filtering requires buffering the page content, which may appear to slow down - page rendering since nothing is displayed until all content has passed - the filters. (It does not really take longer, but seems that way since - the page is not incrementally displayed.) This effect will be more noticeable - on slower connections. +jarfile - + + + Specifies: + + + The file to store intercepted cookies in + + + + + Type of value: + + File name, relative to logdir + + + + Default value: + + jarfile (Unix) or privoxy.jar (Windows) + + + + Effect if unset: + + + Intercepted cookies are not stored at all. + + + + + Notes: + + + The jarfile may grow to ridiculous sizes over time. + + + + + - - - - - re_filterfile re_filterfile - - - - +trustfile - - The logfile is where all logging and error messages are written. The logfile - can be useful for tracking down a problem with - Junkbuster (e.g., it's not blocking an ad you - think it should block) but in most cases you probably will never look at it. - - - - Your logfile will grow indefinitely, and you will probably want to - periodically remove it. On Unix systems, you can do this with a cron job - (see man cron). For Redhat, a logrotate - script has been included. - + + + Specifies: + + + The trust file to use + + + + + Type of value: + + File name, relative to confdir + + + + Default value: + + Unset (commented out). When activated: trust (Unix) or trust.txt (Windows) + + + + Effect if unset: + + + The whole trust mechanism is turned off. + + + + + Notes: + + + The trust mechanism is an experimental feature for building white-lists and should + be used with care. It is NOT recommended for the casual user. + + + If you specify a trust file, Privoxy will only allow + access to sites that are named in the trustfile. + You can also mark sites as trusted referrers (with +), with + the effect that access to untrusted sites will be granted, if a link from a + trusted referrer was used. + The link target will then be added to the trustfile. + Possible applications include limiting Internet access for children. + + + If you use + operator in the trust file, it may grow considerably over time. + + + + + - - On SuSE Linux systems, you can place a line like /var/log/junkbuster.* - +1024k 644 nobody.nogroup in /etc/logfiles, with - the effect that cron.daily will automatically archive, gzip, and empty the - log, when it exceeds 1M size. - + - - Default: Log to the a file named logfile. - Comment out to disable logging. - + - - - - - logfile logfile - - - - - - The jarfile defines where - Junkbuster stores the cookies it intercepts. Note - that if you use a jarfile, it may grow quite large. Default: - Don't store intercepted cookies. - - - - - - #jarfile jarfile - - - - + - - If you specify a trustfile, - Junkbuster will only allow access to sites that - are named in the trustfile. You can also mark sites as trusted referrers, - with the effect that access to untrusted sites will be granted, if a link - from a trusted referrer was used. The link target will then be added to the - trustfile. This is a very restrictive feature that typical - users most probably want to leave disabled. Default: Disabled, don't use the - trust mechanism. - + +Local Set-up Documentation - - - - - #trustfile trust - - - - - - - If you use the trust mechanism, it is a good idea to write up some on-line - documentation about your blocking policy and to specify the URL(s) here. They - will appear on the page that your users receive when they try to access - untrusted content. Use multiple times for multiple URLs. Default: Don't - display links on the untrusted info page. - + + If you intend to operate Privoxy for more users + that just yourself, it might be a good idea to let them know how to reach + you, what you block and why you do that, your policies etc. + - - - - - trust-info-url http://www.your-site.com/why_we_block.html - trust-info-url http://www.your-site.com/what_we_allow.html - - - - +trust-info-url - + + + Specifies: + + + A URL to be displayed in the error page that users will see if access to an untrusted page is denied. + + + + + Type of value: + + URL + + + + Default value: + + Two example URL are provided + + + + Effect if unset: + + + No links are displayed on the "untrusted" error page. + + + + + Notes: + + + The value of this option only matters if the experimental trust mechanism has been + activated. (See trustfile above.) + + + If you use the trust mechanism, it is a good idea to write up some on-line + documentation about your trust policy and to specify the URL(s) here. + Use multiple times for multiple URLs. + + + The URL(s) should be added to the trustfile as well, so users don't end up + locked out from the information on why they were locked out in the first place! + + + + + - +admin-address + + + Specifies: + + + An email address to reach the proxy administrator. + + + + + Type of value: + + Email address + + + + Default value: + + Unset + + + + Effect if unset: + + + No email address is displayed on error pages and the CGI user interface. + + + + + Notes: + + + If both admin-address and proxy-info-url + are unset, the whole "Local Privoxy Support" box on all generated pages will + not be shown. + + + + + + +proxy-info-url + + + + Specifies: + + + A URL to documentation about the local Privoxy setup, + configuration or policies. + + + + + Type of value: + + URL + + + + Default value: + + Unset + + + + Effect if unset: + + + No link to local documentation is displayed on error pages and the CGI user interface. + + + + + Notes: + + + If both admin-address and proxy-info-url + are unset, the whole "Local Privoxy Support" box on all generated pages will + not be shown. + + + This URL shouldn't be blocked ;-) + + + + + + + - -Other Configuration Options + +Debugging - - This part of the configuration file contains options that control how - Junkbuster operates. - + + These options are mainly useful when tracing a problem. + Note that you might also want to invoke + Privoxy with the --no-daemon + command line option when debugging. + - - Admin-address should be set to the email address of the proxy - administrator. It is used in many of the proxy-generated pages. Default: - fill@me.in.please. - +debug - - - - - #admin-address fill@me.in.please - - - - + + + Specifies: + + + Key values that determine what information gets logged. + + + + + Type of value: + + Integer values + + + + Default value: + + 12289 (i.e.: URLs plus informational and warning messages) + + + + Effect if unset: + + + Nothing gets logged. + + + + + Notes: + + + The available debug levels are: + + + + debug 1 # show each GET/POST/CONNECT request + debug 2 # show each connection status + debug 4 # show I/O status + debug 8 # show header parsing + debug 16 # log all data into the logfile + debug 32 # debug force feature + debug 64 # debug regular expression filter + debug 128 # debug fast redirects + debug 256 # debug GIF de-animation + debug 512 # Common Log Format + debug 1024 # debug kill pop-ups + debug 4096 # Startup banner and warnings. + debug 8192 # Non-fatal errors + + + + To select multiple debug levels, you can either add them or use + multiple debug lines. + + + A debug level of 1 is informative because it will show you each request + as it happens. 1, 4096 and 8192 are highly recommended + so that you will notice when things go wrong. The other levels are probably + only of interest if you are hunting down a specific problem. They can produce + a hell of an output (especially 16). + + + + The reporting of fatal errors (i.e. ones which crash + Privoxy) is always on and cannot be disabled. + + + If you want to use CLF (Common Log Format), you should set debug + 512 ONLY and not enable anything else. + + + + + - - Proxy-info-url can be set to a URL that contains more info - about this Junkbuster installation, it's - configuration and policies. It is used in many of the proxy-generated pages - and its use is highly recommended in multi-user installations, since your - users will want to know why certain content is blocked or modified. Default: - Don't show a link to on-line documentation. - +single-threaded - - - - - proxy-info-url http://www.your-site.com/proxy.html - - - - + + + Specifies: + + + Whether to run only one server thread + + + + + Type of value: + + None + + + + Default value: + + Unset + + + + Effect if unset: + + + Multi-threaded (or, where unavailable: forked) operation, i.e. the ability to + serve multiple requests simultaneously. + + + + + Notes: + + + This option is only there for debug purposes and you should never + need to use it. It will drastically reduce performance. + + + + + - - Listen-address specifies the address and port where - Junkbuster will listen for connections from your - Web browser. The default is to listen on the localhost port 8118, and - this is suitable for most users. (In your web browser, under proxy - configuration, list the proxy server as localhost and the - port as 8118). - + - - If you already have another service running on port 8118, or if you want to - serve requests from other machines (e.g. on your local network) as well, you - will need to override the default. The syntax is - listen-address [<ip-address>]:<port>. If you leave - out the IP address, junkbuster will bind to all - interfaces (addresses) on your machine and may become reachable from the - Internet. In that case, consider using access control lists (acl's) (see - aclfile above), or a firewall. - + - - For example, suppose you are running Junkbuster on - a machine which has the address 192.168.0.1 on your local private network - (192.168.0.0) and has another outside connection with a different address. - You want it to serve requests from inside only: - + +Access Control and Security - - - - - listen-address 192.168.0.1:8118 - - - - + + This section of the config file controls the security-relevant aspects + of Privoxy's configuration. + - - If you want it to listen on all addresses (including the outside - connection): - +listen-address - - - - - listen-address :8118 - - - - + + + Specifies: + + + The IP address and TCP port on which Privoxy will + listen for client requests. + + + + + Type of value: + + [IP-Address]:Port + + + + Default value: + + localhost:8118 + + + + Effect if unset: + + + Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended for + home users who run Privoxy on the same machine as + their browser. + + + + + Notes: + + + You will need to configure your browser(s) to this proxy address and port. + + + If you already have another service running on port 8118, or if you want to + serve requests from other machines (e.g. on your local network) as well, you + will need to override the default. + + + If you leave out the IP address, Privoxy will + bind to all interfaces (addresses) on your machine and may become reachable + from the Internet. In that case, consider using access control lists (ACL's) + (see ACLs below), or a firewall. + + + + + Example: + + + Suppose you are running Privoxy on + a machine which has the address 192.168.0.1 on your local private network + (192.168.0.0) and has another outside connection with a different address. + You want it to serve requests from inside only: + + + + listen-address 192.168.0.1:8118 + + + + + + - - If you do this, consider using ACLs (see aclfile above). Note: - you will need to point your browser(s) to the address and port that you have - configured here. Default: localhost:8118 (127.0.0.1:8118). - +toggle - - The debug option sets the level of debugging information to log in the - logfile (and to the console in the Windows version). A debug level of 1 is - informative because it will show you each request as it happens. Higher - levels of debug are probably only of interest to developers. - + + + Specifies: + + + Initial state of "toggle" status + + + + + Type of value: + + 1 or 0 + + + + Default value: + + 1 + + + + Effect if unset: + + + Act as if toggled on + + + + + Notes: + + + If set to 0, Privoxy will start in + toggled off mode, i.e. behave like a normal, content-neutral + proxy. See enable-remote-toggle + below. This is not really useful anymore, since toggling is much easier + via the web + interface then via editing the conf file. + + + The windows version will only display the toggle icon in the system tray + if this option is present. + + + + + - - - - - debug 1 # GPC = show each GET/POST/CONNECT request - debug 2 # CONN = show each connection status - debug 4 # IO = show I/O status - debug 8 # HDR = show header parsing - debug 16 # LOG = log all data into the logfile - debug 32 # FRC = debug force feature - debug 64 # REF = debug regular expression filter - debug 128 # = debug fast redirects - debug 256 # = debug GIF de-animation - debug 512 # CLF = Common Log Format - debug 1024 # = debug kill pop-ups - debug 4096 # INFO = Startup banner and warnings. - debug 8192 # ERROR = Non-fatal errors - - - - - - It is highly recommended that you enable ERROR - reporting (debug 8192), at least until the next stable release. - +enable-remote-toggle + + + Specifies: + + + Whether or not the web-based toggle + feature may be used + + + + + Type of value: + + 0 or 1 + + + + Default value: + + 1 + + + + Effect if unset: + + + The web-based toggle feature is disabled. + + + + + Notes: + + + When toggled off, Privoxy acts like a normal, + content-neutral proxy, i.e. it acts as if none of the actions applied to + any URL. + + + For the time being, access to the toggle feature can not be + controlled separately by ACLs or HTTP authentication, + so that everybody who can access Privoxy (see + ACLs and listen-address above) can + toggle it for all users. So this option is not recommended + for multi-user environments with untrusted users. + + + Note that you must have compiled Privoxy with + support for this feature, otherwise this option has no effect. + + + + + - - The reporting of FATAL errors (i.e. ones which crash - JunkBuster) is always on and cannot be disabled. - - - If you want to use CLF (Common Log Format), you should set debug - 512 ONLY, do not enable anything else. - +enable-edit-actions + + + Specifies: + + + Whether or not the web-based actions + file editor may be used + + + + + Type of value: + + 0 or 1 + + + + Default value: + + 1 + + + + Effect if unset: + + + The web-based actions file editor is disabled. + + + + + Notes: + + + For the time being, access to the editor can not be + controlled separately by ACLs or HTTP authentication, + so that everybody who can access Privoxy (see + ACLs and listen-address above) can + modify its configuration for all users. So this option is not + recommended for multi-user environments with untrusted users. + + + Note that you must have compiled Privoxy with + support for this feature, otherwise this option has no effect. + + + + + + + +ACLs: permit-access and deny-access + + + + + + Specifies: + + + Who can access what. + + + + + Type of value: + + + src_addr[/src_masklen] + [dst_addr[/dst_masklen]] + + + Where src_addr and + dst_addr are IP addresses in dotted decimal notation or valid + DNS names, and src_masklen and + dst_masklen are subnet masks in CIDR notation, i.e. integer + values from 2 to 30 representing the length (in bits) of the network address. The masks and the whole + destination part are optional. + + + + + Default value: + + Unset + + + + Effect if unset: + + + Don't restrict access further than implied by listen-address + + + + + Notes: + + + Access controls are included at the request of ISPs and systems + administrators, and are not usually needed by individual users. + For a typical home user, it will normally suffice to ensure that + Privoxy only listens on the localhost or internal (home) + network address by means of the listen-address option. + + + Please see the warnings in the FAQ that this proxy is not intended to be a substitute + for a firewall or to encourage anyone to defer addressing basic security + weaknesses. + + + Multiple ACL lines are OK. + If any ACLs are specified, then the Privoxy + talks only to IP addresses that match at least one permit-access line + and don't match any subsequent deny-access line. In other words, the + last match wins, with the default being deny-access. + + + If Privoxy is using a forwarder (see forward below) + for a particular destination URL, the dst_addr + that is examined is the address of the forwarder and NOT the address + of the ultimate target. This is necessary because it may be impossible for the local + Privoxy to determine the IP address of the + ultimate target (that's often what gateways are used for). + + + You should prefer using IP addresses over DNS names, because the address lookups take + time. All DNS names must resolve! You can not use domain patterns + like *.org or partial domain names. If a DNS name resolves to multiple + IP addresses, only the first one is used. + + + Denying access to particular sites by ACL may have undesired side effects + if the site in question is hosted on a machine which also hosts other sites. + + + + + Examples: + + + Explicitly define the default behavior if no ACL and + listen-address are set: localhost + is OK. The absence of a dst_addr implies that + all destination addresses are OK: + + + + permit-access localhost + + + + Allow any host on the same class C subnet as www.privoxy.org access to + nothing but www.example.com: + + + + permit-access www.privoxy.org/24 www.example.com/32 + + + + Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere, + with the exception that 192.168.45.73 may not access www.dirty-stuff.example.com: + + + + permit-access 192.168.45.64/26 + deny-access 192.168.45.73 www.dirty-stuff.example.com + + + + + + - - Multiple debug directives, are OK - they're logical-OR'd - together. - +buffer-limit - - - - - debug 15 # same as setting the first 4 listed above - - - - + + + Specifies: + + + Maximum size of the buffer for content filtering. + + + + + Type of value: + + Size in Kbytes + + + + Default value: + + 4096 + + + + Effect if unset: + + + Use a 4MB (4096 KB) limit. + + + + + Notes: + + + For content filtering, i.e. the +filter and + +deanimate-gif actions, it is necessary that + Privoxy buffers the entire document body. + This can be potentially dangerous, since a server could just keep sending + data indefinitely and wait for your RAM to exhaust -- with nasty consequences. + Hence this option. + + + When a document buffer size reaches the buffer-limit, it is + flushed to the client unfiltered and no further attempt to + filter the rest of the document is made. Remember that there may be multiple threads + running, which might require up to buffer-limit Kbytes + each, unless you have enabled single-threaded + above. + + + + + + + + + + + + + + +Forwarding - Default: + This feature allows routing of HTTP requests through a chain of + multiple proxies. + It can be used to better protect privacy and confidentiality when + accessing specific domains by routing requests to those domains + through an anonymous public proxy (see e.g. http://www.multiproxy.org/anon_list.htm) + Or to use a caching proxy to speed up browsing. Or chaining to a parent + proxy may be necessary because the machine that Privoxy + runs on has no direct Internet access. - - - - debug 1 # URLs - debug 4096 # Info - debug 8192 # Errors - *we highly recommended enabling this* - - - + Also specified here are SOCKS proxies. Privoxy + supports the SOCKS 4 and SOCKS 4A protocols. +forward + + + Specifies: + + + To which parent HTTP proxy specific requests should be routed. + + + + + Type of value: + + + target_domain[:port] + http_parent[/port] + + + Where target_domain is a domain name pattern (see the + chapter on domain matching in the default.action file), + http_parent is the address of the parent HTTP proxy + as an IP addresses in dotted decimal notation or as a valid DNS name (or . to denote + no forwarding, and the optional + port parameters are TCP ports, i.e. integer + values from 1 to 64535 + + + + + Default value: + + Unset + + + + Effect if unset: + + + Don't use parent HTTP proxies. + + + + + Notes: + + + If http_parent is ., then requests are not + forwarded to another HTTP proxy but are made directly to the web servers. + + + Multiple lines are OK, they are checked in sequence, and the last match wins. + + + + + Examples: + + + Everything goes to an example anonymizing proxy, except SSL on port 443 (which it doesn't handle): + + + + forward .* anon-proxy.example.org:8080 + forward :443 . + + + + Everything goes to our example ISP's caching proxy, except for requests + to that ISP's sites: + + + + forward .*. caching-proxy.example-isp.net:8000 + forward .example-isp.net . + + + + + + + + +forward-socks4 and forward-socks4a + + + + + + Specifies: + + + Through which SOCKS proxy (and to which parent HTTP proxy) specific requests should be routed. + + + + + Type of value: + + + target_domain[:port] + socks_proxy[/port] + http_parent[/port] + + + Where target_domain is a domain name pattern (see the + chapter on domain matching in the default.action file), + http_parent and socks_proxy + are IP addresses in dotted decimal notation or valid DNS names (http_parent + may be . to denote no HTTP forwarding), and the optional + port parameters are TCP ports, i.e. integer values from 1 to 64535 + + + + + Default value: + + Unset + + + + Effect if unset: + + + Don't use SOCKS proxies. + + + + + Notes: + + + Multiple lines are OK, they are checked in sequence, and the last match wins. + + + The difference between forward-socks4 and forward-socks4a + is that in the SOCKS 4A protocol, the DNS resolution of the target hostname happens on the SOCKS + server, while in SOCKS 4 it happens locally. + + + If http_parent is ., then requests are not + forwarded to another HTTP proxy but are made (HTTP-wise) directly to the web servers, albeit through + a SOCKS proxy. + + + + + Examples: + + + From the company example.com, direct connections are made to all + internal domains, but everything outbound goes through + their ISP's proxy by way of example.com's corporate SOCKS 4A gateway to + the Internet. + + + + forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080 + forward .example.com . + + + + A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent looks like this: + + + + forward-socks4 .*. socks-gw.example.com:1080 . + + + + + + + +Advanced Forwarding Examples + - Junkbuster normally uses - multi-threading, a software technique that permits it to - handle many different requests simultaneously. In some cases you may wish to - disable this -- particularly if you're trying to debug a problem. The - single-threaded option forces - Junkbuster to handle requests sequentially. - Default: Multi-threaded mode. + If you have links to multiple ISPs that provide various special content + only to their subscribers, you can configure multiple Privoxies + which have connections to the respective ISPs to act as forwarders to each other, so that + your users can see the internal content of all ISPs. - - - - #single-threaded - - - + Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP connection to + isp-b.net. Both run Privoxy. Their forwarding + configuration can look like this: - toggle allows you to temporarily disable all - Junkbuster's filtering. Just set toggle - 0. + host-a: - The Windows version of Junkbuster puts an icon in - the system tray, which also allows you to change this option. If you - right-click on that icon (or select the Options menu), one - choice is Enable. Clicking on enable toggles - Junkbuster on and off. This is useful if you want - to temporarily disable Junkbuster, e.g., to access - a site that requires cookies which you would otherwise have blocked. This can also - be toggled via a web browser at the Junkbuster - internal address of http://i.j.b on - any platform. + + forward .*. . + forward .isp-b.net host-b:8118 + - toggle 1 means Junkbuster runs - normally, toggle 0 means that - Junkbuster becomes a non-anonymizing non-blocking - proxy. Default: 1 (on). + host-b: - - - - toggle 1 - - - + + forward .*. . + forward .isp-a.net host-a:8118 + - For content filtering, i.e. the +filter and - +deanimate-gif actions, it is necessary that - Junkbuster buffers the entire document body. - This can be potentially dangerous, since a server could just keep sending - data indefinitely and wait for your RAM to exhaust. With nasty consequences. + Now, your users can set their browser's proxy to use either + host-a or host-b and be able to browse the internal content + of both isp-a and isp-b. - The buffer-limit option lets you set the maximum - size in Kbytes that each buffer may use. When the documents buffer exceeds - this size, it is flushed to the client unfiltered and no further attempt to - filter the rest of it is made. Remember that there may multiple threads - running, which might require increasing the buffer-limit - Kbytes each, unless you have enabled - single-threaded above. + If you intend to chain Privoxy and + squid locally, then chain as + browser -> squid -> privoxy is the recommended way. - - - - buffer-limit 4069 - - - + Assuming that Privoxy and squid + run on the same box, your squid configuration could then look like this: - To enable the web-based ijb.action file editor set - enable-edit-actions to 1, or 0 to disable. Note - that you must have compiled JunkBuster with - support for this feature, otherwise this option has no effect. This - internal page can be reached at http://i.j.b. - + + # Define Privoxy as parent proxy (without ICP) + cache_peer 127.0.0.1 parent 8118 7 no-query - - Security note: If this is enabled, anyone who can use the proxy - can edit the actions file, and their changes will affect all users. - For shared proxies, you probably want to disable this. Default: enabled. - + # Define ACL for protocol FTP + acl ftp proto FTP - - - - - enable-edit-actions 1 - - - - + # Do not forward FTP requests to Privoxy + always_direct allow ftp - - Allow JunkBuster to be toggled on and off - remotely, using your web browser. Set enable-remote-toggleto - 1 to enable, and 0 to disable. Note that you must have compiled - JunkBuster with support for this feature, - otherwise this option has no effect. + # Forward all the rest to Privoxy + never_direct allow all - Security note: If this is enabled, anyone who can use the proxy can toggle - it on or off (see http://i.j.b), and - their changes will affect all users. For shared proxies, you probably want to - disable this. Default: enabled. + You would then need to change your browser's proxy settings to squid's address and port. + Squid normally uses port 3128. If unsure consult http_port in squid.conf. - - - - - enable-remote-toggle 1 - - - - + @@ -1174,2407 +2255,3424 @@ Please choose from the following options: - -Access Control List (ACL) - - Access controls are included at the request of some ISPs and systems - administrators, and are not usually needed by individual users. Please note - the warnings in the FAQ that this proxy is not intended to be a substitute - for a firewall or to encourage anyone to defer addressing basic security - weaknesses. - - - - If no access settings are specified, the proxy talks to anyone that - connects. If any access settings file are specified, then the proxy - talks only to IP addresses permitted somewhere in this file and not - denied later in this file. - - + +Windows GUI Options - Summary -- if using an ACL: + Privoxy has a number of options specific to the + Windows GUI interface: - - - Client must have permission to receive service. - - - - - LAST match in ACL wins. - - - - - Default behavior is to deny service. - - - + - The syntax for an entry in the Access Control List is: + If activity-animation is set to 1, the + Privoxy icon will animate when + Privoxy is active. To turn off, set to 0. - ACTION SRC_ADDR[/SRC_MASKLEN] [ DST_ADDR[/DST_MASKLEN] ] + activity-animation 1 + - Where the individual fields are: + If log-messages is set to 1, + Privoxy will log messages to the console + window: - ACTION = permit-access or deny-access - - SRC_ADDR = client hostname or dotted IP address - SRC_MASKLEN = number of bits in the subnet mask for the source - - DST_ADDR = server or forwarder hostname or dotted IP address - DST_MASKLEN = number of bits in the subnet mask for the target + log-messages 1 - + - The field separator (FS) is whitespace (space or tab). + If log-buffer-size is set to 1, the size of the log buffer, + i.e. the amount of memory used for the log messages displayed in the + console window, will be limited to log-max-lines (see below). - IMPORTANT NOTE: If the junkbuster is using a - forwarder (see below) or a gateway for a particular destination URL, the - DST_ADDR that is examined is the address of the forwarder - or the gateway and NOT the address of the ultimate - target. This is necessary because it may be impossible for the local - Junkbuster to determine the address of the - ultimate target (that's often what gateways are used for). + Warning: Setting this to 0 will result in the buffer to grow infinitely and + eat up all your memory! - Here are a few examples to show how the ACL features work: + + + + log-buffer-size 1 + + + + - localhost is OK -- no DST_ADDR implies that - ALL destination addresses are OK: + log-max-lines is the maximum number of lines held + in the log buffer. See above. - permit-access localhost + log-max-lines 200 + - A silly example to illustrate permitting any host on the class-C subnet with - Junkbuster to go anywhere: + If log-highlight-messages is set to 1, + Privoxy will highlight portions of the log + messages with a bold-faced font: - permit-access www.junkbusters.com/24 + log-highlight-messages 1 + - Except deny one particular IP address from using it at all: + The font used in the console window: - deny-access ident.junkbusters.com + log-font-name Comic Sans MS + - You can also specify an explicit network address and subnet mask. - Explicit addresses do not have to be resolved to be used. + Font size used in the console window: - permit-access 207.153.200.0/24 + log-font-size 8 - - A subnet mask of 0 matches anything, so the next line permits everyone. + + + show-on-task-bar controls whether or not + Privoxy will appear as a button on the Task bar + when minimized: - permit-access 0.0.0.0/0 + show-on-task-bar 0 + - Note, you cannot say: + If close-button-minimizes is set to 1, the Windows close + button will minimize Privoxy instead of closing + the program (close with the exit option on the File menu). - permit-access .org + close-button-minimizes 1 + - to allow all *.org domains. Every IP address listed must resolve fully. - - - - An ISP may want to provide a Junkbuster that is - accessible by the world and yet restrict use of some of their - private content to hosts on its internal network (i.e. its own subscribers). - Say, for instance the ISP owns the Class-B IP address block 123.124.0.0 (a 16 - bit netmask). This is how they could do it: + The hide-console option is specific to the MS-Win console + version of Privoxy. If this option is used, + Privoxy will disconnect from and hide the + command console. - permit-access 0.0.0.0/0 0.0.0.0/0 # other clients can go anywhere - # with the following exceptions: - - deny-access 0.0.0.0/0 123.124.0.0/16 # block all external requests for - # sites on the ISP's network - - permit 0.0.0.0/0 www.my_isp.com # except for the ISP's main - # web site - - permit 123.124.0.0/16 0.0.0.0/0 # the ISP's clients can go - # anywhere + #hide-console - - Note that if some hostnames are listed with multiple IP addresses, - the primary value returned by DNS (via gethostbyname()) is used. Default: - Anyone can access the proxy. - - + +Actions Files + + + The actions files are used to define what actions + Privoxy takes for which URLs, and thus determines + how ad images, cookies and various other aspects of HTTP content and + transactions are handled, and on which sites (or even parts thereof). There + are three such files included with Privoxy, + with slightly different purposes. default.action sets + the default policies. standard.action is used by + Privoxy and the web based editor to set + pre-defined values (and normally should not be edited). Local exceptions + are best done in user.action. The content of these + can all be viewed and edited from http://config.privoxy.org/show-status. + - -Forwarding - - - This feature allows chaining of HTTP requests via multiple proxies. - It can be used to better protect privacy and confidentiality when - accessing specific domains by routing requests to those domains - to a special purpose filtering proxy such as lpwa.com. Or to use - a caching proxy to speed up browsing. + + Anything you want can be blocked, including ads, banners, or just some obnoxious + URL that you would rather not see is done here. Cookies can be accepted or rejected, or + accepted only during the current browser session (i.e. not written to disk), + content can be modified, JavaScripts tamed, user-tracking fooled, and much more. + See below for a complete list of available actions. - It can also be used in an environment with multiple networks to route - requests via multiple gateways allowing transparent access to multiple - networks without having to modify browser configurations. + An actions file typically has sections. Near the top, aliases are + optionally defined (discussed below), then the default set of rules + which will apply universally to all sites and pages. And then below that, + exceptions to the defined universal policies. + + +Finding the Right Mix - Also specified here are SOCKS proxies. Junkbuster - SOCKS 4 and SOCKS 4A. The difference is that SOCKS 4A will resolve the target - hostname using DNS on the SOCKS server, not our local DNS client. + Note that some actions like cookie suppression + or script disabling may render some sites unusable, which rely on these + techniques to work properly. Finding the right mix of actions is not easy and + certainly a matter of personal taste. In general, it can be said that the more + aggressive your default settings (in the top section of the + actions file) are, the more exceptions for trusted sites you + will have to make later. If, for example, you want to kill popup windows per + default, you'll have to make exceptions from that rule for sites that you + regularly use and that require popups for actually useful content, like maybe + your bank, favorite shop, or newspaper. - The syntax of each line is: + We have tried to provide you with reasonable rules to start from in the + distribution actions files. But there is no general rule of thumb on these + things. There just are too many variables, and sites are constantly changing. + Sooner or later you will want to change the rules (and read this chapter again :). + + + +How to Edit - - - - forward target_domain[:port] http_proxy_host[:port] - forward-socks4 target_domain[:port] socks_proxy_host[:port] http_proxy_host[:port] - forward-socks4a target_domain[:port] socks_proxy_host[:port] http_proxy_host[:port] - - - + The easiest way to edit the actions files is with a browser by + using our browser-based editor, which can be reached from http://config.privoxy.org/show-status. - If http_proxy_host is ., then requests are not forwarded to a - HTTP proxy but are made directly to the web servers. + If you prefer plain text editing to GUIs, you can of course also directly edit the + the actions files. + - - Lines are checked in sequence, and the last match wins. - + +How Actions are Applied to URLs - There is an implicit line equivalent to the following, which specifies that - anything not finding a match on the list is to go out without forwarding - or gateway protocol, like so: + Actions files are divided into sections. There are special sections, + like the alias sections which will be discussed later. For now + let's concentrate on regular sections: They have a heading line (often split + up to multiple lines for readability) which consist of a list of actions, + separated by whitespace and enclosed in curly braces. Below that, there + is a list of URL patterns, each on a separate line. - - - - forward .* . # implicit - - - + To determine which actions apply to a request, the URL of the request is + compared to all patterns in this file. Every time it matches, the list of + applicable actions for the URL is incrementally updated, using the heading + of the section in which the pattern is located. If multiple matches for + the same URL set the same action differently, the last match wins. If not, + the effects are aggregated (e.g. a URL might match both the + +handle-as-image + and +block actions). + - In the following common configuration, everything goes to Lucent's LPWA, - except SSL on port 443 (which it doesn't handle): + You can trace this process by visiting http://config.privoxy.org/show-url-info. - - - - forward .* lpwa.com:8000 - forward :443 . - - - + More detail on this is provided in the Appendix, + Anatomy of an Action. + + + +Patterns - See the FAQ for instructions on how to automate the login procedure for LPWA. - Some users have reported difficulties related to LPWA's use of - . as the last element of the domain, and have said that this - can be fixed with this: - - - - - - - forward lpwa. lpwa.com:8000 - - - - - - - (NOTE: the syntax for specifying target_domain has changed since the - previous paragraph was written -- it will not work now. More information - is welcome.) + Generally, a pattern has the form <domain>/<path>, + where both the <domain> and <path> + are optional. (This is why the pattern / matches all URLs). - - In this fictitious example, everything goes via an ISP's caching proxy, - except requests to that ISP: - + + + www.example.com/ + + + is a domain-only pattern and will match any request to www.example.com, + regardless of which document on that server is requested. + + + + + www.example.com + + + means exactly the same. For domain-only patterns, the trailing / may + be omitted. + + + + + www.example.com/index.html + + + matches only the single document /index.html + on www.example.com. + + + + + /index.html + + + matches the document /index.html, regardless of the domain, + i.e. on any web server. + + + + + index.html + + + matches nothing, since it would be interpreted as a domain name and + there is no top-level domain called .html. + + + + - - - - - forward .* caching.myisp.net:8000 - forward myisp.net . - - - - +The Domain Pattern - For the @home network, we're told the forwarding configuration is this: + The matching of the domain part offers some flexible options: if the + domain starts or ends with a dot, it becomes unanchored at that end. + For example: + + + .example.com + + + matches any domain that ENDS in + .example.com + + + + + www. + + + matches any domain that STARTS with + www. + + + + + .example. + + + matches any domain that CONTAINS .example. + (Correctly speaking: It matches any FQDN that contains example as a domain.) + + + + - - - - forward .* proxy:8080 - - - + Additionally, there are wild-cards that you can use in the domain names + themselves. They work pretty similar to shell wild-cards: * + stands for zero or more arbitrary characters, ? stands for + any single character, you can define character classes in square + brackets and all of that can be freely mixed: - - Also, we're told they insist on getting cookies and JavaScript, so you should - add home.com to the cookie file. We consider JavaScript a security risk. - Java need not be enabled. - + + + ad*.example.com + + + matches adserver.example.com, + ads.example.com, etc but not sfads.example.com + + + + + *ad*.example.com + + + matches all of the above, and then some. + + + + + .?pix.com + + + matches www.ipix.com, + pictures.epix.com, a.b.c.d.e.upix.com etc. + + + + + www[1-9a-ez].example.c* + + + matches www1.example.com, + www4.example.cc, wwwd.example.cy, + wwwz.example.com etc., but not + wwww.example.com. + + + + - - In this example direct connections are made to all internal - domains, but everything else goes through Lucent's LPWA by way of the - company's SOCKS gateway to the Internet. - + - - - - - forward-socks4 .* lpwa.com:8000 firewall.my_company.com:1080 - forward my_company.com . - - - - +The Path Pattern - This is how you could set up a site that always uses SOCKS but no forwarders: + Privoxy uses Perl compatible regular expressions + (through the PCRE library) for + matching the path. - - - - forward-socks4a .* . firewall.my_company.com:1080 - - - + There is an Appendix with a brief quick-start into regular + expressions, and full (very technical) documentation on PCRE regex syntax is available on-line + at http://www.pcre.org/man.txt. + You might also find the Perl man page on regular expressions (man perlre) + useful, which is available on-line at http://www.perldoc.com/perl5.6/pod/perlre.html. - An advanced example for network administrators: + Note that the path pattern is automatically left-anchored at the /, + i.e. it matches as if it would start with a ^ (regular expression speak + for the beginning of a line). - If you have links to multiple ISPs that provide various special content to - their subscribers, you can configure forwarding to pass requests to the - specific host that's connected to that ISP so that everybody can see all - of the content on all of the ISPs. + Please also note that matching in the path is case + INSENSITIVE by default, but you can switch to case + sensitive at any point in the pattern by using the + (?-i) switch: + www.example.com/(?-i)PaTtErN.* will match only + documents whose path starts with PaTtErN in + exactly this capitalization. + - - This is a bit tricky, but here's an example: - + + - - host-a has a PPP connection to isp-a.com. And host-b has a PPP connection to - isp-b.com. host-a can run a Junkbuster proxy with - forwarding like this: - - - - - - forward .* . - forward isp-b.com host-b:8118 - - - - + + +Actions - host-b can run a Junkbuster proxy with forwarding - like this: + All actions are disabled by default, until they are explicitly enabled + somewhere in an actions file. Actions are turned on if preceded with a + +, and turned off if preceded with a -. So a + +action means do that action, e.g. + +block means please block the following URL + patterns. - - - - - forward .* . - forward isp-a.com host-a:8118 - - - + + Actions are invoked by enclosing the action name in curly braces (e.g. + {+some_action}), followed by a list of URLs (or patterns that match URLs) to + which the action applies. There are three classes of actions: - Now, anyone on the Internet (including users on host-a - and host-b) can set their browser's proxy to either - host-a or host-b and be able to browse the content on isp-a or isp-b. - + - - Here's another practical example, for University of Kent at - Canterbury students with a network connection in their room, who - need to use the University's Squid web cache. + + + Boolean, i.e the action can only be on or + off. Examples: + + + + + + {+name} # enable this action + {-name} # disable this action + + + + + + + + + + Parameterized, e.g. +/-hide-user-agent{ Mozilla 1.0 }, + where some value is required in order to enable this type of action. + Examples: + + + + + + {+name{param}} # enable action and set parameter to param + {-name} # disable action (parameter) can be omitted + + + + + + + + + + Multi-value, e.g. {+/-add-header{Name: value}} or + {+/-send-wafer{name=value}}), where some value needs to be defined + in addition to simply enabling the action. Examples: + + + + + + {+name{param=value}} # enable action and set param to value + {-name{param=value}} # remove the parameter param completely + {-name} # disable this action totally and remove param too + + + + + + + - - - - forward *. ssbcache.ukc.ac.uk:3128 # Use the proxy, except for: - forward .ukc.ac.uk . # Anything on the same domain as us - forward * . # Host with no domain specified - forward 129.12.*.* . # A dotted IP on our /16 network. - forward 127.*.*.* . # Loopback address - forward localhost.localdomain . # Loopback address - forward www.ukc.mirror.ac.uk . # Specific host - - - + If nothing is specified in any actions file, no actions are + taken. So in this case Privoxy would just be a + normal, non-blocking, non-anonymizing proxy. You must specifically enable the + privacy and blocking features you need (although the provided default actions + files will give a good starting point). - If you intend to chain Junkbuster and - squid locally, then chain as - browser -> squid -> junkbuster is the recommended way. + Later defined actions always over-ride earlier ones. So exceptions + to any rules you make, should come in the latter part of the file (or + in a file that is processed later when using multiple actions files). For + multi-valued actions, the actions are applied in the order they are specified. + Actions files are processed in the order they are defined in + config (the default installation has three actions + files). It also quite possible for any given URL pattern to match more than + one action! + - Your squid configuration could then look like this: + The list of valid Privoxy actions are: - - - - - # Define junkbuster as parent cache - - cache_peer 127.0.0.1 parent 8118 0 no-query - - # Define ACL for protocol FTP - acl FTP proto FTP - # Do not forward ACL FTP to junkbuster - always_direct allow FTP + + + + + - # Do not forward ACL CONNECT (https) to junkbuster - always_direct allow CONNECT - # Forward the rest to junkbuster - never_direct allow all - - - - + - + +<emphasis>+add-header</emphasis> - + + + Type: + + + Multi-value. + + + + + Typical uses: + + + Send a user defined HTTP header to the web server. + + + + + + Possible values: + + + Any value is possible. Validity of the defined HTTP headers is not checked. + + + + + + Example usage: + + + {+add-header{X-User-Tracking: sucks}} + .example.com + + + + + + Notes: + + + This action may be specified multiple times, in order to define multiple + headers. This is rarely needed for the typical user. If you don't know what + HTTP headers are, you definitely don't need to worry about this + one. + + + + + + +<emphasis>+block</emphasis> - -Windows GUI Options - - - Junkbuster has a number of options specific to the - Windows GUI interface: - + + + Type: + + + Boolean. + + - - If activity-animation is set to 1, the - Junkbuster icon will animate when - Junkbuster is active. To turn off, set to 0. - + + Typical uses: + + + Used to block a URL from reaching your browser. The URL may be + anything, but is typically used to block ads or other obnoxious + content. + + + - - - - - activity-animation 1 - - - - + + Possible values: + + N/A + + + + + Example usage: + + + {+block} + .banners.example.com + .ads.r.us + + + - - If log-messages is set to 1, - Junkbuster will log messages to the console - window: - + + Notes: + + + If a URL matches one of the blocked patterns, Privoxy + will intercept the URL and display its special BLOCKED page + instead. If there is sufficient space, a large red banner will appear with + a friendly message about why the page was blocked, and a way to go there + anyway. If there is insufficient space a smaller BLOCKED + page will appear without the red banner. + Click here + to view the default blocked HTML page (Privoxy must be running + for this to work as intended!). + - - - - - log-messages 1 - - - - + + A very important exception is if the URL matches both + +block and +handle-as-image, + then it will be handled by + +set-image-blocker + (see below). It is important to understand this process, in order + to understand how Privoxy is able to deal with + ads and other objectionable content. + + + The +filter + action can also perform some of the + same functionality as +block, but by virtue of very + different programming techniques, and is most often used for different + reasons. + + + - - If log-buffer-size is set to 1, the size of the log buffer, - i.e. the amount of memory used for the log messages displayed in the - console window, will be limited to log-max-lines (see below). - + + - - Warning: Setting this to 0 will result in the buffer to grow infinitely and - eat up all your memory! - - - - - - log-buffer-size 1 - - - - + + +<emphasis>+deanimate-gifs</emphasis> - - log-max-lines is the maximum number of lines held - in the log buffer. See above. - + + + Type: + + + Parameterized. + + - - - - - log-max-lines 200 - - - - + + Typical uses: + + + To stop those annoying, distracting animated GIF images. + + + - - If log-highlight-messages is set to 1, - Junkbuster will highlight portions of the log - messages with a bold-faced font: - + + Possible values: + + + last or first + + + + + + Example usage: + + + {+deanimate-gifs{last}} + .example.com + + + - - - - - log-highlight-messages 1 - - - - + + Notes: + + + De-animate all animated GIF images, i.e. reduce them to their last frame. + This will also shrink the images considerably (in bytes, not pixels!). If + the option first is given, the first frame of the animation + is used as the replacement. If last is given, the last + frame of the animation is used instead, which probably makes more sense for + most banner animations, but also has the risk of not showing the entire + last frame (if it is only a delta to an earlier frame). + + + - - The font used in the console window: - + + - - - - - log-font-name Comic Sans MS - - - - + + +<emphasis>+downgrade-http-version</emphasis> - - Font size used in the console window: - + + + Type: + + + Boolean. + + - - - - - log-font-size 8 - - - - + + Typical uses: + + + +downgrade-http-version will downgrade HTTP/1.1 client requests to + HTTP/1.0 and downgrade the responses as well. + + + - - show-on-task-bar controls whether or not - Junkbuster will appear as a button on the Task bar - when minimized: - + + Possible values: + + + N/A + + + + + + Example usage: + + + {+downgrade-http-version} + .example.com + + + - - - - - show-on-task-bar 0 - - - - + + Notes: + + + Use this action for servers that use HTTP/1.1 protocol features that + Privoxy doesn't handle well yet. HTTP/1.1 is + only partially implemented. Default is not to downgrade requests. This is + an infrequently needed action, and is used to help with rare problem sites only. + + + - - If close-button-minimizes is set to 1, the Windows close - button will minimize Junkbuster instead of closing - the program (close with the exit option on the File menu). - + + - - - - - close-button-minimizes 1 - - - - + + +<emphasis>+fast-redirects</emphasis> - - The hide-console option is specific to the MS-Win console - version of JunkBuster. If this option is used, - Junkbuster will disconnect from and hide the - command console. - + + + Type: + + + Boolean. + + - - - - - #hide-console - - - - + + Typical uses: + + + The +fast-redirects action enables interception of + redirect requests from one server to another, which + are used to track users.Privoxy can cut off + all but the last valid URL in a redirect request and send a local redirect + back to your browser without contacting the intermediate site(s). + + + - - + + Possible values: + + + N/A + + + + + + Example usage: + + + {+fast-redirects} + .example.com + + + - + + Notes: + + + Many sites, like yahoo.com, don't just link to other sites. Instead, they + will link to some script on their own server, giving the destination as a + parameter, which will then redirect you to the final target. URLs + resulting from this scheme typically look like: + http://some.place/some_script?http://some.where-else. + + + Sometimes, there are even multiple consecutive redirects encoded in the + URL. These redirections via scripts make your web browsing more traceable, + since the server from which you follow such a link can see where you go + to. Apart from that, valuable bandwidth and time is wasted, while your + browser ask the server for one redirect after the other. Plus, it feeds + the advertisers. + + + This is a normally on feature, and often requires exceptions + for sites that are sensitive to defeating this mechanism. + + + + + - - -The Actions File - - The ijb.action file (formerly - actionsfile) is used to define what actions - Junkbuster takes, and thus determines how images, - cookies and various other aspects of HTTP content and transactions are - handled. Images can be anything you want, including ads, banners, or just - some obnoxious image that you would rather not see. Cookies can be accepted - or rejected, or accepted only during the current browser session (i.e. - not written to disk). Changes to ijb.action should - be immediately visible to Junkbuster without - the need to restart. - + + +<emphasis>+filter</emphasis> - - To determine which actions apply to a request, the URL of the request is - compared to all patterns in this file. Every time it matches, the list of - applicable actions for the URL is incrementally updated. You can trace - this process by visiting http://i.j.b/show-url-info. - + + + Type: + + + Parameterized. + + - - The actions file can be edited with a browser by loading - http://i.j.b/, and then select - Edit Actions. - + + Typical uses: + + + Apply page filtering as defined by named sections of the + default.filter file to the specified site(s). + Filtering can be any modification of the raw + page content, including re-writing or deletion of content. + + + - - There are four types of lines in this file: comments (begin with a - # character), actions, aliases and patterns, all of which are - explained below, as well as the configuration file syntax that - Junkbuster understands. + + Possible values: + + + +filter must include the name of one of the section identifiers + from default.filter (or whatever + filterfile is specified in config). + + + + + + Example usage (from the current default.filter): + + + + +filter{html-annoyances}: Get rid of particularly annoying HTML abuse. + + + + + +filter{js-annoyances}: Get rid of particularly annoying JavaScript abuse + + + + + +filter{content-cookies}: Kill cookies that come in the HTML or JS content + + + + + +filter{popups}: Kill all popups in JS and HTML + + + + + +filter{frameset-borders}: Give frames a border and make them resizable + + + + + +filter{webbugs}: Squish WebBugs (1x1 invisible GIFs used for user tracking) + + + + + +filter{refresh-tags}: Kill automatic refresh tags (for dial-on-demand setups) + + + + + +filter{fun}: Text replacements for subversive browsing fun! + + + + + +filter{nimda}: Remove Nimda (virus) code. + + + + + +filter{banners-by-size}: Kill banners by size (very efficient!) + + + + + +filter{shockwave-flash}: Kill embedded Shockwave Flash objects + + + + + +filter{crude-parental}: Kill all web pages that contain the words "sex" or "warez" + + + + + + + Notes: + + + This is potentially a very powerful feature! And requires a knowledge + of regular expressions if you want to roll your own. + Filtering operates on a line by line basis throughout the entire page. + + + Filtering requires buffering the page content, which may appear to + slow down page rendering since nothing is displayed until all content has + passed the filters. (It does not really take longer, but seems that way + since the page is not incrementally displayed.) This effect will be more + noticeable on slower connections. + + + Filtering can achieve some of the effects as the + +block + action, i.e. it can be used to block ads and banners. In the overall + scheme of things, filtering is one of the first things Privoxy + does with a web page. So other most other actions are applied to the + already filtered page. + + + - + + - -URL Domain and Path Syntax - - Generally, a pattern has the form <domain>/<path>, where both the - <domain> and <path> part are optional. If you only specify a - domain part, the / can be left out: - + +<emphasis>+hide-forwarded-for-headers</emphasis> - - www.example.com - is a domain only pattern and will match any request to - www.example.com. - + + + Type: + + + Boolean. + + - - www.example.com/ - means exactly the same. - + + Typical uses: + + + Block any existing X-Forwarded-for HTTP header, and do not add a new one. + + + - - www.example.com/index.html - matches only the single - document /index.html on www.example.com. - + + Possible values: + + + N/A + + + + + + Example usage: + + + {+hide-forwarded-for-headers} + .example.com + + + - - /index.html - matches the document /index.html, regardless of - the domain. - + + Notes: + + + It is fairly safe to leave this on. It does not seem to break many sites. + + + - - index.html - matches nothing, since it would be - interpreted as a domain name and there is no top-level domain called - .html. - + + - - The matching of the domain part offers some flexible options: if the - domain starts or ends with a dot, it becomes unanchored at that end. - For example: - - - .example.com - matches any domain that ENDS in - .example.com. - + + +<emphasis>+hide-from-header</emphasis> - - www. - matches any domain that STARTS with - www. - + + + Type: + + + Parameterized. + + - - Additionally, there are wild-cards that you can use in the domain names - themselves. They work pretty similar to shell wild-cards: * - stands for zero or more arbitrary characters, ? stands for - any single character. And you can define character classes in square - brackets and they can be freely mixed: - + + Typical uses: + + + To block the browser from sending your email address in a From: + header. + + + - - ad*.example.com - matches adserver.example.com, - ads.example.com, etc but not sfads.example.com. - + + Possible values: + + + Keyword: block, or any user defined value. + + + + + + Example usage: + + + {+hide-from-header{block}} + .example.com + + + - - *ad*.example.com - matches all of the above, and then some. - + + Notes: + + + The keyword block will completely remove the header + (not to be confused with the +block action). + Alternately, you can specify any value you prefer to send to the web + server. + + + - - .?pix.com - matches www.ipix.com, - pictures.epix.com, a.b.c.d.e.upix.com, etc. - + + - - www[1-9a-ez].example.com - matches www1.example.com, - www4.example.com, wwwd.example.com, - wwwz.example.com, etc., but not - wwww.example.com. - - - If Junkbuster was compiled with - pcre support (default), Perl compatible regular expressions - can be used. See the pcre/docs/ directory or man - perlre (also available on http://www.perldoc.com/perl5.6/pod/perlre.html) - for details. A brief discussion of regular expressions is in the - Appendix. For instance: - + + +<emphasis>+hide-referer</emphasis> + + + + Type: + + + Parameterized. + + - - /.*/advert[0-9]+\.jpe?g - would match a URL from any - domain, with any path that includes advert followed - immediately by one or more digits, then a . and ending in - either jpeg or jpg. So we match - example.com/ads/advert2.jpg, and - www.example.com/ads/banners/advert39.jpeg, but not - www.example.com/ads/banners/advert39.gif (no gifs in the - example pattern). - + + Typical uses: + + + Don't send the Referer: (sic) HTTP header to the web site. + Or, alternately send a forged header instead. + + + - - Please note that matching in the path is case - INSENSITIVE by default, but you can switch to case - sensitive at any point in the pattern by using the - (?-i) switch: - + + Possible values: + + + Prevent the header from being sent with the keyword, block. + Or, forge a URL to one from the same server as the request. + Or, set to user defined value of your choice. + + + + + + Example usage: + + + {+hide-referer{forge}} + .example.com + + + - - www.example.com/(?-i)PaTtErN.* - will match only - documents whose path starts with PaTtErN in - exactly this capitalization. - + + Notes: + + + forge is the preferred option here, since some servers will + not send images back otherwise. + + + +hide-referrer is an alternate spelling of + +hide-referer. It has the exact same parameters, and can be freely + mixed with, +hide-referer. (referrer is the + correct English spelling, however the HTTP specification has a bug - it + requires it to be spelled as referer.) + + + - + + - + + +<emphasis>+hide-user-agent</emphasis> + + + Type: + + + Parameterized. + + - + + Typical uses: + + + To change the User-Agent: header so web servers can't tell + your browser type. Who's business is it anyway? + + + - -Actions - - Actions are enabled if preceded with a +, and disabled if - preceded with a -. Actions are invoked by enclosing the - action name in curly braces (e.g. {+some_action}), followed by a list of - URLs to which the action applies. There are three classes of actions: - + + Possible values: + + + Any user defined string. + + + + + + Example usage: + + + {+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}} + .msn.com + + + - - + + Notes: + + + Warning! This breaks many web sites that depend on this in order + to determine how the target browser will respond to various + requests. Use with caution. + + + - - - Boolean (e.g. +/-block): - - - - - - {+name} # enable this action - {-name} # disable this action - - - - - + + + + +<emphasis>+handle-as-image</emphasis> - - - parameterized (e.g. +/-hide-user-agent): - - - - - - {+name{param}} # enable action and set parameter to param - {-name} # disable action - - - - - + + + Type: + + + Boolean. + + + + + Typical uses: + + + To define what Privoxy should treat + automatically as an image, and is an important ingredient of how + ads are handled. + + + + + + Possible values: + + + N/A + + + - - - Multi-value (e.g. {+/-add-header{Name: value}}, {+/-wafer{name=value}}): - - - - - - {+name{param}} # enable action and add parameter param - {-name{param}} # remove the parameter param - {-name} # disable this action totally - - - - - + + Example usage: + + + {+handle-as-image} + /.*\.(gif|jpg|jpeg|png|bmp|ico) + + + - - + + Notes: + + + This only has meaning if the URL (or pattern) also is + +blocked, in which case a user definable image can + be sent rather than a HTML page. This is integral to the whole concept of + ad blocking: the URL must match both a +block rule, + and +handle-as-image. + (See +set-image-blocker + below for control over what will actually be displayed by the browser.) + + + There is little reason to change the default definition for this action. + + + - - If nothing is specified in this file, no actions are taken. - So in this case JunkBuster would just be a - normal, non-blocking, non-anonymizing proxy. You must specifically - enable the privacy and blocking features you need (although the - provided default ijb.action file will - give a good starting point). - + + - - Later defined actions always over-ride earlier ones. For multi-valued - actions, the actions are applied in the order they are specified. - - - The list of valid Junkbuster actions are: - + + +<emphasis>+set-image-blocker</emphasis> - - - - - - Add the specified HTTP header, which is not checked for validity. - You may specify this many times to specify many different headers: - - - - - - +add-header{Name: value} - - - - - - - - - - Block this URL totally. - - - - - - +block - - - - - - - - - - De-animate all animated GIF images, i.e. reduce them to their last frame. - This will also shrink the images considerably (in bytes, not pixels!). If - the option first is given, the first frame of the animation - is used as the replacement. If last is given, the last frame - of the animation is used instead, which probably makes more sense for most - banner animations, but also has the risk of not showing the entire last - frame (if it is only a delta to an earlier frame). - - - - - - +deanimate-gifs{last} - +deanimate-gifs{first} - - - - - + + + Type: + + + Parameterized. + + + + + Typical uses: + + + Decide what to do with URLs that end up tagged with both + +block + and +handle-as-image, + e.g an advertisement. + + + + + + Possible values: + + + There are four available options: -set-image-blocker will send a HTML + blocked page, usually resulting in a broken + image icon. + +set-image-blocker{blank} will send a + 1x1 transparent GIF image. + +set-image-blocker{pattern} will send a + checkerboard type pattern (the default). And finally, + +set-image-blocker{http://xyz.com} will + send a HTTP temporary redirect to the specified image. This has the + advantage of the icon being being cached by the browser, which will speed + up the display. + + + - - - +downgrade will downgrade HTTP/1.1 client requests to - HTTP/1.0 and downgrade the responses as well. Use this action for servers - that use HTTP/1.1 protocol features that - Junkbuster doesn't handle well yet. HTTP/1.1 - is only partially implemented. Default is not to downgrade requests. - - - - - - +downgrade - - - - - + + Example usage: + + + {+set-image-blocker{blank}} + .example.com + + + + + + Notes: + + + If you want invisible ads, they need to meet + criteria as matching both images and blocked + actions. And then, image-blocker should be set to + blank for invisibility. Note you cannot treat HTML pages as + images in most cases. For instance, frames require an HTML page to + display. So a frame that is an ad, typically cannot be treated as an image. + Forcing an image in this situation just will not work + reliably. + + + + + + + + + +<emphasis>+limit-connect</emphasis> + + + + Type: + + + Parameterized. + + + + + Typical uses: + + + By default, Privoxy only allows HTTP CONNECT + requests to port 443 (the standard, secure HTTPS port). Use + +limit-connect to disable this altogether, or to allow + more ports. + + + + + + Possible values: + + + Any valid port number, or port number range. + + + - - - Many sites, like yahoo.com, don't just link to other sites. Instead, they - will link to some script on their own server, giving the destination as a - parameter, which will then redirect you to the final target. URLs resulting - from this scheme typically look like: - http://some.place/some_script?http://some.where-else. - - - Sometimes, there are even multiple consecutive redirects encoded in the - URL. These redirections via scripts make your web browsing more traceable, - since the server from which you follow such a link can see where you go to. - Apart from that, valuable bandwidth and time is wasted, while your browser - ask the server for one redirect after the other. Plus, it feeds the - advertisers. + + Example usages: + + + + + + +limit-connect{443} # This is the default and need not be specified. + +limit-connect{80,443} # Ports 80 and 443 are OK. + +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to 100 and above 500 are OK. + + + + + + Notes: + + + The CONNECT methods exists in HTTP to allow access to secure websites + (https:// URLs) through proxies. It works very simply: the proxy connects + to the server on the specified port, and then short-circuits its + connections to the client and to the remote proxy. + This can be a big security hole, since CONNECT-enabled proxies can be + abused as TCP relays very easily. - - The +fast-redirects option enables interception of these - requests by Junkbuster, who will cut off all but - the last valid URL in the request and send a local redirect back to your - browser without contacting the remote site. + + If you want to allow CONNECT for more ports than this, or want to forbid + CONNECT altogether, you can specify a comma separated list of ports and + port ranges (the latter using dashes, with the minimum defaulting to 0 and + max to 65K). - - - - +fast-redirects - - - + If you don't know what any of this means, there probably is no reason to + change this one. - + + - - - Filter the website through the re_filterfile: - - - - - - +filter{filename} - - - - - + + - - - Block any existing X-Forwarded-for header, and do not add a new one: - - - - - - +hide-forwarded - - - - - + + +<emphasis>+prevent-compression</emphasis> - - - If the browser sends a From: header containing your e-mail - address, this either completely removes the header (block), or - changes it to the specified e-mail address. - - - - - - +hide-from{block} - +hide-from{spam@sittingduck.xqq} - - - - - - - - - Don't send the Referer: (sic) header to the web site. You - can block it, forge a URL to the same server as the request (which is - preferred because some sites will not send images otherwise) or set it to a - constant string of your choice. - - - - - - +hide-referer{block} - +hide-referer{forge} - +hide-referer{http://nowhere.com} - - - - - - - - - Alternative spelling of +hide-referer. It has the same - parameters, and can be freely mixed with, +hide-referer. - (referrer is the correct English spelling, however the HTTP - specification has a bug - it requires it to be spelled referer.) - - - - - - +hide-referrer{...} - - - - - + + + Type: + + + Boolean. + + - - - Change the User-Agent: header so web servers can't tell your - browser type. Warning! This breaks many web sites. Specify the - user-agent value you want. Example, pretend to be using Netscape on - Linux: - - - - - - +hide-user-agent{Mozilla (X11; I; Linux 2.0.32 i586)} - - - - - - + + Typical uses: + + + Prevent the specified websites from compressing HTTP data. + + + - - - Treat this URL as an image. This only matters if it's also +blocked, - in which case a blocked image can be sent rather than a HTML page. - See +image-blocker{} below for the control over what is actually sent. - - - - - - +image - - - - - - - - - Decides what to do with URLs that end up tagged with {+block - +image}, e.g an advertizement. There are five options. - -image-blocker will send a HTML blocked page, - usually resulting in a broken image icon. - +image-blocker{logo} will send a JunkBuster - logo image. +image-blocker{blank} will send a 1x1 - transparent GIF image. And finally, - +image-blocker{http://xyz.com} will send a HTTP temporary - redirect to the specified image. This has the advantage of the icon being - being cached by the browser, which will speed up the display. - +image-blocker{pattern} will send a checkboard type pattern, - which scales better than the logo (which can get blocky if the browser - enlarges it too much). - - - - - - +image-blocker{logo} - +image-blocker{blank} - +image-blocker{http://i.j.b/send-banner} - - - - - - - - - By default (i.e. in the absence of a +limit-connect - action), Junkbuster will only allow CONNECT - requests to port 443, which is the standard port for https as a - precaution. - + + Possible values: + + + N/A + + + - - The CONNECT methods exists in HTTP to allow access to secure websites - (https:// URLs) through proxies. It works very simply: the proxy - connects to the server on the specified port, and then short-circuits - its connections to the client and to the remote proxy. - This can be a big security hole, since CONNECT-enabled proxies can - be abused as TCP relays very easily. - - - - If you want to allow CONNECT for more ports than this, or want to forbid - CONNECT altogether, you can specify a comma separated list of ports and - port ranges (the latter using dashes, with the minimum defaulting to 0 and - max to 65K): - + + Example usage: + + + {+prevent-compression} + .example.com + + + - - - - - +limit-connect{443} # This is the default and need no be specified. - +limit-connect{80,443} # Ports 80 and 443 are OK. - +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to 100 - #and above 500 are OK. - - - - + + Notes: + + + Some websites do this, which can be a problem for + Privoxy, since + +filter, + +kill-popups + and +gif-deanimate + will not work on compressed data. This will slow down connections to those + websites, though. Default typically is to turn + prevent-compression on. + + + - + + + + + +<emphasis>+session-cookies-only</emphasis> + + + + Type: + + + Boolean. + + + + + Typical uses: + + + Allow cookies for the current browser session only. + + + + + + Possible values: + + + N/A + + + - - - +no-compression prevents the website from compressing the - data. Some websites do this, which can be a problem for - Junkbuster, since +filter, - +no-popup and +gif-deanimate will not work on - compressed data. This will slow down connections to those websites, - though. Default is nocompression is turned on. - + + Example usage (disabling): + + + {-session-cookies-only} + .example.com + + + - - - - - +nocompression - - - - - + + Notes: + + + If websites set cookies, +session-cookies-only will make sure + they are erased when you exit and restart your web browser. This makes + profiling cookies useless, but won't break sites which require cookies so + that you can log in for transactions. This is generally turned on for all + sites, and is the recommended setting. + + + +prevent-*-cookies actions should be turned off as well (see + below), for +session-cookies-only to work. Or, else no cookies + will get through at all. For, persistent cookies that survive + across browser sessions, see below as well. + + + + + + + + + + +<emphasis>+prevent-reading-cookies</emphasis> + + + + Type: + + + Boolean. + + + + + Typical uses: + + + Explicitly prevent the web server from reading any cookies on your + system. + + + + + + Possible values: + + + N/A + + + - - - If the website sets cookies, no-cookies-keep will make sure - they are erased when you exit and restart your web browser. This makes - profiling cookies useless, but won't break sites which require cookies so - that you can log in for transactions. Default: on. - - - - - - +no-cookies-keep - - - - - + + Example usage: + + + {+prevent-reading-cookies} + .example.com + + + + + + Notes: + + + Often used in conjunction with +prevent-setting-cookies to + disable cookies completely. Note that + +session-cookies-only + requires these to both be disabled (or else it never gets any cookies to cache). + + + For persistent cookies to work (i.e. they survive across browser + sessions and reboots), all three cookie settings should be off + for the specified sites. + + + + + + + + + + +<emphasis>+prevent-setting-cookies</emphasis> + + + + Type: + + + Boolean. + + + + + Typical uses: + + + Explicitly block the web server from storing cookies on your + system. + + + + + + Possible values: + + + N/A + + + - - - Prevent the website from reading cookies: - - - - - - +no-cookies-read - - - - - + + Example usage: + + + {+prevent-setting-cookies} + .example.com + + + + + + Notes: + + + Often used in conjunction with +prevent-reading-cookies to + disable cookies completely (see above). + + + + + + + + + + +<emphasis>+kill-popups<anchor id="kill-popups"></emphasis> + + + Type: + + + Boolean. + + + + + Typical uses: + + + Stop those annoying JavaScript pop-up windows! + + + + + + Possible values: + + + N/A + + + - - - Prevent the website from setting cookies: - - - - - - +no-cookies-set - - - - - + + Example usage: + + + {+kill-popups} + .example.com + + + + + + Notes: + + + +kill-popups uses a built in filter to disable pop-ups + that use the window.open() function, etc. This is + one of the first actions processed by Privoxy + as it contacts the remote web server. This action is not always 100% reliable, + and is supplemented by +filter{popups}. + + + + + + + + + + + +<emphasis>+send-vanilla-wafer</emphasis> + + + + Type: + + + Boolean. + + + + + Typical uses: + + + Sends a cookie for every site stating that you do not accept any copyright + on cookies sent to you, and asking them not to track you. + + + + + + Possible values: + + + N/A + + + - - - Filter the website through a built-in filter to disable those obnoxious - JavaScript pop-up windows via window.open(), etc. The two alternative - spellings are equivalent. - - - - - - +no-popup - +no-popups - - - - - + + Example usage: + + + {+send-vanilla-wafer} + .example.com + + + + + + Notes: + + + This action only applies if you are using a jarfile + for saving cookies. Of course, this is a (relatively) unique header and + could conceivably be used to track you. + + + + + + + + + + +<emphasis>+send-wafer</emphasis> + + + + Type: + + + Multi-value. + + + + + Typical uses: + + + This allows you to send an arbitrary, user definable cookie. + + + + + + Possible values: + + + User specified cookie name and corresponding value. + + + - - - This action only applies if you are using a jarfile - for saving cookies. It sends a cookie to every site stating that you do not - accept any copyright on cookies sent to you, and asking them not to track - you. Of course, this is a (relatively) unique header they could use to - track you. - - - - - - +vanilla-wafer - - - - - + + Example usage: + + + {+send-wafer{name=value}} + .example.com + + + + + + Notes: + + + This can be specified multiple times in order to add as many cookies as you + like. + + + + + + + + + + +Actions Examples + + Note that the meaning of any of the above examples is reversed by preceding + the action with a -, in place of the +. Also, + that some actions are turned on in the default section of the actions file, + and require little to no additional configuration. These are just on. + But, other actions that are turned on the default section do + typically require exceptions to be listed in the latter sections of + one of our actions file. For instance, by default no URLs are + blocked (i.e. in the default definitions of + default.action). We need exceptions to this in order to + enable ad blocking in the lower sections. But we need to be very selective + about what we do block. + + + + Below is a liberally commented default.action file to + demonstrate how all the pieces come together. And to show how exceptions to + the default policies can be handled. This is followed by a + user.action with similar examples. + + + + + + + +# Settings -- Don't change! For internal Privoxy use ONLY. +{{settings}} +for-privoxy-version=3.0 + + +########################################################################## +# Aliases must be defined *before* they are used. These are +# easier to remember, and combine several actions into one. Once defined +# they can be used just like any built-in action. +########################################################################## + +# Some useful aliases. + -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies + +imageblock = +block +handle-as-image + +# Fragile sites should have the minimum changes: + fragile = -block -deanimate-gifs -fast-redirects -filter -hide-referer \ + -prevent-cookies -kill-popups + +# Shops should be allowed to set persistent cookies + shop = -filter -prevent-cookies -session-cookies-only + + +########################################################################## +# Begin default action settings. Anything in this section will match +# all URLs -- UNLESS we have exceptions that match defined below this +# section. We will show all potential actions here whether they are on +# or off. We could omit any disabled action if we wanted, since all +# actions are 'off' by default anyway. Shown for completeness only. +# Actions are enabled if preceded by a '+', otherwise they are disabled. +########################################################################## + { \ + -add-header \ + -block \ + -deanimate-gifs \ + -downgrade-http-version \ + +fast-redirects \ + +filter{html-annoyances} \ + +filter{js-annoyances} \ + -filter{content-cookies} \ + -filter{popups} \ + +filter{webbugs} \ + -filter{refresh-tags} \ + -filter{fun} \ + +filter{nimda} \ + +filter{banners-by-size} \ + -filter{shockwave-flash} \ + -filter{crude-prental} \ + +hide-forwarded-for-headers \ + +hide-from-header{block} \ + -hide-referrer \ + -hide-user-agent \ + -handle-as-image \ + +set-image-blocker{pattern} \ + -limit-connect \ + +prevent-compression \ + -session-cookies-only \ + -prevent-reading-cookies \ + -prevent-setting-cookies \ + -kill-popups \ + -send-vanilla-wafer \ + -send-wafer \ + } + / # forward slash will match *all* potential URL patterns. + +########################################################################## +# Default behavior is now set. Time for some exceptions to our +# default actions. +########################################################################## + +# These sites are very complex and require very minimal interference. +# We'll disable most actions with our 'fragile' alias. + {fragile} + .office.microsoft.com # surprise, surprise! + .windowsupdate.microsoft.com + + +# Shopping sites - not as fragile but require some special +# handling. We still want to block ads, and we will allow +# persistant cookies via the 'shop' alias. + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + .jungle.com + .scan.co.uk + + +# These sites require pop-ups too :( We'll combine our 'shop' +# alias with two other actions into one rule to allow all popups. + {shop -no-popups -filter{popups}} + .dabs.com + .overclockers.co.uk + + +# The 'Fast-redirects' action breaks some sites. Disable this action +# for these known sensitive sites. + {-fast-redirects} + www.ukc.ac.uk/cgi-bin/wac\.cgi\? + login.yahoo.com + edit.europe.yahoo.com + .google.com + .altavista.com/.*(like|url|link):http + .altavista.com/trans.*urltext=http + .nytimes.com + + +# Define which file types will be treated as images. Important +# for ad blocking. + {+handle-as-image} + /.*\.(gif|jpe?g|png|bmp|ico) + + +# Now lets list some domains that are known ad generators. And +# our alias here will block these as well as force them to be +# treated as images. This combination of actions is important +# for ad blocking. What the browser will show instead is +# determined by the setting of +set-image-blocker + {+imageblock} + ar.atwola.com + .ad.doubleclick.net + .a.yimg.com/(?:(?!/i/).)*$ + .a[0-9].yimg.com/(?:(?!/i/).)*$ + bs*.gsanet.com + bs*.einets.com + .qkimg.net + ad.*.doubleclick.net + + +# These will just simply be blocked. They will generate the BLOCKED +# banner page, if matched. Heavy use of wildcards and regular +# expressions in this example. + {+block} + ad*. + .*ads. + banner?. + count*. + /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?) + /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/ + .hitbox.com + + +# The above block section will catch some sites we DO NOT want +# blocked via the wildcards and regular expressions. Now let's set +# exceptions to the exceptions so the good guys get better treatment. + {-block} + advogato.org + adsl. + ad[ud]*. + advice. +# Let's just trust all .edu top level domains. + .edu + www.ugu.com/sui/ugu/adv +# We'll need to access to path names containing 'download' + .*downloads. + /downloads/ +# 'adv' is for globalintersec and means advanced, not advertisement + www.globalintersec.com/adv + + +# Don't filter *anything* from our friends at sourceforge. +# Notice we don't have to name the individual filter +# identifiers -- we just turn them all off in one fell swoop. + {-filter} + .sourceforge.net + + + + + + + + So far we are painting with a broad brush by setting general policies. + The above would be a reasonable starting point for many situations. Now, + we want to be more specific and have customized rules that are more suitable + to our personal habits and preferences. These should be placed in + user.action, which is parsed after all other + actions files. So any settings here, will have the last word. + + + + Now an example of a few things that one might do with a user.action + file. This is where user preferences are defined. + + + + + + + + + Some examples: + + + + Turn off cookies by default, then allow a few through for specified sites + (showing an excerpt from the default section of an actions + file ONLY): + + + + + + + # Excerpt only: + # Allow cookies to and from the server, but + # for this browser session ONLY + { + # other actions normally listed here... + -prevent-setting-cookies \ + -prevent-reading-cookies \ + +session-cookies-only \ + } + / # match all URLs + + # Exceptions to the above, sites that benefit from persistent cookies + # that are saved from one browser session to the next. + { -session-cookies-only } + .javasoft.com + .sun.com + .yahoo.com + .msdn.microsoft.com + .redhat.com + + + + + + + + Now turn off fast redirects, and then we allow two exceptions: + + + + + + + # Turn them off (excerpt only)! + { + # other actions normally listed here... + +fast-redirects + } + / # match all URLs - - - This allows you to add an arbitrary cookie. It can be specified multiple - times in order to add as many cookies as you like. - - - - - - +wafer{name=value} - - - - - + # Reverse it for these two sites, which don't work right without it. + {-fast-redirects} + www.ukc.ac.uk/cgi-bin/wac\.cgi\? + login.yahoo.com + + + + + + + Turn on page filtering according to rules in the defined sections + of default.filter, and make one exception for + Sourceforge: + + + + + + + # Run everything through the filter file, using only certain + # specified sections: + { + # other actions normally listed here... + +filter{html-annoyances} +filter{js-annoyances} +filter{kill-popups}\ + +filter{webbugs} +filter{nimda} +filter{banners-by-size} + } + / #match all URLs + + # Then disable filtering of code from all sourceforge domains! + {-filter} + .sourceforge.net + + + + + + + Now some URLs that we want blocked (normally generates + the blocked banner). Typically, the block + action is off by default in the upper section of an actions file, then enabled + against certain URLs and patterns in the lower part of the file. Many of these use regular expressions that will expand to match multiple + URLs: + + + + + + # Blocklist: + {+block} + ad*. + .*ads. + banner?. + count*. + /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?) + /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/ + .hitbox.com + /.*/(ng)?adclient\.cgi + /.*/(plain|live|rotate)[-_.]?ads?/ + /.*/abanners/ + /autoads/ + + + + + + + Note that many of these actions have the potential to cause a page to + misbehave, possibly even not to display at all. There are many ways + a site designer may choose to design his site, and what HTTP header + content, and other criteria, he may depend on. There is no way to have hard + and fast rules for all sites. See the Appendix for a brief example on troubleshooting + actions. + + + + + + + + + + +Aliases + + Custom actions, known to Privoxy + as aliases, can be defined by combining other actions. + These can in turn be invoked just like the built-in actions. + Currently, an alias can contain any character except space, tab, =, + { or }. But please use only a- + z, 0-9, +, and + -. Alias names are not case sensitive, and + must be defined before other actions in the + actions file! And there can only be one set of aliases + defined per file. Each actions file may have its own aliases, but they are + only visible within that file. + - + + Now let's define a few aliases: - The meaning of any of the above is reversed by preceding the action with a - -, in place of the +. + + + + # Useful custom aliases we can use later. These must come first! + {{alias}} + +prevent-cookies = +prevent-setting-cookies +prevent-reading-cookies + -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies + fragile = -block -prevent-cookies -filter -fast-redirects -hide-referer -kill-popups + shop = -prevent-cookies -filter -fast-redirects + +imageblock = +block +handle-as-image + + # Aliases defined from other aliases, for people who don't like to type + # too much: ;-) + c0 = +prevent-cookies + c1 = -prevent-cookies + #... etc. Customize to your heart's content. + + + - Some examples: + Some examples using our shop and fragile + aliases from above. These would appear in the lower sections of an + actions file as exceptions to the default actions (as defined in the + upper section): - Turn off cookies by default, then allow a few through for specified sites: + + + + # These sites are very complex and require + # minimal interference. + {fragile} + .office.microsoft.com + .windowsupdate.microsoft.com + .nytimes.com + + # Shopping sites - but we still want to block ads. + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + .scan.co.uk + + # These shops require pop-ups also + {shop -kill-popups} + .dabs.com + .overclockers.co.uk + + + - + + + The shop and fragile aliases are often used for + problem sites that require most actions to be disabled + in order to function properly. + + + + + + + + + + + +The Filter File + + Any web page can be dynamically modified with the filter file. This + modification can be removal, or re-writing, of any web page content, + including tags and non-visible content. The default filter file is + default.filter, located in the config directory. + + + + This is potentially a very powerful feature, and requires knowledge of both + regular expression and HTML in order create custom + filters. But, there are a number of useful filters included with + Privoxy for many common situations. + + + + The included example file is divided into sections. Each section begins + with the FILTER keyword, followed by the identifier + for that section, e.g. FILTER: webbugs. Each section performs + a similar type of filtering, such as html-annoyances. + + + + This file uses regular expressions to alter or remove any string in the + target page. The expressions can only operate on one line at a time. Some + examples from the included default default.filter: + + + + Stop web pages from displaying annoying messages in the status bar by + deleting such references: + + - # Turn off all persistent cookies - { +no-cookies-read } - { +no-cookies-set } - # Allow cookies for this browser session ONLY - { +no-cookies-keep } + FILTER: html-annoyances - # Exceptions to the above, sites that benefit from persistent cookies - { -no-cookies-read } - { -no-cookies-set } - { -no-cookies-keep } - .javasoft.com - .sun.com - .yahoo.com - .msdn.microsoft.com - .redhat.com - - # Alternative way of saying the same thing - {-no-cookies-set -no-cookies-read -no-cookies-keep} - .sourceforge.net - .sf.net + # New browser windows should be resizeable and have a location and status + # bar. Make it so. + # + s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig + s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig + s/scrolling="?(no|0|Auto)"?/scrolling=1/ig + s/menubar="?(no|0)"?/menubar=1/ig + + # The <BLINK> tag was a crime! + # + s*<blink>|</blink>**ig + + # Is this evil? + # + #s/framespacing="?(no|0)"?//ig + #s/margin(height|width)=[0-9]*//gi - Now turn off fast redirects, and then we allow two exceptions: + Just for kicks, replace any occurrence of Microsoft with + MicroSuck, and have a little fun with topical buzzwords: - # Turn them off! - {+fast-redirects} - - # Reverse it for these two sites, which don't work right without it. - {-fast-redirects} - www.ukc.ac.uk/cgi-bin/wac\.cgi\? - login.yahoo.com + FILTER: fun + + s/microsoft(?!.com)/MicroSuck/ig + + # Buzzword Bingo: + # + s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig - Turn on page filtering, with one exception for sourceforge: + Kill those pesky little web-bugs: - # Run everything through the default filter file (re_filterfile): - {+filter} - - # But please don't re_filter code from sourceforge! - {-filter} - .cvs.sourceforge.net + # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking) + FILTER: webbugs + + s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig + + + +The +filter Action + + Filters are enabled with the +filter action from within + one of the actions files. +filter requires one parameter, which + should match one of the section identifiers in the filter file itself. Example: + + + + + +filter{html-annoyances} + + + + + This would activate that particular filter. Similarly, +filter + can be turned off for selected sites as: + -filter{html-annoyances}. Remember, all actions are off by + default, unless they are explicity enabled in one of the actions files. + + + + + + + + + + + + + +Templates + + When Privoxy displays one of its internal + pages, such as a 404 Not Found error page, it uses the appropriate template. + On Linux, BSD, and Unix, these are located in + /etc/privoxy/templates by default. These may be + customized, if desired. cgi-style.css is + used to control the HTML attributes (fonts, etc). + + + The default Blocked banner page with the bright red top + banner, is called just blocked. This + may be customized or replaced with something else if desired. + + + + + + + + + + + + +Contacting the Developers, Bug Reporting and Feature +Requests + + + &contacting; + + + + + + +Copyright and History + +Copyright + + ©right; + + + + + + + + +History + + &history; + + + + + +See Also + + &seealso; + + + + + + +Appendix + + + + +Regular Expressions + + Privoxy can use regular expressions + in various config files. Assuming support for pcre (Perl + Compatible Regular Expressions) is compiled in, which is the default. Such + configuration directives do not require regular expressions, but they can be + used to increase flexibility by matching a pattern with wild-cards against + URLs. + + + + If you are reading this, you probably don't understand what regular + expressions are, or what they can do. So this will be a very brief + introduction only. A full explanation would require a book ;-) + + + + Regular expressions is a way of matching one character + expression against another to see if it matches or not. One of the + expressions is a literal string of readable characters + (letter, numbers, etc), and the other is a complex string of literal + characters combined with wild-cards, and other special characters, called + meta-characters. The meta-characters have special meanings and + are used to build the complex pattern to be matched against. Perl Compatible + Regular Expressions is an enhanced form of the regular expression language + with backward compatibility. + + + + To make a simple analogy, we do something similar when we use wild-card + characters when listing files with the dir command in DOS. + *.* matches all filenames. The special + character here is the asterisk which matches any and all characters. We can be + more specific and use ? to match just individual + characters. So dir file?.text would match + file1.txt, file2.txt, etc. We are pattern + matching, using a similar technique to regular expressions! + + + + Regular expressions do essentially the same thing, but are much, much more + powerful. There are many more special characters and ways of + building complex patterns however. Let's look at a few of the common ones, + and then some examples: + + + + + . - Matches any single character, e.g. a, + A, 4, :, or @. + + + + + + ? - The preceding character or expression is matched ZERO or ONE + times. Either/or. + + + + + + + - The preceding character or expression is matched ONE or MORE + times. + + + + + + * - The preceding character or expression is matched ZERO or MORE + times. + + + + + + \ - The escape character denotes that + the following character should be taken literally. This is used where one of the + special characters (e.g. .) needs to be taken literally and + not as a special meta-character. Example: example\.com, makes + sure the period is recognized only as a period (and not expanded to its + meta-character meaning of any single character). + + + + + + [] - Characters enclosed in brackets will be matched if + any of the enclosed characters are encountered. For instance, [0-9] + matches any numeric digit (zero through nine). As an example, we can combine + this with + to match any digit one of more times: [0-9]+. + + + + + + () - parentheses are used to group a sub-expression, + or multiple sub-expressions. + + + + + + | - The bar character works like an + or conditional statement. A match is successful if the + sub-expression on either side of | matches. As an example: + /(this|that) example/ uses grouping and the bar character + and would match either this example or that + example, and nothing else. + + + + + + s/string1/string2/g - This is used to rewrite strings of text. + string1 is replaced by string2 in this + example. There must of course be a match on string1 first. + + + + + These are just some of the ones you are likely to use when matching URLs with + Privoxy, and is a long way from a definitive + list. This is enough to get us started with a few simple examples which may + be more illuminating: + + - Now some URLs that we want blocked, ie we won't see them. - Many of these use regular expressions that will expand to match multiple - URLs: + /.*/banners/.* - A simple example + that uses the common combination of . and * to + denote any character, zero or more times. In other words, any string at all. + So we start with a literal forward slash, then our regular expression pattern + (.*) another literal forward slash, the string + banners, another forward slash, and lastly another + .*. We are building + a directory path here. This will match any file with the path that has a + directory named banners in it. The .* matches + any characters, and this could conceivably be more forward slashes, so it + might expand into a much longer looking path. For example, this could match: + /eye/hate/spammers/banners/annoy_me_please.gif, or just + /banners/annoying.html, or almost an infinite number of other + possible combinations, just so it has banners in the path + somewhere. - - - - # Blocklist: - {+block} - /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g)) - /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/]) - /.*/(ng)?adclient\.cgi - /.*/(plain|live|rotate)[-_.]?ads?/ - /.*/(sponsor)s?[0-9]?/ - /.*/_?(plain|live)?ads?(-banners)?/ - /.*/abanners/ - /.*/ad(sdna_image|gifs?)/ - /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe) - /.*/adbanners/ - /.*/adserver - /.*/adstream\.cgi - /.*/adv((er)?ts?|ertis(ing|ements?))?/ - /.*/banner_?ads/ - /.*/banners?/ - /.*/banners?\.cgi/ - /.*/cgi-bin/centralad/getimage - /.*/images/addver\.gif - /.*/images/marketing/.*\.(gif|jpe?g) - /.*/popupads/ - /.*/siteads/ - /.*/sponsor.*\.gif - /.*/sponsors?[0-9]?/ - /.*/advert[0-9]+\.jpg - /Media/Images/Adds/ - /ad_images/ - /adimages/ - /.*/ads/ - /bannerfarm/ - /grafikk/annonse/ - /graphics/defaultAd/ - /image\.ng/AdType - /image\.ng/transactionID - /images/.*/.*_anim\.gif # alvin brattli - /ip_img/.*\.(gif|jpe?g) - /rotateads/ - /rotations/ - /worldnet/ad\.cgi - /cgi-bin/nph-adclick.exe/ - /.*/Image/BannerAdvertising/ - /.*/ad-bin/ - /.*/adlib/server\.cgi - /autoads/ - - - + A now something a little more complex: - - - - - - - -Aliases - Custom actions, known to Junkbuster - as aliases, can be defined by combining other actions. - These can in turn be invoked just like the built-in actions. - Currently, an alias can contain any character except space, tab, =, - { or }. But please use only a- - z, 0-9, +, and - -. Alias names are not case sensitive, and - must be defined before anything else in the - ijb.actionfile ! And there can only be one set of - aliases defined. + /.*/adv((er)?ts?|ertis(ing|ements?))?/ - + We have several literal forward slashes again (/), so we are + building another expression that is a file path statement. We have another + .*, so we are matching against any conceivable sub-path, just so + it matches our expression. The only true literal that must + match our pattern is adv, together with + the forward slashes. What comes after the adv string is the + interesting part. - Now let's define a few aliases: + Remember the ? means the preceding expression (either a + literal character or anything grouped with (...) in this case) + can exist or not, since this means either zero or one match. So + ((er)?ts?|ertis(ing|ements?)) is optional, as are the + individual sub-expressions: (er), + (ing|ements?), and the s. The | + means or. We have two of those. For instance, + (ing|ements?), can expand to match either ing + OR ements?. What is being done here, is an + attempt at matching as many variations of advertisement, and + similar, as possible. So this would expand to match just adv, + or advert, or adverts, or + advertising, or advertisement, or + advertisements. You get the idea. But it would not match + advertizements (with a z). We could fix that by + changing our regular expression to: + /.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/, which would then match + either spelling. - - - - # Useful customer aliases we can use later. These must come first! - {{alias}} - +no-cookies = +no-cookies-set +no-cookies-read - -no-cookies = -no-cookies-set -no-cookies-read - fragile = -block -no-cookies -filter -fast-redirects -hide-referer -no-popups - shop = -no-cookies -filter -fast-redirects - +imageblock = +block +image - - #For people who don't like to type too much: ;-) - c0 = +no-cookies - c1 = -no-cookies - c2 = -no-cookies-set +no-cookies-read - c3 = +no-cookies-set -no-cookies-read - #... etc. Customize to your heart's content. - - - + /.*/advert[0-9]+\.(gif|jpe?g) - Again + another path statement with forward slashes. Anything in the square brackets + [] can be matched. This is using 0-9 as a + shorthand expression to mean any digit one through nine. It is the same as + saying 0123456789. So any digit matches. The + + means one or more of the preceding expression must be included. The preceding + expression here is what is in the square brackets -- in this case, any digit + one through nine. Then, at the end, we have a grouping: (gif|jpe?g). + This includes a |, so this needs to match the expression on + either side of that bar character also. A simple gif on one side, and the other + side will in turn match either jpeg or jpg, + since the ? means the letter e is optional and + can be matched once or not at all. So we are building an expression here to + match image GIF or JPEG type image file. It must include the literal + string advert, then one or more digits, and a . + (which is now a literal, and not a special character, since it is escaped + with \), and lastly either gif, or + jpeg, or jpg. Some possible matches would + include: //advert1.jpg, + /nasty/ads/advert1234.gif, + /banners/from/hell/advert99.jpg. It would not match + advert1.gif (no leading slash), or + /adverts232.jpg (the expression does not include an + s), or /advert1.jsp (jsp is not + in the expression anywhere). - Some examples using our shop and fragile - aliases from above: + s/microsoft(?!.com)/MicroSuck/i - This is + a substitution. MicroSuck will replace any occurrence of + microsoft. The i at the end of the expression + means ignore case. The (?!.com) means + the match should fail if microsoft is followed by + .com. In other words, this acts like a NOT + modifier. In case this is a hyperlink, we don't want to break it ;-). - - - - # These sites are very complex and require - # minimal interference. - {fragile} - .office.microsoft.com - .windowsupdate.microsoft.com - .nytimes.com - - # Shopping sites - still want to block ads. - {shop} - .quietpc.com - .worldpay.com # for quietpc.com - .jungle.com - .scan.co.uk + We are barely scratching the surface of regular expressions here so that you + can understand the default Privoxy + configuration files, and maybe use this knowledge to customize your own + installation. There is much, much more that can be done with regular + expressions. Now that you know enough to get started, you can learn more on + your own :/ + - # These shops require pop-ups - {shop -no-popups} - .dabs.com - .overclockers.co.uk - - - + + More reading on Perl Compatible Regular expressions: + http://www.perldoc.com/perl5.6/pod/perlre.html - - -The Filter File - - Any web page can be dynamically modified with the filter file. This - modification can be removal, or re-writing, of any web page content, - including tags and non-visible content. The default filter file is - re_filterfile, located in the config directory. - - - - This file uses regular expressions to alter or remove any string in the - target page. The expressions can only operate on one line at a time .Some - examples from the included default re_filterfile: - - - - Stop web pages from displaying annoying messages in the status bar by - deleting such references: - - - - - - - # The status bar is for displaying link targets, not pointless buzzwords. - # Again, check it out on http://www.airport-cgn.de/. - s/status='.*?';*//ig - - - - + +<application>Privoxy</application>'s Internal Pages - Just for kicks, replace any occurrence of Microsoft with - MicroSuck: - + Since Privoxy proxies each requested + web page, it is easy for Privoxy to + trap certain special URLs. In this way, we can talk directly to + Privoxy, and see how it is + configured, see how our rules are being applied, change these + rules and other configuration options, and even turn + Privoxy's filtering off, all with + a web browser. - - - - - s/microsoft(?!.com)/MicroSuck/ig - - - - Kill those auto-refresh tags: + The URLs listed below are the special ones that allow direct access + to Privoxy. Of course, + Privoxy must be running to access these. If + not, you will get a friendly error message. Internet access is not + necessary either. - - - - # Kill refresh tags. I like to refresh myself. Manually. - # check it out on http://www.airport-cgn.de/ and go to the arrivals page. - # - s/<meta[^>]*http-equiv[^>]*refresh.*URL=([^>]*?)"?>/<link rev="x-refresh" href=$1>/i - s/<meta[^>]*http-equiv="?page-enter"?[^>]*content=[^>]*>/<!--no page enter for me-->/i - - - - - - - - - + + + + Privoxy main page: + +
+ + http://config.privoxy.org/ + +
+ + Alternately, this may be reached at http://p.p/, but this + variation may not work as reliably as the above in some configurations. + +
- + + + Show information about the current configuration, including viewing and + editing of actions files: + +
+ + http://config.privoxy.org/show-status + +
+
+ + + + Show the source code version numbers: + +
+ + http://config.privoxy.org/show-version + +
+
+ + + + Show the browser's request headers: + +
+ + http://config.privoxy.org/show-request + +
+
+ + + + Show which actions apply to a URL and why: + +
+ + http://config.privoxy.org/show-url-info + +
+
+ + + + Toggle Privoxy on or off. In this case, Privoxy continues + to run, but only as a pass-through proxy, with no actions taking place: + +
+ + http://config.privoxy.org/toggle + +
+ + Short cuts. Turn off, then on: + +
+ + http://config.privoxy.org/toggle?set=disable + +
+
+ + http://config.privoxy.org/toggle?set=enable + +
+
+ +
+ - -Templates - When Junkbuster displays one of its internal - pages, such as a 404 Not Found error page, it uses the appropriate template. - On Linux, BSD, and Unix, these are located in - /etc/junkbuster/templates by default. These may be - customized, if desired. + These may be bookmarked for quick reference. See next. - - -
- - - - - -Quickstart to Using Junkbuster + +Bookmarklets - Install package, then run and enjoy! JunkBuster - is typically started by specifying the main configuration file to be - used on the command line. Example Unix startup command: + Below are some bookmarklets to allow you to easily access a + mini version of some of Privoxy's + special pages. They are designed for MS Internet Explorer, but should work + equally well in Netscape, Mozilla, and other browsers which support + JavaScript. They are designed to run directly from your bookmarks - not by + clicking the links below (although that should work for testing). - - - - # /usr/sbin/junkbuster /etc/junkbuster/config - - + To save them, right-click the link and choose Add to Favorites + (IE) or Add Bookmark (Netscape). You will get a warning that + the bookmark may not be safe - just click OK. Then you can run the + Bookmarklet directly from your favorites/bookmarks. For even faster access, + you can put them on the Links bar (IE) or the Personal + Toolbar (Netscape), and run them with a single click. - An init script is provided for SuSE and Redhat. - + - -For for SuSE: /etc/rc.d/junkbuster start - + + + Privoxy - Enable + + - -For RedHat: /etc/rc.d/init.d/junkbuster start - + + + Privoxy - Disable + + + + + Privoxy - Toggle Privoxy (Toggles between enabled and disabled) + + - - If no configuration file is specified on the command line, - Junkbuster will look for a file named - config in the current directory. Except on Win32 where - it will try config.txt. If no file is specified on the - command line and no default configuration file can be found, - Junkbuster will fail to start. - + + + Privoxy- View Status + + - - Be sure your browser is set to use the proxy which is by default at - localhost, port 8118. With Netscape (and - Mozilla), this can be set under Edit - -> Preferences -> Advanced -> Proxies -> HTTP Proxy. - For Internet Explorer: Tools > - Internet Properties -> Connections -> LAN Setting. Then, - check Use Proxy and fill in the appropriate info (Address: - localhost, Port: 8118). Include if HTTPS proxy support too. - + + + Privoxy - Submit Filter Feedback + + - - The included default configuration files should give a reasonable starting - point, though may be somewhat aggressive in blocking junk. You will probably - want to keep an eye out for sites that require persistent cookies, and add these to - ijb.action as needed. By default, most of these will - be accepted only during the current browser session, until you add them to - the configuration. If you want the browser to handle this instead, you will - need to edit ijb.action and disable this feature. If you - use more than one browser, it would make more sense to let - Junkbuster handle this. In which case, the - browser(s) should be set to accept all cookies. + - - If a particular site shows problems loading properly, try adding it - to the {fragile} section of - ijb.action. This will turn off most actions for - this site. - - - Junkbuster is HTTP/1.1 compliant, but not all 1.1 - features are as yet implemented. If browsers that support HTTP/1.1 (like - Mozilla or recent versions of I.E.) experience - problems, you might try to force HTTP/1.0 compatibility. For Mozilla, look - under Edit -> Preferences -> Debug -> Networking. - Or set the +downgrade config option in - ijb.action. - - After running Junkbuster for a while, you can - start to fine tune the configuration to suit your personal, or site, - preferences and requirements. There are many, many aspects that can - be customized. Actions (as specified in ijb.action) - can be adjusted by pointing your browser to - http://i.j.b/, - and then follow the link to edit the actions list. - (This is an internal page and does not require Internet access.) + Credit: The site which gave me the general idea for these bookmarklets is + www.bookmarklets.com. They + have more information about bookmarklets. - - In fact, various aspects of Junkbuster - configuration can be viewed from this page, including - current configuration parameters, source code version numbers, - the browser's request headers, and actions that apply - to a given URL. In addition to the ijb.action file - editor mentioned above, Junkbuster can also - be turned on and off from this page. - - - If you encounter problems, please verify it is a - Junkbuster bug, by disabling - Junkbuster, and then trying the same page. - Also, try another browser if possible to eliminate browser or site - problems. Before reporting it as a bug, see if there is not a configuration - option that is enabled that is causing the page not to load. You can - then add an exception for that page or site. If a bug, please report it to - the developers (see below). - + + - - -Command Line Options + +Chain of Events - JunkBuster may be invoked with the following - command-line options: + Let's take a quick look at the basic sequence of events when a web page is + requested by your browser and Privoxy is on duty: - - --version - - - Print version info and exit, Unix only. + First, your web browser requests a web page. The browser knows to send + the request to Privoxy, which will in turn, + relay the request to the remote web server after passing the following + tests: - --help + Privoxy traps any request for its own internal CGI + pages (e.g http://p.p/) and sends the CGI page back to the browser. + + - Print a short usage info and exit, Unix only. + Next, Privoxy checks to see if the URL + matches any +block patterns. If + so, the URL is then blocked, and the remote web server will not be contacted. + +handle-as-image + is then checked and if it does not match, an + HTML BLOCKED page is sent back. Otherwise, if it does match, + an image is returned. The type of image depends on the setting of +set-image-blocker + (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere). - --no-daemon + Untrusted URLs are blocked. If URLs are being added to the + trust file, then that is done. + + - Don't become a daemon, i.e. don't fork and become process group - leader, don't detach from controlling tty. Unix only. + If the URL pattern matches the +fast-redirects action, + it is then processed. Unwanted parts of the requested URL are stripped. - --pidfile FILE - + Now the rest of the client browser's request headers are processed. If any + of these match any of the relevant actions (e.g. +hide-user-agent, + etc.), headers are suppressed or forged as determined by these actions and + their parameters. + + - On startup, write the process ID to FILE. Delete the - FILE on exit. Failiure to create or delete the - FILE is non-fatal. If no FILE - option is given, no PID file will be used. Unix only. + Now the web server starts sending its response back (i.e. typically a web page and related + data). - --user USER[.GROUP] - + First, the server headers are read and processed to determine, among other + things, the MIME type (document type) and encoding. The headers are then + filtered as deterimed by the + +prevent-setting-cookies, + +session-cookies-only, + and +downgrade-http-version + actions. + + - After (optionally) writing the PID file, assume the user ID of - USER, and if included the GID of GROUP. Exit if the - privileges are not sufficient to do so. Unix only. + If the +kill-popups + action applies, and it is an HTML or JavaScript document, the popup-code in the + response is filtered on-the-fly as it is received. - configfile + If a +filter + or +deanimate-gifs + action applies (and the document type fits the action), the rest of the page is + read into memory (up to a configurable limit). Then the filter rules (from + default.filter) are processed against the buffered + content. Filters are applied in the order they are specified in the + default.filter file. Animated GIFs, if present, are + reduced to either the first or last frame, depending on the action + setting.The entire page, which is now filtered, is then sent by + Privoxy back to your browser. - If no configfile is included on the command line, - JunkBuster will look for a file named - config in the current directory (except on Win32 - where it will look for config.txt instead). Specify - full path to avoid confusion. + If neither +filter + or +deanimate-gifs + matches, then Privoxy passes the raw data through + to the client browser as it becomes available. - + + + As the browser receives the now (probably filtered) page content, it + reads and then requests any URLs that may be embedded within the page + source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g. + frames), sounds, etc. For each of these objects, the browser issues a new + request. And each such request is in turn processed as above. Note that a + complex web page may have many such embedded URLs. + + + - - - - - - -Contacting the Developers, Bug Reporting and Feature -Requests - -We value your feedback. However, to provide you with the best support, -please note: - - - - Use the Sourceforge support forum to get - help. - - Submit bugs only thru our Sourceforge bug - forum. -Make sure that the bug has not already been submitted. Please try to -verify that it is a Junkbuster bug, and not -a browser or site bug first. If you are using your own custom configuration, -please try the stock configs to see if the problem is a configuration -related bug. And if not using the latest development snapshot, please -try the latest one. Or even better, CVS sources. - - - - Submit feature requests only thru our Sourceforge feature request forum. - - - - - + +Anatomy of an Action -For any other issues, feel free to use the mailing lists. + The way Privoxy applies + actions + and filters + to any given URL can be complex, and not always so + easy to understand what is happening. And sometimes we need to be able to + see just what Privoxy is + doing. Especially, if something Privoxy is doing + is causing us a problem inadvertently. It can be a little daunting to look at + the actions and filters files themselves, since they tend to be filled with + regular expressions whose consequences are not always + so obvious. - Anyone interested in actively participating in development and related - discussions can join the appropriate mailing list - here. - Archives are available here too. + One quick test to see if Privoxy is causing a problem + or not, is to disable it temporarily. This should be the first troubleshooting + step. See the Bookmarklets section on a quick + and easy way to do this (be sure to flush caches afterward!). - - - - -Copyright and History - - -License - Internet Junkbuster is free software; you can - redistribute it and/or modify it under the terms of the GNU General Public - License as published by the Free Software Foundation; either version 2 of the - License, or (at your option) any later version. + Privoxy also provides the + http://config.privoxy.org/show-url-info + page that can show us very specifically how actions + are being applied to any given URL. This is a big help for troubleshooting. - This program is distributed in the hope that it will be useful, but WITHOUT - ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS - FOR A PARTICULAR PURPOSE. See the GNU General Public License for more - details, which is available from the Free Software Foundation, - Inc, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + First, enter one URL (or partial URL) at the prompt, and then + Privoxy will tell us + how the current configuration will handle it. This will not + help with filtering effects (i.e. the +filter action) from + the default.filter file since this is handled very + differently and not so easy to trap! It also will not tell you about any other + URLs that may be embedded within the URL you are testing. For instance, images + such as ads are expressed as URLs within the raw page source of HTML pages. So + you will only get info for the actual URL that is pasted into the prompt area + -- not any sub-URLs. If you want to know about embedded URLs like ads, you + will have to dig those out of the HTML source. Use your browser's View + Page Source option for this. Or right click on the ad, and grab the + URL. - - - - - - - - -History - Junkbuster was originally written by Anonymous - Coders and Junkbuster's - Corporation, and was released as free open-source software under the - GNU GPL. Stefan - Waldherr made many improvements, and started the SourceForge project to - rekindle development. There are now several active developers contributing. - The last stable release was v2.0.2, which has now grown whiskers ;-). + Let's try an example, google.com, + and look at it one section at a time: - - - - - -See also - - - -   http://sourceforge.net/projects/ijbswa - - - - -   http://ijbswa.sourceforge.net/ - - - - -   http://i.j.b/ - - - - -   http://www.junkbusters.com/ht/en/cookies.html - - - - -   http://www.waldherr.org/junkbuster/ - - - - -   http://privacy.net/analyze/ - - - - -  http://www.squid-cache.org/ - - - - - + + Matches for http://google.com: +--- File standard --- +(no matches in this file) +--- File default --- - -Appendix +{ -add-header -block +deanimate-gifs{last} -downgrade-http-version +fast-redirects + -filter{popups} -filter{fun} -filter{shockwave-flash} -filter{crude-parental} + +filter{html-annoyances} +filter{js-annoyances} +filter{content-cookies} + +filter{webbugs} +filter{refresh-tags} +filter{nimda} +filter{banners-by-size} + +hide-forwarded-for-headers +hide-from-header{block} +hide-referer{forge} + -hide-user-agent -handle-as-image +set-image-blocker{pattern} -limit-connect + +prevent-compression +session-cookies-only -prevent-reading-cookies + -prevent-setting-cookies -kill-popups -send-vanilla-wafer -send-wafer } +/ + { -session-cookies-only } + .google.com - - -Regular Expressions - - Junkbuster can use regular expressions - in various config files. Assuming support for pcre (Perl - Compatible Regular Expressions) is compiled in, which is the default. Such - configuration directives do not require regular expressions, but they can be - used to increase flexibility by matching a pattern with wild-cards against - URLs. - + { -fast-redirects } + .google.com - - If you are reading this, you probably don't understand what regular - expressions are, or what they can do. So this will be a very brief - introduction only. A full explanation would require a book ;-) +--- File user --- +(no matches in this file) + - Regular expressions is a way of matching one character - expression against another to see if it matches or not. One of the - expressions is a literal string of readable characters - (letter, numbers, etc), and the other is a complex string of literal - characters combined with wild-cards, and other special characters, called - meta-characters. The meta-characters have special meanings and - are used to build the complex pattern to be matched against. Perl Compatible - Regular Expressions is an enhanced form of the regular expression language - with backward compatibility. + This tells us how we have defined our + actions, and + which ones match for our example, google.com. The first listing + is any matches for the standard.action file. No hits at + all here on standard. Then next is default, or + our default.action file. The large, multi-line listing, + is how the actions are set to match for all URLs, i.e. our default settings. + If you look at your actions file, this would be the section + just below the aliases section near the top. This will apply to + all URLs as signified by the single forward slash at the end of the listing + -- /. - To make a simple analogy, we do something similar when we use wild-card - characters when listing files with the dir command in DOS. - *.* matches all filenames. The special - character here is the asterisk which matches any and all characters. We can be - more specific and use ? to match just individual - characters. So dir file?.text would match - file1.txt, file2.txt, etc. We are pattern - matching, using a similar technique to regular expressions! + But we can define additional actions that would be exceptions to these general + rules, and then list specific URLs (or patterns) that these exceptions would + apply to. Last match wins. Just below this then are two explicit matches for + .google.com. The first is negating our previous cookie setting, + which was for +session-cookies-only + (i.e. not persistent). So we will allow persistent cookies for google. The + second turns off any + +fast-redirects + action, allowing this to take place unmolested. Note that there is a leading + dot here -- .google.com. This will match any hosts and + sub-domains, in the google.com domain also, such as + www.google.com. So, apparently, we have these two actions + defined somewhere in the lower part of our default.action + file, and google.com is referenced somewhere in these latter + sections. - Regular expressions do essentially the same thing, but are much, much more - powerful. There are many more special characters and ways of - building complex patterns however. Let's look at a few of the common ones, - and then some examples: + Then, for our user.action file, we again have no hits. - - - . - Matches any single character, e.g. a, - A, 4, :, or @. - - - - - - ? - The preceding character or expression is matched ZERO or ONE - times. Either/or. - - - - - - + - The preceding character or expression is matched ONE or MORE - times. - - - - - - * - The preceding character or expression is matched ZERO or MORE - times. - - - - - - \ - The escape character denotes that - the following character should be taken literally. This is used where one of the - special characters (e.g. .) needs to be taken literally and - not as a special meta-character. - - - - - - [] - Characters enclosed in brackets will be matched if - any of the enclosed characters are encountered. - - - - - - () - parentheses are used to group a sub-expression, - or multiple sub-expressions. - - - - - - | - The bar character works like an - or conditional statement. A match is successful if the - sub-expression on either side of | matches. - - + + And finally we pull it all together in the bottom section and summarize how + Privoxy is applying all its actions + to google.com: - - - s/string1/string2/g - This is used to rewrite strings of text. - string1 is replaced by string2 in this - example. - - + - These are just some of the ones you are likely to use when matching URLs with - Junkbuster, and is a long way from a definitive - list. This is enough to get us started with a few simple examples which may - be more illuminating: + + + Final results: + -add-header -block +deanimate-gifs{last} -downgrade-http-version -fast-redirects + -filter{popups} -filter{fun} -filter{shockwave-flash} -filter{crude-parental} + +filter{html-annoyances} +filter{js-annoyances} +filter{content-cookies} + +filter{webbugs} +filter{refresh-tags} +filter{nimda} +filter{banners-by-size} + +hide-forwarded-for-headers +hide-from-header{block} +hide-referer{forge} + -hide-user-agent -handle-as-image +set-image-blocker{pattern} -limit-connect + +prevent-compression -session-cookies-only -prevent-reading-cookies + -prevent-setting-cookies -kill-popups -send-vanilla-wafer -send-wafer + - /.*/banners/.* - A simple example - that uses the common combination of . and * to - denote any character, zero or more times. In other words, any string at all. - So we start with a literal forward slash, then our regular expression pattern - (.*) another literal forward slash, the string - banners, another forward slash, and lastly another - .*. We are building - a directory path here. This will match any file with the path that has a - directory named banners in it. The .* matches - any characters, and this could conceivably be more forward slashes, so it - might expand into a much longer looking path. For example, this could match: - /eye/hate/spammers/banners/annoy_me_please.gif, or just - /banners/annoying.html, or almost an infinite number of other - possible combinations, just so it has banners in the path - somewhere. + Notice the only difference here to the previous listing, is to + fast-redirects and session-cookies-only. - A now something a little more complex: + Now another example, ad.doubleclick.net: - /.*/adv((er)?ts?|ertis(ing|ements?))?/ - - We have several literal forward slashes again (/), so we are - building another expression that is a file path statement. We have another - .*, so we are matching against any conceivable sub-path, just so - it matches our expression. The only true literal that must - match our pattern is adv, together with - the forward slashes. What comes after the adv string is the - interesting part. + + + { +block +handle-as-image } + .ad.doubleclick.net + + { +block +handle-as-image } + ad*. + + { +block +handle-as-image } + .doubleclick.net + - Remember the ? means the preceding expression (either a - literal character or anything grouped with (...) in this case) - can exist or not, since this means either zero or one match. So - ((er)?ts?|ertis(ing|ements?)) is optional, as are the - individual sub-expressions: (er), - (ing|ements?), and the s. The | - means or. We have two of those. For instance, - (ing|ements?), can expand to match either ing - OR ements?. What is being done here, is an - attempt at matching as many variations of advertisement, and - similar, as possible. So this would expand to match just adv, - or advert, or adverts, or - advertising, or advertisement, or - advertisements. You get the idea. But it would not match - advertizements (with a z). We could fix that by - changing our regular expression to: - /.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/, which would then match - either spelling. + We'll just show the interesting part here, the explicit matches. It is + matched three different times. Each as an +block +handle-as-image, + which is the expanded form of one of our aliases that had been defined as: + +imageblock. (Aliases are defined in + the first section of the actions file and typically used to combine more + than one action.) - /.*/advert[0-9]+\.(gif|jpe?g) - Again - another path statement with forward slashes. Anything in the square brackets - [] can be matched. This is using 0-9 as a - shorthand expression to mean any digit one through nine. It is the same as - saying 0123456789. So any digit matches. The + - means one or more of the preceding expression must be included. The preceding - expression here is what is in the square brackets -- in this case, any digit - one through nine. Then, at the end, we have a grouping: (gif|jpe?g). - This includes a |, so this needs to match the expression on - either side of that bar character also. A simple gif on one side, and the other - side will in turn match either jpeg or jpg, - since the ? means the letter e is optional and - can be matched once or not at all. So we are building an expression here to - match image GIF or JPEG type image file. It must include the literal - string advert, then one or more digits, and a . - (which is now a literal, and not a special character, since it is escaped - with \), and lastly either gif, or - jpeg, or jpg. Some possible matches would - include: //advert1.jpg, - /nasty/ads/advert1234.gif, - /banners/from/hell/advert99.jpg. It would not match - advert1.gif (no leading slash), or - /adverts232.jpg (the expression does not include an - s), or /advert1.jsp (jsp is not - in the expression anywhere). + Any one of these would have done the trick and blocked this as an unwanted + image. This is unnecessarily redundant since the last case effectively + would also cover the first. No point in taking chances with these guys + though ;-) Note that if you want an ad or obnoxious + URL to be invisible, it should be defined as ad.doubleclick.net + is done here -- as both a +block + and an + +handle-as-image. + The custom alias +imageblock just simplifies the process and make + it more readable. - s/microsoft(?!.com)/MicroSuck/i - This is - a substitution. MicroSuck will replace any occurrence of - microsoft. The i at the end of the expression - means ignore case. The (?!.com) means - the match should fail if microsoft is followed by - .com. In other words, this acts like a NOT - modifier. In case this is a hyperlink, we don't want to break it ;-). + One last example. Let's try http://www.rhapsodyk.net/adsl/HOWTO/. + This one is giving us problems. We are getting a blank page. Hmmm... - We are barely scratching the surface of regular expressions here so that you - can understand the default Junkbuster - configuration files, and maybe use this knowledge to customize your own - installation. There is much, much more that can be done with regular - expressions. Now that you know enough to get started, you can learn more on - your own :/ + + + Matches for http://www.rhapsodyk.net/adsl/HOWTO/: + + { -add-header -block +deanimate-gifs -downgrade-http-version +fast-redirects + +filter{html-annoyances} +filter{js-annoyances} +filter{kill-popups} + +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal} + +filter{fun} +hide-forwarded-for-headers +hide-from-header{block} + +hide-referer{forge} -hide-user-agent -handle-as-image +set-image-blocker{blank} + +prevent-compression +session-cookies-only -prevent-setting-cookies + -prevent-reading-cookies +kill-popups -send-vanilla-wafer -send-wafer } + / + + { +block +handle-as-image } + /ads + - More reading on Perl Compatible Regular expressions: - http://www.perldoc.com/perl5.6/pod/perlre.html + Ooops, the /adsl/ is matching /ads! But + we did not want this at all! Now we see why we get the blank page. We could + now add a new action below this that explicitly does not + block ({-block}) paths with adsl. There are + various ways to handle such exceptions. Example: - + + - + { -block } + /adsl + + + + Now the page displays ;-) Be sure to flush your browser's caches when + making such changes. Or, try using Shift+Reload. + - - -JunkBuster's Internal Pages + + But now what about a situation where we get no explicit matches like + we did with: + - Since JunkBuster proxies each requested - web page, it is easy for JunkBuster to - trap certain URLs. In this way, we can talk directly to - JunkBuster, and see how it is - configured, see how our rules are being applied, change these - rules and other configuration options, and even turn - JunkBuster off. + + { +block +handle-as-image } + /ads + - The URLs listed below are the special ones that allow direct access - to JunkBuster. Of course, - JunkBuster must be running to access these. If - not, you will get a friendly error message. + That actually was very telling and pointed us quickly to where the problem + was. If you don't get this kind of match, then it means one of the default + rules in the first section is causing the problem. This would require some + guesswork, and maybe a little trial and error to isolate the offending rule. + One likely cause would be one of the {+filter} actions. Try + adding the URL for the site to one of aliases that turn off +filter: + + + + + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + .jungle.com + .scan.co.uk + .forbes.com + - + {shop} is an alias that expands to + { -filter -session-cookies-only }. + Or you could do your own exception to negate filtering: - - - Junkbuster main page: - -
- - http://ijbswa.sourceforge.net/config/ - -
- - Alternately, this may be reached at http://i.j.b/, - but this variation may not work as reliably as the above in some - configurations. - -
+
- - - Show information about the current configuration: - -
- - http://ijbswa.sourceforge.net/config/show-status - -
-
- - - - Show the source code version numbers: - -
- - http://ijbswa.sourceforge.net/config/show-version - -
-
- - - - Show the client's request headers: - -
- - http://ijbswa.sourceforge.net/config/show-request - -
-
- - - - Show which actions apply to a URL and why: - -
- - http://ijbswa.sourceforge.net/config/show-url-info - -
-
- - - - Toggle JunkBuster on or off: - -
- - http://ijbswa.sourceforge.net/config/toggle - -
- - Short cuts. Turn off, then on: - -
- - http://ijbswa.sourceforge.net/config/toggle?set=disable - -
-
- - http://ijbswa.sourceforge.net/config/toggle?set=enable - -
-
+ + - - - Edit the actions list file: - -
- - http://ijbswa.sourceforge.net/config/edit-actions - -
-
- - + {-filter} + .forbes.com +
- These may be bookmarked for quick reference. + This would probably be most appropriately put in user.action, + for local site exceptions. + + + {fragile} is an alias that disables most actions. This can be + used as a last resort for problem sites. Remember to flush caches! If this + still does not work, you will have to go through the remaining actions one by + one to find which one(s) is causing the problem.
@@ -3602,6 +5700,167 @@ For any other issues, feel free to use the