X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fsource%2Fuser-manual.sgml;h=1e1c9d2710e51310cbd2701484aab33d72286a06;hp=da10618b8c8e00ff30bae389eaa14ba44f77a7a3;hb=354e3dc6f1e2091e190238b0129aa962deff3472;hpb=05dfb69d1ff3a6e1aad8887f645a24268a9bb6b4 diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml index da10618b..1e1c9d27 100644 --- a/doc/source/user-manual.sgml +++ b/doc/source/user-manual.sgml @@ -1,295 +1,599 @@ - + + + + + + + + + + + + + + + + + +]> -
-Junkbuster User Manual +Privoxy User Manual -$Id: user-manual.sgml,v 1.4 2001/09/24 01:27:56 hal9 Exp $ +$Id: user-manual.sgml,v 1.82 2002/04/18 12:04:50 oes Exp $ - By: Junkbuster Developers + By: Privoxy Developers + - The user manual gives the users information on how to install and configure - Internet Junkbuster. Internet - Junkbuster is an application that provides privacy and - security to users of the World Wide Web. + + This is here to keep vim syntax file from breaking :/ + If I knew enough to fix it, I would. + PLEASE DO NOT REMOVE! HB: hal@foobox.net + +]]> + -You can find the latest version of the user manual at http://ijbswa.sourceforge.net/doc/user-manual/. - + The user manual gives users information on how to install, configure and use + Privoxy. + + + + &p-intro; + - Feel free to send a note to the developers at ijbswa-developers@lists.sourceforge.net. - + You can find the latest version of the user manual at http://www.privoxy.org/user-manual/. + Please see the Contact section on how to + contact the developers. + + + + + + + + + + + -Introduction +Introduction + - Internet Junkbuster is a web proxy with advanced - filtering capabilities for protecting privacy, filtering web page content, - managing cookies and removing ads, banners, pop-ups and other obnoxious - Internet Junk. Junkbuster has a very flexible - configuration and can be customized to suit individual needs and tastes. - Internet Junkbuster has application for both - stand-alone systems and multi-user networks. + This documentation is included with the current &p-status; version of + Privoxy, v.&p-version;soon ;-)]]>. + - This documentation is included with the current development version of - Internet Junkbuster and is incomplete at this - point. The most up to date reference for the time being is still the comments - in the source files and in the individual configuration files. Development - of version 3.0 is currently underway, and includes significant changes and - enhancements over earlier verions. + Since this is a &p-status; version, not all new features are well tested. This + documentation may be slightly out of sync as a result (especially with + CVS sources). And there may be bugs, though hopefully + not many! +]]> + + +New Features - Since this is a development version, there are bugs! + In addition to Internet Junkbuster's traditional + features of ad and banner blocking and cookie management, + Privoxy provides new features: + + + &newfeatures; + + + + + + + + +Installation - -License - Internet Junkbuster is free software; you can - redistribute it and/or modify it under the terms of the GNU General Public - License as published by the Free Software Foundation; either version 2 of the - License, or (at your option) any later version. + Privoxy is available both in convenient pre-compiled + packages for a wide range of operating systems, and as raw source code. + For most users, we recommend using the packages, which can be downloaded from our + Privoxy Project Page. - This program is distributed in the hope that it will be useful, but WITHOUT - ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS - FOR A PARTICULAR PURPOSE. See the GNU General Public License for more - details, which is available from the Free Software Foundation, - Inc, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + If you like to live on the bleeding edge and are not afraid of using + possibly unstable development versions, you can check out the up-to-the-minute + version directly from the + CVS repository or simply download the nightly CVS + tarball. - - - - + + &supported; + +Binary Packages - -History - Junkbuster was originally written by JunkBusters - Corporation, and was released as free open-source software under the - GNU GPL. Stefan - Waldherr made many improvements, and started the SourceForge project to - rekindle development. + Note: If you have a previous Junkbuster or + Privoxy installation on your system, you + will either need to remove it, or that might be done by the setup + procedure. (See below for your platform). - + + In any case be sure to backup your old configuration + if it is valuable to you. In that case, also see the + note to upgraders. + - + + How to install the binary packages depends on your operating system: + - + +Redhat and SuSE RPMs + + RPMs can be installed with rpm -Uvh <name-of-rpm.rpm>, + and will use /etc/privoxy for configuration files. + - -Installation - Junkbuster is available as raw source code, or - pre-compiled binaries. See the Junkbuster Home Page - for current releases. Junkbuster is also available - via CVS. - This is the recommended approach at this time. + Note that if you have a Junkbuster RPM installed + on your system, you need to remove it first, because the packages conflict. + Otherwise, RPM will try removing Junkbuster automaticaly, before installing + privoxy. + -Source +Debian - For gzipped tar archives, unpack the source: + FIXME. + + + +Windows - - tar zxvf ijb_source_2.9* - cd ijb_source_2.9* - + Just double-click the installer, which will guide you through + the installation process. + + + +Solaris, NetBSD, FreeBSD, HP-UX - For retrieving the current CVS sources, you'll need the CVS - package installed first. To download CVS source: + Create a new directory, cd to it, then unzip and + untar the archive. For the most part, you'll have to figure out where + things go. FIXME. + + + +OS/2 - - cvs -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa login - cvs -z3 -d:pserver:anonymous@cvs.ijbswa.sourceforge.net:/cvsroot/ijbswa co current - cd current - + First, make sure that no previous installations of + Junkbuster and / or + Privoxy are left on your + system. - This will create a directory named current/, which will - contain the source tree. + Then, just double-click the WarpIN self-installing archive, which will + guide you through the installation process. A shadow of the + Privoxy executable will be placed in your + startup folder so it will start automatically whenever OS/2 starts. - Then, in either case, to build from source: + The directory you choose to install Privoxy + into will contain all of the configuration files. + + +Max OSX - - ./configure - make - su - make install - + FIXME. + + +AmigaOS - For Redhat and SuSE Linux RPM packages, see below. + Unpack the .lha archive, then FIXME. + + + + +Building from Source + +&buildsource; + + + + + + + + +Quickstart to Using <application>Privoxy</application> + -Red Hat + +Note to Upgraders - To build Redhat RPM packages, install source as above. Then: + There are very significant changes from older versions of + Junkbuster to the current + Privoxy. Configuration is substantially + changed. Junkbuster 2.0.x and earlier + configuration files will not migrate. The functionality of the old + blockfile, cookiefile and + imagelist, are now combined into the + actions file (default.action + for most installations). - - - ./configure - make redhat-dist - + A filter file (typically default.filter) + is new with Privoxy 2.9.x, and provides some + of the new sophistication (explained below). config is + much the same as before. + + + If upgrading from a 2.0.x version, you will have to use the new config + files, and possibly adapt any personal rules from your older files. + When porting personal rules over from the old blockfile + to the new actions file, please note that even the pattern syntax has + changed. + If upgrading from 2.9.x development versions, it is still recommended + to use the new configuration files. + + + A quick list of things to be aware of before upgrading: - This will create both binary and src RPMs in the usual places. Example: + + + + + The default listening port is now 8118 due to a conflict with another + service (NAS). + + + + + Some installers may remove earlier versions completely. Save any + important configuration files! + + + + + Privoxy is controllable with a web browser + at the special URL: http://config.privoxy.org/ + (Shortcut: http://p.p/). Many + aspects of configuration can be done here, including temporarily disabling + Privoxy. + + + + + The primary configuration file for cookie management, ad and banner + blocking, and many other aspects of Privoxy + configuration is default.action. It is strongly + recommended to become familiar with the new actions concept below, + before modifying this file. + + + + + + + Some installers may not automatically start + Privoxy after installation. + + + + + + + + +Starting <application>Privoxy</application> -    /usr/src/redhat/RPMS/i686/junkbuster-2.9.8-1.i686.rpm + Before launching Privoxy for the first time, you + will want to configure your browser(s) to use Privoxy + as a HTTP and HTTPS proxy. The default is localhost for the proxy address, + and port 8118 (earlier versions used port 8000). This is the one required + configuration that must be done! + + + + With Netscape (and + Mozilla), this can be set under Edit + -> Preferences -> Advanced -> Proxies -> HTTP Proxy. + For Internet Explorer: Tools -> + Internet Properties -> Connections -> LAN Setting. Then, + check Use Proxy and fill in the appropriate info (Address: + localhost, Port: 8118). Include if HTTPS proxy support too. + -    /usr/src/redhat/SRPMS/junkbuster-2.9.8-1.src.rpm + After doing this, flush your browser's disk and memory caches to force a + re-reading of all pages and to get rid of any ads that may be cached. You + are now ready to start enjoying the benefits of using + Privoxy. + - To install, of course: + Privoxy is typically started by specifying the + main configuration file to be used on the command line. Example Unix startup + command: - rpm -Uvv /usr/src/redhat/RPMS/i686/junkbuster-2.9.8-1.i686.rpm + + # /usr/sbin/privoxy /etc/privoxy/config + - This will place the Junkbuster configuration - files in /etc/junkbuster/, and log files in - /var/log/junkbuster/. + An init script is provided for SuSE and Redhat. - + + For for SuSE: rcprivoxy start + - -SuSE - To build SuSE RPM packages, install source as above. Then: + For RedHat: /etc/rc.d/init.d/privoxy start + - - ./configure - make suse-dist - + If no configuration file is specified on the command line, + Privoxy will look for a file named + config in the current directory. Except on Win32 where + it will try config.txt. If no file is specified on the + command line and no default configuration file can be found, + Privoxy will fail to start. + - This will create both binary and src RPMs in the usual places. Example: + The included default configuration files should give a reasonable starting + point. Most of the per site configuration is done in the + actions files. These are where various cookie actions are + defined, ad and banner blocking, and other aspects of + Privoxy configuration. There are several such + files included, with varying levels of aggressiveness. -    /usr/src/suse/RPMS/i686/junkbuster-2.9.8-1.i686.rpm + You will probably want to keep an eye out for sites that require persistent + cookies, and add these to default.action as needed. By + default, most of these will be accepted only during the current browser + session (aka session cookies), until you add them to the + configuration. If you want the browser to handle this instead, you will need + to edit default.action and disable this feature. If you + use more than one browser, it would make more sense to let + Privoxy handle this. In which case, the + browser(s) should be set to accept all cookies. + -    /usr/src/suse/SRPMS/junkbuster-2.9.8-1.src.rpm + Another feature where you will probably want to define exceptions for trusted + sites is the popup-killing (through the +popup and + +filter{popups} actions), because your favorite shopping, + banking, or leisure site may need popups. - To install, of course: + Privoxy is HTTP/1.1 compliant, but not all of + the optional 1.1 features are as yet supported. In the unlikely event that + you experience inexplicable problems with browsers that use HTTP/1.1 per default + (like Mozilla or recent versions of I.E.), you might + try to force HTTP/1.0 compatibility. For Mozilla, look under Edit -> + Preferences -> Debug -> Networking. + Alternatively, set the +downgrade config option in + default.action which will downgrade your browser's HTTP + requests from HTTP/1.1 to HTTP/1.0 before processing them. - - rpm -Uvv /usr/src/suse/RPMS/i686/junkbuster-2.9.8-1.i686.rpm - + After running Privoxy for a while, you can + start to fine tune the configuration to suit your personal, or site, + preferences and requirements. There are many, many aspects that can + be customized. Actions (as specified in default.action) + can be adjusted by pointing your browser to + http://config.privoxy.org/ + (shortcut: http://p.p/), + and then follow the link to edit the actions list. + (This is an internal page and does not require Internet access.) - This will place the Junkbuster configuration - files in /etc/junkbuster/, and log files in - /var/log/junkbuster/. + In fact, various aspects of Privoxy + configuration can be viewed from this page, including + current configuration parameters, source code version numbers, + the browser's request headers, and actions that apply + to a given URL. In addition to the default.action file + editor mentioned above, Privoxy can also + be turned on and off (toggled) from this page. - + + If you encounter problems, try loading the page without + Privoxy. If that helps, enter the URL where + you have the problems into the browser + based rule tracing utility. See which rules apply and why, and + then try turning them off for that site one after the other, until the problem + is gone. When you have found the culprit, you might want to turn the rest on + again. + + + If the above paragraph sounds gibberish to you, you might want to read more about the actions concept + or even dive deep into the Appendix + on actions. + - -Windows -I need help on this. Not a clue here. Also for -configuration section below. + + If you can't get rid of the problem at all, think you've found a bug in + Privoxy, want to propose a new feature or smarter rules, please see the + chapter "Contacting the Developers, .." below. + + -Other -I need help on this too. OS/2? What others? + +Command Line Options + + Privoxy may be invoked with the following + command-line options: + + + + + + + + --version + + + Print version info and exit, Unix only. + + + + + --help + + + Print a short usage info and exit, Unix only. + + + + + --no-daemon + + + Don't become a daemon, i.e. don't fork and become process group + leader, don't detach from controlling tty. Unix only. + + + + + --pidfile FILE + + + + On startup, write the process ID to FILE. Delete the + FILE on exit. Failure to create or delete the + FILE is non-fatal. If no FILE + option is given, no PID file will be used. Unix only. + + + + + --user USER[.GROUP] + + + + After (optionally) writing the PID file, assume the user ID of + USER, and if included the GID of GROUP. Exit if the + privileges are not sufficient to do so. Unix only. + + + + + configfile + + + If no configfile is included on the command line, + Privoxy will look for a file named + config in the current directory (except on Win32 + where it will look for config.txt instead). Specify + full path to avoid confusion. + + + + + @@ -298,19 +602,89 @@ configuration section below. -Junkbuster Configuration +<application>Privoxy</application> Configuration + + All Privoxy configuration is stored + in text files. These files can be edited with a text editor. + Many important aspects of Privoxy can + also be controlled easily with a web browser. + + + + + + + +Controlling <application>Privoxy</application> with Your Web Browser + + Privoxy's user interface can be reached through the special + URL http://config.privoxy.org/ + (shortcut: http://p.p/), + which is a built-in page and works without Internet access. + You will see the following section: + + + + + + +Please choose from the following options: + + * Privoxy main page + * Show information about the current configuration + * Show the source code version numbers + * Show the request headers. + * Show which actions apply to a URL and why + * Toggle Privoxy on or off + * Edit the actions list + + + + + + This should be self-explanatory. Note the last item is an editor for the + actions list, which is where much of the ad, banner, cookie, + and URL blocking magic is configured as well as other advanced features of + Privoxy. This is an easy way to adjust various + aspects of Privoxy configuration. The actions + file, and other configuration files, are explained in detail below. + + + + Toggle Privoxy On or Off is handy for sites that might + have problems with your current actions and filters. You can in fact use + it as a test to see whether it is Privoxy + causing the problem or not. Privoxy continues + to run as a proxy in this case, but all filtering is disabled. There + is even a toggle Bookmarklet offered, so that you can toggle + Privoxy with one click from your browser. + + + + + + + + + + + + + +Configuration Files Overview - For Unix and Linux, all configuraton files are located in - /etc/junkbuster/ by default. For MS Windows and OS/2, - these are all in the same directory as the Junkbuster - executable. The name and number of configuration files has changed from - previous versions, and is subject to change as development progresses. + For Unix, *BSD and Linux, all configuration files are located in + /etc/privoxy/ by default. For MS Windows, OS/2, and + AmigaOS these are all in the same directory as the + Privoxy executable. - The installed defaults provide a reasonable starting point. For the - time being, there are only three default configuration files (this will - change in time): + The installed defaults provide a reasonable starting point, though possibly + aggressive by some standards. For the time being, there are only three + default configuration files (this may change in time): @@ -319,21 +693,30 @@ configuration section below. The main configuration file is named config - on Linux, Unix and OS/2, and junkbustr.txt on Windows. + on Linux, Unix, BSD, OS/2, and AmigaOS and config.txt + on Windows. - The actionsfile file is used to define various - actions relating to images, banners, pop-ups, banners and cookies. + default.action (the actions file) is used to define + which of a set of various actions relating to images, banners, + pop-ups, access restrictions, banners and cookies are to be applied where. + There is a web based editor for this file that can be accessed at http://config.privoxy.org/edit-actions/ + (Shortcut: http://p.p/edit-actions/). + (Other actions files are included as well with differing levels of filtering + and blocking, e.g. basic.action.) - The re_filterfile file can be used to rewrite the raw - page content, including text as well as embedded HTML and JavaScript. + default.filter (the filter file) can be used to re-write the raw + page content, including viewable text as well as embedded HTML and JavaScript, + and whatever else lurks on any given web page. The filtering jobs are only + pre-defined here; whether to apply them or not is up to the actions file. @@ -341,13 +724,39 @@ configuration section below. - actionsfile and re_filterfile - can use Perl style regular expressions for maximum flexibility. All files use - the # character to denote a comment. Such - lines are not processed by Junkbuster. After - making any changes, restart Junkbuster in order - for the changes to take effect. + All files use the # character to denote a + comment (the rest of the line will be ignored) and understand line continuation + through placing a backslash ("\") as the very last character + in a line. If the # is preceded by a backslash, it looses + its special function. Placing a # in front of an otherwise + valid configuration line to prevent it from being interpreted is called "commenting + out" that line. + + + + default.action and default.filter + can use Perl style regular expressions for maximum flexibility. + + + + After making any changes, there is no need to restart + Privoxy in order for the changes to take + effect. Privoxy detects such changes + automatically. Note, however, that it may take one or two additional + requests for the change to take effect. When changing the listening address + of Privoxy, these wake up requests + must obviously be sent to the old listening address. + + + + While under development, the configuration content is subject to change. + The below documentation may not be accurate by the time you read this. + Also, what constitutes a default setting, may change, so + please check all your configuration files on important issues. +]]> + @@ -356,7 +765,7 @@ configuration section below. The Main Configuration File Again, the main configuration file is named config on - Linux, Unix and OS/2, and junkbustr.txt on Windows. + Linux/Unix/BSD and OS/2, and config.txt on Windows. Configuration lines consist of an initial keyword followed by a list of values, all separated by whitespace (any number of spaces or tabs). For example: @@ -364,240 +773,385 @@ configuration section below. - + - blockfile blocklist.ini + confdir /etc/privoxy - - - - - - Indicates that the blockfile is named blocklist.ini. - - - - The # indicates a comment. Any part of a - line following a # is ignored, except if - the # is preceded by a - \. + + - Thus, by placing a # at the start of an - existing configuration line, you can make it a comment and it will be treated - as if it weren't there. This is called commenting out an - option and can be useful to turn off features: If you comment out the - logfile line, junkbuster will not - log to a file at all. Watch for the default: section in each - explanation to see what happens if the option is left unset (or commented - out). + Assigns the value /etc/privoxy to the option + confdir and thus indicates that the configuration + directory is named /etc/privoxy/. - Long lines can be continued on the next line by using a - \ as the very last character. + All options in the config file except for confdir and + logdir are optional. Watch out in the below description + for what happens if you leave them unset. - There are various aspects of Junkbuster behavior - that can be adjusted. + The main config file controls all aspects of Privoxy's + operation that are not location dependent (i.e. they apply universally, no matter + where you may be surfing). -Defining Other Configuration Files +Configuration and Log File Locations - Junkbuster can use a number of other files to tell it - what ads to block, what cookies to accept, etc. This section of the - configuration file tells Junkbuster where to find - all those other files. + Privoxy can (and normally does) use a number of + other files for additional configuration and logging. + This section of the configuration file tells Privoxy + where to find those other files. - - On Windows, Junkbuster - looks for these files in the same directory as the executable. On Unix and OS/2, - Junkbuster looks for these files in the current - working directory. In either case, an absolute path name can be used to - avoid problems. - - - - When development goes modular and multiuser, the blocker, filter, and - per-user config will be stored in subdirectories of confdir. - For now, only confdir/templates is used for storing HTML - templates for CGI results. - - - The location of the configuration files: - +confdir - - - - - confdir /etc/junkbuster # No trailing /, please. - - - - + + + Specifies: + + The directory where the other configuration files are located + + + + Type of value: + + Path name + + + + Default value: + + /etc/privoxy (Unix) or Privoxy installation dir (Windows) + + + + Effect if unset: + + Mandatory + + + + Notes: + + + No trailing /, please + + + When development goes modular and multi-user, the blocker, filter, and + per-user config will be stored in subdirectories of confdir. + For now, the configuration directory structure is flat, except for + confdir/templates, where the HTML templates for CGI + output reside (e.g. Privoxy's 404 error page). + + + + + - - The directory where all logging (i.e. logfile and - jarfile) takes place. No trailing - /, please: - - - - - - logdir /var/log/junkbuster - - - - +logdir - - Note that all file specifications below are relative to - the above two directories! - - - - The actionsfile contains patterns to specify the actions to - apply to requests for each site. Default: Cookies to and from all - destinations are filtered. Popups are disabled for all sites. All sites are - filtered if re_filterfile specified. No sites are blocked. An empty image is - displayed for filtered ads and other images (formerly - tinygif). The syntax of this file is explained in detail - below. - - - - - - - actionsfile actionsfile - - - - - - - The re_filterfile file contains content modification rules. - These rules permit powerful changes on the content of Web pages, e.g., you - could disable your favourite JavaScript annoyances, rewrite the actual - content, or just have some fun replacing Microsoft with - MicroSuck wherever it appears on a Web page. Default: No - content modification, or whatever the developers are playing with :-/ - - - - - - - re_filterfile re_filterfile - - - - + + + Specifies: + + + The directory where all logging takes place (i.e. where logfile and + jarfile are located) + + + + + Type of value: + + Path name + + + + Default value: + + /var/log/privoxy (Unix) or Privoxy installation dir (Windows) + + + + Effect if unset: + + Mandatory + + + + Notes: + + + No trailing /, please + + + + + - - The logfile is where all logging and error messages are written. The logfile - can be useful for tracking down a problem with - Junkbuster (e.g., it's not blocking an ad you - think it should block) but in most cases you probably will never look at it. - +actionsfile - - Your logfile will grow indefinitely, and you will probably want to - periodically remove it. On Unix systems, you can do this with a cron job - (see man cron). For Redhat, a logrotate - script has been included. - + + + Specifies: + + + The actions file to use + + + + + Type of value: + + File name, relative to confdir + + + + Default value: + + default.action (Unix) or default.action.txt (Windows) + + + + Effect if unset: + + + No action is taken at all. Simple neutral proxying. + + + + + Notes: + + + There is no point in using Privoxy without + an actions file. There are three different actions files included in the + distribution, with varying degrees of aggressiveness: + default.action, intermediate.action and + advanced.action. + + + + + - - On SuSE Linux systems, you can place a line like /var/log/junkbuster.* - +1024k 644 nobody.nogroup in /etc/logfiles, with - the effect that cron.daily will automatically archive, gzip, and empty the - log, when it exceeds 1M size. - +filterfile - - Default: Log to the a file named logfile. - Comment out to disable logging. - + + + Specifies: + + + The filter file to use + + + + + Type of value: + + File name, relative to confdir + + + + Default value: + + default.filter (Unix) or default.filter.txt (Windows) + + + + Effect if unset: + + + No textual content filtering takes place, i.e. all + +filter{name} + actions in the actions file are turned off + + + + + Notes: + + + The default.filter file contains content modification rules + that use regular expressions. These rules permit powerful + changes on the content of Web pages, e.g., you could disable your favorite + JavaScript annoyances, re-write the actual displayed text, or just have some + fun replacing Microsoft with MicroSuck wherever + it appears on a Web page. + + + + + - - - - - logfile logfile - - - - +logfile - - The jarfile defines where - Junkbuster stores the cookies it intercepts. Note - that if you use a jarfile, it may grow quite large. Default: - Don't store intercepted cookies. - + + + Specifies: + + + The log file to use + + + + + Type of value: + + File name, relative to logdir + + + + Default value: + + logfile (Unix) or privoxy.log (Windows) + + + + Effect if unset: + + + No log file is used, all log messages go to the console (stderr). + + + + + Notes: + + + The windows version will additionally log to the console. + + + The logfile is where all logging and error messages are written. The level + of detail and number of messages are set with the debug + option (see below). The logfile can be useful for tracking down a problem with + Privoxy (e.g., it's not blocking an ad you + think it should block) but in most cases you probably will never look at it. + + + Your logfile will grow indefinitely, and you will probably want to + periodically remove it. On Unix systems, you can do this with a cron job + (see man cron). For Redhat, a logrotate + script has been included. + + + On SuSE Linux systems, you can place a line like /var/log/privoxy.* + +1024k 644 nobody.nogroup in /etc/logfiles, with + the effect that cron.daily will automatically archive, gzip, and empty the + log, when it exceeds 1M size. + + + + + - - - - - #jarfile jarfile - - - - +jarfile - - If you specify a trustfile, - Junkbuster will only allow access to sites that - are named in the trustfile. You can also mark sites as trusted referrers, - with the effect that access to untrusted sites will be granted, if a link - from a trusted referrer was used. The link target will then be added to the - trustfile. This is a very restrictive feature that typical - users most propably want to leave disabled. Default: Disabled, don't use the - trust mechanism. - + + + Specifies: + + + The file to store intercepted cookies in + + + + + Type of value: + + File name, relative to logdir + + + + Default value: + + jarfile (Unix) or privoxy.jar (Windows) + + + + Effect if unset: + + + Intercepted cookies are not stored at all. + + + + + Notes: + + + The jarfile may grow to ridiculous sizes over time. + + + + + - - - - - #trustfile trust - - - - - - - If you use the trust mechanism, it is a good idea to write up some online - documentation about your blocking policy and to specify the URL(s) here. They - will appear on the page that your users receive when they try to access - untrusted content. Use multiple times for multiple URLs. Default: Don't - display links on the untrusted info page. - +trustfile - - - - - trust-info-url http://www.your-site.com/why_we_block.html - trust-info-url http://www.your-site.com/what_we_allow.html - - - - + + + Specifies: + + + The trust file to use + + + + + Type of value: + + File name, relative to confdir + + + + Default value: + + Unset (commented out). When activated: trust (Unix) or trust.txt (Windows) + + + + Effect if unset: + + + The whole trust mechanism is turned off. + + + + + Notes: + + + The trust mechanism is an experimental feature for building white-lists and should + be used with care. It is NOT recommended for the casual user. + + + If you specify a trust file, Privoxy will only allow + access to sites that are named in the trustfile. + You can also mark sites as trusted referrers (with +), with + the effect that access to untrusted sites will be granted, if a link from a + trusted referrer was used. + The link target will then be added to the trustfile. + Possible applications include limiting Internet access for children. + + + If you use + operator in the trust file, it may grow considerably over time. + + + + + @@ -608,235 +1162,989 @@ configuration section below. -Other Configuration Options +Local Set-up Documentation - - This part of the configuration file contains options that control how - Junkbuster operates. - + + If you intend to operate Privoxy for more users + that just yourself, it might be a good idea to let them know how to reach + you, what you block and why you do that, your policies etc. + - - Admin-address should be set to the email address of the proxy - administrator. It is used in many of the proxy-generated pages. Default: - fill@me.in.please. - +trust-info-url - - - - - #admin-address fill@me.in.please - - - - + + + Specifies: + + + A URL to be displayed in the error page that users will see if access to an untrusted page is denied. + + + + + Type of value: + + URL + + + + Default value: + + Two example URL are provided + + + + Effect if unset: + + + No links are displayed on the "untrusted" error page. + + + + + Notes: + + + The value of this option only matters if the experimental trust mechanism has been + activated. (See trustfile above.) + + + If you use the trust mechanism, it is a good idea to write up some on-line + documentation about your trust policy and to specify the URL(s) here. + Use multiple times for multiple URLs. + + + The URL(s) should be added to the trustfile as well, so users don't end up + locked out from the information on why they were locked out in the first place! + + + + + - - Proxy-info-url can be set to a URL that contains more info - about this Junkbuster installation, it's - configuration and policies. It is used in many of the proxy-generated pages - and its use is highly recommended in multi-user installations, since your - users will want to know why certain content is blocked or modified. Default: - Don't show a link to online documentation. - +admin-address - - - - - proxy-info-url http://www.your-site.com/proxy.html - - - - + + + Specifies: + + + An email address to reach the proxy administrator. + + + + + Type of value: + + Email address + + + + Default value: + + Unset + + + + Effect if unset: + + + No email address is displayed on error pages and the CGI user interface. + + + + + Notes: + + + If both admin-address and proxy-info-url + are unset, the whole "Local Privoxy Support" box on all generated pages will + not be shown. + + + + + + +proxy-info-url + + + + Specifies: + + + A URL to documentation about the local Privoxy setup, + configuration or policies. + + + + + Type of value: + + URL + + + + Default value: + + Unset + + + + Effect if unset: + + + No link to local documentation is displayed on error pages and the CGI user interface. + + + + + Notes: + + + If both admin-address and proxy-info-url + are unset, the whole "Local Privoxy Support" box on all generated pages will + not be shown. + + + This URL shouldn't be blocked ;-) + + + + + - - Listen-address specifies the address and port where - Junkbuster will listen for connections from your - Web browser. The default is to listen on the localhost port 8000, and - this is suitable for most users. (In your web browser, under proxy - configuration, list the proxy server as localhost and the - port as 8000). - + + - - If you already have another service running on port 8000, or if you want to - serve requests from other machines (e.g. on your local network) as well, you - will need to override the default. The syntax is - listen-address [<ip-address>]:<port>. If you leave - out the IP adress, junkbuster will bind to all - interfaces (addresses) on your machine and may become reachable from the - internet. In that case, consider using access control lists (acl's) (see - aclfile above). - + - - For example, suppose you are running Junkbuster on - a machine which has the address 192.168.0.1 on your local private network - (192.168.0.0) and has another outside connection with a different address. - You want it to serve requests from inside only: - + +Debugging - - - - - listen-address 192.168.0.1:8000 - - - - + + These options are mainly useful when tracing a problem. + Note that you might also want to invoke + Privoxy with the --no-daemon + command line option when debugging. + - - If you want it to listen on all addresses (including the outside - connection): - +debug - - - - - listen-address :8000 - - - - + + + Specifies: + + + Key values that determine what information gets logged. + + + + + Type of value: + + Integer values + + + + Default value: + + 12289 (i.e.: URLs plus informational and warning messages) + + + + Effect if unset: + + + Nothing gets logged. + + + + + Notes: + + + The available debug levels are: + + + + debug 1 # show each GET/POST/CONNECT request + debug 2 # show each connection status + debug 4 # show I/O status + debug 8 # show header parsing + debug 16 # log all data into the logfile + debug 32 # debug force feature + debug 64 # debug regular expression filter + debug 128 # debug fast redirects + debug 256 # debug GIF de-animation + debug 512 # Common Log Format + debug 1024 # debug kill pop-ups + debug 4096 # Startup banner and warnings. + debug 8192 # Non-fatal errors + + + + To select multiple debug levels, you can either add them or use + multiple debug lines. + + + A debug level of 1 is informative because it will show you each request + as it happens. 1, 4096 and 8192 are highly recommended + so that you will notice when things go wrong. The other levels are probably + only of interest if you are hunting down a specific problem. They can produce + a hell of an output (especially 16). + + + + The reporting of fatal errors (i.e. ones which crash + Privoxy) is always on and cannot be disabled. + + + If you want to use CLF (Common Log Format), you should set debug + 512 ONLY and not enable anything else. + + + + + - - If you do this, consider using ACLs (see aclfile above). Note: - you will need to point your browser(s) to the address and port that you have - configured here. Default: localhost:8000 (127.0.0.1:8000). - +single-threaded - - The debug option sets the level of debugging information to log in the - logfile (and to the console in the Windows version). A debug level of 1 is - informative because it will show you each request as it happens. Higher - levels of debug are probably only of interest to developers. - + + + Specifies: + + + Whether to run only one server thread + + + + + Type of value: + + None + + + + Default value: + + Unset + + + + Effect if unset: + + + Multi-threaded (or, where unavailable: forked) operation, i.e. the ability to + serve multiple requests simultaneously. + + + + + Notes: + + + This option is only there for debug purposes and you should never + need to use it. It will drastically reduce performance. + + + + + - - - - - debug 1 # GPC = show each GET/POST/CONNECT request - debug 2 # CONN = show each connection status - debug 4 # IO = show I/O status - debug 8 # HDR = show header parsing - debug 16 # LOG = log all data into the logfile - debug 32 # FRC = debug force feature - debug 64 # REF = debug regular expression filter - debug 128 # = debug fast redirects - debug 256 # = debug GIF deanimation - debug 512 # CLF = Common Log Format - debug 1024 # = debug kill popups - debug 4096 # INFO = Startup banner and warnings. - debug 8192 # ERROR = Non-fatal errors - - - - + - - It is highly recommended that you enable ERROR - reporting (debug 8192), at least until the next stable release. - + - - The reporting of FATAL errors (i.e. ones which crash - JunkBuster) is always on and cannot be disabled. - + +Access Control and Security + + + This section of the config file controls the security-relevant aspects + of Privoxy's configuration. + + +listen-address + + + + Specifies: + + + The IP address and TCP port on which Privoxy will + listen for client requests. + + + + + Type of value: + + [IP-Address]:Port + + + + Default value: + + localhost:8118 + + + + Effect if unset: + + + Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended for + home users who run Privoxy on the same machine as + their browser. + + + + + Notes: + + + You will need to configure your browser(s) to this proxy address and port. + + + If you already have another service running on port 8118, or if you want to + serve requests from other machines (e.g. on your local network) as well, you + will need to override the default. + + + If you leave out the IP address, Privoxy will + bind to all interfaces (addresses) on your machine and may become reachable + from the Internet. In that case, consider using access control lists (acl's) + (see ACLs below), or a firewall. + + + + + Example: + + + Suppose you are running Privoxy on + a machine which has the address 192.168.0.1 on your local private network + (192.168.0.0) and has another outside connection with a different address. + You want it to serve requests from inside only: + + + + listen-address 192.168.0.1:8118 + + + + + + + +toggle + + + + Specifies: + + + Initial state of "toggle" status + + + + + Type of value: + + 1 or 0 + + + + Default value: + + 1 + + + + Effect if unset: + + + Act as if toggled on + + + + + Notes: + + + If set to 0, Privoxy will start in + toggled off mode, i.e. behave like a normal, content-neutral + proxy. See enable-remote-toggle + below. This is not really useful anymore, since toggling is much easier + via the web + interface then via editing the conf file. + + + The windows version will only display the toggle icon in the system tray + if this option is present. + + + + + + + +enable-remote-toggle + + + Specifies: + + + Whether or not the web-based toggle + feature may be used + + + + + Type of value: + + 0 or 1 + + + + Default value: + + 1 + + + + Effect if unset: + + + The web-based toggle feature is disabled. + + + + + Notes: + + + When toggled off, Privoxy acts like a normal, + content-neutral proxy, i.e. it acts as if none of the actions applied to + any URL. + + + For the time being, access to the toggle feature can not be + controlled separately by ACLs or HTTP authentication, + so that everybody who can access Privoxy (see + ACLs and listen-address above) can + toggle it for all users. So this option is not recommended + for multi-user environments with untrusted users. + + + Note that you must have compiled Privoxy with + support for this feature, otherwise this option has no effect. + + + + + + + +enable-edit-actions + + + Specifies: + + + Whether or not the web-based actions + file editor may be used + + + + + Type of value: + + 0 or 1 + + + + Default value: + + 1 + + + + Effect if unset: + + + The web-based actions file editor is disabled. + + + + + Notes: + + + For the time being, access to the editor can not be + controlled separately by ACLs or HTTP authentication, + so that everybody who can access Privoxy (see + ACLs and listen-address above) can + modify its configuration for all users. So this option is not + recommended for multi-user environments with untrusted users. + + + Note that you must have compiled Privoxy with + support for this feature, otherwise this option has no effect. + + + + + + +ACLs: permit-access and deny-access + + + Specifies: + + + Who can access what. + + + + + Type of value: + + + src_addr[/src_masklen] + [dst_addr[/dst_masklen]] + + + Where src_addr and + dst_addr are IP addresses in dotted decimal notation or valid + DNS names, and src_masklen and + dst_masklen are subnet masks in CIDR notation, i.e. integer + values from 2 to 30 representing the length (in bits) of the network address. The masks and the whole + destination part are optional. + + + + + Default value: + + Unset + + + + Effect if unset: + + + Don't restrict access further than implied by listen-address + + + + + Notes: + + + Access controls are included at the request of ISPs and systems + administrators, and are not usually needed by individual users. + For a typical home user, it will normally suffice to ensure that + Privoxy only listens on the localhost or internal (home) + network address by means of the listen-address option. + + + Please see the warnings in the FAQ that this proxy is not intended to be a substitute + for a firewall or to encourage anyone to defer addressing basic security + weaknesses. + + + Multiple ACL lines are OK. + If any ACLs are specified, then the Privoxy + talks only to IP addresses that match at least one permit-access line + and don't match any subsequent deny-access line. In other words, the + last match wins, with the default being deny-access. + + + If Privoxy is using a forwarder (see forward below) + for a particular destination URL, the dst_addr + that is examined is the address of the forwarder and NOT the address + of the ultimate target. This is necessary because it may be impossible for the local + Privoxy to determine the IP address of the + ultimate target (that's often what gateways are used for). + + + You should prefer using IP addresses over DNS names, because the address lookups take + time. All DNS names must resolve! You can not use domain patterns + like *.org or partial domain names. If a DNS name resolves to multiple + IP addresses, only the first one is used. + + + Denying access to particular sites by ACL may have undesired side effects + if the site in question is hosted on a machine which also hosts other sites. + + + + + Examples: + + + Explicitly define the default behavior if no ACL and + listen-address are set: localhost + is OK. The absence of a dst_addr implies that + all destination addresses are OK: + + + + permit-access localhost + + + + Allow any host on the same class C subnet as www.privoxy.org access to + nothing but www.example.com: + + + + permit-access www.privoxy.org/24 www.example.com/32 + + + + Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere, + with the exception that 192.168.45.73 may not access www.dirty-stuff.example.com: + + + + permit-access 192.168.45.64/26 + deny-access 192.168.45.73 www.dirty-stuff.example.com + + + + + + + +buffer-limit + + + + Specifies: + + + Maximum size of the buffer for content filtering. + + + + + Type of value: + + Size in Kbytes + + + + Default value: + + 4096 + + + + Effect if unset: + + + Use a 4MB (4096 KB) limit. + + + + + Notes: + + + For content filtering, i.e. the +filter and + +deanimate-gif actions, it is necessary that + Privoxy buffers the entire document body. + This can be potentially dangerous, since a server could just keep sending + data indefinitely and wait for your RAM to exhaust -- with nasty consequences. + Hence this option. + + + When a document buffer size reaches the buffer-limit, it is + flushed to the client unfiltered and no further attempt to + filter the rest of the document is made. Remember that there may be multiple threads + running, which might require up to buffer-limit Kbytes + each, unless you have enabled single-threaded + above. + + + + + + + + + + + + + + +Forwarding + + + This feature allows routing of HTTP requests through a chain of + multiple proxies. + It can be used to better protect privacy and confidentiality when + accessing specific domains by routing requests to those domains + through an anonymous public proxy (see e.g. http://www.multiproxy.org/anon_list.htm) + Or to use a caching proxy to speed up browsing. Or chaining to a parent + proxy may be necessary because the machine that Privoxy + runs on has no direct Internet access. + + + + Also specified here are SOCKS proxies. Privoxy + supports the SOCKS 4 and SOCKS 4A protocols. + + +forward + + + Specifies: + + + To which parent HTTP proxy specific requests should be routed. + + + + + Type of value: + + + target_domain[:port] + http_parent[/port] + + + Where target_domain is a domain name pattern (see the + chapter on domain matching in the actions file), + http_parent is the address of the parent HTTP proxy + as an IP addresses in dotted decimal notation or as a valid DNS name (or . to denote + no forwarding, and the optional + port parameters are TCP ports, i.e. integer + values from 1 to 64535 + + + + + Default value: + + Unset + + + + Effect if unset: + + + Don't use parent HTTP proxies. + + + + + Notes: + + + If http_parent is ., then requests are not + forwarded to another HTTP proxy but are made directly to the web servers. + + + Multiple lines are OK, they are checked in sequence, and the last match wins. + + + + + Examples: + + + Everything goes to an example anonymizing proxy, except SSL on port 443 (which it doesn't handle): + + + + forward .* anon-proxy.example.org:8080 + forward :443 . + + + + Everything goes to our example ISP's caching proxy, except for requests + to that ISP's sites: + + + + forward .*. caching-proxy.example-isp.net:8000 + forward .example-isp.net . + + + + + + + +forward-socks4 and forward-socks4a + + + Specifies: + + + Through which SOCKS proxy (and to which parent HTTP proxy) specific requests should be routed. + + + + + Type of value: + + + target_domain[:port] + socks_proxy[/port] + http_parent[/port] + + + Where target_domain is a domain name pattern (see the + chapter on domain matching in the actions file), + http_parent and socks_proxy + are IP addresses in dotted decimal notation or valid DNS names (http_parent + may be . to denote no HTTP forwarding), and the optional + port parameters are TCP ports, i.e. integer values from 1 to 64535 + + + + + Default value: + + Unset + + + + Effect if unset: + + + Don't use SOCKS proxies. + + + + + Notes: + + + Multiple lines are OK, they are checked in sequence, and the last match wins. + + + The difference between forward-socks4 and forward-socks4a + is that in the SOCKS 4A protocol, the DNS resolution of the target hostname happens on the SOCKS + server, while in SOCKS 4 it happens locally. + + + If http_parent is ., then requests are not + forwarded to another HTTP proxy but are made (HTTP-wise) directly to the web servers, albeit through + a SOCKS proxy. + + + + + Examples: + + + From the company example.com, direct connections are made to all + internal domains, but everything outbound goes through + their ISP's proxy by way of example.com's corporate SOCKS 4A gateway to + the Internet. + + + + forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080 + forward .example.com . + + + + A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent looks like this: + + + + forward-socks4 .*. socks-gw.example.com:1080 . + + + + + + + +Advanced Forwarding Examples - If you want to use CLF (Common Log Format), you should set debug - 512 ONLY, do not enable anything else. + If you have links to multiple ISPs that provide various special content + only to their subscribers, you can configure multiple Privoxies + which have connections to the respective ISPs to act as forwarders to each other, so that + your users can see the internal content of all ISPs. - Multiple debug directives, are OK - they're logical-OR'd - together. + Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP connection to + isp-b.net. Both run Privoxy. Their forwarding + configuration can look like this: - - - - debug 15 # same as setting the first 4 listed above - - - + host-a: - Default: + + forward .*. . + forward .isp-b.net host-b:8118 + - - - - debug 1 # URLs - debug 4096 # Info - debug 8192 # Errors - *we highly recommended enabling this* - - - + host-b: - Junkbuster normally uses - multi-threading, a software technique that permits it to - handle many different requests simultaneously. In some cases you may wish to - disable this -- particularly if you're trying to debug a problem. The - single-threaded option forces - Junkbuster to handle requests sequentially. - Default: Multi-threaded mode. + + forward .*. . + forward .isp-a.net host-a:8118 + - - - - #single-threaded - - - + Now, your users can set their browser's proxy to use either + host-a or host-b and be able to browse the internal content + of both isp-a and isp-b. - toggle allows you to temporarily disable all - Junkbuster's filtering. Just set toggle - 0. + If you intend to chain Privoxy and + squid locally, then chain as + browser -> squid -> privoxy is the recommended way. - The Windows version of Junkbuster puts an icon in - the system tray, which allows you to change this option without having to - edit this file. If you right-click on that icon (or select the - Options menu), one choice is Enable. Clicking - on enable toggles Junkbuster on and off. This is - useful if you want to temporarily disable - Junkbuster, e.g., to access a site that requires - cookies which you normally have blocked. + Assuming that Privoxy and squid + run on the same box, your squid configuration could then look like this: - toggle 1 means Junkbuster runs - normally, toggle 0 means that - Junkbuster becomes a non-anonymizing non-blocking - proxy. Default: 1. + + # Define Privoxy as parent proxy (without ICP) + cache_peer 127.0.0.1 parent 8118 7 no-query + + # Define ACL for protocol FTP + acl ftp proto FTP + + # Do not forward FTP requests to Privoxy + always_direct allow ftp + + # Forward all the rest to Privoxy + never_direct allow all + - - - - toggle 1 - - - + You would then need to change your browser's proxy settings to squid's address and port. + Squid normally uses port 3128. If unsure consult http_port in squid.conf. + + @@ -845,683 +2153,1330 @@ configuration section below. -Access Control List (ACL) - - Access controls are included at the request of some ISPs and systems - administrators, and are not usually needed by individual users. Please note - the warnings in the FAQ that this proxy is not intended to be a substitute - for a firewall or to encourage anyone to defer addressing basic security - weaknesses. - - - - If no access settings are specified, the proxy talks to anyone that - connects. If any access settings file are specified, then the proxy - talks only to IP addresses permitted somewhere in this file and not - denied later in this file. - - +Windows GUI Options + - Summary -- if using an ACL: + Privoxy has a number of options specific to the + Windows GUI interface: - - - Client must have permission to receive service. - - - - - LAST match in ACL wins. - - - - - Default behavior is to deny service. - - - - The syntax for an entry in the Access Control List is: + If activity-animation is set to 1, the + Privoxy icon will animate when + Privoxy is active. To turn off, set to 0. - + - ACTION SRC_ADDR[/SRC_MASKLEN] [ DST_ADDR[/DST_MASKLEN] ] + activity-animation 1 - + - Where the individual fields are: + If log-messages is set to 1, + Privoxy will log messages to the console + window: - + - ACTION = permit-access or deny-access - - SRC_ADDR = client hostname or dotted IP address - SRC_MASKLEN = number of bits in the subnet mask for the source - - DST_ADDR = server or forwarder hostname or dotted IP address - DST_MASKLEN = number of bits in the subnet mask for the target + log-messages 1 - + - - The field separator (FS) is whitespace (space or tab). - - - - IMPORTANT NOTE: If the junkbuster is using a - forwarder (see below) or a gateway for a particular destination URL, the - DST_ADDR that is examined is the address of the forwarder - or the gateway and NOT the address of the ultimate - target. This is necessary because it may be impossible for the local - Junkbuster to determine the address of the - ultimate target (that's often what gateways are used for). + If log-buffer-size is set to 1, the size of the log buffer, + i.e. the amount of memory used for the log messages displayed in the + console window, will be limited to log-max-lines (see below). - Here are a few examples to show how the ACL features work: - - - - localhost is OK -- no DST_ADDR implies that - ALL destination addresses are OK: + Warning: Setting this to 0 will result in the buffer to grow infinitely and + eat up all your memory! - + - permit-access localhost + log-buffer-size 1 - + - A silly example to illustrate permitting any host on the class-C subnet with - Junkbuster to go anywhere: + log-max-lines is the maximum number of lines held + in the log buffer. See above. - + - permit-access www.junkbusters.com/24 + log-max-lines 200 - + - Except deny one particular IP address from using it at all: + If log-highlight-messages is set to 1, + Privoxy will highlight portions of the log + messages with a bold-faced font: - + - deny-access ident.junkbusters.com + log-highlight-messages 1 - + - You can also specify an explicit network address and subnet mask. - Explicit addresses do not have to be resolved to be used. + The font used in the console window: - + - permit-access 207.153.200.0/24 + log-font-name Comic Sans MS - + - A subnet mask of 0 matches anything, so the next line permits everyone. + Font size used in the console window: - + - permit-access 0.0.0.0/0 + log-font-size 8 - + - - Note, you cannot say: + + show-on-task-bar controls whether or not + Privoxy will appear as a button on the Task bar + when minimized: - + - permit-access .org + show-on-task-bar 0 - + - to allow all *.org domains. Every IP address listed must resolve fully. - - - - An ISP may want to provide a Junkbuster that is - accessible by the world and yet restrict use of some of their - private content to hosts on its internal network (i.e. its own subscribers). - Say, for instance the ISP owns the Class-B IP address block 123.124.0.0 (a 16 - bit netmask). This is how they could do it: + If close-button-minimizes is set to 1, the Windows close + button will minimize Privoxy instead of closing + the program (close with the exit option on the File menu). - + - permit-access 0.0.0.0/0 0.0.0.0/0 # other clients can go anywhere - # with the following exceptions: - - deny-access 0.0.0.0/0 123.124.0.0/16 # block all external requests for - # sites on the ISP's network - - permit 0.0.0.0/0 www.my_isp.com # except for the ISP's main - # web site - - permit 123.124.0.0/16 0.0.0.0/0 # the ISP's clients can go - # anywhere + close-button-minimizes 1 - + - Note that if some hostnames are listed with multiple IP addresses, - the primary value returned by DNS (via gethostbyname()) is used. Default: - Anyone can access the proxy. + The hide-console option is specific to the MS-Win console + version of Privoxy. If this option is used, + Privoxy will disconnect from and hide the + command console. + + + + + + + #hide-console + + + + - - -Forwarding + +The Actions File - This feature allows routing of HTTP requests via multiple proxies. - It can be used to better protect privacy and confidentiality when - accessing specific domains by routing requests to those domains - to a special purpose filtering proxy such as lpwa.com. + The actions file (default.action, formerly: + actionsfile or ijb.action) is used + to define what actions Privoxy takes for which + URLs, and thus determines how ad images, cookies and various other aspects + of HTTP content and transactions are handled on which sites (or even parts + thereof). - - It can also be used in an environment with multiple networks to route - requests via multiple gateways allowing transparent access to multiple - networks without having to modify browser configurations. + + Anything you want can blocked, including ads, banners, or just some obnoxious + URL that you would rather not see. Cookies can be accepted or rejected, or + accepted only during the current browser session (i.e. not written to disk), + content can be modified, JavaScripts tamed, user-tracking fooled, and much more. + See below for a complete list of available actions. + + +Finding the Right Mix - Also specified here are SOCKS proxies. Junkbuster - SOCKS 4 and SOCKS 4A. The difference is that SOCKS 4A will resolve the target - hostname using DNS on the SOCKS server, not our local DNS client. + Note that some actions like cookie suppression or script disabling may + render some sites unusable, which rely on these techniques to work properly. + Finding the right mix of actions is not easy and certainly a matter of personal + taste. In general, it can be said that the more aggressive + your default settings (in the top section of the actions file) are, + the more exceptions for trusted sites you will have to + make later. If, for example, you want to kill popup windows per default, you'll + have to make exceptions from that rule for sites that you regularly use + and that require popups for actually useful content, like maybe your bank, + favorite shop, or newspaper. - The syntax of each line is: + We have tried to provide you with reasonable rules to start from in the + distribution actions file. But there is no general rule of thumb on these + things. There just are too many variables, and sites are constantly changing. + Sooner or later you will want to change the rules (and read this chapter). + + + +How to Edit - - - - forward target_domain[:port] http_proxy_host[:port] - forward-socks4 target_domain[:port] socks_proxy_host[:port] http_proxy_host[:port] - forward-socks4a target_domain[:port] socks_proxy_host[:port] http_proxy_host[:port] - - - + The easiest way to edit the actions file is with a browser by + using our browser-based editor, which is available at http://config.privoxy.org/edit-actions. - If http_proxy_host is ., then requests are not forwarded to a - HTTP proxy but are made directly to the web servers. + If you prefer plain text editing to GUIs, you can of course also directly edit the + default.action file. + - - Lines are checked in sequence, and the last match wins. - + +How Actions are Applied to URLs - There is an implicit line equivalent to the following, which specifies that - anything not finding a match on the list is to go out without forwarding - or gateway protocol, like so: + The actions file is divided into sections. There are special sections, + like the alias sections which will be discussed later. For now let's + concentrate on regular sections: They have a heading line (often split + up to multiple lines for readability) which consist of a list of actions, + separated by whitespace and enclosed in curly braces. Below that, there + is a list of URL patterns, each on a separate line. - - - - forward .* . # implicit - - - + To determine which actions apply to a request, the URL of the request is + compared to all patterns in this file. Every time it matches, the list of + applicable actions for the URL is incrementally updated, using the heading + of the section in which the pattern is located. If multiple matches for + the same URL set the same action differently, the last match wins. - In the following common configuration, everything goes to Lucent's LPWA, - except SSL on port 443 (which it doesn't handle): + You can trace this process by visiting http://config.privoxy.org/show-url-info. - - - - forward .* lpwa.com:8000 - forward :443 . - - - + More detail on this is provided in the Appendix, + Anatomy of an Action. + + + +Patterns - See the FAQ for instructions on how to automate the login procedure for LPWA. - Some users have reported difficulties related to LPWA's use of - . as the last element of the domain, and have said that this - can be fixed with this: - - - - - - - forward lpwa. lpwa.com:8000 - - - - - - - (NOTE: the syntax for specifiying target_domain has changed since the - previous paragraph was written -- it will not work now. More information - is welcome.) + Generally, a pattern has the form <domain>/<path>, + where both the <domain> and <path> + are optional. (This is why the pattern / matches all URLs). - - In this fictitious example, everything goes via an ISP's caching proxy, - except requests to that ISP: - + + + www.example.com/ + + + is a domain-only pattern and will match any request to www.example.com, + regardless of which document on that server is requested. + + + + + www.example.com + + + means exactly the same. For domain-only patterns, the trailing / may + be omitted. + + + + + www.example.com/index.html + + + matches only the single document /index.html + on www.example.com. + + + + + /index.html + + + matches the document /index.html, regardless of the domain, + i.e. on any web server. + + + + + index.html + + + matches nothing, since it would be interpreted as a domain name and + there is no top-level domain called .html. + + + + - - - - - forward .* caching.myisp.net:8000 - forward myisp.net . - - - - +The Domain Pattern - For the @home network, we're told the forwarding configuration is this: + The matching of the domain part offers some flexible options: if the + domain starts or ends with a dot, it becomes unanchored at that end. + For example: + + + .example.com + + + matches any domain that ENDS in + .example.com + + + + + www. + + + matches any domain that STARTS with + www. + + + + + .example. + + + matches any domain that CONTAINS .example. + (Correctly speaking: It matches any FQDN that contains example as a domain.) + + + + - - - - forward .* proxy:8080 - - - + Additionally, there are wild-cards that you can use in the domain names + themselves. They work pretty similar to shell wild-cards: * + stands for zero or more arbitrary characters, ? stands for + any single character, you can define character classes in square + brackets and all of that can be freely mixed: - - Also, we're told they insist on getting cookies and JavaScript, so you need - to add home.com to the cookie file. We consider JavaScript a security risk. - Java need not be enabled. - + + + ad*.example.com + + + matches adserver.example.com, + ads.example.com, etc but not sfads.example.com + + + + + *ad*.example.com + + + matches all of the above, and then some. + + + + + .?pix.com + + + matches www.ipix.com, + pictures.epix.com, a.b.c.d.e.upix.com etc. + + + + + www[1-9a-ez].example.c* + + + matches www1.example.com, + www4.example.cc, wwwd.example.cy, + wwwz.example.com etc., but not + wwww.example.com. + + + + - - In this example direct connections are made to all internal - domains, but everything else goes through Lucent's LPWA by way of the - company's SOCKS gateway to the Internet. - + + +The Path Pattern - - - - forward_socks4 .* lpwa.com:8000 firewall.my_company.com:1080 - forward my_company.com . - - - + Privoxy uses Perl compatible regular expressions + (through the PCRE library) for + matching the path. - This is how you could set up a site that always uses SOCKS but no forwarders: + There is an Appendix with a brief quick-start into regular + expressions, and full (very technical) documentation on PCRE regex syntax is available on-line + at http://www.pcre.org/man.txt. + You might also find the Perl man page on regular expressions (man perlre) + useful, which is available on-line at http://www.perldoc.com/perl5.6/pod/perlre.html. - - - - forward_socks4a .* . firewall.my_company.com:1080 - - - + Note that the path pattern is automatically left-anchored at the /, + i.e. it matches as if it would start with a ^. - An advanced example for network administrators: + Please also note that matching in the path is case + INSENSITIVE by default, but you can switch to case + sensitive at any point in the pattern by using the + (?-i) switch: + www.example.com/(?-i)PaTtErN.* will match only + documents whose path starts with PaTtErN in + exactly this capitalization. + + + + + + + + + + +Actions - If you have links to multiple ISPs that provide various special content to - their subscribers, you can configure forwarding to pass requests to the - specific host that's connected to that ISP so that everybody can see all - of the content on all of the ISPs. + Actions are enabled if preceded with a +, and disabled if + preceded with a -. Actions are invoked by enclosing the + action name in curly braces (e.g. {+some_action}), followed by a list of + URLs to which the action applies. There are three classes of actions: - This is a bit tricky, but here's an example: - + + + + + Boolean (e.g. +/-block): + + + + + + {+name} # enable this action + {-name} # disable this action + + + + + - - host-a has a PPP connection to isp-a.com. And host-b has a PPP connection to - isp-b.com. host-a can run a Junkbuster proxy with - forwarding like this: + + + parameterized (e.g. +/-hide-user-agent): + + + + + + {+name{param}} # enable action and set parameter to param + {-name} # disable action + + + + + + + + + Multi-value (e.g. {+/-add-header{Name: value}}, {+/-wafer{name=value}}): + + + + + + {+name{param}} # enable action and add parameter param + {-name{param}} # remove the parameter param + {-name} # disable this action totally + + + + + + + - - - - forward .* . - forward isp-b.com host-b:8000 - - - + If nothing is specified in this file, no actions are taken. + So in this case Privoxy would just be a + normal, non-blocking, non-anonymizing proxy. You must specifically + enable the privacy and blocking features you need (although the + provided default default.action file will + give a good starting point). - host-b can run a Junkbuster proxy with forwarding - like this: + Later defined actions always over-ride earlier ones. So exceptions + to any rules you make, should come in the latter part of the file. For + multi-valued actions, the actions are applied in the order they are + specified. - - - - forward .* . - forward isp-a.com host-a:8000 - - - + The list of valid Privoxy actions are: - Now, anyone on the Internet (including users on host-a - and host-b) can set their browser's proxy to either - host-a or host-b and be able to browse the content on isp-a or isp-b. - + + + + + Add the specified HTTP header, which is not checked for validity. + You may specify this many times to specify many different headers: + + + + + + +add-header{Name: value} + + + + + + + + + + Block this URL totally. In a default installation, a blocked + URL will result in bright red banner that says BLOCKED, + with a reason why it is being blocked, and an option to see it anyway. + The page displayed for this is the blocked template + file. + + + + + + +block + + + + + + + + + + De-animate all animated GIF images, i.e. reduce them to their last frame. + This will also shrink the images considerably (in bytes, not pixels!). If + the option first is given, the first frame of the animation + is used as the replacement. If last is given, the last frame + of the animation is used instead, which probably makes more sense for most + banner animations, but also has the risk of not showing the entire last + frame (if it is only a delta to an earlier frame). + + + + + + +deanimate-gifs{last} + +deanimate-gifs{first} + + + + + + + + + +downgrade will downgrade HTTP/1.1 client requests to + HTTP/1.0 and downgrade the responses as well. Use this action for servers + that use HTTP/1.1 protocol features that + Privoxy doesn't handle well yet. HTTP/1.1 + is only partially implemented. Default is not to downgrade requests. + + + + + + +downgrade + + + + + + + + + Many sites, like yahoo.com, don't just link to other sites. Instead, they + will link to some script on their own server, giving the destination as a + parameter, which will then redirect you to the final target. URLs resulting + from this scheme typically look like: + http://some.place/some_script?http://some.where-else. + + + Sometimes, there are even multiple consecutive redirects encoded in the + URL. These redirections via scripts make your web browsing more traceable, + since the server from which you follow such a link can see where you go to. + Apart from that, valuable bandwidth and time is wasted, while your browser + ask the server for one redirect after the other. Plus, it feeds the + advertisers. + + + The +fast-redirects option enables interception of these + types of requests by Privoxy, who will cut off + all but the last valid URL in the request and send a local redirect back to + your browser without contacting the intermediate site(s). + + + + + + +fast-redirects + + + + + + + + + Apply the filters in the section_header + section of the default.filter file to the site(s). + default.filter sections are grouped according to like + functionality. Filters can be used to + re-write any of the raw page content. This is a potentially a + very powerful feature! + + + + + + + +filter{section_header} + + + + + + + Filter sections that are pre-defined in the supplied + default.filter include: + + +
+ + + html-annoyances: Get rid of particularly annoying HTML abuse. + + + + + js-annoyances: Get rid of particularly annoying JavaScript abuse + + + + + content-cookies: Kill cookies that come in the HTML or JS content + + + + + popups: Kill all popups in JS and HTML + + + + + frameset-borders: Give frames a border and make them resizable + + + + + webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking) + + + + + refresh-tags: Kill automatic refresh tags (for dial-on-demand setups) + + + + + fun: Text replacements for subversive browsing fun! + + + + + nimda: Remove Nimda (virus) code. + + + + + banners-by-size: Kill banners by size (very efficient!) + + + + + shockwave-flash: Kill embedded Shockwave Flash objects + + + + + crude-parental: Kill all web pages that contain the words "sex" or "warez" + + +
+ + + + Note: Filtering requires buffering the page content, which may appear to slow down + page rendering since nothing is displayed until all content has passed + the filters. (It does not really take longer, but seems that way since + the page is not incrementally displayed.) This effect will be more noticeable + on slower connections. + + +
+ + + + Block any existing X-Forwarded-for header, and do not add a new one: + + + + + + +hide-forwarded + + + + + + + + + If the browser sends a From: header containing your e-mail + address, this either completely removes the header (block), or + changes it to the specified e-mail address. + + + + + + +hide-from{block} + +hide-from{spam@sittingduck.xqq} + + + + + + + + + Don't send the Referer: (sic) header to the web site. You + can block it, forge a URL to the same server as the request (which is + preferred because some sites will not send images otherwise) or set it to a + constant, user defined string of your choice. + + + + + + +hide-referer{block} + +hide-referer{forge} + +hide-referer{http://nowhere.com} + + + + + + + + + Alternative spelling of +hide-referer. It has the same + parameters, and can be freely mixed with, +hide-referer. + (referrer is the correct English spelling, however the HTTP + specification has a bug - it requires it to be spelled referer.) + + + + + + +hide-referrer{...} + + + + + + + + + Change the User-Agent: header so web servers can't tell your + browser type. Warning! This breaks many web sites. Specify the + user-agent value you want. Example, pretend to be using Netscape on + Linux: + + + + + + +hide-user-agent{Mozilla (X11; I; Linux 2.0.32 i586)} + + + + + + + + + + Treat this URL as an image. This only matters if it's also +blocked, + in which case a blocked image can be sent rather than a HTML page. + See +image-blocker{} below for the control over what is actually sent. + If you want invisible ads, they should be defined as + images and blocked. And also, + image-blocker should be set to blank. Note you + cannot treat HTML pages as images in most cases. For instance, frames + require an HTML page to display. So a frame that is an ad, cannot be + treated as an image. Forcing an image in this + situation just will not work. + + + + + + +image + + + + + + + + Decides what to do with URLs that end up tagged with {+block + +image}, e.g an advertisement. There are four options. + -image-blocker will send a HTML blocked page, + usually resulting in a broken image icon. + + + ++image-blocker{blank} will send a 1x1 transparent GIF +image. And finally, +image-blocker{http://xyz.com} will send a +HTTP temporary redirect to the specified image. This has the advantage of the +icon being being cached by the browser, which will speed up the display. ++image-blocker{pattern} will send a checkerboard type pattern: + + + + + + + + + + +image-blocker{blank} + +image-blocker{pattern} + +image-blocker{http://p.p/send-banner} + + + + + + + + + By default (i.e. in the absence of a +limit-connect + action), Privoxy will only allow CONNECT + requests to port 443, which is the standard port for https as a + precaution. + + + + The CONNECT methods exists in HTTP to allow access to secure websites + (https:// URLs) through proxies. It works very simply: the proxy + connects to the server on the specified port, and then short-circuits + its connections to the client and to the remote proxy. + This can be a big security hole, since CONNECT-enabled proxies can + be abused as TCP relays very easily. + + + + If you want to allow CONNECT for more ports than this, or want to forbid + CONNECT altogether, you can specify a comma separated list of ports and + port ranges (the latter using dashes, with the minimum defaulting to 0 and + max to 65K): + + + + + + + +limit-connect{443} # This is the default and need no be specified. + +limit-connect{80,443} # Ports 80 and 443 are OK. + +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to 100 + #and above 500 are OK. + + + + + + + + + + +no-compression prevents the website from compressing the + data. Some websites do this, which can be a problem for + Privoxy, since +filter, + +no-popup and +gif-deanimate will not work on + compressed data. This will slow down connections to those websites, + though. Default is no-compression is turned on. + + + + + + + +nocompression + + + + + + + + + If the website sets cookies, no-cookies-keep will make sure + they are erased when you exit and restart your web browser. This makes + profiling cookies useless, but won't break sites which require cookies so + that you can log in for transactions. Default: on. + + + + + + +no-cookies-keep + + + + + + + + + Prevent the website from reading cookies: + + + + + + +no-cookies-read + + + + + + + + + Prevent the website from setting cookies: + + + + + + +no-cookies-set + + + + + + + + + Filter the website through a built-in filter to disable those obnoxious + JavaScript pop-up windows via window.open(), etc. The two alternative + spellings are equivalent. + + + + + + +no-popup + +no-popups + + + + + + + + + This action only applies if you are using a jarfile + for saving cookies. It sends a cookie to every site stating that you do not + accept any copyright on cookies sent to you, and asking them not to track + you. Of course, this is a (relatively) unique header they could use to + track you. + + + + + + +vanilla-wafer + + + + + + + + + This allows you to add an arbitrary cookie. It can be specified multiple + times in order to add as many cookies as you like. + + + + + + +wafer{name=value} + + + + + - - Here's another practical example, for University of Kent at - Canterbury students with a network connection in their room, who - need to use the University's Squid web cache. +
- - - - forward *. ssbcache.ukc.ac.uk:3128 # Use the proxy, except for: - forward .ukc.ac.uk . # Anything on the same domain as us - forward * . # Host with no domain specified - forward 129.12.*.* . # A dotted IP on our /16 network. - forward 127.*.*.* . # Loopback address - forward localhost.localdomain . # Loopback address - forward www.ukc.mirror.ac.uk . # Specific host - - - + The meaning of any of the above is reversed by preceding the action with a + -, in place of the +. - If you intend to chain Junkbuster and - squid locally, then chain as - browser -> squid -> junkbuster is the recommended way. + Some examples: - Your squid configuration could then look like this: + Turn off cookies by default, then allow a few through for specified sites: - + - + - # Define junkbuster as parent cache - cache_peer 127.0.0.1 8000 parent 0 no-query - - # Define ACL for protocol FTP - acl FTP proto FTP - - # Do not forward ACL FTP to junkbuster - always_direct allow FTP + # Turn off all persistent cookies + { +no-cookies-read } + { +no-cookies-set } + # Allow cookies for this browser session ONLY + { +no-cookies-keep } - # Do not forward ACL CONNECT (https) to junkbuster - always_direct allow CONNECT + # Exceptions to the above, sites that benefit from persistent cookies + { -no-cookies-read } + { -no-cookies-set } + { -no-cookies-keep } + .javasoft.com + .sun.com + .yahoo.com + .msdn.microsoft.com + .redhat.com - # Forward the rest to junkbuster - never_direct allow all + # Alternative way of saying the same thing + {-no-cookies-set -no-cookies-read -no-cookies-keep} + .sourceforge.net + .sf.net - + - - - - - - - - -Windows GUI Options - - - Junkbuster has a number of options specific to the - Windows GUI interface: - - - If activity-animation is set to 1, the - Junkbuster icon will animate when - Junkbuster is active. To turn off, set to 0. + Now turn off fast redirects, and then we allow two exceptions: - + - activity-animation 1 + # Turn them off! + {+fast-redirects} + + # Reverse it for these two sites, which don't work right without it. + {-fast-redirects} + www.ukc.ac.uk/cgi-bin/wac\.cgi\? + login.yahoo.com - + - If log-messages is set to 1, - Junkbuster will log messages to the console - window: - + Turn on page filtering according to rules in the defined sections + of default.filter, and make one exception for + Sourceforge: + - + - log-messages 1 + # Run everything through the filter file, using only the + # specified sections: + +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups}\ + +filter{webbugs} +filter{nimda} +filter{banners-by-size} + + # Then disable filtering of code from sourceforge! + {-filter} + .cvs.sourceforge.net - + - - If log-buffer-size is set to 1, the size of the log buffer, - i.e. the amount of memory used for the log messages displayed in the - console window, will be limited to log-max-lines (see below). - - - Warning: Setting this to 0 will result in the buffer to grow infinitely and - eat up all your memory! + Now some URLs that we want blocked (normally generates + the blocked banner). Many of these use regular expressions + that will expand to match multiple URLs: - + - log-buffer-size 1 + # Blocklist: + {+block} + /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g)) + /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/]) + /.*/(ng)?adclient\.cgi + /.*/(plain|live|rotate)[-_.]?ads?/ + /.*/(sponsor)s?[0-9]?/ + /.*/_?(plain|live)?ads?(-banners)?/ + /.*/abanners/ + /.*/ad(sdna_image|gifs?)/ + /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe) + /.*/adbanners/ + /.*/adserver + /.*/adstream\.cgi + /.*/adv((er)?ts?|ertis(ing|ements?))?/ + /.*/banner_?ads/ + /.*/banners?/ + /.*/banners?\.cgi/ + /.*/cgi-bin/centralad/getimage + /.*/images/addver\.gif + /.*/images/marketing/.*\.(gif|jpe?g) + /.*/popupads/ + /.*/siteads/ + /.*/sponsor.*\.gif + /.*/sponsors?[0-9]?/ + /.*/advert[0-9]+\.jpg + /Media/Images/Adds/ + /ad_images/ + /adimages/ + /.*/ads/ + /bannerfarm/ + /grafikk/annonse/ + /graphics/defaultAd/ + /image\.ng/AdType + /image\.ng/transactionID + /images/.*/.*_anim\.gif # alvin brattli + /ip_img/.*\.(gif|jpe?g) + /rotateads/ + /rotations/ + /worldnet/ad\.cgi + /cgi-bin/nph-adclick.exe/ + /.*/Image/BannerAdvertising/ + /.*/ad-bin/ + /.*/adlib/server\.cgi + /autoads/ - + - log-max-lines is the maximum number of lines held - in the log buffer. See above. - - - - - - - log-max-lines 200 - - - + Note that many of these actions have the potential to cause a page to + misbehave, possibly even not to display at all. There are many ways + a site designer may choose to design his site, and what HTTP header + content he may depend on. There is no way to have hard and fast rules + for all sites. See the Appendix + for a brief example on troubleshooting actions. - - If log-highlight-messages is set to 1, - Junkbuster will highlight portions of the log - messages with a bold-faced font: - + - - - - - log-highlight-messages 1 - - - - + - - The font used in the console window: - + + +Aliases - - - - log-font-name Comic Sans MS - - - + Custom actions, known to Privoxy + as aliases, can be defined by combining other actions. + These can in turn be invoked just like the built-in actions. + Currently, an alias can contain any character except space, tab, =, + { or }. But please use only a- + z, 0-9, +, and + -. Alias names are not case sensitive, and + must be defined before anything else in the + default.actionfile! And there can only be one set of + aliases defined. - Font size used in the console window: + Now let's define a few aliases: - + - log-font-size 8 - - - - - - - show-on-task-bar controls whether or not - Junkbuster will appear as a button on the Task bar - when minimized: - + # Useful custom aliases we can use later. These must come first! + {{alias}} + +no-cookies = +no-cookies-set +no-cookies-read + -no-cookies = -no-cookies-set -no-cookies-read + fragile = -block -no-cookies -filter -fast-redirects -hide-referer -no-popups + shop = -no-cookies -filter -fast-redirects + +imageblock = +block +image - - - - - show-on-task-bar 0 + #For people who don't like to type too much: ;-) + c0 = +no-cookies + c1 = -no-cookies + c2 = -no-cookies-set +no-cookies-read + c3 = +no-cookies-set -no-cookies-read + #... etc. Customize to your heart's content. - + - If close-button-minimizes is set to 1, the Windows close - button will minimize Junkbuster instead of closing - the program (close with the exit option on the File menu). + Some examples using our shop and fragile + aliases from above: - + - close-button-minimizes 1 + # These sites are very complex and require + # minimal interference. + {fragile} + .office.microsoft.com + .windowsupdate.microsoft.com + .nytimes.com + + # Shopping sites - still want to block ads. + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + .jungle.com + .scan.co.uk + + # These shops require pop-ups + {shop -no-popups} + .dabs.com + .overclockers.co.uk - + - The hide-console option is specific to the MS-Win console - version of JunkBuster. If this option is used, - Junkbuster will disconnect from and hide the - command console. - + The shop and fragile aliases are often used for + problem sites that require most actions to be disabled + in order to function properly. - - - - - #hide-console - - - @@ -1531,147 +3486,106 @@ Removed references to Win32. HB 09/23/01 - -The Actions File - - - The actionsfile is used to define what actions - Junkbuster takes, and thus determines how images, - cookies and various other aspects of HTTP content and transactions are - handled. Images can be anything you want, including ads, banners, or just - some obnoxious image that you would rather not see. Cookies can be accepted - or rejected. The default file is in fact named actionsfile. - - - - To determine which actions apply to a request, the URL of the request is - compared to all patterns in this file. Every time it matches, the list of - applicable actions for the URL is incrementally updated. You can trace - this process by visiting http://i.j.b/show-url-info. - - - - There are four types of lines in this file: comments (begin with a - # character), actions, aliases and patterns, all of which are - explained below. - - - - - -URL Domain and Path Syntax - - Generally, a pattern has the form <domain>/<path>, where both the - <domain> and <path> part are optional. If you only specify a - domain part, the / can be left out: - - - - www.example.com - is a domain only pattern and will match any request to - www.example.com. - - - - www.example.com/ - means exactly the same. - - - - www.example.com/index.html - matches only the single - document /index.html on www.example.com. - - - - /index.html - matches the document /index.html, regardless of - the domain. - - - - index.html - matches nothing, since it would be - interpreted as a domain name and there is no top-level domain called - .html. - - - - The matching of the domain part offers some flexible options: if the - domain starts or ends with a dot, it becomes unanchored at that end. - For example: - - + +The Filter File - .example.com - matches any domain that ENDS in - .example.com. + Any web page can be dynamically modified with the filter file. This + modification can be removal, or re-writing, of any web page content, + including tags and non-visible content. The default filter file is + default.filter, located in the config directory. - www. - matches any domain that STARTS with - www. + This is potentially a very powerful feature, and requires knowledge of both + regular expression and HTML in order create custom + filters. But, there are a number of useful filters included with + Privoxy for many common situations. - Additionally, there are wildcards that you can use in the domain names - themselves. They work pretty similar to shell wildcards: * - stands for zero or more arbitrary characters, ? stands for - any single character. And you can define charachter classes in square - brackets and they can be freely mixed: + The included example file is divided into sections. Each section begins + with the FILTER keyword, followed by the identifier + for that section, e.g. FILTER: webbugs. Each section performs + a similar type of filtering, such as html-annoyances. - ad*.example.com - matches adserver.example.com, - ads.example.com, etc but not sfads.example.com. + This file uses regular expressions to alter or remove any string in the + target page. The expressions can only operate on one line at a time. Some + examples from the included default default.filter: - *ad*.example.com - matches all of the above, and then some. + Stop web pages from displaying annoying messages in the status bar by + deleting such references: - .?pix.com - matches www.ipix.com, - pictures.epix.com, a.b.c.d.e.upix.com, etc. - + + + + FILTER: html-annoyances - - www[1-9a-ez].example.com - matches www1.example.com, - www4.example.com, wwwd.example.com, - wwwz.example.com, etc., but not - wwww.example.com. + # New browser windows should be resizeable and have a location and status + # bar. Make it so. + # + s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig + s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig + s/scrolling="?(no|0|Auto)"?/scrolling=1/ig + s/menubar="?(no|0)"?/menubar=1/ig + + # The <BLINK> tag was a crime! + # + s*<blink>|</blink>**ig + + # Is this evil? + # + #s/framespacing="?(no|0)"?//ig + #s/margin(height|width)=[0-9]*//gi + + + - If Junkbuster was compiled with - pcre support (default), Perl compatible regular expressions - can be used. See the pcre/docs/ direcory or man - perlre (also available on http://www.perldoc.com/perl5.6/pod/perlre.html) - for details. A brief discussion of regular expressions is in the - Appendix. For instance: + Just for kicks, replace any occurrence of Microsoft with + MicroSuck, and have a little fun with topical buzzwords: - /.*/advert[0-9]+\.jpe?g - would match a URL from any - domain, with any path that includes advert followed - immediately by one or more digits, then a . and ending in - either jpeg or jpg. So we match - example.com/ads/advert2.jpg, and - www.example.com/ads/banners/advert39.jpeg, but not - www.example.com/ads/banners/advert39.gif (no gifs in the - example pattern). + + + + FILTER: fun + + s/microsoft(?!.com)/MicroSuck/ig + + # Buzzword Bingo: + # + s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig + + + - Please note that matching in the path is case - INSENSITIVE by default, but you can switch to case - sensitive at any point in the pattern by using the - (?-i) switch: + Kill those pesky little web-bugs: - www.example.com/(?-i)PaTtErN.* - will match only - documents whose path starts with PaTtErN in - exactly this capitalization. + + + + # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking) + FILTER: webbugs + + s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig + + + - + @@ -1679,768 +3593,813 @@ Removed references to Win32. HB 09/23/01 - -Actions + +Templates - Actions are enabled if preceded with a +, and disabled if - preceded with a -. Actions are invoked by enclosing the - action name in curly braces (e.g. {+some_action}), followed by a list of - URLs to which the action applies. There are three classes of actions: + When Privoxy displays one of its internal + pages, such as a 404 Not Found error page, it uses the appropriate template. + On Linux, BSD, and Unix, these are located in + /etc/privoxy/templates by default. These may be + customized, if desired. cgi-style.css is + used to control the HTML attributes (fonts, etc). - - + The default Blocked banner page with the bright red top + banner, is called just blocked. This + may be customized or replaced with something else if desired. - - - Boolean (e.g. +/-block): - - - - - - {+name} # enable this action - {-name} # disable this action - - - - - + + +
- - - Parameterized (e.g. +/-hide-user-agent): - - - - - - {+name{param}} # enable action and set parameter to param - {-name} # disable action - - - - - - - - - Multi-value (e.g. {+/-add-header{Name: value}}, {+/-wafer{name=value}}): - - - - - - {+name{param}} # enable action and add parameter param - {-name{param}} # remove the parameter param - {-name} # disable this action totally - - - - - + + + + + + +Contacting the Developers, Bug Reporting and Feature +Requests + + + &contacting; + - - + + +Submitting Ads and <quote>Action</quote> Problems - If nothing is specified in this file, no actions are taken. - So in this case JunkBuster would just be a - normal, non-blocking, non-anonymizing proxy. You must specifically - enable the privacy and blocking features you need (although the - provided default actionsfile file will - give a good starting point). + Ads and banners that are not stopped by Privoxy + can be submitted to the developers by accessing a special page and filling + out the brief, required form. Conversely, you can also report pages, images, + etc. that Privoxy is blocking, but should not. + The form itself does require Internet access. - - Later defined actions always over-ride earlier ones. For multi-valued - actions, the actions are applied in the order they are specified. + To do this, point your browser to Privoxy + at http://config.privoxy.org/ + (shortcut: http://p.p/), and then select + Actions file feedback system, + near the bottom of the page. Paste in the URL that is the cause of the + unwanted behavior, and follow the prompts. The developers will + try to incorporate a fix for the problem you reported into future versions. - The list of valid Junkbuster actions are: + New default.actions files will occasionally be made + available based on your feedback. These + will be announced on the + ijbswa-announce + list. + - - - - - - Add the specified HTTP header, which is not checked for validity. - You may specify this many times to specify many different headers: - - - - - - +add-header{Name: value} - - - - - - - - - - Block this URL totally. - - - - - - +block - - - - - - - - - - De-animate all animated GIF images, i.e. reduce them to their last frame. - This will also shrink the images considerably (in bytes, not pixels!). If - the option first is given, the first frame of the animation - is used as the replacement. If last is given, the last frame - of the animation is used instead, which propably makes more sense for most - banner animations, but also has the risk of not showing the entire last - frame (if it is only a delta to an earlier frame). - - - - - - +deanimate-gifs{last} - +deanimate-gifs{first} - - - - - - - - - Many sites, like yahoo.com, don't just link to other sites. Instead, they - will link to some script on their own server, giving the destination as a - parameter, which will then redirect you to the final target. URLs resulting - from this scheme typically look like: - http://some.place/some_script?http://some.where-else. - - - Sometimes, there are even multiple consecutive redirects encoded in the - URL. These redirections via scripts make your web browing more traceable, - since the server from which you follow such a link can see where you go to. - Apart from that, valuable bandwidth and time is wasted, while your browser - ask the server for one redirect after the other. Plus, it feeds the - advertisers. - - - The +fast-redirects option enables interception of these - requests by Junkbuster, who will cut off all but - the last valid URL in the request and send a local redirect back to your - browser without contacting the remote site. - - - - - - +fast-redirects - - - - - + - - - Filter the website through the re_filterfile: - - - - - - +filter{filename} - - - - - - - - Block any existing X-Forwarded-for header, and do not add a new one: - - - - - - +hide-forwarded - - - - - + +Copyright and History - - - If the browser sends a From: header containing your e-mail - address, this either completely removes the header (block), or - changes it to the specified e-mail address. - - - - - - +hide-from{block} - +hide-from{spam@sittingduck.xqq} - - - - - - - - - Don't send the Referer: (sic) header to the web site. You - can block it, forge a URL to the same server as the request (which is - preferred because some sites will not send images otherwise) or set it to a - constant string of your choice. - - - - - - +hide-referer{block} - +hide-referer{forge} - +hide-referer{http://nowhere.com} - - - - - - - - - Alternative spelling of +hide-referer. It has the same - parameters, and can be freely mixed with, +hide-referer. - (referrer is the correct English spelling, however the HTTP - specification has a bug - it requires it to be spelled referer.) - - - - - - +hide-referrer{...} - - - - - +Copyright + + ©right; + + + + + + + + +History + + &history; + + + + + +See Also + + &seealso; + + + + + + +Appendix + + + + +Regular Expressions + + Privoxy can use regular expressions + in various config files. Assuming support for pcre (Perl + Compatible Regular Expressions) is compiled in, which is the default. Such + configuration directives do not require regular expressions, but they can be + used to increase flexibility by matching a pattern with wild-cards against + URLs. + + + + If you are reading this, you probably don't understand what regular + expressions are, or what they can do. So this will be a very brief + introduction only. A full explanation would require a book ;-) + + + + Regular expressions is a way of matching one character + expression against another to see if it matches or not. One of the + expressions is a literal string of readable characters + (letter, numbers, etc), and the other is a complex string of literal + characters combined with wild-cards, and other special characters, called + meta-characters. The meta-characters have special meanings and + are used to build the complex pattern to be matched against. Perl Compatible + Regular Expressions is an enhanced form of the regular expression language + with backward compatibility. + + + + To make a simple analogy, we do something similar when we use wild-card + characters when listing files with the dir command in DOS. + *.* matches all filenames. The special + character here is the asterisk which matches any and all characters. We can be + more specific and use ? to match just individual + characters. So dir file?.text would match + file1.txt, file2.txt, etc. We are pattern + matching, using a similar technique to regular expressions! + + + + Regular expressions do essentially the same thing, but are much, much more + powerful. There are many more special characters and ways of + building complex patterns however. Let's look at a few of the common ones, + and then some examples: + + + + + . - Matches any single character, e.g. a, + A, 4, :, or @. + + + + + + ? - The preceding character or expression is matched ZERO or ONE + times. Either/or. + + + + + + + - The preceding character or expression is matched ONE or MORE + times. + + + + + + * - The preceding character or expression is matched ZERO or MORE + times. + + + + + + \ - The escape character denotes that + the following character should be taken literally. This is used where one of the + special characters (e.g. .) needs to be taken literally and + not as a special meta-character. + + + + + + [] - Characters enclosed in brackets will be matched if + any of the enclosed characters are encountered. + + + + + + () - parentheses are used to group a sub-expression, + or multiple sub-expressions. + + + + + + | - The bar character works like an + or conditional statement. A match is successful if the + sub-expression on either side of | matches. + + + + + + s/string1/string2/g - This is used to rewrite strings of text. + string1 is replaced by string2 in this + example. + + + + + These are just some of the ones you are likely to use when matching URLs with + Privoxy, and is a long way from a definitive + list. This is enough to get us started with a few simple examples which may + be more illuminating: + + + + /.*/banners/.* - A simple example + that uses the common combination of . and * to + denote any character, zero or more times. In other words, any string at all. + So we start with a literal forward slash, then our regular expression pattern + (.*) another literal forward slash, the string + banners, another forward slash, and lastly another + .*. We are building + a directory path here. This will match any file with the path that has a + directory named banners in it. The .* matches + any characters, and this could conceivably be more forward slashes, so it + might expand into a much longer looking path. For example, this could match: + /eye/hate/spammers/banners/annoy_me_please.gif, or just + /banners/annoying.html, or almost an infinite number of other + possible combinations, just so it has banners in the path + somewhere. + + + + A now something a little more complex: + + + + /.*/adv((er)?ts?|ertis(ing|ements?))?/ - + We have several literal forward slashes again (/), so we are + building another expression that is a file path statement. We have another + .*, so we are matching against any conceivable sub-path, just so + it matches our expression. The only true literal that must + match our pattern is adv, together with + the forward slashes. What comes after the adv string is the + interesting part. + + + + Remember the ? means the preceding expression (either a + literal character or anything grouped with (...) in this case) + can exist or not, since this means either zero or one match. So + ((er)?ts?|ertis(ing|ements?)) is optional, as are the + individual sub-expressions: (er), + (ing|ements?), and the s. The | + means or. We have two of those. For instance, + (ing|ements?), can expand to match either ing + OR ements?. What is being done here, is an + attempt at matching as many variations of advertisement, and + similar, as possible. So this would expand to match just adv, + or advert, or adverts, or + advertising, or advertisement, or + advertisements. You get the idea. But it would not match + advertizements (with a z). We could fix that by + changing our regular expression to: + /.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/, which would then match + either spelling. + + + + /.*/advert[0-9]+\.(gif|jpe?g) - Again + another path statement with forward slashes. Anything in the square brackets + [] can be matched. This is using 0-9 as a + shorthand expression to mean any digit one through nine. It is the same as + saying 0123456789. So any digit matches. The + + means one or more of the preceding expression must be included. The preceding + expression here is what is in the square brackets -- in this case, any digit + one through nine. Then, at the end, we have a grouping: (gif|jpe?g). + This includes a |, so this needs to match the expression on + either side of that bar character also. A simple gif on one side, and the other + side will in turn match either jpeg or jpg, + since the ? means the letter e is optional and + can be matched once or not at all. So we are building an expression here to + match image GIF or JPEG type image file. It must include the literal + string advert, then one or more digits, and a . + (which is now a literal, and not a special character, since it is escaped + with \), and lastly either gif, or + jpeg, or jpg. Some possible matches would + include: //advert1.jpg, + /nasty/ads/advert1234.gif, + /banners/from/hell/advert99.jpg. It would not match + advert1.gif (no leading slash), or + /adverts232.jpg (the expression does not include an + s), or /advert1.jsp (jsp is not + in the expression anywhere). + + + + s/microsoft(?!.com)/MicroSuck/i - This is + a substitution. MicroSuck will replace any occurrence of + microsoft. The i at the end of the expression + means ignore case. The (?!.com) means + the match should fail if microsoft is followed by + .com. In other words, this acts like a NOT + modifier. In case this is a hyperlink, we don't want to break it ;-). + + + + We are barely scratching the surface of regular expressions here so that you + can understand the default Privoxy + configuration files, and maybe use this knowledge to customize your own + installation. There is much, much more that can be done with regular + expressions. Now that you know enough to get started, you can learn more on + your own :/ + + + + More reading on Perl Compatible Regular expressions: + http://www.perldoc.com/perl5.6/pod/perlre.html + + + + + + + + + +<application>Privoxy</application>'s Internal Pages + + + Since Privoxy proxies each requested + web page, it is easy for Privoxy to + trap certain special URLs. In this way, we can talk directly to + Privoxy, and see how it is + configured, see how our rules are being applied, change these + rules and other configuration options, and even turn + Privoxy's filtering off, all with + a web browser. + + + + + The URLs listed below are the special ones that allow direct access + to Privoxy. Of course, + Privoxy must be running to access these. If + not, you will get a friendly error message. Internet access is not + necessary either. + + + + - Change the User-Agent: header so web servers can't tell your - browser type. Warning! This breaks many web sites. Specify the - user-agent value you want. Example, pretend to be using Netscape on - Linux: - - - - - - +hide-user-agent{Mozilla (X11; I; Linux 2.0.32 i586)} - - - - - - Treat this URL as an image. This only matters if it's also +blocked, - in which case a blocked image can be sent rather than a HTML page. - See +image-blocker{} below for the control over what is actually sent. - - - - - - +image - - - + Show information about the current configuration: +
+ + http://config.privoxy.org/show-status + +
- Decides what to do with URLs that end up tagged with {+block - +image}. There are 4 options. -image-blocker will - send a HTML blocked page, usually resulting in a - broken image icon. +image-blocker{logo} will - send a JunkBuster image. - +image-blocker{blank} will send a 1x1 transparent GIF image. - And finally, +image-blocker{http://xyz.com} will send a HTTP - temporary redirect to the specified image. This has the advantage of the - icon being being cached by the browser, which will speed up the display. - - - - - - +image-blocker{logo} - +image-blocker{blank} - +image-blocker{http://i.j.b/send-banner} - - - + Show the source code version numbers: +
+ + http://config.privoxy.org/show-version + +
- Prevent the website from reading cookies: - - - - - - +no-cookies-read - - - + Show the client's request headers: +
+ + http://config.privoxy.org/show-request + +
- Prevent the website from setting cookies: - - - - - - +no-cookies-set - - - + Show which actions apply to a URL and why: +
+ + http://config.privoxy.org/show-url-info + +
- Filter the website through a built-in filter to disable those obnoxious - JavaScript pop-up windows via window.open(), etc. The two alternative - spellings are equivalent. + Toggle Privoxy on or off. In this case, Privoxy continues + to run, but only as a pass-through proxy, with no actions taking place: +
+ + http://config.privoxy.org/toggle + +
- - - - +no-popup - +no-popups - - - + Short cuts. Turn off, then on: +
+ + http://config.privoxy.org/toggle?set=disable + +
+
+ + http://config.privoxy.org/toggle?set=enable + +
- + - This action only applies if you are using a jarfile - for saving cookies. It sends a cookie to every site stating that you do not - accept any copyright on cookies sent to you, and asking them not to track - you. Of course, this is a (relatively) unique header they could use to - track you. - - - - - - +vanilla-wafer - - - + Edit the actions list file: +
+ + http://config.privoxy.org/edit-actions + +
- - - This allows you to add an arbitrary cookie. It can be specified multiple - times in order to add as many cookies as you like. - - - - - - +wafer{name=value} - - - - - -
- The meaning of any of the above is reversed by preceding the action with a - -, in place of the +. - - - - Some examples: - - - - Turn off cookies by default, then allow a few through for specified sites: - - - - - - - # Turn off all cookies - { +no-cookies-read } - { +no-cookies-set } - - # Execeptions to the above, sites that need cookies - { -no-cookies-read } - { -no-cookies-set } - .javasoft.com - .sun.com - .yahoo.com - .msdn.microsoft.com - .redhat.com - - # Alternative way of saying the same thing - {-no-cookies-set -no-cookies-read} - .sourceforge.net - .sf.net - - - - - - - Now turn off fast redirects, and then we allow two exceptions: - + These may be bookmarked for quick reference. - - - - - # Turn them off! - {+fast-redirects} - - # Reverse it for these two sites, which don't work right without it. - {-fast-redirects} - www.ukc.ac.uk/cgi-bin/wac\.cgi\? - login.yahoo.com - - - + +Bookmarklets - Turn on page filtering, with one exception for sourceforge: + Below are some bookmarklets to allow you to easily access a + mini version of some of Privoxy's + special pages. They are designed for MS Internet Explorer, but should work + equally well in Netscape, Mozilla, and other browsers which support + JavaScript. They are designed to run directly from your bookmarks - not by + clicking the links below (although that should work for testing). - - - - - # Run everything through the default filter file (re_filterfile): - {+filter} - - # But please don't re_filter code from sourceforge! - {-filter} - .cvs.sourceforge.net - - - + To save them, right-click the link and choose Add to Favorites + (IE) or Add Bookmark (Netscape). You will get a warning that + the bookmark may not be safe - just click OK. Then you can run the + Bookmarklet directly from your favorites/bookmarks. For even faster access, + you can put them on the Links bar (IE) or the Personal + Toolbar (Netscape), and run them with a single click. - Now some URLs that we want blocked, ie we won't see them. - Many of these use regular expressions that will expand to match multiple - URLs: - + - - - - - # Blocklist: - {+block} - /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g)) - /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/]) - /.*/(ng)?adclient\.cgi - /.*/(plain|live|rotate)[-_.]?ads?/ - /.*/(sponsor)s?[0-9]?/ - /.*/_?(plain|live)?ads?(-banners)?/ - /.*/abanners/ - /.*/ad(sdna_image|gifs?)/ - /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe) - /.*/adbanners/ - /.*/adserver - /.*/adstream\.cgi - /.*/adv((er)?ts?|ertis(ing|ements?))?/ - /.*/banner_?ads/ - /.*/banners?/ - /.*/banners?\.cgi/ - /.*/cgi-bin/centralad/getimage - /.*/images/addver\.gif - /.*/images/marketing/.*\.(gif|jpe?g) - /.*/popupads/ - /.*/siteads/ - /.*/sponsor.*\.gif - /.*/sponsors?[0-9]?/ - /.*/advert[0-9]+\.jpg - /Media/Images/Adds/ - /ad_images/ - /adimages/ - /.*/ads/ - /bannerfarm/ - /grafikk/annonse/ - /graphics/defaultAd/ - /image\.ng/AdType - /image\.ng/transactionID - /images/.*/.*_anim\.gif # alvin brattli - /ip_img/.*\.(gif|jpe?g) - /rotateads/ - /rotations/ - /worldnet/ad\.cgi - /cgi-bin/nph-adclick.exe/ - /.*/Image/BannerAdvertising/ - /.*/ad-bin/ - /.*/adlib/server\.cgi - /autoads/ - - - + + + Enable Privoxy + + + + + + Disable Privoxy + + + + + + Toggle Privoxy (Toggles between enabled and disabled) + + + + + + View Privoxy Status + + + + + + Actions file feedback system + + + + + + + + + + Credit: The site which gave me the general idea for these bookmarklets is + www.bookmarklets.com. They + have more information about bookmarklets. + - +
- -Aliases + +Anatomy of an Action + + + The way Privoxy applies actions + and filters to any given URL can be complex, and not always so + easy to understand what is happening. And sometimes we need to be able to + see just what Privoxy is + doing. Especially, if something Privoxy is doing + is causing us a problem inadvertently. It can be a little daunting to look at + the actions and filters files themselves, since they tend to be filled with + regular expressions whose consequences are not always + so obvious. Privoxy provides the + http://config.privoxy.org/show-url-info + page that can show us very specifically how actions + are being applied to any given URL. This is a big help for troubleshooting. + + - Custom actions, known to Junkbuster - as aliases, can be defined by combing other actions. - These can in turn be invoked just like the built-in actions. - Currently, an alias can contain any character except space, tab, =, - { or }. But please use only a- - z, 0-9, +, and - -. Alias names are not case sensitive, and must be defined - before they are used. + First, enter one URL (or partial URL) at the prompt, and then + Privoxy will tell us + how the current configuration will handle it. This will not + help with filtering effects from the default.filter file! It + also will not tell you about any other URLs that may be embedded within the + URL you are testing (i.e. a web page). For instance, images such as ads are expressed as URLs + within the raw page source of HTML pages. So you will only get info for the + actual URL that is pasted into the prompt area -- not any sub-URLs. If you + want to know about embedded URLs like ads, you will have to dig those out of + the HTML source. Use your browser's View Page Source option + for this. Or right click on the ad, and grab the URL. - Now let's define a few aliases: + Let's look at an example, google.com, + one section at a time: - - - - # Aliases - {{alias}} - - # Useful aliases - +no-cookies = +no-cookies-set +no-cookies-read - -no-cookies = -no-cookies-set -no-cookies-read - fragile = -block -no-cookies -filter -fast-redirects -hide-referer -no-popups - shop = -no-cookies -filter -fast-redirects - +imageblock = +block +image + + System default actions: - #For people who don't like to type too much: ;-) - c0 = +no-cookies - c1 = -no-cookies - c2 = -no-cookies-set +no-cookies-read - c3 = +no-cookies-set -no-cookies-read - #... etc. Customize to your heart's content. - - - + { -add-header -block -deanimate-gifs -downgrade -fast-redirects -filter + -hide-forwarded -hide-from -hide-referer -hide-user-agent -image + -image-blocker -limit-connect -no-compression -no-cookies-keep + -no-cookies-read -no-cookies-set -no-popups -vanilla-wafer -wafer } + + - Some examples using our shop and fragile - aliases from above: + This is the top section, and only tells us of the compiled in defaults. This + is basically what Privoxy would do if there + were not any actions defined, i.e. it does nothing. Every action + is disabled. This is not particularly informative for our purposes here. OK, + next section: - - - - # These sites are very complex and require - # minimal interference. - {fragile} - .office.microsoft.com - .windowsupdate.microsoft.com + - # Shopping sites - still want to block ads. - {shop} - .quietpc.com - .worldpay.com # for quietpc.com - .jungle.com - .scan.co.uk + Matches for http://google.com: - # These shops require pop-ups - {shop -no-popups} - .dabs.com - .overclockers.co.uk - - - + { -add-header -block +deanimate-gifs -downgrade +fast-redirects + +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups} + +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal} + +filter{fun} +hide-forwarded +hide-from{block} +hide-referer{forge} + -hide-user-agent -image +image-blocker{blank} +no-compression + +no-cookies-keep -no-cookies-read -no-cookies-set +no-popups + -vanilla-wafer -wafer } + / + + { -no-cookies-keep -no-cookies-read -no-cookies-set } + .google.com + + { -fast-redirects } + .google.com + + - - + + This is much more informative, and tells us how we have defined our + actions, and which ones match for our example, + google.com. The first grouping shows our default + settings, which would apply to all URLs. If you look at your actions + file, this would be the section just below the aliases section + near the top. This applies to all URLs as signified by the single forward + slash -- /. + + - + + These are the default actions we have enabled. But we can define additional + actions that would be exceptions to these general rules, and then list + specific URLs that these exceptions would apply to. Last match wins. + Just below this then are two explicit matches for .google.com. + The first is negating our various cookie blocking actions (i.e. we will allow + cookies here). The second is allowing fast-redirects. Note + that there is a leading dot here -- .google.com. This will + match any hosts and sub-domains, in the google.com domain also, such as + www.google.com. So, apparently, we have these actions defined + somewhere in the lower part of our actions file, and + google.com is referenced in these sections. + - - -The Filter File - The filter file defines what filtering of web pages - Junkbuster does. The default filter file is - re_filterfile, located in the config directory. In this - file, any document content, whether viewable text or - embedded non-visible content, can be changed. + And now we pull it altogether in the bottom section and summarize how + Privoxy is applying all its actions + to google.com: + - This file uses regular expressions to alter or remove any string in the - target page. Some examples from the included default re_filterfile: + + + Final results: + + -add-header -block -deanimate-gifs -downgrade -fast-redirects + +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups} + +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal} + +filter{fun} +hide-forwarded +hide-from{block} +hide-referer{forge} + -hide-user-agent -image +image-blocker{blank} -limit-connect +no-compression + -no-cookies-keep -no-cookies-read -no-cookies-set +no-popups -vanilla-wafer + -wafer + + - Stop web pages from displaying annoying messages in the status bar by - deleting such references: + Now another example, ad.doubleclick.net: - - - - # The status bar is for displaying link targets, not pointless buzzwords. - # Again, check it out on http://www.airport-cgn.de/. - s/status='.*?';*//ig - - - + + + { +block +image } + .ad.doubleclick.net + + { +block +image } + ad*. + + { +block +image } + .doubleclick.net + + - Just for kicks, replace any occurrence of Microsoft with - MicroSuck: + We'll just show the interesting part here, the explicit matches. It is + matched three different times. Each as an +block +image, + which is the expanded form of one of our aliases that had been defined as: + +imageblock. (Aliases are defined in the + first section of the actions file and typically used to combine more + than one action.) - - - - s/microsoft(?!.com)/MicroSuck/ig - - - + Any one of these would have done the trick and blocked this as an unwanted + image. This is unnecessarily redundant since the last case effectively + would also cover the first. No point in taking chances with these guys + though ;-) Note that if you want an ad or obnoxious + URL to be invisible, it should be defined as ad.doubleclick.net + is done here -- as both a +block and an + +image. The custom alias +imageblock does this + for us. - Kill those auto-refresh tags: + One last example. Let's try http://www.rhapsodyk.net/adsl/HOWTO/. + This one is giving us problems. We are getting a blank page. Hmmm... - - - - # Kill refresh tags. I like to refresh myself. Manually. - # check it out on http://www.airport-cgn.de/ and go to the arrivals page. - # - s/<meta[^>]*http-equiv[^>]*refresh.*URL=([^>]*?)"?>/<link rev="x-refresh" href=$1>/i - s/<meta[^>]*http-equiv="?page-enter"?[^>]*content=[^>]*>/<!--no page enter for me-->/i - - - + + + Matches for http://www.rhapsodyk.net/adsl/HOWTO/: + + { -add-header -block +deanimate-gifs -downgrade +fast-redirects + +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups} + +filter{webbugs} +filter{nimda} +filter{banners-by-size} +filter{hal} + +filter{fun} +hide-forwarded +hide-from{block} +hide-referer{forge} + -hide-user-agent -image +image-blocker{blank} +no-compression + +no-cookies-keep -no-cookies-read -no-cookies-set +no-popups + -vanilla-wafer -wafer } + / + + { +block +image } + /ads + + - + + Ooops, the /adsl/ is matching /ads! But + we did not want this at all! Now we see why we get the blank page. We could + now add a new action below this that explicitly does not + block (-block) pages with adsl. There are various ways to + handle such exceptions. Example: + -
+ + - -Quickstart to Using Junkbuster -To be filled. + { -block } + /adsl + + -
+ + Now the page displays ;-) Be sure to flush your browser's caches when + making such changes. Or, try using Shift+Reload. + - -Contact the developers -To be filled. mention the support forums as the primary channel of -communication (bugs, feature requests, etc.) + + But now what about a situation where we get no explicit matches like + we did with: - - -Copyright and History -To be filled. + + + + { -block } + /adsl + + - - -See also -To be filled. + + That actually was very telling and pointed us quickly to where the problem + was. If you don't get this kind of match, then it means one of the default + rules in the first section is causing the problem. This would require some + guesswork, and maybe a little trial and error to isolate the offending rule. + One likely cause would be one of the {+filter} actions. Try + adding the URL for the site to one of aliases that turn off +filter: + + + + + + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + .jungle.com + .scan.co.uk + .forbes.com + + - + + {shop} is an alias that expands to + { -filter -no-cookies -no-cookies-keep }. Or you could do + your own exception to negate filtering: + - -Appendix + + + {-filter} + .forbes.com + + + - - -Regular Expressions - WIP + {fragile} is an alias that disables most actions. This can be + used as a last resort for problem sites. Remember to flush caches! If this + still does not work, you will have to go through the remaining actions one by + one to find which one(s) is causing the problem. @@ -2468,8 +4427,241 @@ communication (bugs, feature requests, etc.) Temple Place - Suite 330, Boston, MA 02111-1307, USA. $Log: user-manual.sgml,v $ - Revision 1.4 2001/09/24 01:27:56 hal9 - Draft. Unfinished. + Revision 1.82 2002/04/18 12:04:50 oes + Cosmetics + + Revision 1.81 2002/04/18 11:50:24 oes + Extended Install section - needs fixing by packagers + + Revision 1.80 2002/04/18 10:45:19 oes + Moved text to buildsource.sgml, renamed some filters, details + + Revision 1.79 2002/04/18 03:18:06 hal9 + Spellcheck, and minor touchups. + + Revision 1.78 2002/04/17 18:04:16 oes + Proofreading part 2 + + Revision 1.77 2002/04/17 13:51:23 oes + Proofreading, part one + + Revision 1.76 2002/04/16 04:25:51 hal9 + -Added 'Note to Upgraders' and re-ordered the 'Quickstart' section. + -Note about proxy may need requests to re-read config files. + + Revision 1.75 2002/04/12 02:08:48 david__schmidt + Remove OS/2 building info... it is already in the developer-manual + + Revision 1.74 2002/04/11 00:54:38 hal9 + Add small section on submitting actions. + + Revision 1.73 2002/04/10 18:45:15 swa + generated + + Revision 1.72 2002/04/10 04:06:19 hal9 + Added actions feedback to Bookmarklets section + + Revision 1.71 2002/04/08 22:59:26 hal9 + Version update. Spell chkconfig correctly :) + + Revision 1.70 2002/04/08 20:53:56 swa + ? + + Revision 1.69 2002/04/06 05:07:29 hal9 + -Add privoxy-man-page.sgml, for man page. + -Add authors.sgml for AUTHORS (and p-authors.sgml) + -Reworked various aspects of various docs. + -Added additional comments to sub-docs. + + Revision 1.68 2002/04/04 18:46:47 swa + consistent look. reuse of copyright, history et. al. + + Revision 1.67 2002/04/04 17:27:57 swa + more single file to be included at multiple points. make maintaining easier + + Revision 1.66 2002/04/04 06:48:37 hal9 + Structural changes to allow for conditional inclusion/exclusion of content + based on entity toggles, e.g. 'entity % p-not-stable "INCLUDE"'. And + definition of internal entities, e.g. 'entity p-version "2.9.13"' that will + eventually be set by Makefile. + More boilerplate text for use across multiple docs. + + Revision 1.65 2002/04/03 19:52:07 swa + enhance squid section due to user suggestion + + Revision 1.64 2002/04/03 03:53:43 hal9 + A few minor bug fixes, and touch ups. Ready for review. + + Revision 1.63 2002/04/01 16:24:49 hal9 + Define entities to include boilerplate text. See doc/source/*. + + Revision 1.62 2002/03/30 04:15:53 hal9 + - Fix privoxy.org/config links. + - Paste in Bookmarklets from Toggle page. + - Move Quickstart nearer top, and minor rework. + + Revision 1.61 2002/03/29 01:31:08 hal9 + Minor update. + + Revision 1.60 2002/03/27 01:57:34 hal9 + Added more to Anatomy section. + + Revision 1.59 2002/03/27 00:54:33 hal9 + Touch up intro for new name. + + Revision 1.58 2002/03/26 22:29:55 swa + we have a new homepage! + + Revision 1.57 2002/03/24 20:33:30 hal9 + A few minor catch ups with name change. + + Revision 1.56 2002/03/24 16:17:06 swa + configure needs to be generated. + + Revision 1.55 2002/03/24 16:08:08 swa + we are too lazy to make a block-built + privoxy logo. hence removed the option. + + Revision 1.54 2002/03/24 15:46:20 swa + name change related issue. + + Revision 1.53 2002/03/24 11:51:00 swa + name change. changed filenames. + + Revision 1.52 2002/03/24 11:01:06 swa + name change + + Revision 1.51 2002/03/23 15:13:11 swa + renamed every reference to the old name with foobar. + fixed "application foobar application" tag, fixed + "the foobar" with "foobar". left junkbustser in cvs + comments and remarks to history untouched. + + Revision 1.50 2002/03/23 05:06:21 hal9 + Touch up. + + Revision 1.49 2002/03/21 17:01:05 hal9 + New section in Appendix. + + Revision 1.48 2002/03/12 06:33:01 hal9 + Catching up to Andreas and re_filterfile changes. + + Revision 1.47 2002/03/11 13:13:27 swa + correct feedback channels + + Revision 1.46 2002/03/10 00:51:08 hal9 + Added section on JB internal pages in Appendix. + + Revision 1.45 2002/03/09 17:43:53 swa + more distros + + Revision 1.44 2002/03/09 17:08:48 hal9 + New section on Jon's actions file editor, and move some stuff around. + + Revision 1.43 2002/03/08 00:47:32 hal9 + Added imageblock{pattern}. + + Revision 1.42 2002/03/07 18:16:55 swa + looks better + + Revision 1.41 2002/03/07 16:46:43 hal9 + Fix a few markup problems for jade. + + Revision 1.40 2002/03/07 16:28:39 swa + provide correct feedback channels + + Revision 1.39 2002/03/06 16:19:28 hal9 + Note on perceived filtering slowdown per FR. + + Revision 1.38 2002/03/05 23:55:14 hal9 + Stupid I did it again. Double hyphen in comment breaks jade. + + Revision 1.37 2002/03/05 23:53:49 hal9 + jade barfs on '- -' embedded in comments. - -user option broke it. + + Revision 1.36 2002/03/05 22:53:28 hal9 + Add new - - user option. + + Revision 1.35 2002/03/05 00:17:27 hal9 + Added section on command line options. + + Revision 1.34 2002/03/04 19:32:07 oes + Changed default port to 8118 + + Revision 1.33 2002/03/03 19:46:13 hal9 + Emphasis on where/how to report bugs, etc + + Revision 1.32 2002/03/03 09:26:06 joergs + AmigaOS changes, config is now loaded from PROGDIR: instead of + AmiTCP:db/junkbuster/ if no configuration file is specified on the + command line. + + Revision 1.31 2002/03/02 22:45:52 david__schmidt + Just tweaking + + Revision 1.30 2002/03/02 22:00:14 hal9 + Updated 'New Features' list. Ran through spell-checker. + + Revision 1.29 2002/03/02 20:34:07 david__schmidt + Update OS/2 build section + + Revision 1.28 2002/02/24 14:34:24 jongfoster + Formatting changes. Now changing the doctype to DocBook XML 4.1 + will work - no other changes are needed. + + Revision 1.27 2002/01/11 14:14:32 hal9 + Added a very short section on Templates + + Revision 1.26 2002/01/09 20:02:50 hal9 + Fix bug re: auto-detect config file changes. + + Revision 1.25 2002/01/09 18:20:30 hal9 + Touch ups for *.action files. + + Revision 1.24 2001/12/02 01:13:42 hal9 + Fix typo. + + Revision 1.23 2001/12/02 00:20:41 hal9 + Updates for recent changes. + + Revision 1.22 2001/11/05 23:57:51 hal9 + Minor update for startup now daemon mode. + + Revision 1.21 2001/10/31 21:11:03 hal9 + Correct 2 minor errors + + Revision 1.18 2001/10/24 18:45:26 hal9 + *** empty log message *** + + Revision 1.17 2001/10/24 17:10:55 hal9 + Catching up with Jon's recent work, and a few other things. + + Revision 1.16 2001/10/21 17:19:21 swa + wrong url in documentation + + Revision 1.15 2001/10/14 23:46:24 hal9 + Various minor changes. Fleshed out SEE ALSO section. + + Revision 1.13 2001/10/10 17:28:33 hal9 + Very minor changes. + + Revision 1.12 2001/09/28 02:57:04 hal9 + Ditto :/ + + Revision 1.11 2001/09/28 02:25:20 hal9 + Ditto. + + Revision 1.9 2001/09/27 23:50:29 hal9 + A few changes. A short section on regular expression in appendix. + + Revision 1.8 2001/09/25 00:34:59 hal9 + Some additions, and re-arranging. + + Revision 1.7 2001/09/24 14:31:36 hal9 + Diddling. + + Revision 1.6 2001/09/24 14:10:32 hal9 + Including David's OS/2 installation instructions. Revision 1.2 2001/09/13 15:27:40 swa cosmetics