X-Git-Url: http://www.privoxy.org/gitweb/?a=blobdiff_plain;f=doc%2Fsource%2Fuser-manual.sgml;h=3e36ed859925995f90c70a86b055aa3086e96ec1;hb=03472355cc98c0a5f3e65deb0e4569bd14e0fb54;hp=6be6df840cca9b7bef3f2456f848cf4c71c1bbe4;hpb=94e54af218937c38ccb7b9e4edfb6df1bdd58a78;p=privoxy.git
diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml
index 6be6df84..3e36ed85 100644
--- a/doc/source/user-manual.sgml
+++ b/doc/source/user-manual.sgml
@@ -10,18 +10,21 @@
-
-
+
+
+
-
+
+
+Privoxy">
]>
- Copyright &my-copy; 2001, 2002 by
- Privoxy Developers
+ Copyright &my-copy; 2001 - 2006 by
+ Privoxy Developers
-$Id: user-manual.sgml,v 1.120 2002/05/23 19:16:43 roro Exp $
+$Id: user-manual.sgml,v 2.23 2006/10/02 22:43:53 hal9 Exp $
@@ -95,9 +87,9 @@ Hal.
]]>
- The user manual gives users information on how to install, configure and use
- Privoxy .
+ The Privoxy User Manual gives users information on how to
+ install, configure and use Privoxy .
@@ -105,9 +97,9 @@ Hal.
- You can find the latest version of the user manual at Privoxy User Manual at http://www.privoxy.org/user-manual/ .
- Please see the Contact section on how to
+ Please see the Contact section on how to
contact the developers.
@@ -125,10 +117,9 @@ Hal.
Privoxy , v.&p-version;soon ;-)]]>.
+ configuration files. Development of a new version is currently nearing
+ completion, and includes significant changes and enhancements over
+ earlier versions. ]]>.
@@ -144,10 +135,12 @@ Hal.
Features
- In addition to Internet Junkbuster's traditional
- features of ad and banner blocking and cookie management,
- Privoxy provides new features:
+ In addition to the core
+ features of ad blocking and
+ cookie management,
+ Privoxy provides many supplemental
+ features,
+ that give the end-user more control, more privacy and more freedom:
&newfeatures;
@@ -171,13 +164,11 @@ Hal.
- Note: If you have a previous Junkbuster or
- Privoxy installation on your system, you
- will need to remove it. On some platforms, this may be done for you as part
- of their installation procedure. (See below for your platform). In any case
- be sure to backup your old configuration if it is valuable to
- you. See the note to
- upgraders section below.
+ Note:
+ On some platforms, the installer may remove previously installed versions, if
+ found. (See below for your platform). In any case be sure to backup
+ your old configuration if it is valuable to you. See the note to upgraders section below.
@@ -187,7 +178,7 @@ How to install the binary packages depends on your operating system:
-Red Hat, SuSE RPMs and Conectiva
+Red Hat, SuSE and Conectiva RPMs
RPMs can be installed with rpm -Uvh privoxy-&p-version;-1.rpm ,
@@ -199,13 +190,12 @@ How to install the binary packages depends on your operating system:
Note that on Red Hat, Privoxy will
not be automatically started on system boot. You will
need to enable that using chkconfig ,
- ntsysv , or similar methods. Note that SuSE will
-automatically start Privoxy in the boot process.
+ ntsysv , or similar methods.
If you have problems with failed dependencies, try rebuilding the SRC RPM:
- rpm --rebuild privoxy-&p-version;-1.src.rpm; . This
+ rpm --rebuild privoxy-&p-version;-1.src.rpm . This
will use your locally installed libraries and RPM version.
@@ -213,17 +203,16 @@ automatically start Privoxy in the boot process.
Also note that if you have a Junkbuster RPM installed
on your system, you need to remove it first, because the packages conflict.
Otherwise, RPM will try to remove Junkbuster
- automatically, before installing Privoxy .
+ automatically if found, before installing Privoxy .
Debian
- DEBs can be installed with dpkg -i
- privoxy_&p-version;-1.deb , and will use
- /etc/privoxy for the location of configuration
- files.
+ DEBs can be installed with apt-get install privoxy ,
+ and will use /etc/privoxy for the location of
+ configuration files.
@@ -233,9 +222,41 @@ automatically start Privoxy in the boot process.
Just double-click the installer, which will guide you through
the installation process. You will find the configuration files
- in the same directory as you installed Privoxy in. We do not
- use the registry of Windows.
+ in the same directory as you installed Privoxy in.
+
+
+ Version 3.0.4 introduced full Windows service
+ functionality. On Windows only, the Privoxy
+ program has two new command line arguments to install and uninstall
+ Privoxy as a service .
+
+
+
+ Arguments:
+
+
+ --install [:service_name ]
+
+
+ --uninstall [:service_name ]
+
+
+
+
+
+ After invoking Privoxy with
+ --install , you will need to bring up the
+ Windows service console to assign the user you
+ want Privoxy to run under, and whether or not you
+ want it to run whenever the system starts. You can start the
+ Windows services console with the following
+ command: services.msc . If you do not take the manual step
+ of modifying Privoxy's service settings, it will
+ not start. Note too that you will need to give Privoxy a user account that
+ actually exists, or it will not be permitted to
+ write to its log and configuration files.
+
@@ -244,7 +265,7 @@ automatically start Privoxy in the boot process.
Create a new directory, cd to it, then unzip and
untar the archive. For the most part, you'll have to figure out where
- things go. FIXME.
+ things go.
@@ -255,7 +276,10 @@ automatically start Privoxy in the boot process.
First, make sure that no previous installations of
Junkbuster and / or
Privoxy are left on your
- system. You can do this by
+ system. Check that no Junkbuster
+ or Privoxy objects are in
+ your startup folder.
+
@@ -272,17 +296,32 @@ automatically start Privoxy in the boot process.
-Max OSX
-
- Unzip the downloaded package (you can either double-click on the file
- in the finder, or on the desktop if you downloaded it there). Then,
- double-click on the package installer icon and follow the installation
- process.
- Privoxy will be installed in the subdirectory
- /Applications/Privoxy.app .
- Privoxy will set itself up to start
- automatically on system bring-up via
- /System/Library/StartupItems/Privoxy .
+Mac OSX
+
+ Unzip the downloaded file (you can either double-click on the file
+ from the finder, or from the desktop if you downloaded it there).
+ Then, double-click on the package installer icon named
+ Privoxy.pkg
+ and follow the installation process.
+ Privoxy will be installed in the folder
+ /Library/Privoxy .
+ It will start automatically whenever you start up. To prevent it from
+ starting automatically, remove or rename the folder
+ /Library/StartupItems/Privoxy .
+
+
+ To start Privoxy by hand, double-click on
+ StartPrivoxy.command in the
+ /Library/Privoxy folder.
+ Or, type this command in the Terminal:
+
+
+
+ /Library/Privoxy/StartPrivoxy.command
+
+
+
+ You will be prompted for the administrator password.
@@ -294,16 +333,29 @@ automatically start Privoxy in the boot process.
directory, including all configuration and log files. To uninstall, just
remove this directory.
+
+
+
+Gentoo
- Start Privoxy (with RUN <>NIL:) in your
- startnet script (AmiTCP), in
- s:user-startup (RoadShow), as startup program in your
- startup script (Genesis), or as startup action (Miami and MiamiDx).
- Privoxy will automatically quit when you quit your
- TCP/IP stack (just ignore the harmless warning your TCP/IP stack may display that
- Privoxy is still running).
+ Gentoo source packages (Ebuilds) for Privoxy are
+ contained in the Gentoo Portage Tree (they are not on the download page,
+ but there is a Gentoo section, where you can see when a new
+ Privoxy Version is added to the Portage Tree).
+
+
+ Before installing Privoxy under Gentoo just do
+ first emerge rsync to get the latest changes from the
+ Portage tree. With emerge privoxy you install the latest
+ version.
+
+
+ Configuration files are in /etc/privoxy , the
+ documentation is in /usr/share/doc/privoxy-&p-version;
+ and the Log directory is in /var/log/privoxy .
+
@@ -319,9 +371,13 @@ automatically start Privoxy in the boot process.
If you like to live on the bleeding edge and are not afraid of using
possibly unstable development versions, you can check out the up-to-the-minute
version directly from the
- CVS repository or simply download the nightly CVS
+ CVS repository .
+
@@ -329,82 +385,281 @@ automatically start Privoxy in the boot process.
+
+Keeping your Installation Up-to-Date
+
+ As user feedback comes in and development continues, we will make updated versions
+ of both the main actions file (as a separate
+ package ) and the software itself (including the actions file) available for
+ download.
+
+
+
+ If you wish to receive an email notification whenever we release updates of
+ Privoxy or the actions file, subscribe
+ to our announce mailing list , ijbswa-announce@lists.sourceforge.net.
+
+
+
+ In order not to lose your personal changes and adjustments when updating
+ to the latest default.action file we strongly
+ recommend that you use user.action and
+ user.filter for your local
+ customizations of Privoxy . See the Chapter on actions files for details.
+
+
+
+
-
-Note to Upgraders
-
- There are very significant changes from earlier
- Junkbuster versions to the current
- Privoxy . The number, names, syntax, and
- purposes of configuration files have substantially changed.
- Junkbuster 2.0.x configuration
- files will not migrate, Junkbuster 2.9.x
- and Privoxy configurations will need to be
- ported. The functionalities of the old blockfile ,
- cookiefile and imagelist
- are now combined into the actions
- files
.
- default.action , is the main actions file. Local
- exceptions should best be put into user.action .
-
+
+What's New in this Release
- A filter file
(typically
- default.filter ) is new as of Privoxy
- 2.9.x , and provides some of the new sophistication (explained
- below). config is much the same as before.
+ There are many improvements and new features since the last Privoxy stable release:
+
- If upgrading from a 2.0.x version, you will have to use the new config
- files, and possibly adapt any personal rules from your older files.
- When porting personal rules over from the old blockfile
- to the new actions files, please note that even the pattern syntax has
- changed. If upgrading from 2.9.x development versions, it is still
- recommended to use the new configuration files.
+
+
+
+ Multiple filter files can now be specified in config . This allows for
+ locally defined filters that can be maintained separately from the filters as
+ supplied by the developers, i.e. default.filter .
+
+
+
+
+
+ There are a number of new actions:
+
+
+
+
+
+
+
+ content-type-overwrite
+
+
+
+
+ crunch-client-header
+
+
+
+
+ crunch-if-none-match
+
+
+
+
+ crunch-server-header
+
+
+
+
+ filter-client-headers
+
+
+
+
+ filter-server-headers
+
+
+
+
+ force-text-mode
+
+
+
+
+ handle-as-empty-document
+
+
+
+
+ hide-accept-language
+
+
+
+
+ hide-content-disposition
+
+
+
+
+ hide-if-modified-since
+
+
+
+
+ inspect-jpegs
+
+
+
+
+ overwrite-last-modified
+
+
+
+
+ redirect
+
+
+
+
+ treat-forbidden-connects-like-blocks
+
+
+
+
+
+
+ In addition, fast-redirects
+ has been significantly improved with enhanced syntax.
+
+
+ And hide-referrer
+ has a new option, conditional block .
+
+
+
+
+
+
+ MS-Windows versions can now be
+ installed and
+ started as a Windows service .
+
+
+
+
+
+ config has two new options:
+ enable-remote-http-toggle,
+ and forwarded-connect-retries.
+
+
+ And there is improved handling of the user-manual
+ option, for placing documentation and help files on the local system.
+
+
+
+
+
+ There are six new filters.
+
+
+
+
+
+ Actions files problems and suggestions are now being directed to:
+ http://sourceforge.net/tracker/?group_id=11118&atid=460288 .
+ Please use this to report such configuration related problems as missed
+ ads, sites that don't function properly due to one action or another,
+ innocent images being blocked, etc.
+
+
+
+
+
+ In addition, there are numerous bug fixes and significant enhancements,
+ including error pages should no longer be cached if the problem is fixed,
+ much better DNS error handling, and various logging improvements.
+
+
+
+
+
+ The default actions setting is now Cautious . Previous
+ releases had a default setting of Medium . Experienced
+ users may want to adjust this, as it is fairly conservative by &my-app;
+ standards and past practices. See
+ http://config.privoxy.org/edit-actions-list?f=default . New users
+ should try the default settings for a while before turning up the volume.
+
+
+
+
+
+
+
+
+Note to Upgraders
+
- A quick list of things to be aware of before upgrading:
+ A quick list of things to be aware of before upgrading from earlier
+ versions of Privoxy :
-
- The default listening port is now 8118 due to a conflict with another
- service (NAS).
+
+ Some installers may remove earlier versions completely, including
+ configuration files. Save any important configuration files!
-
+
- Some installers may remove earlier versions completely. Save any
- important configuration files!
+ On the other hand, other installers may not overwrite any existing configuration
+ files, thinking you will want to do that. You may want to manually check
+ your saved files against the newer versions to see if the improvements have
+ merit, or whether there are new options that you may want to consider.
+ There are a number of new features, but most won't be available unless
+ these features are incorporated into your configuration somehow.
-
- Privoxy is controllable with a web browser
- at the special URL: http://config.privoxy.org/
- (Shortcut: http://p.p/ ). Many
- aspects of configuration can be done here, including temporarily disabling
- Privoxy .
-
-
+
+ See the full documentation on
+ fast-redirects
+ which has changed syntax, and will require adjustments to local configs,
+ such as user.action . You must reference the new
+ syntax:
+
+
+
+ { +fast-redirects{check-decoded-url} }
+ .example.com
+ mybank.com
+ .google.
+
+
+
-
- The primary configuration files for cookie management, ad and banner
- blocking, and many other aspects of Privoxy
- configuration are the actions
- files. It is strongly recommended to become familiar with the new
- actions concept below, before modifying these files. Locally defined rules
- should go into user.action .
+
+ The jarfile ,
+ cookie logger, is off by default now.
+
+
+
+
+
+ What constitutes a default
configuration has changed,
+ and you may want to review which actions are on
by
+ default. This is primarily a matter of emphasis, but some features
+ you may have been used to, may now be off
by default.
+ There are also a number of new actions and filters you may want to
+ consider, most of which are not incorporated into the default settings as
+ yet (see above).
-
+
+
@@ -416,20 +671,14 @@ automatically start Privoxy in the boot process.
+
-Quickstart to Using Privoxy
+Quickstart to Using Privoxy
-
-
- If upgrading, from versions before 2.9.16, please back up any configuration
- files. See the Note to Upgraders Section.
-
-
-
Install Privoxy . See the
Set your browser to use Privoxy as HTTP and
- HTTPS proxy by setting the proxy configuration for address of
+ HTTPS (SSL) proxy
+ by setting the proxy configuration for address of
127.0.0.1 and port 8118 .
- (Junkbuster and earlier versions of
- Privoxy used port 8000.) See the section Starting Privoxy below
- for more details on this.
+ DO NOT activate proxying for FTP or
+ any protocols besides HTTP and HTTPS (SSL)! It won't work!
Flush your browser's disk and memory caches, to remove any cached ad images.
+ If using Privoxy to manage
+ cookies ,
+ you should remove any currently stored cookies too.
@@ -484,39 +735,47 @@ automatically start Privoxy in the boot process.
See the Configuration section for more
configuration options, and how to customize your installation.
- next section for a quick
introduction to how Privoxy blocks ads and
- banners.]]>
-
+ banners.
+
- If you experience ads that slipped through, innocent images that are
+ If you experience ads that slip through, innocent images that are
blocked, or otherwise feel the need to fine-tune
- Privoxy's behaviour, take a look at the Privoxy's behavior, take a look at the actions files. As a quick start, you might
find the richly commented examples
helpful. You can also view and edit the actions files through the web-based user interface . The
- Appendix Anatomy of an
- Action
has hints how to debug actions that
+ Appendix Troubleshooting: Anatomy of an
+ Action
has hints on how to understand and debug actions that
misbehave
.
+
+
+ For easy access to &my-app;'s most important controls, drag the provided
+ Bookmarklets into your browser's
+ personal toolbar.
+
+
+
Please see the section Contacting the
- Developers on how to report bugs or problems with websites or to get
+ Developers on how to report bugs, problems with websites or to get
help.
- Now enjoy surfing with enhanced comfort and privacy!
+ Now enjoy surfing with enhanced control, comfort and privacy!
@@ -540,7 +799,7 @@ automatically start Privoxy in the boot process.
This section will provide a quick summary of ad blocking so
you can get up to speed quickly without having to read the more extensive
- information provided below, though this is highly recommeneded.
+ information provided below, though this is highly recommended.
First a bit of a warning ... blocking ads is much like blocking SPAM: the
@@ -577,10 +836,10 @@ automatically start Privoxy in the boot process.
When you connect to a website, the full URL will either match one or more
of the sections as defined in Privoxy's configuration,
or not. If so, then Privoxy will perform the
- respective actions. If not, then nothing special happens. Futhermore, web
+ respective actions. If not, then nothing special happens. Furthermore, web
pages may contain embedded, secondary URLs that your web browser will
use to load additional components of the page, as it parses the
- original page's HTML content. An ad image for instance, is just a URL
+ original page's HTML content. An ad image for instance, is just an URL
embedded in the page somewhere. The image itself may be on the same server,
or a server somewhere else on the Internet. Complex web pages will have many
such embedded URLs.
@@ -613,12 +872,13 @@ automatically start Privoxy in the boot process.
tells Privoxy to treat this URL as an image.
Privoxy 's default configuration already does this
for all common image types (e.g. GIF), but there are many situations where this
- is not as easy to determine. So we'll force it in these cases. This is particularly
- important for ad blocking, since only if we know that it's an image, we can replace
- it by an image instead of the BLOCKED page, which would only result in a
- broken image
icon. There are some limitations to this though. For
- instance, you can't just brute-force an image substituion for an entire HTML page
- in most situations.
+ is not so easy to determine. So we'll force it in these cases. This is particularly
+ important for ad blocking, since only if we know that it's an image of
+ some kind, can we replace it with an image of our choosing, instead of the
+ Privoxy BLOCKED page (which would only result in
+ a broken image
icon). There are some limitations to this
+ though. For instance, you can't just brute-force an image substitution for
+ an entire HTML page in most situations.
@@ -637,7 +897,7 @@ automatically start Privoxy in the boot process.
- pattern - a checkboard pattern, so that an ad
+ pattern - a checkerboard pattern, so that an ad
replacement is obvious. This is the default.
@@ -705,10 +965,10 @@ automatically start Privoxy in the boot process.
Actions Files in Use
-
+
- Screenshot of Files in Use
+ [ Screenshot of Actions Files in Use ]
@@ -760,7 +1020,7 @@ automatically start Privoxy in the boot process.
For advanced users who want to hand edit their config files, you might want
to now go to the Actions Files Tutorial.
- The ideas explained thererin also apply to the web-based editor.
+ The ideas explained therein also apply to the web-based editor.
@@ -772,34 +1032,103 @@ automatically start Privoxy in the boot process.
-Starting Privoxy
+Starting Privoxy
Before launching Privoxy for the first time, you
will want to configure your browser(s) to use
- Privoxy as a HTTP and HTTPS proxy. The default is
+ Privoxy as a HTTP and HTTPS (SSL)
+ proxy . The default is
127.0.0.1 (or localhost) for the proxy address, and port 8118 (earlier versions
- used port 8000). This is the one configuration step that must be done!
+ used port 8000). This is the one configuration step that must be done
+ !
+
+
+ Please note that Privoxy can only proxy HTTP and
+ HTTPS traffic. It will not work with FTP or other protocols.
+
+
+
+
+ Proxy Configuration Showing
+ Mozilla/Netscape HTTP and HTTPS (SSL) Settings
+
+
+
+
+
+ [ Screenshot of Mozilla Proxy Configuration ]
+
+
+
+
+
+
+
+ With Firefox , this can be set under:
+
+ Tools -> Options -> General -> Connection Settings -> Manual Proxy Configuration
+
+
+
+
With Netscape (and
- Mozilla ), this can be set under Edit
- -> Preferences -> Advanced -> Proxies -> HTTP Proxy .
- For Internet Explorer : Tools ->
- Internet Properties -> Connections -> LAN Setting . Then,
- check Use Proxy
and fill in the appropriate info (Address:
- 127.0.0.1, Port: 8118). Include if HTTPS proxy support too.
+ Mozilla ), this can be set under:
+
+
+
+
+
+
+ Edit -> Preferences -> Advanced -> Proxies -> HTTP Proxy
+
+
+
+
+ For Internet Explorer v.5-6 :
+
+ Tools -> Internet Options -> Connections -> LAN Settings
+
+
+
+ Then, check Use Proxy
and fill in the appropriate info
+ (Address: 127.0.0.1, Port: 8118). Include HTTPS (SSL), if you want HTTPS
+ proxy support too (sometimes labeled Secure
). Make sure any
+ checkboxes like Use the same proxy server for all protocols
is
+ UNCHECKED . You want only HTTP and HTTPS (SSL)!
+
+
+
+
+ Proxy Configuration Showing
+ Internet Explorer HTTP and HTTPS (Secure) Settings
+
+
+
+
+
+ [ Screenshot of IE Proxy Configuration ]
+
+
+
+
+
+
After doing this, flush your browser's disk and memory caches to force a
- re-reading of all pages and to get rid of any ads that may be cached. You
- are now ready to start enjoying the benefits of using
+ re-reading of all pages and to get rid of any ads that may be cached. Remove
+ any cookies ,
+ if you want Privoxy to manage that. You are now
+ ready to start enjoying the benefits of using
Privoxy !
- Privoxy is typically started by specifying the
+ Privoxy itself is typically started by specifying the
main configuration file to be used on the command line. If no configuration
file is specified on the command line, Privoxy
will look for a file named config in the current
@@ -807,23 +1136,31 @@ automatically start Privoxy in the boot process.
-RedHat and Conectiva
+Red Hat, Fedora and Conectiva
-We use a script. Note that RedHat does not start Privoxy upon booting per
-default. It will use the file /etc/privoxy/config as its
-main configuration file.
+ A default Red Hat installation may not start &my-app; upon boot. It will use
+ the file /etc/privoxy/config as its main configuration
+ file.
# /etc/rc.d/init.d/privoxy start
+
+ Or ...
+
+
+
+ # service privoxy start
+
+
Debian
- We use a script. Note that Debian starts Privoxy upon booting per
+ We use a script. Note that Debian typically starts &my-app; upon booting per
default. It will use the file
/etc/privoxy/config as its main configuration
file.
@@ -852,10 +1189,18 @@ your PC.
Windows
-Click on the Privoxy Icon to start Privoxy. If no configuration file is
+Click on the &my-app; Icon to start Privoxy . If no configuration file is
specified on the command line, Privoxy will look
for a file named config.txt . Note that Windows will
- automatically start Privoxy upon booting you PC.
+ automatically start &my-app; when the system starts if you chose that option
+ when installing.
+
+
+ Privoxy can run with full Windows service functionality.
+ On Windows only, the &my-app; program has two new command line arguments
+ to install and uninstall &my-app; as a service. See the
+ Windows Installation
+ instructions for details.
@@ -874,14 +1219,29 @@ Example Unix startup command:
OS/2
-FIXME.
+ During installation, Privoxy is configured to
+ start automatically when the system restarts. You can start it manually by
+ double-clicking on the Privoxy icon in the
+ Privoxy folder.
-MAX OSX
+Mac OSX
-FIXME.
+ During installation, Privoxy is configured to
+ start automatically when the system restarts. To start &my-app; manually,
+ double-click on the StartPrivoxy.command icon in the
+ /Library/Privoxy folder. Or, type this command
+ in the Terminal:
+
+
+
+ /Library/Privoxy/StartPrivoxy.command
+
+
+
+ You will be prompted for the administrator password.
@@ -889,7 +1249,36 @@ FIXME.
AmigaOS
-FIXME.
+ Start Privoxy (with RUN <>NIL:) in your
+ startnet script (AmiTCP), in
+ s:user-startup (RoadShow), as startup program in your
+ startup script (Genesis), or as startup action (Miami and MiamiDx).
+ Privoxy will automatically quit when you quit your
+ TCP/IP stack (just ignore the harmless warning your TCP/IP stack may display that
+ Privoxy is still running).
+
+
+
+
+Gentoo
+
+ A script is again used. It will use the file /etc/privoxy/config
+ as its main configuration file.
+
+
+
+ /etc/init.d/privoxy start
+
+
+
+ Note that Privoxy is not automatically started at
+ boot time by default. You can change this with the rc-update
+ command.
+
+
+
+ rc-update add privoxy default
+
@@ -897,7 +1286,7 @@ FIXME.
See the section Command line options for
- furher info.
+ further info.
must find a better place for this paragraph
@@ -978,17 +1367,17 @@ must find a better place for this paragraph
- If the above paragraph sounds gibberish to you, you might want to read more about the actions concept
- or even dive deep into the Appendix
- on actions .
+ If the above paragraph sounds gibberish to you, you might want to read more about the actions concept
+ or even dive deep into the Appendix
+ on actions.
If you can't get rid of the problem at all, think you've found a bug in
Privoxy, want to propose a new feature or smarter rules, please see the
- section Contacting the
- Developers
below.
+ section Contacting the
+ Developers
below.
-->
@@ -1051,7 +1440,20 @@ must find a better place for this paragraph
USER , and if included the GID of GROUP. Exit if the
privileges are not sufficient to do so. Unix only.
-
+
+
+
+ --chroot
+
+
+
+ Before changing to the user ID given in the --user option,
+ chroot to that user's home directory, i.e. make the kernel pretend to the &my-app;
+ process that the directory tree starts there. If set up carefully, this can limit
+ the impact of possible vulnerabilities in &my-app; to the files contained in that hierarchy.
+ Unix only.
+
+
configfile
@@ -1069,6 +1471,14 @@ must find a better place for this paragraph
+
+ On MS Windows only there are two additional
+ command-line options to allow Privoxy to install and
+ run as a service . See the
+ Window Installation section
+for details.
+
+
@@ -1077,7 +1487,7 @@ must find a better place for this paragraph
-Privoxy Configuration
+Privoxy Configuration
All Privoxy configuration is stored
in text files. These files can be edited with a text editor.
@@ -1089,7 +1499,7 @@ must find a better place for this paragraph
-Controlling Privoxy with Your Web Browser
+Controlling Privoxy with Your Web Browser
Privoxy 's user interface can be reached through the special
URL http://config.privoxy.org/
@@ -1102,7 +1512,7 @@ must find a better place for this paragraph
- Privoxy Menu
+ Privoxy Menu
@@ -1120,6 +1530,10 @@ must find a better place for this paragraph
▪ Toggle Privoxy on or off
+
+ ▪ Documentation
+
@@ -1214,17 +1628,32 @@ must find a better place for this paragraph
- default.filter (the filter
+ Filter files
(the filter
file) can be used to re-write the raw page content, including
viewable text as well as embedded HTML and JavaScript, and whatever else
lurks on any given web page. The filtering jobs are only pre-defined here;
- whether to apply them or not is up to the actions files.
+ whether to apply them or not is up to the actions files.
+ default.filter includes various filters made
+ available for use by the developers. Some are much more intrusive than
+ others, and all should be used with caution. You may define additional
+ filter files in config as you can with
+ actions files. We suggest user.filter for any
+ locally defined filters or customizations.
+
+ The syntax of all configuration files has remained the same throughout the
+ 3.x series. There have been enhancements, but no changes that would preclude
+ the use of any configuration file from one version to the next. (There is
+ one exception: +fast-redirects which
+ has enhanced syntax and will require updating any local configs from earlier
+ versions.)
+
+
All files use the #
character to denote a
comment (the rest of the line will be ignored) and understand line continuation
@@ -1232,11 +1661,11 @@ must find a better place for this paragraph
in a line. If the # is preceded by a backslash, it looses
its special function. Placing a # in front of an otherwise
valid configuration line to prevent it from being interpreted is called "commenting
- out" that line.
+ out" that line. Blank lines are ignored.
- The actions files and default.filter
+ The actions files and filter files
can use Perl style regular expressions for
maximum flexibility.
@@ -1267,2238 +1696,1955 @@ must find a better place for this paragraph
-
-The Main Configuration File
+
+
+
+ &config;
+
-
- Again, the main configuration file is named config on
- Linux/Unix/BSD and OS/2, and config.txt on Windows.
- Configuration lines consist of an initial keyword followed by a list of
- values, all separated by whitespace (any number of spaces or tabs). For
- example:
-
-
-
-
-
-
- confdir /etc/privoxy
-
-
-
-
-
- Assigns the value /etc/privoxy to the option
- confdir and thus indicates that the configuration
- directory is named /etc/privoxy/
.
-
-
- All options in the config file except for confdir and
- logdir are optional. Watch out in the below description
- for what happens if you leave them unset.
-
+
-
- The main config file controls all aspects of Privoxy 's
- operation that are not location dependent (i.e. they apply universally, no matter
- where you may be surfing).
-
-
+
-
-Configuration and Log File Locations
+Actions Files
- Privoxy can (and normally does) use a number of
- other files for additional configuration, help and logging.
- This section of the configuration file tells Privoxy
- where to find those other files.
+ The actions files are used to define what actions
+ Privoxy takes for which URLs, and thus determines
+ how ad images, cookies and various other aspects of HTTP content and
+ transactions are handled, and on which sites (or even parts thereof).
+ There are a number of such actions, with a wide range of functionality.
+ Each action does something a little different.
+ These actions give us a veritable arsenal of tools with which to exert
+ our control, preferences and independence. Actions can be combined so that
+ their effects are aggregated when applied against a given set of URLs.
+
+
+ There
+ are three action files included with Privoxy with
+ differing purposes:
+
+
+
+
+
+
+ default.action - is the primary action file
+ that sets the initial values for all actions. It is intended to
+ provide a base level of functionality for
+ Privoxy's array of features. So it is
+ a set of broad rules that should work reasonably well as-is for most users.
+ This is the file that the developers are keeping updated, and making available to users.
+ The user's preferences as set in standard.action ,
+ e.g. either Cautious (the default),
+ Medium , or Advanced (see
+ below).
+
+
+
+
+ user.action - is intended to be for local site
+ preferences and exceptions. As an example, if your ISP or your bank
+ has specific requirements, and need special handling, this kind of
+ thing should go here. This file will not be upgraded.
+
+
+
+
+ standard.action - is used by the web based editor
+ at
+ http://config.privoxy.org/edit-actions-list?f=default ,
+ to set various pre-defined sets of rules for the default actions section
+ in default.action .
+
+
+ Edit Set to Cautious Set to Medium Set to Advanced
+
+
+ These have increasing levels of aggressiveness and have no
+ influence on your browsing unless you select them explicitly in the
+ editor . A default installation should be pre-set to
+ Cautious (versions prior to 3.0.5 were set to
+ Medium ). New users should try this for a while before
+ adjusting the settings to more aggressive levels. The more aggressive
+ the settings, then the more likelihood there is of problems such as sites
+ not working as they should.
+
+
+ The Edit button allows you to turn each
+ action on/off individually for fine-tuning. The Cautious
+ button changes the actions list to low/safe settings which will activate
+ a minimal set of &my-app;'s features, and subsequently there will be
+ less of a chance for accidental problems. The Medium
+ button sets the list to a medium level of ad blocking and a low level set of
+ privacy features. The Advanced button
+ sets the list to a high level of ad blocking and medium level of
+ privacy. See the chart below. The latter three buttons over-ride
+ any changes via with the Edit button. More
+ fine-tuning can be done in the lower sections of this internal page.
+
+
+ It is not recommend to edit the standard.action file
+ itself.
+
+
+ The default profiles, and their associated actions, as pre-defined in
+ standard.action are:
+
+
+ Default Configurations
+
+
+
+
+
+
+
+ Feature
+ Cautious
+ Medium
+ Advanced
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Ad-blocking Aggressiveness
+ medium
+ high
+ high
+
+
+
+ Ad-filtering by size
+ no
+ yes
+ yes
+
+
+
+ Ad-filtering by link
+ no
+ no
+ yes
+
+
+ Pop-up killing
+ blocks only
+ blocks only
+ blocks only
+
+
+
+ Privacy Features
+ low
+ medium
+ medium/high
+
+
+
+ Cookie handling
+ none
+ session-only
+ kill
+
+
+
+ Referer forging
+ no
+ yes
+ yes
+
+
+
+
+ GIF de-animation
+ no
+ yes
+ yes
+
+
+
+
+ Fast redirects
+ no
+ no
+ yes
+
+
+
+ HTML taming
+ no
+ no
+ yes
+
+
+
+ JavaScript taming
+ no
+ no
+ yes
+
+
+
+ Web-bug killing
+ no
+ yes
+ yes
+
+
+
+ Image tag reordering
+ no
+ no
+ yes
+
+
+
+
+
+
+
+
+
+
+
+
+ The list of actions files to be used are defined in the main configuration
+ file, and are processed in the order they are defined (e.g.
+ default.action is typically process before
+ user.action ). The content of these can all be viewed and
+ edited from http://config.privoxy.org/show-status .
+ The over-riding principle when applying actions, is that the last action that
+ matches a given URL, wins. The broadest, most general rules go first
+ (defined in default.action ),
+ followed by any exceptions (typically also in
+ default.action ), which are then followed lastly by any
+ local preferences (typically in user .action ).
+ Generally, user.action has the last word.
+
+
+
+ An actions file typically has multiple sections. If you want to use
+ aliases
in an actions file, you have to place the (optional)
+ alias section at the top of that file.
+ Then comes the default set of rules which will apply universally to all
+ sites and pages (be very careful with using such a
+ universal set in user.action or any other actions file after
+ default.action , because it will override the result
+ from consulting any previous file). And then below that,
+ exceptions to the defined universal policies. You can regard
+ user.action as an appendix to default.action ,
+ with the advantage that is a separate file, which makes preserving your
+ personal settings across Privoxy upgrades easier.
+
+
+
+ Actions can be used to block anything you want, including ads, banners, or
+ just some obnoxious URL that you would rather not see. Cookies can be accepted
+ or rejected, or accepted only during the current browser session (i.e. not
+ written to disk), content can be modified, JavaScripts tamed, user-tracking
+ fooled, and much more. See below for a complete list
+ of actions.
+
+
+Finding the Right Mix
- The user running Privoxy, must have read permission for all
- configuration files, and write permission to any files that would
- be modified, such as log files.
+ Note that some actions, like cookie suppression
+ or script disabling, may render some sites unusable that rely on these
+ techniques to work properly. Finding the right mix of actions is not always easy and
+ certainly a matter of personal taste. And, things can always change, requiring
+ refinements in the configuration. In general, it can be said that the more
+ aggressive
your default settings (in the top section of the
+ actions file) are, the more exceptions for trusted
sites you
+ will have to make later. If, for example, you want to crunch all cookies per
+ default, you'll have to make exceptions from that rule for sites that you
+ regularly use and that require cookies for actually useful purposes, like maybe
+ your bank, favorite shop, or newspaper.
-confdir
+
+ We have tried to provide you with reasonable rules to start from in the
+ distribution actions files. But there is no general rule of thumb on these
+ things. There just are too many variables, and sites are constantly changing.
+ Sooner or later you will want to change the rules (and read this chapter again :).
+
+
-
-
- Specifies:
-
- The directory where the other configuration files are located
-
-
-
- Type of value:
-
- Path name
-
-
-
- Default value:
-
- /etc/privoxy (Unix) or Privoxy installation dir (Windows)
-
-
-
- Effect if unset:
-
- Mandatory
-
-
-
- Notes:
-
-
- No trailing /
, please
-
-
- When development goes modular and multi-user, the blocker, filter, and
- per-user config will be stored in subdirectories of confdir
.
- For now, the configuration directory structure is flat, except for
- confdir/templates , where the HTML templates for CGI
- output reside (e.g. Privoxy's 404 error page).
-
-
-
-
-
+
+
+How to Edit
+
+ The easiest way to edit the actions files is with a browser by
+ using our browser-based editor, which can be reached from http://config.privoxy.org/show-status .
+ The editor allows both fine-grained control over every single feature on a
+ per-URL basis, and easy choosing from wholesale sets of defaults like
+ Cautious
, Medium
or Advanced
.
+ Warning: the Advanced
setting is more aggressive, and
+ will be more likely to cause problems for some sites. Experienced users only!
+
+
+
+ If you prefer plain text editing to GUIs, you can of course also directly edit the
+ the actions files with your favorite text editor. Look at
+ default.action which is richly commented with many
+ good examples.
+
+
+
+
+
+How Actions are Applied to URLs
+
+ Actions files are divided into sections. There are special sections,
+ like the alias
sections which will
+ be discussed later. For now let's concentrate on regular sections: They have a
+ heading line (often split up to multiple lines for readability) which consist
+ of a list of actions, separated by whitespace and enclosed in curly braces.
+ Below that, there is a list of URL patterns, each on a separate line.
+
+
+
+ To determine which actions apply to a request, the URL of the request is
+ compared to all patterns in each action file
file. Every time it matches, the list of
+ applicable actions for the URL is incrementally updated, using the heading
+ of the section in which the pattern is located. If multiple matches for
+ the same URL set the same action differently, the last match wins. If not,
+ the effects are aggregated. E.g. a URL might match a regular section with
+ a heading line of {
+ + handle-as-image } ,
+ then later another one with just {
+ + block } , resulting
+ in both actions to apply. And there may well be
+ cases where you will want to combine actions together. Such a section then
+ might look like:
+
+
+
+
+ { +handle-as-image +block }
+ # Block these as if they were images. Send no block page.
+ banners.example.com
+ media.example.com/.*banners
+ .example.com/images/ads/
+
+
+
+ You can trace this process for any given URL by visiting http://config.privoxy.org/show-url-info .
+
+
+ Examples and more detail on this is provided in the Appendix,
+ Troubleshooting: Anatomy of an Action section.
+
+
-logdir
+
+
+Patterns
+
+ As mentioned, Privoxy uses patterns
+ to determine what actions might apply to which sites and
+ pages your browser attempts to access. These patterns
use wild
+ card type pattern matching to achieve a high degree of
+ flexibility. This allows one expression to be expanded and potentially match
+ against many similar patterns.
+
+
+
+ Generally, a Privoxy pattern has the form
+ <domain>/<path> , where both the
+ <domain> and <path> are
+ optional. (This is why the special / pattern matches all
+ URLs). Note that the protocol portion of the URL pattern (e.g.
+ http:// ) should not be included in
+ the pattern. This is assumed already!
+
+
+ The pattern matching syntax is different for the domain and path parts of
+ the URL. The domain part uses a simple globbing type matching technique,
+ while the path part uses a more flexible
+ Regular
+ Expressions (PCRE)
based syntax.
+
- Specifies:
+ www.example.com/
- The directory where all logging takes place (i.e. where logfile and
- jarfile are located)
+ is a domain-only pattern and will match any request to www.example.com ,
+ regardless of which document on that server is requested. So ALL pages in
+ this domain would be covered by the scope of this action. Note that a
+ simple example.com is different and would NOT match.
- Type of value:
+ www.example.com
- Path name
+
+ means exactly the same. For domain-only patterns, the trailing / may
+ be omitted.
+
- Default value:
+ www.example.com/index.html
- /var/log/privoxy (Unix) or Privoxy installation dir (Windows)
+
+ matches only the single document /index.html
+ on www.example.com .
+
- Effect if unset:
+ /index.html
- Mandatory
+
+ matches the document /index.html , regardless of the domain,
+ i.e. on any web server anywhere.
+
- Notes:
+ index.html
- No trailing /
, please
+ matches nothing, since it would be interpreted as a domain name and
+ there is no top-level domain called .html . So its
+ a mistake.
-
-
-actionsfile
-
-
-
-
-
+
+
+The Domain Pattern
+
+
+ The matching of the domain part offers some flexible options: if the
+ domain starts or ends with a dot, it becomes unanchored at that end.
+ For example:
+
+
- Specifies:
+ .example.com
- The actions file(s) to use
+ matches any domain that ENDS in
+ .example.com
- Type of value:
-
- File name, relative to confdir , without the .action suffix
-
-
-
- Default values:
-
-
-
- standard # Internal purposes, no editing recommended
-
-
- default # Main actions file
-
-
- user # User customizations
-
-
-
-
-
- Effect if unset:
+ www.
- No actions are taken at all. Simple neutral proxying.
+ matches any domain that STARTS with
+ www.
- Notes:
+ .example.
- Multiple actionsfile lines are permitted, and are in fact recommended!
-
-
- The default values include standard.action, which is used for internal
- purposes and should be loaded, default.action, which is the
- main
actions file maintained by the developers, and
- user.action , where you can make your personal additions.
-
-
- Actions files are where all the per site and per URL configuration is done for
- ad blocking, cookie management, privacy considerations, etc.
- There is no point in using Privoxy without at
- least one actions file.
+ matches any domain that CONTAINS .example. .
+ And, by the way, also included would be any files or documents that exist
+ within that domain since no path limitations are specified. (Correctly
+ speaking: It matches any FQDN that contains example as
+ a domain.) This might be www.example.com ,
+ news.example.de , or
+ www.example.net/cgi/testing.pl for instance. All these
+ cases are matched.
-
-filterfile
-
+
+ Additionally, there are wild-cards that you can use in the domain names
+ themselves. These work similarly to shell globbing type wild-cards:
+ *
represents zero or more arbitrary characters (this is
+ equivalent to the
+ Regular
+ Expression
based syntax of .*
),
+ ?
represents any single character (this is equivalent to the
+ regular expression syntax of a simple .
), and you can define
+ character classes
in square brackets which is similar to
+ the same regular expression technique. All of this can be freely mixed:
+
+
- Specifies:
+ ad*.example.com
- The filter file to use
+ matches adserver.example.com
,
+ ads.example.com
, etc but not sfads.example.com
- Type of value:
-
- File name, relative to confdir
-
-
-
- Default value:
+ *ad*.example.com
- default.filter (Unix) or default.filter.txt (Windows)
+
+ matches all of the above, and then some.
+
- Effect if unset:
+ .?pix.com
- No textual content filtering takes place, i.e. all
- + filter{name }
- actions in the actions files are turned neutral.
+ matches www.ipix.com ,
+ pictures.epix.com , a.b.c.d.e.upix.com etc.
- Notes:
+ www[1-9a-ez].example.c*
- The filter file contains content modification
- rules that use regular expressions. These rules permit
- powerful changes on the content of Web pages, e.g., you could disable your favorite
- JavaScript annoyances, re-write the actual displayed text, or just have some
- fun replacing Microsoft
with MicroSuck
wherever
- it appears on a Web page.
-
-
- The
- + filter{name }
- actions rely on the relevant filter (name )
- to be defined in the filter file!
-
-
- A pre-defined filter file called default.filter that contains
- a bunch of handy filters for common problems is included in the distribution.
- See the section on the filter
- action for a list.
+ matches www1.example.com ,
+ www4.example.cc , wwwd.example.cy ,
+ wwwz.example.com etc., but not
+ wwww.example.com .
-
-
-logfile
+
+
+ While flexibile, this is not the sophistication of full regular expression based syntax.
+
+
+
+
+
+
+
+
+The Path Pattern
+
+
+ Privoxy uses Perl compatible (PCRE)
+ Regular
+ Expression
based syntax
+ (through the PCRE library) for
+ matching the path portion (after the slash), and is thus more flexible.
+
+
+
+ There is an Appendix with a brief quick-start into regular
+ expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
+ at http://www.pcre.org/man.txt .
+ You might also find the Perl man page on regular expressions (man perlre )
+ useful, which is available on-line at http://perldoc.perl.org/perlre.html .
+
+
+
+ Note that the path pattern is automatically left-anchored at the /
,
+ i.e. it matches as if it would start with a ^
(regular expression speak
+ for the beginning of a line).
+
+
+
+ Please also note that matching in the path is CASE INSENSITIVE
+ by default, but you can switch to case sensitive at any point in the pattern by using the
+ (?-i)
switch: www.example.com/(?-i)PaTtErN.* will match
+ only documents whose path starts with PaTtErN in
+ exactly this capitalization.
+
- Specifies:
+ .example.com/.*
- The log file to use
+ Is equivalent to just .example.com
, since any documents
+ within that domain are matched with or without the .*
+ regular expression. This is redundant
- Type of value:
+ .example.com/.*/index.html
- File name, relative to logdir
+
+ Will match any page in the domain of example.com
that is
+ named index.html
, and that is part of some path. For
+ example, it matches www.example.com/testing/index.html
but
+ NOT www.example.com/index.html
because the regular
+ expression called for at least two /'s
, thus the path
+ requirement. It also would match
+ www.example.com/testing/index_html
, because of the
+ special meta-character .
.
+
- Default value:
+ .example.com/(.*/)?index\.html
- logfile (Unix) or privoxy.log (Windows)
+
+ This regular expression is conditional so it will match any page
+ named index.html
regardless of path which in this case can
+ have one or more /'s
. And this one must contain exactly
+ .html
(but does not have to end with that!).
+
- Effect if unset:
+ .example.com/(.*/)(ads|banners?|junk)
- No log file is used, all log messages go to the console (stderr ).
+ This regular expression will match any path of example.com
+ that contains any of the words ads
, banner
,
+ banners
(because of the ?
) or junk
.
+ The path does not have to end in these words, just contain them.
- Notes:
+ .example.com/(.*/)(ads|banners?|junk)/.*\.(jpe?g|gif|png)$
- The windows version will additionally log to the console.
-
-
- The logfile is where all logging and error messages are written. The level
- of detail and number of messages are set with the debug
- option (see below). The logfile can be useful for tracking down a problem with
- Privoxy (e.g., it's not blocking an ad you
- think it should block) but in most cases you probably will never look at it.
-
-
- Your logfile will grow indefinitely, and you will probably want to
- periodically remove it. On Unix systems, you can do this with a cron job
- (see man cron
). For Red Hat, a logrotate
- script has been included.
-
-
- On SuSE Linux systems, you can place a line like /var/log/privoxy.*
- +1024k 644 nobody.nogroup
in /etc/logfiles , with
- the effect that cron.daily will automatically archive, gzip, and empty the
- log, when it exceeds 1M size.
-
-
- Any log files must be writable by whatever user Privoxy
- is being run as (default on UNIX, user id is privoxy
).
+ This is very much the same as above, except now it must end in either
+ .jpg
, .jpeg
, .gif
or .png
. So this
+ one is limited to common image formats.
+
+
+ There are many, many good examples to be found in default.action ,
+ and more tutorials below in Appendix on regular expressions.
+
+
-jarfile
+
+
+
+
+
+
+
+
+Actions
+
+ All actions are disabled by default, until they are explicitly enabled
+ somewhere in an actions file. Actions are turned on if preceded with a
+ +
, and turned off if preceded with a -
. So a
+ +action means do that action
, e.g.
+ +block means please block URLs that match the
+ following patterns
, and -block means don't
+ block URLs that match the following patterns, even if +block
+ previously applied.
+
+
+
+
+ Again, actions are invoked by placing them on a line, enclosed in curly braces and
+ separated by whitespace, like in
+ {+some-action -some-other-action{some-parameter}} ,
+ followed by a list of URL patterns, one per line, to which they apply.
+ Together, the actions line and the following pattern lines make up a section
+ of the actions file.
+
+
+
+ Actions fall into three categories:
+
+
+
+
+
+
+ Boolean, i.e the action can only be enabled
or
+ disabled
. Syntax:
+
+
+
+ +name # enable action name
+ -name # disable action name
+
+
+ Example: +block
+
+
+
+
+
+
+ Parameterized, where some value is required in order to enable this type of action.
+ Syntax:
+
+
+
+ +name {param } # enable action and set parameter to param ,
+ # overwriting parameter from previous match if necessary
+ -name # disable action. The parameter can be omitted
+
+
+ Note that if the URL matches multiple positive forms of a parameterized action,
+ the last match wins, i.e. the params from earlier matches are simply ignored.
+
+
+ Example: +hide-user-agent{ Mozilla 1.0 }
+
+
+
+
+
+ Multi-value. These look exactly like parameterized actions,
+ but they behave differently: If the action applies multiple times to the
+ same URL, but with different parameters, all the parameters
+ from all matches are remembered. This is used for actions
+ that can be executed for the same request repeatedly, like adding multiple
+ headers, or filtering through multiple filters. Syntax:
+
+
+
+ +name {param } # enable action and add param to the list of parameters
+ -name {param } # remove the parameter param from the list of parameters
+ # If it was the last one left, disable the action.
+ -name # disable this action completely and remove all parameters from the list
+
+
+ Examples: +add-header{X-Fun-Header: Some text} and
+ +filter{html-annoyances}
+
+
+
+
+
+
+
+ If nothing is specified in any actions file, no actions
are
+ taken. So in this case Privoxy would just be a
+ normal, non-blocking, non-anonymizing proxy. You must specifically enable the
+ privacy and blocking features you need (although the provided default actions
+ files will give a good starting point).
+
+
+
+ Later defined actions always over-ride earlier ones. So exceptions
+ to any rules you make, should come in the latter part of the file (or
+ in a file that is processed later when using multiple actions files such
+ as user.action ). For multi-valued actions, the actions
+ are applied in the order they are specified. Actions files are processed in
+ the order they are defined in config (the default
+ installation has three actions files). It also quite possible for any given
+ URL to match more than one pattern
(because of wildcards and
+ regular expressions), and thus to trigger more than one set of actions! Last
+ match wins.
+
+
+
+
+ The list of valid Privoxy actions are:
+
+
+
+
+
+
+
+
+
+
+
+
+
-trustfile
+
+
+
+block
+
- Specifies:
+ Typical use:
-
- The trust file to use
-
+ Block ads or other unwanted content
+
- Type of value:
+ Effect:
- File name, relative to confdir
+
+ Requests for URLs to which this action applies are blocked, i.e. the
+ requests are trapped by &my-app; and the requested URL is never retrieved,
+ but is answered locally with a substitute page or image, as determined by
+ the handle-as-image ,
+ set-image-blocker , and
+ handle-as-empty-document actions.
+
+
+
- Default value:
+ Type:
+
- Unset (commented out) . When activated: trust (Unix) or trust.txt (Windows)
+ Boolean.
+
- Effect if unset:
+ Parameter:
-
- The whole trust mechanism is turned off.
-
+ N/A
-
+
+
Notes:
- The trust mechanism is an experimental feature for building white-lists and should
- be used with care. It is NOT recommended for the casual user.
+ Privoxy sends a special BLOCKED
page
+ for requests to blocked pages. This page contains links to find out why the request
+ was blocked, and a click-through to the blocked content (the latter only if compiled with the
+ force feature enabled). The BLOCKED
page adapts to the available
+ screen space -- it displays full-blown if space allows, or miniaturized and text-only
+ if loaded into a small frame or window. If you are using Privoxy
+ right now, you can take a look at the
+ BLOCKED
+ page .
-
- If you specify a trust file, Privoxy will only allow
- access to sites that are named in the trustfile.
- You can also mark sites as trusted referrers (with + ), with
- the effect that access to untrusted sites will be granted, if a link from a
- trusted referrer was used.
- The link target will then be added to the trustfile
.
- Possible applications include limiting Internet access for children.
+
+ A very important exception occurs if both
+ block and handle-as-image ,
+ apply to the same request: it will then be replaced by an image. If
+ set-image-blocker
+ (see below) also applies, the type of image will be determined by its parameter,
+ if not, the standard checkerboard pattern is sent.
- If you use + operator in the trust file, it may grow considerably over time.
+ It is important to understand this process, in order
+ to understand how Privoxy deals with
+ ads and other unwanted content. Blocking is a core feature, and one
+ upon which various other features depend.
+
+
+ The filter
+ action can perform a very similar task, by blocking
+ banner images and other content through rewriting the relevant URLs in the
+ document's HTML source, so they don't get requested in the first place.
+ Note that this is a totally different technique, and it's easy to confuse the two.
-
-
-
-
-
+
+ Example usage (section):
+
+
+ {+block}
+# Block and replace with "blocked" page
+ .nasty-stuff.example.com
+
+{+block +handle-as-image}
+# Block and replace with image
+ .ad.doubleclick.net
+ .ads.r.us/banners/
+
+{+block +handle-as-empty-document}
+# Block and then ignore
+ adserver.exampleclick.net/.*\.js$
+
+
+
-
+
+
-
-Local Set-up Documentation
-
- If you intend to operate Privoxy for more users
- than just yourself, it might be a good idea to let them know how to reach
- you, what you block and why you do that, your policies, etc.
-
+
+
+
+content-type-overwrite
-user-manual
- Specifies:
+ Typical use:
-
- Location of the Privoxy User Manual.
-
+ Stop useless download menus from popping up, or change the browser's rendering mode
+
- Type of value:
+ Effect:
- A fully qualified URI
+
+ Replaces the Content-Type:
HTTP server header.
+
+
- Default value:
+ Type:
+
- Unset
+ Parameterized.
+
- Effect if unset:
+ Parameter:
- http://www.privoxy.org/version /user-manual/
- will be used, where version is the Privoxy version.
-
+ Any string.
+
+
Notes:
-
- The User Manual URI is used for help links from some of the internal CGI pages.
- The manual itself is normally packaged with the binary distributions, so you probably want
- to set this to a locally installed copy. For multi-user setups, you could provide a copy on
- a local webserver for all your users and use the corresponding URL here.
-
- Examples:
+ The Content-Type:
HTTP server header is used by the
+ browser to decide what to do with the document. The value of this
+ header can cause the browser to open a download menu instead of
+ displaying the document by itself, even if the document's format is
+ supported by the browser.
-
- Unix, in local filesystem:
-
-
- user-manual file:///usr/share/doc/privoxy-&p-version;/user-manual/
-
-
- Any platform, on local webserver (called local-webserver
):
-
-
- user-manual http://local-webserver/privoxy-user-manual/
-
-
- If set, this option should be the first option in the config file , because
- it is used while the config file is being read.
+ The declared content type can also affect which rendering mode
+ the browser chooses. If XHTML is delivered as text/html
,
+ many browsers treat it as yet another broken HTML document.
+ If it is send as application/xml
, browsers with
+ XHTML support will only display it, if the syntax is correct.
-
-
-
-
-
-
-trust-info-url
-
-
-
- Specifies:
-
- A URL to be displayed in the error page that users will see if access to an untrusted page is denied.
+ If you see a web site that proudly uses XHTML buttons, but sets
+ Content-Type: text/html
, you can use &my-app;
+ to overwrite it with application/xml
and validate
+ the web master's claim inside your XHTML-supporting browser.
+ If the syntax is incorrect, the browser will complain loudly.
-
-
-
- Type of value:
-
- URL
-
-
-
- Default value:
-
- Two example URL are provided
-
-
-
- Effect if unset:
-
- No links are displayed on the "untrusted" error page.
+ You can also go the opposite direction: if your browser prints
+ error messages instead of rendering a document falsely declared
+ as XHTML, you can overwrite the content type with
+ text/html
and have it rendered as broken HTML document.
-
-
-
- Notes:
-
- The value of this option only matters if the experimental trust mechanism has been
- activated. (See trustfile above.)
+ By default content-type-overwrite only replaces
+ Content-Type:
headers that look like some kind of text.
+ If you want to overwrite it unconditionally, you have to combine it with
+ force-text-mode .
+ This limitation exists for a reason, think twice before circumventing it.
- If you use the trust mechanism, it is a good idea to write up some on-line
- documentation about your trust policy and to specify the URL(s) here.
- Use multiple times for multiple URLs.
+ Most of the time it's easier to enable
+ filter-server-headers
+ and replace this action with a custom regular expression. It allows you
+ to activate it for every document of a certain site and it will still
+ only replace the content types you aimed at.
- The URL(s) should be added to the trustfile as well, so users don't end up
- locked out from the information on why they were locked out in the first place!
+ Of course you can apply content-type-overwrite
+ to a whole site and then make URL based exceptions, but it's a lot
+ more work to get the same precision.
+
+
+
+
+
+ Example usage (sections):
+
+
+ # Check if www.example.net/ really uses valid XHTML
+{+content-type-overwrite {application/xml}}
+www.example.net/
+
+# but leave the content type unmodified if the URL looks like a style sheet
+{-content-type-overwrite}
+www.example.net/*.\.css$
+www.example.net/*.style
+
-admin-address
+
+
+
-proxy-info-url
+
+
+crunch-if-none-match
+
- Specifies:
+ Typical use:
-
- A URL to documentation about the local Privoxy setup,
- configuration or policies.
-
+ Prevent yet another way to track the user's steps between sessions.
+
- Type of value:
+ Effect:
- URL
+
+ Deletes the If-None-Match:
HTTP client header.
+
+
- Default value:
+ Type:
+
- Unset
+ Boolean.
+
- Effect if unset:
+ Parameter:
- No link to local documentation is displayed on error pages and the CGI user interface.
-
+ N/A
+
+
Notes:
- If both admin-address and proxy-info-url
- are unset, the whole "Local Privoxy Support" box on all generated pages will
- not be shown.
-
+ Removing the If-None-Match:
HTTP client header
+ is useful for filter testing, where you want to force a real
+ reload instead of getting status code 304
which
+ would cause the browser to use a cached copy of the page.
+
+
+ It is also useful to make sure the header isn't used as a cookie
+ replacement.
+
+
+ Blocking the If-None-Match:
header shouldn't cause any
+ caching problems, as long as the If-Modified-Since:
header
+ isn't blocked as well.
+
- This URL shouldn't be blocked ;-)
-
+ It is recommended to use this action together with
+ hide-if-modified-since
+ and
+ overwrite-last-modified .
+
+
+
+
+
+ Example usage (section):
+
+
+ # Let the browser revalidate cached documents without being tracked across sessions
+{+hide-if-modified-since {-60} \
++overwrite-last-modified {randomize} \
++crunch-if-none-match}
+/
+
-
-
-
-
-Debugging
-
-
- These options are mainly useful when tracing a problem.
- Note that you might also want to invoke
- Privoxy with the --no-daemon
- command line option when debugging.
-
-
-debug
+
+crunch-incoming-cookies
- Specifies:
+ Typical use:
- Key values that determine what information gets logged to the
- logfile .
+ Prevent the web server from setting any cookies on your system
+
- Type of value:
+ Effect:
- Integer values
+
+ Deletes any Set-Cookie:
HTTP headers from server replies.
+
+
- Default value:
+ Type:
+
- 12289 (i.e.: URLs plus informational and warning messages)
+ Boolean.
+
- Effect if unset:
+ Parameter:
- Nothing gets logged.
+ N/A
+
Notes:
- The available debug levels are:
-
-
-
- debug 1 # show each GET/POST/CONNECT request
- debug 2 # show each connection status
- debug 4 # show I/O status
- debug 8 # show header parsing
- debug 16 # log all data into the logfile
- debug 32 # debug force feature
- debug 64 # debug regular expression filter
- debug 128 # debug fast redirects
- debug 256 # debug GIF de-animation
- debug 512 # Common Log Format
- debug 1024 # debug kill pop-ups
- debug 4096 # Startup banner and warnings.
- debug 8192 # Non-fatal errors
-
-
-
- To select multiple debug levels, you can either add them or use
- multiple debug lines.
-
-
- A debug level of 1 is informative because it will show you each request
- as it happens. 1, 4096 and 8192 are highly recommended
- so that you will notice when things go wrong. The other levels are probably
- only of interest if you are hunting down a specific problem. They can produce
- a hell of an output (especially 16).
-
+ This action is only concerned with incoming cookies. For
+ outgoing cookies, use
+ crunch-outgoing-cookies .
+ Use both to disable cookies completely.
- The reporting of fatal errors (i.e. ones which crash
- Privoxy ) is always on and cannot be disabled.
+ It makes no sense at all to use this action in conjunction
+ with the session-cookies-only action,
+ since it would prevent the session cookies from being set. See also
+ filter-content-cookies .
+
+
+
+
+ Example usage:
+
- If you want to use CLF (Common Log Format), you should set debug
- 512
ONLY and not enable anything else.
+ +crunch-incoming-cookies
-single-threaded
+
+
-
-
-
-Access Control and Security
-
-
- This section of the config file controls the security-relevant aspects
- of Privoxy 's configuration.
-
-
-listen-address
+
+crunch-outgoing-cookies
- Specifies:
+ Typical use:
- The IP address and TCP port on which Privoxy will
- listen for client requests.
+ Prevent the web server from reading any cookies from your system
+
- Type of value:
+ Effect:
- [IP-Address ]:Port
+
+ Deletes any Cookie:
HTTP headers from client requests.
+
- Default value:
+ Type:
+
- 127.0.0.1:8118
+ Boolean.
+
- Effect if unset:
+ Parameter:
- Bind to 127.0.0.1 (localhost), port 8118. This is suitable and recommended for
- home users who run Privoxy on the same machine as
- their browser.
+ N/A
+
Notes:
- You will need to configure your browser(s) to this proxy address and port.
-
-
- If you already have another service running on port 8118, or if you want to
- serve requests from other machines (e.g. on your local network) as well, you
- will need to override the default.
+ This action is only concerned with outgoing cookies. For
+ incoming cookies, use
+ crunch-incoming-cookies .
+ Use both to disable cookies completely.
- If you leave out the IP address, Privoxy will
- bind to all interfaces (addresses) on your machine and may become reachable
- from the Internet. In that case, consider using access control lists (ACL's, see below), and/or
- a firewall.
-
-
- If you open Privoxy to untrusted users, you will
- also want to turn off the enable-edit-actions and
- enable-remote-toggle
- options!
+ It makes no sense at all to use this action in conjunction
+ with the session-cookies-only action,
+ since it would prevent the session cookies from being read.
+
- Example:
+ Example usage:
- Suppose you are running Privoxy on
- a machine which has the address 192.168.0.1 on your local private network
- (192.168.0.0) and has another outside connection with a different address.
- You want it to serve requests from inside only:
-
-
-
- listen-address 192.168.0.1:8118
-
+ +crunch-outgoing-cookies
+
-toggle
+
+
+
+deanimate-gifs
- Specifies:
+ Typical use:
-
- Initial state of "toggle" status
-
+ Stop those annoying, distracting animated GIF images.
+
- Type of value:
+ Effect:
- 1 or 0
+
+ De-animate GIF animations, i.e. reduce them to their first or last image.
+
+
- Default value:
+ Type:
+
- 1
+ Parameterized.
+
- Effect if unset:
+ Parameter:
- Act as if toggled on
+ last
or first
+
Notes:
- If set to 0, Privoxy will start in
- toggled off
mode, i.e. behave like a normal, content-neutral
- proxy where all ad blocking, filtering, etc are disabled. See
- enable-remote-toggle below. This is not really useful
- anymore, since toggling is much easier via the web interface than via
- editing the conf file.
+ This will also shrink the images considerably (in bytes, not pixels!). If
+ the option first
is given, the first frame of the animation
+ is used as the replacement. If last
is given, the last
+ frame of the animation is used instead, which probably makes more sense for
+ most banner animations, but also has the risk of not showing the entire
+ last frame (if it is only a delta to an earlier frame).
- The windows version will only display the toggle icon in the system tray
- if this option is present.
+ You can safely use this action with patterns that will also match non-GIF
+ objects, because no attempt will be made at anything that doesn't look like
+ a GIF.
-
-
-
-enable-remote-toggle
-
-
- Specifies:
-
-
- Whether or not the web-based toggle
- feature may be used
-
-
-
- Type of value:
+ Example usage:
- 0 or 1
+
+ +deanimate-gifs{last}
+
+
+
+
+
+
+downgrade-http-version
+
+
- Default value:
+ Typical use:
- 1
+ Work around (very rare) problems with HTTP/1.1
+
- Effect if unset:
+ Effect:
- The web-based toggle feature is disabled.
+ Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0.
+
- Notes:
+ Type:
+
-
- When toggled off, Privoxy acts like a normal,
- content-neutral proxy, i.e. it acts as if none of the actions applied to
- any URL.
-
-
- For the time being, access to the toggle feature can not be
- controlled separately by ACLs
or HTTP authentication,
- so that everybody who can access Privoxy (see
- ACLs
and listen-address above) can
- toggle it for all users. So this option is not recommended
- for multi-user environments with untrusted users.
-
-
- Note that you must have compiled Privoxy with
- support for this feature, otherwise this option has no effect.
-
+ Boolean.
-
-
-
-enable-edit-actions
-
- Specifies:
+ Parameter:
- Whether or not the web-based actions
- file editor may be used
+ N/A
-
- Type of value:
-
- 0 or 1
-
-
-
- Default value:
-
- 1
-
-
-
- Effect if unset:
+
+
+ Notes:
- The web-based actions file editor is disabled.
+ This is a left-over from the time when Privoxy
+ didn't support important HTTP/1.1 features well. It is left here for the
+ unlikely case that you experience HTTP/1.1 related problems with some server
+ out there. Not all (optional) HTTP/1.1 features are supported yet, so there
+ is a chance you might need this action.
+
- Notes:
+ Example usage (section):
-
- For the time being, access to the editor can not be
- controlled separately by ACLs
or HTTP authentication,
- so that everybody who can access Privoxy (see
- ACLs
and listen-address above) can
- modify its configuration for all users. So this option is not
- recommended for multi-user environments with untrusted users.
-
-
- Note that you must have compiled Privoxy with
- support for this feature, otherwise this option has no effect.
-
+
+ {+downgrade-http-version}
+problem-host.example.com
+
+
-
-ACLs: permit-access and deny-access
-
-
+
+
+fast-redirects
- Specifies:
+ Typical use:
-
- Who can access what.
-
+ Fool some click-tracking scripts and speed up indirect links.
+
- Type of value:
+ Effect:
- src_addr [/src_masklen ]
- [dst_addr [/dst_masklen ]]
-
-
- Where src_addr and
- dst_addr are IP addresses in dotted decimal notation or valid
- DNS names, and src_masklen and
- dst_masklen are subnet masks in CIDR notation, i.e. integer
- values from 2 to 30 representing the length (in bits) of the network address. The masks and the whole
- destination part are optional.
+ Detects redirection URLs and redirects the browser without contacting
+ the redirection server first.
+
- Default value:
+ Type:
+
- Unset
+ Parameterized.
+
- Effect if unset:
+ Parameter:
-
- Don't restrict access further than implied by listen-address
-
+
+
+
+ simple-check
to just search for the string http://
+ to detect redirection URLs.
+
+
+
+
+ check-decoded-url
to decode URLs (if necessary) before searching
+ for redirection URLs.
+
+
+
+
Notes:
+
+ Many sites, like yahoo.com, don't just link to other sites. Instead, they
+ will link to some script on their own servers, giving the destination as a
+ parameter, which will then redirect you to the final target. URLs
+ resulting from this scheme typically look like:
+ http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/
.
+
- Access controls are included at the request of ISPs and systems
- administrators, and are not usually needed by individual users .
- For a typical home user, it will normally suffice to ensure that
- Privoxy only listens on the localhost
- (127.0.0.1) or internal (home) network address by means of the
- listen-address
- option.
-
-
- Please see the warnings in the FAQ that this proxy is not intended to be a substitute
- for a firewall or to encourage anyone to defer addressing basic security
- weaknesses.
-
-
- Multiple ACL lines are OK.
- If any ACLs are specified, then the Privoxy
- talks only to IP addresses that match at least one permit-access line
- and don't match any subsequent deny-access line. In other words, the
- last match wins, with the default being deny-access .
-
-
- If Privoxy is using a forwarder (see forward below)
- for a particular destination URL, the dst_addr
- that is examined is the address of the forwarder and NOT the address
- of the ultimate target. This is necessary because it may be impossible for the local
- Privoxy to determine the IP address of the
- ultimate target (that's often what gateways are used for).
-
-
- You should prefer using IP addresses over DNS names, because the address lookups take
- time. All DNS names must resolve! You can not use domain patterns
- like *.org
or partial domain names. If a DNS name resolves to multiple
- IP addresses, only the first one is used.
-
-
- Denying access to particular sites by ACL may have undesired side effects
- if the site in question is hosted on a machine which also hosts other sites.
-
-
-
-
- Examples:
-
-
- Explicitly define the default behavior if no ACL and
- listen-address are set: localhost
- is OK. The absence of a dst_addr implies that
- all destination addresses are OK:
-
-
-
- permit-access localhost
-
+ Sometimes, there are even multiple consecutive redirects encoded in the
+ URL. These redirections via scripts make your web browsing more traceable,
+ since the server from which you follow such a link can see where you go
+ to. Apart from that, valuable bandwidth and time is wasted, while your
+ browser asks the server for one redirect after the other. Plus, it feeds
+ the advertisers.
- Allow any host on the same class C subnet as www.privoxy.org access to
- nothing but www.example.com:
+ This feature is currently not very smart and is scheduled for improvement.
+ If it is enabled by default, you will have to create some exceptions to
+ this action. It can lead to failures in several ways:
-
- permit-access www.privoxy.org/24 www.example.com/32
-
+ Not every URLs with other URLs as parameters is evil.
+ Some sites offer a real service that requires this information to work.
+ For example a validation service needs to know, which document to validate.
+ fast-redirects assumes that every URL parameter that
+ looks like another URL is a redirection target, and will always redirect to
+ the last one. Most of the time the assumption is correct, but if it isn't,
+ the user gets redirected anyway.
- Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere,
- with the exception that 192.168.45.73 may not access www.dirty-stuff.example.com:
+ Another failure occurs if the URL contains other parameters after the URL parameter.
+ The URL:
+ http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar
.
+ contains the redirection URL http://www.example.net/
,
+ followed by another parameter. fast-redirects doesn't know that
+ and will cause a redirect to http://www.example.net/&foo=bar
.
+ Depending on the target server configuration, the parameter will be silently ignored
+ or lead to a page not found
error. It is possible to fix these redirected
+ requests with filter-client-headers
+ but it requires a little effort.
-
- permit-access 192.168.45.64/26
- deny-access 192.168.45.73 www.dirty-stuff.example.com
-
+ To detect a redirection URL, fast-redirects only
+ looks for the string http://
, either in plain text
+ (invalid but often used) or encoded as http%3a//
.
+ Some sites use their own URL encoding scheme, encrypt the address
+ of the target server or replace it with a database id. In theses cases
+ fast-redirects is fooled and the request reaches the
+ redirection server where it probably gets logged.
-
-
-
-buffer-limit
-
-
- Specifies:
-
-
- Maximum size of the buffer for content filtering.
-
-
-
-
- Type of value:
-
- Size in Kbytes
-
-
-
- Default value:
-
- 4096
-
-
-
- Effect if unset:
-
-
- Use a 4MB (4096 KB) limit.
-
-
-
- Notes:
+ Example usage:
-
- For content filtering, i.e. the +filter and
- +deanimate-gif actions, it is necessary that
- Privoxy buffers the entire document body.
- This can be potentially dangerous, since a server could just keep sending
- data indefinitely and wait for your RAM to exhaust -- with nasty consequences.
- Hence this option.
-
-
- When a document buffer size reaches the buffer-limit , it is
- flushed to the client unfiltered and no further attempt to
- filter the rest of the document is made. Remember that there may be multiple threads
- running, which might require up to buffer-limit Kbytes
- each , unless you have enabled single-threaded
- above.
-
+
+
+ { +fast-redirects{simple-check} }
+ .example.com
+
+ { +fast-redirects{check-decoded-url} }
+ another.example.com/testing
+
+
-
-
-
-
+
+filter
-
-Forwarding
-
-
- This feature allows routing of HTTP requests through a chain of
- multiple proxies.
- It can be used to better protect privacy and confidentiality when
- accessing specific domains by routing requests to those domains
- through an anonymous public proxy (see e.g. http://www.multiproxy.org/anon_list.htm )
- Or to use a caching proxy to speed up browsing. Or chaining to a parent
- proxy may be necessary because the machine that Privoxy
- runs on has no direct Internet access.
-
-
-
- Also specified here are SOCKS proxies. Privoxy
- supports the SOCKS 4 and SOCKS 4A protocols.
-
-
-forward
- Specifies:
+ Typical use:
-
- To which parent HTTP proxy specific requests should be routed.
-
+ Get rid of HTML and JavaScript annoyances, banner advertisements (by size),
+ do fun text replacements, add personalized effects, etc.
+
- Type of value:
+ Effect:
- target_domain [:port ]
- http_parent [/port ]
-
-
- Where target_domain is a domain name pattern (see the
- chapter on domain matching in the default.action file),
- http_parent is the address of the parent HTTP proxy
- as an IP addresses in dotted decimal notation or as a valid DNS name (or .
to denote
- no forwarding
, and the optional
- port parameters are TCP ports, i.e. integer
- values from 1 to 64535
+ All files of text-based type, most notably HTML and
+ JavaScript, to which this action applies, can be filtered on-the-fly
+ through the specified regular expression based substitutions. (Note: as of
+ version 3.0.3 plain text documents are exempted from filtering, because
+ web servers often use the text/plain MIME type for all
+ files whose type they don't know.) By default, filtering works only on the
+ raw document content itself (that which can be seen with View
+ Source ),
+ not the headers.
+
- Default value:
+ Type:
+
- Unset
+ Parameterized.
+
- Effect if unset:
+ Parameter:
- Don't use parent HTTP proxies.
+ The name of a filter, as defined in the filter file.
+ Filters can be defined in one or more files as defined by the
+ filterfile
+ option in the config file.
+ default.filter is the collection of filters
+ supplied by the developers. Locally defined filters should go
+ in their own file, such as user.filter .
+
+ When used in its negative form,
+ and without parameters, all filtering is completely disabled.
+
+
Notes:
- If http_parent is .
, then requests are not
- forwarded to another HTTP proxy but are made directly to the web servers.
+ For your convenience, there are a number of pre-defined filters available
+ in the distribution filter file that you can use. See the examples below for
+ a list.
- Multiple lines are OK, they are checked in sequence, and the last match wins.
+ Filtering requires buffering the page content, which may appear to
+ slow down page rendering since nothing is displayed until all content has
+ passed the filters. (It does not really take longer, but seems that way
+ since the page is not incrementally displayed.) This effect will be more
+ noticeable on slower connections.
-
-
-
- Examples:
-
- Everything goes to an example anonymizing proxy, except SSL on port 443 (which it doesn't handle):
+ Rolling your own
+ filters requires a knowledge of
+ Regular
+ Expressions
and
+ HTML
.
+ This is very powerful feature, and potentially very intrusive.
+ Filters should be used with caution, and where an equivalent
+ action
is not available.
-
- forward .* anon-proxy.example.org:8080
- forward :443 .
-
+ The amount of data that can be filtered is limited to the
+ buffer-limit
+ option in the main config file. The
+ default is 4096 KB (4 Megs). Once this limit is exceeded, the buffered
+ data, and all pending data, is passed through unfiltered.
- Everything goes to our example ISP's caching proxy, except for requests
- to that ISP's sites:
+ Inappropriate MIME types, such as zipped files, are not filtered at all.
+ (Again, only text-based types except plain text). Encrypted SSL data
+ (from HTTPS servers) cannot be filtered either, since this would violate
+ the integrity of the secure transaction. In some situations it might
+ be necessary to protect certain text, like source code, from filtering
+ by defining appropriate -filter exceptions.
-
- forward .*. caching-proxy.example-isp.net:8000
- forward .example-isp.net .
-
+ At this time, Privoxy cannot uncompress compressed
+ documents. If you want filtering to work on all documents, even those that
+ would normally be sent compressed, you must use the
+ prevent-compression
+ action in conjunction with filter .
-
-
-
-
-
-
-forward-socks4 and forward-socks4a
-
-
-
-
-
- Specifies:
-
-
- Through which SOCKS proxy (and to which parent HTTP proxy) specific requests should be routed.
-
-
-
-
- Type of value:
-
-
- target_domain [:port ]
- socks_proxy [/port ]
- http_parent [/port ]
-
-
- Where target_domain is a domain name pattern (see the
- chapter on domain matching in the default.action file),
- http_parent and socks_proxy
- are IP addresses in dotted decimal notation or valid DNS names (http_parent
- may be .
to denote no HTTP forwarding
), and the optional
- port parameters are TCP ports, i.e. integer values from 1 to 64535
-
-
-
-
- Default value:
-
- Unset
-
-
-
- Effect if unset:
-
-
- Don't use SOCKS proxies.
-
-
-
-
- Notes:
-
-
- Multiple lines are OK, they are checked in sequence, and the last match wins.
-
-
- The difference between forward-socks4 and forward-socks4a
- is that in the SOCKS 4A protocol, the DNS resolution of the target hostname happens on the SOCKS
- server, while in SOCKS 4 it happens locally.
-
-
- If http_parent is .
, then requests are not
- forwarded to another HTTP proxy but are made (HTTP-wise) directly to the web servers, albeit through
- a SOCKS proxy.
-
-
-
-
- Examples:
-
-
- From the company example.com, direct connections are made to all
- internal
domains, but everything outbound goes through
- their ISP's proxy by way of example.com's corporate SOCKS 4A gateway to
- the Internet.
-
-
-
- forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080
- forward .example.com .
-
-
-
- A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent looks like this:
-
-
-
- forward-socks4 .*. socks-gw.example.com:1080 .
-
-
-
-
-
-
-
-Advanced Forwarding Examples
-
-
- If you have links to multiple ISPs that provide various special content
- only to their subscribers, you can configure multiple Privoxies
- which have connections to the respective ISPs to act as forwarders to each other, so that
- your users can see the internal content of all ISPs.
-
-
-
- Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP connection to
- isp-b.net. Both run Privoxy . Their forwarding
- configuration can look like this:
-
-
-
- host-a:
-
-
-
-
- forward .*. .
- forward .isp-b.net host-b:8118
-
-
-
-
- host-b:
-
-
-
-
- forward .*. .
- forward .isp-a.net host-a:8118
-
-
-
-
- Now, your users can set their browser's proxy to use either
- host-a or host-b and be able to browse the internal content
- of both isp-a and isp-b.
-
-
-
- If you intend to chain Privoxy and
- squid locally, then chain as
- browser -> squid -> privoxy is the recommended way.
-
-
-
- Assuming that Privoxy and squid
- run on the same box, your squid configuration could then look like this:
-
-
-
-
- # Define Privoxy as parent proxy (without ICP)
- cache_peer 127.0.0.1 parent 8118 7 no-query
-
- # Define ACL for protocol FTP
- acl ftp proto FTP
-
- # Do not forward FTP requests to Privoxy
- always_direct allow ftp
-
- # Forward all the rest to Privoxy
- never_direct allow all
-
-
-
- You would then need to change your browser's proxy settings to squid 's address and port.
- Squid normally uses port 3128. If unsure consult http_port in squid.conf .
-
-
-
-
-
-
-
-
-
-
-
-
-Windows GUI Options
-
- Privoxy has a number of options specific to the
- Windows GUI interface:
-
-
-
-
- If activity-animation
is set to 1, the
- Privoxy icon will animate when
- Privoxy
is active. To turn off, set to 0.
-
-
-
-
-
-
- activity-animation 1
-
-
-
-
-
-
-
- If log-messages
is set to 1,
- Privoxy will log messages to the console
- window:
-
-
-
-
-
-
- log-messages 1
-
-
-
-
-
-
-
- If log-buffer-size
is set to 1, the size of the log buffer,
- i.e. the amount of memory used for the log messages displayed in the
- console window, will be limited to log-max-lines
(see below).
-
-
-
- Warning: Setting this to 0 will result in the buffer to grow infinitely and
- eat up all your memory!
-
-
-
-
-
-
- log-buffer-size 1
-
-
-
-
-
-
-
- log-max-lines is the maximum number of lines held
- in the log buffer. See above.
-
-
-
-
-
-
- log-max-lines 200
-
-
-
-
-
-
-
- If log-highlight-messages
is set to 1,
- Privoxy will highlight portions of the log
- messages with a bold-faced font:
-
-
-
-
-
-
- log-highlight-messages 1
-
-
-
-
-
-
-
- The font used in the console window:
-
-
-
-
-
-
- log-font-name Comic Sans MS
-
-
-
-
-
-
-
- Font size used in the console window:
-
-
-
-
-
-
- log-font-size 8
-
-
-
-
-
-
-
- show-on-task-bar
controls whether or not
- Privoxy will appear as a button on the Task bar
- when minimized:
-
-
-
-
-
-
- show-on-task-bar 0
-
-
-
-
-
-
-
- If close-button-minimizes
is set to 1, the Windows close
- button will minimize Privoxy instead of closing
- the program (close with the exit option on the File menu).
-
-
-
-
-
-
- close-button-minimizes 1
-
-
-
-
-
-
-
- The hide-console
option is specific to the MS-Win console
- version of Privoxy . If this option is used,
- Privoxy will disconnect from and hide the
- command console.
-
-
-
-
-
-
- #hide-console
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-Actions Files
-
-
- The actions files are used to define what actions
- Privoxy takes for which URLs, and thus determine
- how ad images, cookies and various other aspects of HTTP content and
- transactions are handled, and on which sites (or even parts thereof). There
- are three such files included with Privoxy (as of
- version 2.9.15), with differing purposes:
-
-
-
-
-
-
- default.action - is the primary action file
- that sets the initial values for all actions. It is intended to
- provide a base level of functionality for
- Privoxy's array of features. So it is
- a set of broad rules that should work reasonably well for users everywhere.
- This is the file that the developers are keeping updated, and making
- available to users.
-
-
-
-
- user.action - is intended to be for local site
- preferences and exceptions. As an example, if your ISP or your bank
- has specific requirements, and need special handling, this kind of
- thing should go here. This file will not be upgraded.
-
-
-
-
- standard.action - is used by the web based editor,
- to set various pre-defined sets of rules for the default actions section
- in default.action . These have increasing levels of
- aggressiveness and have no influence on your browsing unless
- you select them explicitly in the editor . It is not recommend
- to edit this file.
-
-
-
-
-
-
- The list of actions files to be used are defined in the main configuration
- file, and are processed in the order they are defined. The content of these
- can all be viewed and edited from http://config.privoxy.org/show-status .
-
-
-
- An actions file typically has multiple sections. If you want to use
- aliases
in an actions file, you have to place the (optional)
- alias section at the top of that file.
- Then comes the default set of rules which will apply universally to all
- sites and pages (be very careful with using such a
- universal set in user.action or any other actions file after
- default.action , because it will override the result
- from consulting any previous file). And then below that,
- exceptions to the defined universal policies. You can regard
- user.action as an appendix to default.action ,
- with the advantage that is a separate file, which makes preserving your
- personal settings across Privoxy upgrades easier.
-
-
-
- Actions can be used to block anything you want, including ads, banners, or
- just some obnoxious URL that you would rather not see. Cookies can be accepted
- or rejected, or accepted only during the current browser session (i.e. not
- written to disk), content can be modified, JavaScripts tamed, user-tracking
- fooled, and much more. See below for a complete list
- of actions.
-
-
-
-
-Finding the Right Mix
-
- Note that some actions, like cookie suppression
- or script disabling, may render some sites unusable that rely on these
- techniques to work properly. Finding the right mix of actions is not always easy and
- certainly a matter of personal taste. In general, it can be said that the more
- aggressive
your default settings (in the top section of the
- actions file) are, the more exceptions for trusted
sites you
- will have to make later. If, for example, you want to kill popup windows per
- default, you'll have to make exceptions from that rule for sites that you
- regularly use and that require popups for actually useful content, like maybe
- your bank, favorite shop, or newspaper.
-
-
-
- We have tried to provide you with reasonable rules to start from in the
- distribution actions files. But there is no general rule of thumb on these
- things. There just are too many variables, and sites are constantly changing.
- Sooner or later you will want to change the rules (and read this chapter again :).
-
-
-
-
-
-How to Edit
-
- The easiest way to edit the actions files is with a browser by
- using our browser-based editor, which can be reached from http://config.privoxy.org/show-status .
- The editor allows both fine-grained control over every single feature on a
- per-URL basis, and easy choosing from wholesale sets of defaults like
- Cautious
, Medium
or Advanced
.
-
-
-
- If you prefer plain text editing to GUIs, you can of course also directly edit the
- the actions files. Look at default.action which is richly
- commented.
-
-
-
-
-
-How Actions are Applied to URLs
-
- Actions files are divided into sections. There are special sections,
- like the alias
sections which will
- be discussed later. For now let's concentrate on regular sections: They have a
- heading line (often split up to multiple lines for readability) which consist
- of a list of actions, separated by whitespace and enclosed in curly braces.
- Below that, there is a list of URL patterns, each on a separate line.
-
-
-
- To determine which actions apply to a request, the URL of the request is
- compared to all patterns in each action file file. Every time it matches, the list of
- applicable actions for the URL is incrementally updated, using the heading
- of the section in which the pattern is located. If multiple matches for
- the same URL set the same action differently, the last match wins. If not,
- the effects are aggregated. E.g. a URL might match a regular section with
- a heading line of {
- +handle-as-image } ,
- then later another one with just {
- +block } , resulting
- in both actions to apply.
-
-
-
- You can trace this process for any given URL by visiting http://config.privoxy.org/show-url-info .
-
-
-
- More detail on this is provided in the Appendix,
- Anatomy of an Action.
-
-
-
-
-
-Patterns
-
- Generally, a pattern has the form <domain>/<path> ,
- where both the <domain> and <path>
- are optional. (This is why the pattern / matches all URLs).
-
-
-
-
- www.example.com/
-
-
- is a domain-only pattern and will match any request to www.example.com ,
- regardless of which document on that server is requested.
-
-
-
-
- www.example.com
-
-
- means exactly the same. For domain-only patterns, the trailing / may
- be omitted.
-
-
-
-
- www.example.com/index.html
-
-
- matches only the single document /index.html
- on www.example.com .
-
-
-
-
- /index.html
-
-
- matches the document /index.html , regardless of the domain,
- i.e. on any web server.
-
-
-
-
- index.html
-
-
- matches nothing, since it would be interpreted as a domain name and
- there is no top-level domain called .html .
-
-
-
-
-
-
-
-The Domain Pattern
-
-
- The matching of the domain part offers some flexible options: if the
- domain starts or ends with a dot, it becomes unanchored at that end.
- For example:
-
-
-
-
- .example.com
-
-
- matches any domain that ENDS in
- .example.com
-
-
-
-
- www.
-
-
- matches any domain that STARTS with
- www.
-
-
-
-
- .example.
-
-
- matches any domain that CONTAINS .example.
- (Correctly speaking: It matches any FQDN that contains example as a domain.)
-
-
-
-
-
-
- Additionally, there are wild-cards that you can use in the domain names
- themselves. They work pretty similar to shell wild-cards: *
- stands for zero or more arbitrary characters, ?
stands for
- any single character, you can define character classes in square
- brackets and all of that can be freely mixed:
-
-
-
-
- ad*.example.com
-
-
- matches adserver.example.com
,
- ads.example.com
, etc but not sfads.example.com
-
-
-
-
- *ad*.example.com
-
-
- matches all of the above, and then some.
-
-
-
-
- .?pix.com
-
-
- matches www.ipix.com ,
- pictures.epix.com , a.b.c.d.e.upix.com etc.
-
-
-
-
- www[1-9a-ez].example.c*
-
-
- matches www1.example.com ,
- www4.example.cc , wwwd.example.cy ,
- wwwz.example.com etc., but not
- wwww.example.com .
-
-
-
-
-
-
-
-
-
-
-
-The Path Pattern
-
-
- Privoxy uses Perl compatible regular expressions
- (through the PCRE library) for
- matching the path.
-
-
-
- There is an Appendix with a brief quick-start into regular
- expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
- at http://www.pcre.org/man.txt .
- You might also find the Perl man page on regular expressions (man perlre )
- useful, which is available on-line at http://www.perldoc.com/perl5.6/pod/perlre.html .
-
-
-
- Note that the path pattern is automatically left-anchored at the /
,
- i.e. it matches as if it would start with a ^
(regular expression speak
- for the beginning of a line).
-
-
-
- Please also note that matching in the path is CASE INSENSITIVE
- by default, but you can switch to case sensitive at any point in the pattern by using the
- (?-i)
switch: www.example.com/(?-i)PaTtErN.* will match
- only documents whose path starts with PaTtErN in
- exactly this capitalization.
-
-
-
-
-
-
-
-
-
-
-
-Actions
-
- All actions are disabled by default, until they are explicitly enabled
- somewhere in an actions file. Actions are turned on if preceded with a
- +
, and turned off if preceded with a -
. So a
- +action means do that action
, e.g.
- +block means please block URLs that match the
- following patterns
, and -block means don't
- block URLs that match the following patterns, even if +block
- previously applied.
-
-
-
-
- Again, actions are invoked by placing them on a line, enclosed in curly braces and
- separated by whitespace, like in
- {+some-action -some-other-action{some-parameter}} ,
- followed by a list of URL patterns, one per line, to which they apply.
- Together, the actions line and the following pattern lines make up a section
- of the actions file.
-
-
-
- There are three classes of actions:
-
-
-
-
-
-
- Boolean, i.e the action can only be enabled
or
- disabled
. Syntax:
-
-
-
- +name # enable action name
- -name # disable action name
-
-
- Example: +block
-
-
-
-
-
-
- Parameterized, where some value is required in order to enable this type of action.
- Syntax:
-
-
-
- +name {param } # enable action and set parameter to param ,
- # overwriting parameter from previous match if necessary
- -name # disable action. The parameter can be omitted
-
-
- Note that if the URL matches multiple positive forms of a parameterized action,
- the last match wins, i.e. the params from earlier matches are simply ignored.
-
-
- Example: +hide-user-agent{ Mozilla 1.0 }
-
-
-
-
-
- Multi-value. These look exactly like parameterized actions,
- but they behave differently: If the action applies multiple times to the
- same URL, but with different parameters, all the parameters
- from all matches are remembered. This is used for actions
- that can be executed for the same request repeatedly, like adding multiple
- headers, or filtering through multiple filters. Syntax:
-
-
-
- +name {param } # enable action and add param to the list of parameters
- -name {param } # remove the parameter param from the list of parameters
- # If it was the last one left, disable the action.
- -name # disable this action completely and remove all parameters from the list
-
-
- Examples: +add-header{X-Fun-Header: Some text} and
- +filter{html-annoyances}
-
-
-
-
-
-
-
- If nothing is specified in any actions file, no actions
are
- taken. So in this case Privoxy would just be a
- normal, non-blocking, non-anonymizing proxy. You must specifically enable the
- privacy and blocking features you need (although the provided default actions
- files will give a good starting point).
-
-
-
- Later defined actions always over-ride earlier ones. So exceptions
- to any rules you make, should come in the latter part of the file (or
- in a file that is processed later when using multiple actions files). For
- multi-valued actions, the actions are applied in the order they are specified.
- Actions files are processed in the order they are defined in
- config (the default installation has three actions
- files). It also quite possible for any given URL pattern to match more than
- one pattern and thus more than one set of actions!
-
-
-
-
- The list of valid Privoxy actions are:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+
Actions Files Tutorial
@@ -5210,7 +5764,7 @@ Every config file should start with a short comment stating its purpose:
- # Sample default.action file <developers@privoxy.org>
+ # Sample default.action file <ijbswa-developers@lists.sourceforge.net>
@@ -5242,19 +5796,19 @@ that also explains why and how aliases are used:
##########################################################################
{{alias}}
-# These aliases just save typing later:
-# (Note that some already use other aliases!)
-#
-+crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
--crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
-block-as-image = +block +handle-as-image
-mercy-for-cookies = -crunch-all-cookies -session-cookies-only
+ # These aliases just save typing later:
+ # (Note that some already use other aliases!)
+ #
+ +crunch-all-cookies = + crunch-incoming-cookies + crunch-outgoing-cookies
+ -crunch-all-cookies = - crunch-incoming-cookies - crunch-outgoing-cookies
+ +block-as-image = +block +handle-as-image
+ mercy-for-cookies = -crunch-all-cookies - session-cookies-only - filter{content-cookies}
-# These aliases define combinations of actions
-# that are useful for certain types of sites:
-#
-fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
-shop = mercy-for-cookies -filter{popups} -kill-popups
+ # These aliases define combinations of actions
+ # that are useful for certain types of sites:
+ #
+ fragile = - block - filter -crunch-all-cookies - fast-redirects - hide-referrer - kill-popups
+ shop = -crunch-all-cookies - filter{all-popups} - kill-popups
@@ -5267,7 +5821,7 @@ shop = mercy-for-cookies -filter{popups} -kill-popups
The first regular section is probably the most important. It has only
one pattern, /
, but this pattern
- matches all URLs.. Therefore, the
+ matches all URLs. Therefore, the
set of actions used in this default
section will
be applied to all requests as a start . It can be partly or
wholly overridden by later matches further down this file, or in user.action,
@@ -5278,7 +5832,7 @@ shop = mercy-for-cookies -filter{popups} -kill-popups
Again, at the start of matching, all actions are disabled, so there is
no real need to disable any actions here, but we will do that nonetheless,
- to have a complete listing for your reference. (Remember: A +
+ to have a complete listing for your reference. (Remember: a +
preceding the action name enables the action, a -
disables!).
Also note how this long line has been made more readable by splitting it into
multiple lines with line continuation.
@@ -5292,34 +5846,64 @@ shop = mercy-for-cookies -filter{popups} -kill-popups
{ \
- add-header \
- block \
+ - content-type-overwrite \
+ - crunch-client-header \
+ - crunch-if-none-match \
- crunch-incoming-cookies \
+ - crunch-server-header \
- crunch-outgoing-cookies \
+ deanimate-gifs \
- downgrade-http-version \
- + fast-redirects \
+ - fast-redirects{check-decoded-url} \
+ - filter{js-annoyances} \
+ - filter{js-events} \
+ filter{html-annoyances} \
- + filter{js-annoyances} \
- filter{content-cookies} \
- + filter{popups} \
+ + filter{refresh-tags} \
+ - filter{unsolicited-popups} \
+ - filter{all-popups} \
+ - filter{img-reorder} \
+ - filter{banners-by-size} \
+ - filter{banners-by-link} \
+ filter{webbugs} \
- - filter{refresh-tags} \
- - filter{fun} \
- + filter{nimda} \
- + filter{banners-by-size} \
+ - filter{tiny-textforms} \
+ - filter{jumping-windows} \
+ - filter{frameset-borders} \
+ - filter{demoronizer} \
- filter{shockwave-flash} \
+ - filter{quicktime-kioskmode} \
+ - filter{fun} \
- filter{crude-parental} \
+ + filter{ie-exploits} \
+ - filter-client-headers \
+ - filter-server-headers \
+ - filter-google \
+ - filter-yahoo \
+ - filter-msn \
+ - filter-blogspot \
+ - filter-xml-to-html \
+ - filter-html-to-xml \
+ - force-text-mode \
+ - handle-as-empty-document \
- handle-as-image \
+ - hide-accept-language \
+ - hide-content-disposition \
+ - hide-if-modified-since \
+ hide-forwarded-for-headers \
+ hide-from-header{block} \
+ hide-referrer{forge} \
- hide-user-agent \
+ - inspect-jpegs \
- kill-popups \
- limit-connect \
+ prevent-compression \
+ - overwrite-last-modified \
+ - redirect \
- send-vanilla-wafer \
- send-wafer \
+ session-cookies-only \
+ set-image-blocker{pattern} \
+ - treat-forbidden-connects-like-blocks \
}
/ # forward slash will match *all* potential URL patterns.
@@ -5331,8 +5915,6 @@ shop = mercy-for-cookies -filter{popups} -kill-popups
like not blocking (which is understandably the
default!) need exceptions, i.e. we need to specify explicitly what we
want to block in later sections.
- We will also want to make exceptions from our general pop-up-killing,
- and use our defined aliases for that.
@@ -5354,7 +5936,8 @@ shop = mercy-for-cookies -filter{popups} -kill-popups
#
{ fragile }
.office.microsoft.com # surprise, surprise!
-.windowsupdate.microsoft.com
+.windowsupdate.microsoft.com
+mail.google.com
@@ -5374,13 +5957,15 @@ shop = mercy-for-cookies -filter{popups} -kill-popups
.scan.co.uk
+
+
The fast-redirects
action, which we enabled per default above, breaks some sites. So disable
@@ -5445,7 +6032,7 @@ edit.*.yahoo.com
generate the banners, so it won't be visible from the URL that the
request is for an image. Hence we block them and
mark them as images in one go, with the help of our
- block-as-image alias defined above. (We could of
+ +block-as-image alias defined above. (We could of
course just as well use + block
+ handle-as-image here.)
Remember that the type of the replacement image is chosen by the
@@ -5459,20 +6046,19 @@ edit.*.yahoo.com
# Known ad generators:
#
-{ block-as-image }
+{ +block-as-image }
ar.atwola.com
.ad.doubleclick.net
.ad.*.doubleclick.net
.a.yimg.com/(?:(?!/i/).)*$
.a[0-9].yimg.com/(?:(?!/i/).)*$
bs*.gsanet.com
-bs*.einets.com
.qkimg.net
One of the most important jobs of Privoxy
- is to block banners. A huge bunch of them are already blocked
+ is to block banners. Many of these can be blocked
by the filter{banners-by-size}
action, which we enabled above, and which deletes the references to banner
images from the pages while they are loaded, so the browser doesn't request
@@ -5482,7 +6068,7 @@ bs*.einets.com
block action to them.
- First comes a bunch of generic patterns, which do most of the work, by
+ First comes many generic patterns, which do most of the work, by
matching typical domain and path name components of banners. Then comes
a list of individual patterns for specific sites, which is omitted here
to keep the example short:
@@ -5510,7 +6096,7 @@ count*.
- You wouldn't believe how many advertisers actually call their banner
+ It's quite remarkable how many advertisers actually call their banner
servers ads.company .com, or call the directory
in which the banners are stored simply banners
. So the above
generic patterns are surprisingly effective.
@@ -5548,6 +6134,7 @@ count*.
{ - block }
adv[io]*. # (for advogato.org and advice.*)
adsl. # (has nothing to do with ads)
+adobe. # (has nothing to do with ads either)
ad[ud]*. # (adult.* and add.*)
.edu # (universities don't host banners (yet!))
.*loads. # (downloads, uploads etc)
@@ -5575,12 +6162,15 @@ www.ugu.com/sui/ugu/adv
# Don't filter code!
#
{ - filter }
-/.*cvs
+/(.*/)?cvs
+bugzilla.
+developer.
+wiki.
.sourceforge.net
- The actual default.action is of course more
+ The actual default.action is of course much more
comprehensive, but we hope this example made clear how it works.
@@ -5591,7 +6181,7 @@ www.ugu.com/sui/ugu/adv
So far we are painting with a broad brush by setting general policies,
which would be a reasonable starting point for many people. Now,
- you'd maybe want to be more specific and have customized rules that
+ you might want to be more specific and have customized rules that
are more suitable to your personal habits and preferences. These would
be for narrowly defined situations like your ISP or your bank, and should
be placed in user.action , which is parsed after all other
@@ -5624,82 +6214,140 @@ www.ugu.com/sui/ugu/adv
+# Aliases are local to the file they are defined in.
# (Re-)define aliases for this file:
#
{{alias}}
+#
+# These aliases just save typing later, and the alias names should
+# be self explanatory.
+#
++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
-mercy-for-cookies = -crunch-all-cookies -session-cookies-only
-fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
-shop = mercy-for-cookies -filter{popups} -kill-popups
-allow-ads = -block -filter{banners-by-size} # (see below)
-
+ allow-all-cookies = -crunch-all-cookies -session-cookies-only
+ allow-popups = -filter{all-popups} -kill-popups
++block-as-image = +block +handle-as-image
+-block-as-image = -block
+
+# These aliases define combinations of actions that are useful for
+# certain types of sites:
+#
+fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer -kill-popups
+shop = -crunch-all-cookies allow-popups
+
+# Allow ads for selected useful free sites:
+#
+allow-ads = -block -filter{banners-by-size} -filter{banners-by-link}
+
+# Alias for specific file types that are text, but might have conflicting
+# MIME types. We want the browser to force these to be text documents.
+handle-as-text = - filter +- content-type-overwrite{text/plain} +- force-text-mode - hide-content-disposition
+
Say you have accounts on some sites that you visit regularly, and
you don't want to have to log in manually each time. So you'd like
to allow persistent cookies for these sites. The
- mercy-for-cookies alias defined above does exactly
- that, i.e. it disables crunching of cookies in any direction, and
- processing of cookies to make them temporary.
+ allow-all-cookies alias defined above does exactly
+ that, i.e. it disables crunching of cookies in any direction, and the
+ processing of cookies to make them only temporary.
+
+
+
+
+{ allow-all-cookies }
+ sourceforge.net
+ .yahoo.com
+ .msdn.microsoft.com
+ .redhat.com
+
+
+
+ Your bank is allergic to some filter, but you don't know which, so you disable them all:
-{ mercy-for-cookies }
-sunsolve.sun.com
-slashdot.org
-.yahoo.com
-.msdn.microsoft.com
-.redhat.com
+{ - filter }
+ .your-home-banking-site.com
- Your bank needs popups and is allergic to some filter, but you don't
- know which, so you disable them all:
+ Some file types you may not want to filter for various reasons:
-{ - filter - kill-popups }
-.your-home-banking-site.com
+# Technical documentation is likely to contain strings that might
+# erroneously get altered by the JavaScript-oriented filters:
+#
+.tldp.org
+/(.*/)?selfhtml/
+
+# And this stupid host sends streaming video with a wrong MIME type,
+# so that Privoxy thinks it is getting HTML and starts filtering:
+#
+stupid-server.example.com/
- While browsing the web with Privoxy you
- noticed some ads that sneaked through, but you were too lazy to
- report them through our fine and easy feedback
- system, so you have added them here:
+ Example of a simple block action. Say you've
+ seen an ad on your favourite page on example.com that you want to get rid of.
+ You have right-clicked the image, selected copy image location
+ and pasted the URL below while removing the leading http://, into a
+ { +block } section. Note that { +handle-as-image
+ } need not be specified, since all URLs ending in
+ .gif will be tagged as images by the general rules as set
+ in default.action anyway:
{ + block }
-www.a-popular-site.com/some/unobvious/path
-another.popular.site.net/more/junk/here/
+ www.example.com/nasty-ads/sponsor.gif
+ another.popular.site.net/more/junk/here/
- Note that, assuming the banners in the above example have regular image
- extensions (most do),
- + handle-as-image
- need not be specified, since all URLs ending in these extensions will
- already have been tagged as images in the relevant section of
- default.action by now.
+ The URLs of dynamically generated banners, especially from large banner
+ farms, often don't use the well-known image file name extensions, which
+ makes it impossible for Privoxy to guess
+ the file type just by looking at the URL.
+ You can use the +block-as-image alias defined above for
+ these cases.
+ Note that objects which match this rule but then turn out NOT to be an
+ image are typically rendered as a broken image
icon by the
+ browser. Use cautiously.
- Then you noticed that the default configuration breaks Forbes Magazine,
+
+{ +block-as-image }
+ .doubleclick.net
+ .fastclick.net
+ /Realmedia/ads/
+ ar.atwola.com/
+
+
+
+ Now you noticed that the default configuration breaks Forbes Magazine,
but you were too lazy to find out which action is the culprit, and you
were again too lazy to give feedback, so
you just used the fragile alias on the site, and
- -- whoa! -- it worked:
+ -- whoa! -- it worked. The fragile
+ aliases disables those actions that are most likely to break a site. Also,
+ good for testing purposes to see if it is Privoxy
+ that is causing the problem or not. We later find other regular sites
+ that misbehave, and add those to our personalized list of troublemakers:
{ fragile }
-.forbes.com
+ .forbes.com
+ webmail.example.com
+ .mybank.com
@@ -5712,7 +6360,7 @@ another.popular.site.net/more/junk/here/
{ + filter{fun} }
-/ # For ALL sites!
+ / # For ALL sites!
@@ -5724,7 +6372,7 @@ another.popular.site.net/more/junk/here/
- Finally, you might think about how your favourite free websites are
+ You might also worry about how your favourite free websites are
funded, and find that they rely on displaying banner advertisements
to survive. So you might want to specifically allow banners for those
sites that you feel provide value to you:
@@ -5733,17 +6381,47 @@ another.popular.site.net/more/junk/here/
{ allow-ads }
-.sourceforge.net
-.slashdot.org
-.osdn.net
+ .sourceforge.net
+ .slashdot.org
+ .osdn.net
Note that allow-ads has been aliased to
- - block
- - filter{banners-by-size}
- above.
+ - block ,
+ - filter{banners-by-size} , and
+ - filter{banners-by-link} above.
+
+
+
+ Invoke another alias here to force an over-ride of the MIME type
+ application/x-sh which typically would open a download type
+ dialog. In my case, I want to look at the shell script, and then I can save
+ it should I choose to.
+
+
+
+
+{ handle-as-text }
+ /.*\.sh$
+
+
+
+ user.action is generally the best place to define
+ exceptions and additions to the default policies of
+ default.action . Some actions are safe to have their
+ default policies set here though. So let's set a default policy to have a
+ blank
image as opposed to the checkerboard pattern for
+ ALL sites. /
of course matches all URL
+ paths and patterns:
+
+
+
+
+{ + set-image-blocker{blank} }
+/ # ALL sites
+
@@ -5756,20 +6434,24 @@ another.popular.site.net/more/junk/here/
-The Filter File
-
-
- All text substitutions that can be invoked through the
- filter action
- must first be defined in the filter file, which is typically
- called default.filter and which can be
- selected through the
- filterfile config
- option.
+Filter Files
+
+
+ On-the-fly text substitutions that can be invoked through the
+ filter action need
+ to be defined in a filter file
. Once defined, they
+ can then be invoked as an action
. Multiple filter files can be
+ defined through the filterfile config directive. The filters
+ as supplied by the developers will be found in
+ default.filter . It is recommended that any locally
+ defined or modified filters go in a separately defined file such as
+ user.filter .
+
- Typical reasons for doing such substitutions are to eliminate
+ Typical reasons for doing these kinds of substitutions are to eliminate
common annoyances in HTML and JavaScript, such as pop-up windows,
exit consoles, crippled windows without navigation tools, the
infamous <BLINK> tag etc, to suppress images with certain
@@ -5778,11 +6460,16 @@ another.popular.site.net/more/junk/here/
- Filtering works on any text-based document type, including plain
- text, HTML, JavaScript, CSS etc. (all text/*
- MIME types). Substitutions are made at the source level, so if
- you want to roll your own
filters, you should be
- familiar with HTML syntax.
+ Filtering works on any text-based document type, including
+ HTML, JavaScript, CSS etc. (all text/*
+ MIME types, except text/plain ).
+ Substitutions are made at the source level, so if you want to roll
+ your own
filters, you should first be familiar with HTML syntax,
+ and, of course, regular expressions. By default, filters are only applied
+ to the raw document content, but can be extended to the HTTP headers with
+ the supplemental actions:
+ filter-client-headers and
+ filter-server-headers.
@@ -5821,24 +6508,26 @@ another.popular.site.net/more/junk/here/
in a syntax that imitates Perl 's
s/// operator. If you are familiar with Perl, you
will find this to be quite intuitive, and may want to look at the
- PCRS man page
- for the subtle differences to Perl behaviour. Most notably, the non-standard
- option letter U is supported, which turns the default
- to ungreedy matching.
+ PCRS documentation for the subtle differences to Perl behaviour. Most
+ notably, the non-standard option letter U is supported,
+ which turns the default to ungreedy matching.
- If you are new to regular expressions, you might want to take a look at
+ If you are new to
+ Regular
+ Expressions
, you might want to take a look at
the Appendix on regular expressions, and
- see the Perl
+ see the Perl
manual for
- the
+ the
s/// operator's syntax and Perl-style regular
+ url="http://perldoc.perl.org/perlre.html">Perl-style regular
expressions in general.
The below examples might also help to get you started.
+
Filter File Tutorial
@@ -5972,90 +6661,526 @@ s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|U
# The status bar is for displaying link targets, not pointless blahblah
#
-s/window\.status\s*=\s*['"].*?['"]/dUmMy=1/ig
+s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig
+
+
+
+ \s stands for whitespace characters (space, tab, newline,
+ carriage return, form feed), so that \s* means: zero
+ or more whitespace
. The ? in .*?
+ makes this matching of arbitrary text ungreedy. (Note that the U
+ option is not set). The ['"] construct means: a single
+ or a double quote
. Finally, \1 is
+ a back-reference to the first parenthesis just like $1 above,
+ with the difference that in the pattern , a backslash indicates
+ a back-reference, whereas in the substitute , it's the dollar.
+
+
+
+ So what does this job do? It replaces assignments of single- or double-quoted
+ strings to the window.status
object with a dummy assignment
+ (using a variable name that is hopefully odd enough not to conflict with
+ real variables in scripts). Thus, it catches many cases where e.g. pointless
+ descriptions are displayed in the status bar instead of the link target when
+ you move your mouse over links.
+
+
+
+
+# Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
+#
+s/(<body [^>]*)onunload(.*>)/$1never$2/iU
+
+
+
+ Including the
+ OnUnload
+ event binding in the HTML DOM was a CRIME .
+ When I close a browser window, I want it to close and die. Basta.
+ This job replaces the onunload
attribute in
+ <body>
tags with the dummy word never .
+ Note that the i option makes the pattern matching
+ case-insensitive. Also note that ungreedy matching alone doesn't always guarantee
+ a minimal match: In the first parenthesis, we had to use [^>]*
+ instead of .* to prevent the match from exceeding the
+ <body> tag if it doesn't contain OnUnload
, but the page's
+ content does.
+
+
+
+ The last example is from the fun department:
+
+
+
+
+FILTER: fun Fun text replacements
+
+# Spice the daily news:
+#
+s/microsoft(?!\.com)/MicroSuck/ig
+
+
+
+ Note the (?!\.com) part (a so-called negative lookahead)
+ in the job's pattern, which means: Don't match, if the string
+ .com
appears directly following microsoft
+ in the page. This prevents links to microsoft.com from being trashed, while
+ still replacing the word everywhere else.
+
+
+
+
+# Buzzword Bingo (example for extended regex syntax)
+#
+s* industry[ -]leading \
+| cutting[ -]edge \
+| customer[ -]focused \
+| market[ -]driven \
+| award[ -]winning # Comments are OK, too! \
+| high[ -]performance \
+| solutions[ -]based \
+| unmatched \
+| unparalleled \
+| unrivalled \
+*<font color="red"><b>BINGO!</b></font> \
+*igx
+
+
+
+ The x option in this job turns on extended syntax, and allows for
+ e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.
+
+
+
+ You get the idea?
+
+
+
+
+
+The Pre-defined Filters
+
+
+
+
+The distribution default.filter file contains a selection of
+pre-defined filters for your convenience:
-
- \s stands for whitespace characters (space, tab, newline,
- carriage return, form feed), so that \s* means: zero
- or more whitespace
. The ? in .*?
- makes this matching of arbitrary text ungreedy. (Note that the U
- option is not set). The ['"] construct means: a single
- or a double quote
.
-
+
+
+ js-annoyances
+
+
+ The purpose of this filter is to get rid of particularly annoying JavaScript abuse.
+ To that end, it
+
+
+
+ replaces JavaScript references to the browser's referrer information
+ with the string "Not Your Business!". This compliments the hide-referrer action on the content level.
+
+
+
+
+ removes the bindings to the DOM's
+ unload
+ event which we feel has no right to exist and is responsible for most exit consoles
, i.e.
+ nasty windows that pop up when you close another one.
+
+
+
+
+ removes code that causes new windows to be opened with undesired properties, such as being
+ full-screen, non-resizeable, without location, status or menu bar etc.
+
+
+
+
+
+ Use with caution. This is an aggressive filter, and can break sites that
+ rely heavily on JavaScript.
+
+
+
+
+
+ js-events
+
+
+ This is a very radical measure. It removes virtually all JavaScript event bindings, which
+ means that scripts can not react to user actions such as mouse movements or clicks, window
+ resizing etc, anymore. Use with caution!
+
+
+ We strongly discourage using this filter as a default since it breaks
+ many legitimate scripts. It is meant for use only on extra-nasty sites (should you really
+ need to go there).
+
+
+
+
+
+ html-annoyances
+
+
+ This filter will undo many common instances of HTML based abuse.
+
+
+ The BLINK and MARQUEE tags
+ are neutralized (yeah baby!), and browser windows will be created as
+ resizeable (as of course they should be!), and will have location,
+ scroll and menu bars -- even if specified otherwise.
+
+
+
+
+
+ content-cookies
+
+
+ Most cookies are set in the HTTP dialog, where they can be intercepted
+ by the
+ crunch-incoming-cookies
+ and crunch-outgoing-cookies
+ actions. But web sites increasingly make use of HTML meta tags and JavaScript
+ to sneak cookies to the browser on the content level.
+
+
+ This filter disables most HTML and JavaScript code that reads or sets
+ cookies. It cannot detect all clever uses of these types of code, so it
+ should not be relied on as an absolute fix. Use it wherever you would also
+ use the cookie crunch actions.
+
+
+
+
+
+ refresh tags
+
+
+ Disable any refresh tags if the interval is greater than nine seconds (so
+ that redirections done via refresh tags are not destroyed). This is useful
+ for dial-on-demand setups, or for those who find this HTML feature
+ annoying.
+
+
+
+
+
+ unsolicited-popups
+
+
+ This filter attempts to prevent only unsolicited
pop-up
+ windows from opening, yet still allow pop-up windows that the user
+ has explicitly chosen to open. It was added in version 3.0.1,
+ as an improvement over earlier such filters.
+
+
+ Technical note: The filter works by redefining the window.open JavaScript
+ function to a dummy function, PrivoxyWindowOpen() ,
+ during the loading and rendering phase of each HTML page access, and
+ restoring the function afterward.
+
+
+ This is recommended only for browsers that cannot perform this function
+ reliably themselves. And be aware that some sites require such windows
+ in order to function normally. Use with caution.
+
+
+
+
+
+ all-popups
+
+
+ Attempt to prevent all pop-up windows from opening.
+ Note this should be used with even more discretion than the above, since
+ it is more likely to break some sites that require pop-ups for normal
+ usage. Use with caution.
+
+
+
+
+
+ img-reorder
+
+
+ This is a helper filter that has no value if used alone. It makes the
+ banners-by-size and banners-by-link
+ (see below) filters more effective and should be enabled together with them.
+
+
+
+
+
+ banners-by-size
+
+
+ This filter removes image tags purely based on what size they are. Fortunately
+ for us, many ads and banner images tend to conform to certain standardized
+ sizes, which makes this filter quite effective for ad stripping purposes.
+
+
+ Occasionally this filter will cause false positives on images that are not ads,
+ but just happen to be of one of the standard banner sizes.
+
+
+ Recommended only for those who require extreme ad blocking. The default
+ block rules should catch 95+% of all ads without this filter enabled.
+
+
+
+
+
+ banners-by-link
+
+
+ This is an experimental filter that attempts to kill any banners if
+ their URLs seem to point to known or suspected click trackers. It is currently
+ not of much value and is not recommended for use by default.
+
+
+
+
+
+ webbugs
+
+
+ Webbugs are small, invisible images (technically 1X1 GIF images), that
+ are used to track users across websites, and collect information on them.
+ As an HTML page is loaded by the browser, an embedded image tag causes the
+ browser to contact a third-party site, disclosing the tracking information
+ through the requested URL and/or cookies for that third-party domain, without
+ the user ever becoming aware of the interaction with the third-party site.
+ HTML-ized spam also uses a similar technique to verify email addresses.
+
+
+ This filter removes the HTML code that loads such webbugs
.
+
+
+
+
+
+ tiny-textforms
+
+
+ A rather special-purpose filter that can be used to enlarge textareas (those
+ multi-line text boxes in web forms) and turn off hard word wrap in them.
+ It was written for the sourceforge.net tracker system where such boxes are
+ a nuisance, but it can be handy on other sites, too.
+
+
+ It is not recommended to use this filter as a default.
+
+
+
+
+
+ jumping-windows
+
+
+ Many consider windows that move, or resize themselves to be abusive. This filter
+ neutralizes the related JavaScript code. Note that some sites might not display
+ or behave as intended when using this filter. Use with caution.
+
+
+
+
+
+ frameset-borders
+
+
+ Some web designers seem to assume that everyone in the world will view their
+ web sites using the same browser brand and version, screen resolution etc,
+ because only that assumption could explain why they'd use static frame sizes,
+ yet prevent their frames from being resized by the user, should they be too
+ small to show their whole content.
+
+
+ This filter removes the related HTML code. It should only be applied to sites
+ which need it.
+
+
+
+
+
+ demoronizer
+
+
+ Many Microsoft products that generate HTML use non-standard extensions (read:
+ violations) of the ISO 8859-1 aka Latin-1 character set. This can cause those
+ HTML documents to display with errors on standard-compliant platforms.
+
+
+ This filter translates the MS-only characters into Latin-1 equivalents.
+ It is not necessary when using MS products, and will cause corruption of
+ all documents that use 8-bit character sets other than Latin-1. It's mostly
+ worthwhile for Europeans on non-MS platforms, if weird garbage characters
+ sometimes appear on some pages, or user agents that don't correct for this on
+ the fly.
+
+
+
+
+
+
+ shockwave-flash
+
+
+ A filter for shockwave haters. As the name suggests, this filter strips code
+ out of web pages that is used to embed shockwave flash objects.
+
+
+
+
+
-
- So what does this job do? It replaces assignments of single- or double-quoted
- strings to the window.status
object with a dummy assignment
- (using a variable name that is hopefully odd enough not to conflict with
- real variables in scripts). Thus, it catches many cases where e.g. pointless
- descriptions are displayed in the status bar instead of the link target when
- you move your mouse over links.
-
+
+ quicktime-kioskmode
+
+
+ Change HTML code that embeds Quicktime objects so that kioskmode, which
+ prevents saving, is disabled.
+
+
+
-
-
-# Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
-#
-s/(<body .*)onunload(.*>)/$1never$2/iU
-
+
+ fun
+
+
+ Text replacements for subversive browsing fun. Make fun of your favorite
+ Monopolist or play buzzword bingo.
+
+
+
-
- Including the
- OnUnload
- event binding in the HTML DOM was a CRIME .
- When I close a browser window, I want it to close and die. Basta.
- This job replaces the onunload
attribute in
- <body>
tags with the dummy word never .
- Note that the i option makes the pattern matching
- case-insensitive.
-
+
+ crude-parental
+
+
+ A demonstration-only filter that shows how Privoxy
+ can be used to delete web content on a keyword basis.
+
+
+
-
- The last example is from the fun department:
-
+
+ ie-exploits
+
+
+ An experimental collection of text replacements to disable malicious HTML and JavaScript
+ code that exploits known security holes in Internet Explorer.
+
+
+ Presently, it only protects against Nimda and a cross-site scripting bug, and
+ would need active maintenance to provide more substantial protection.
+
+
+
-
-
-FILTER: fun Fun text replacements
+
+ site-specifics
+
+
+ Some web sites have very specific problems, the cure for which doesn't apply
+ anywhere else, or could even cause damage on other sites.
+
+
+ This is a collection of such site-specific cures which should only be applied
+ to the sites they were intended for, which is what the supplied
+ default.action file does. Users shouldn't need to change
+ anything regarding this filter.
+
+
+
-# Spice the daily news:
-#
-s/microsoft(?!\.com)/MicroSuck/ig
-
+
+ google
+
+
+ A CSS based block for Google text ads. Also removes a width limitation
+ and the toolbar advertisement.
+
+
+
+
+
+ yahoo
+
+
+ Another CSS based block, this time for Yahoo text ads. And removes
+ a width limitation as well.
+
+
+
-
- Note the (?!\.com) part (a so-called negative lookahead)
- in the job's pattern, which means: Don't match, if the string
- .com
appears directly following microsoft
- in the page. This prevents links to microsoft.com from being messed, while
- still replacing the word everywhere else.
-
+
+ msn
+
+
+ Another CSS based block, this time for MSN text ads. And removes
+ tracking URLs, as well as a width limitation.
+
+
+
-
-
-# Buzzword Bingo (example for extended regex syntax)
-#
-s* industry[ -]leading \
-| cutting[ -]edge \
-| award[ -]winning # Comments are OK, too! \
-| high[ -]performance \
-| solutions[ -]based \
-| unmatched \
-| unparalleled \
-| unrivalled \
-*<font color="red"><b>BINGO!</b></font> \
-*igx
-
+
+ blogspot
+
+
+ Cleans up some Blogspot blogs. Read the fine print before using this one!
+
+
+ This filter also intentionally removes some navigation stuff and sets the
+ page width to 100%. As a result, some rounded corners
would
+ appear to early or not at all and as fixing this would require a browser
+ that understands background-size (CSS3), they are removed instead.
+
+
+
-
- The x option in this job turns on extended syntax, and allows for
- e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.
-
+
+ xml-to-html
+
+
+ Header filter to change the Content-Type from xml to html.
+
+
+
+
+ html-to-xml
+
+
+ Header filter to change the Content-Type from html to xml.
+
+
+
+
+
+
-
- You get the idea?
-
@@ -6066,7 +7191,7 @@ s* industry[ -]leading \
-Templates
+Privoxy's Template Files
All Privoxy built-in pages, i.e. error pages such as the
404 - No Such Domain
@@ -6076,12 +7201,12 @@ s* industry[ -]leading \
and all pages of its web-based
user interface , are generated from templates .
(Privoxy must be running for the above links to work as
- intended)
+ intended.)
These templates are stored in a subdirectory of the configuration
- directory called templates . On unixish platforms,
+ directory called templates . On Unixish platforms,
this is typically
/etc/privoxy/templates/ .
@@ -6108,7 +7233,7 @@ s* industry[ -]leading \
blocks of HTML code disappear when a specific symbol is set. We use this
for many purposes, one of them being to include the beta warning in all
our user interface (CGI) pages when Privoxy
- in in an alpha or beta development stage:
+ is in an alpha or beta development stage:
@@ -6165,7 +7290,7 @@ Requests
-Privoxy Copyright, License and History
+Privoxy Copyright, License and History
©right;
@@ -6220,7 +7345,11 @@ Requests
expressions in its actions
files and filter file,
through the PCRE and
+
+ PCRS libraries.
@@ -6300,7 +7429,7 @@ Requests
- [] - Characters enclosed in brackets will be matched if
+ [ ] - Characters enclosed in brackets will be matched if
any of the enclosed characters are encountered. For instance, [0-9]
matches any numeric digit (zero through nine). As an example, we can combine
this with +
to match any digit one of more times: [0-9]+
.
@@ -6309,7 +7438,7 @@ Requests
- () - parentheses are used to group a sub-expression,
+ ( ) - parentheses are used to group a sub-expression,
or multiple sub-expressions.
@@ -6351,7 +7480,7 @@ Requests
- A now something a little more complex:
+ And now something a little more complex:
@@ -6389,7 +7518,7 @@ Requests
/.*/advert[0-9]+\.(gif|jpe?g) - Again
another path statement with forward slashes. Anything in the square brackets
- []
can be matched. This is using 0-9
as a
+ [ ]
can be matched. This is using 0-9
as a
shorthand expression to mean any digit one through nine. It is the same as
saying 0123456789
. So any digit matches. The +
means one or more of the preceding expression must be included. The preceding
@@ -6425,11 +7554,11 @@ Requests
More reading on Perl Compatible Regular expressions:
- http://www.perldoc.com/perl5.6/pod/perlre.html
+ http://perldoc.perl.org/perlre.html
- For information on regular expression based substititions and their applications
+ For information on regular expression based substitutions and their applications
in filters, please see the filter file tutorial
in this manual.
@@ -6440,7 +7569,7 @@ Requests
-Privoxy 's Internal Pages
+Privoxy's Internal Pages
Since Privoxy proxies each requested
@@ -6476,7 +7605,7 @@ Requests
There is a shortcut: http://p.p/ (But it
- doesn't provide a fallback to a real page, in case the request is not
+ doesn't provide a fall-back to a real page, in case the request is not
sent through Privoxy )
@@ -6608,19 +7737,24 @@ Requests
url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy- View Status
-
+
+
+
+ Privoxy - Why?
+
+
Credit: The site which gave us the general idea for these bookmarklets is
- www.bookmarklets.com . They
+ www.bookmarklets.com . They
have more information about bookmarklets.
@@ -6651,7 +7785,7 @@ Requests
Privoxy traps any request for its own internal CGI
- pages (e.g http://p.p/) and sends the CGI page back to the browser.
+ pages (e.g http://p.p/ ) and sends the CGI page back to the browser.
@@ -6700,7 +7834,7 @@ Requests
First, the server headers are read and processed to determine, among other
things, the MIME type (document type) and encoding. The headers are then
- filtered as deterimed by the
+ filtered as determined by the
+crunch-incoming-cookies
,
+session-cookies-only
,
and +downgrade-http-version
@@ -6721,10 +7855,10 @@ Requests
linkend="DEANIMATE-GIFS">+deanimate-gifs
action applies (and the document type fits the action), the rest of the page is
read into memory (up to a configurable limit). Then the filter rules (from
- default.filter ) are processed against the buffered
- content. Filters are applied in the order they are specified in the
- default.filter file. Animated GIFs, if present, are
- reduced to either the first or last frame, depending on the action
+ default.filter and any other filter files) are
+ processed against the buffered content. Filters are applied in the order
+ they are specified in one of the filter files. Animated GIFs, if present,
+ are reduced to either the first or last frame, depending on the action
setting.The entire page, which is now filtered, is then sent by
Privoxy back to your browser.
@@ -6738,7 +7872,7 @@ Requests
- As the browser receives the now (probably filtered) page content, it
+ As the browser receives the now (possibly filtered) page content, it
reads and then requests any URLs that may be embedded within the page
source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
frames), sounds, etc. For each of these objects, the browser issues a new
@@ -6755,7 +7889,7 @@ Requests
-Anatomy of an Action
+Troubleshooting: Anatomy of an Action
The way Privoxy applies
@@ -6774,7 +7908,15 @@ Requests
One quick test to see if Privoxy is causing a problem
or not, is to disable it temporarily. This should be the first troubleshooting
step. See the Bookmarklets section on a quick
- and easy way to do this (be sure to flush caches afterward!).
+ and easy way to do this (be sure to flush caches afterward!). Looking at the
+ logs is a good idea too.
+
+
+ Another easy troubleshooting step to try is if you have done any
+ customization of your installation, revert back to the installed
+ defaults and see if that helps. There are times the developers get complaints
+ about one thing or another, and the problem is more related to a customized
+ configuration issue.
@@ -6790,7 +7932,7 @@ Requests
how the current configuration will handle it. This will not
help with filtering effects (i.e. the +filter
action) from
- the default.filter file since this is handled very
+ one of the filter files since this is handled very
differently and not so easy to trap! It also will not tell you about any other
URLs that may be embedded within the URL you are testing. For instance, images
such as ads are expressed as URLs within the raw page source of HTML pages. So
@@ -6803,7 +7945,8 @@ Requests
Let's try an example, google.com ,
- and look at it one section at a time:
+ and look at it one section at a time in a sample configuration (your real
+ configuration may vary):
@@ -6812,38 +7955,69 @@ Requests
In file: default.action [ View ] [ Edit ]
-{-add-header
- -block
- -crunch-outgoing-cookies
- -crunch-incoming-cookies
- +deanimate-gifs{last}
- -downgrade-http-version
- +fast-redirects
- -filter{popups}
- -filter{fun}
- -filter{shockwave-flash}
- -filter{crude-parental}
- +filter{html-annoyances}
- +filter{js-annoyances}
- +filter{content-cookies}
- +filter{webbugs}
- +filter{refresh-tags}
- +filter{nimda}
- +filter{banners-by-size}
- +hide-forwarded-for-headers
- +hide-from-header{block}
- +hide-referer{forge}
- -hide-user-agent
- -handle-as-image
- -kill-popups
- -limit-connect
- +prevent-compression
- -send-vanilla-wafer
- -send-wafer
- +session-cookies-only
- +set-image-blocker{pattern} }
+ {-add-header
+ -block
+ -content-type-overwrite
+ -crunch-client-header
+ -crunch-if-none-match
+ -crunch-incoming-cookies
+ -crunch-outgoing-cookies
+ -crunch-server-header
+ +deanimate-gifs {last}
+ -downgrade-http-version
+ +fast-redirects {check-decoded-url}
+ -filter {js-events}
+ -filter {content-cookies}
+ -filter {all-popups}
+ -filter {banners-by-link}
+ -filter {tiny-textforms}
+ -filter {frameset-borders}
+ -filter {demoronizer}
+ -filter {shockwave-flash}
+ -filter {quicktime-kioskmode}
+ -filter {fun}
+ -filter {crude-parental}
+ -filter {site-specifics}
+ -filter {js-annoyances}
+ -filter {html-annoyances}
+ +filter {refresh-tags}
+ -filter {unsolicited-popups}
+ +filter {img-reorder}
+ +filter {banners-by-size}
+ +filter {webbugs}
+ +filter {jumping-windows}
+ +filter {ie-exploits}
+ -filter {google}
+ -filter {yahoo}
+ -filter {msn}
+ -filter {blogspot}
+ -filter {xml-to-html}
+ -filter {html-to-xml}
+ -filter-client-headers
+ -filter-server-headers
+ -force-text-mode
+ -handle-as-empty-document
+ -handle-as-image
+ -hide-accept-language
+ -hide-content-disposition
+ +hide-forwarded-for-headers
+ +hide-from-header {block}
+ -hide-if-modified-since
+ +hide-referrer {forge}
+ -hide-user-agent
+ -inspect-jpegs
+ -kill-popups
+ -limit-connect
+ -overwrite-last-modified
+ +prevent-compression
+ -redirect
+ -send-vanilla-wafer
+ -send-wafer
+ +session-cookies-only
+ +set-image-blocker {pattern}
+ -treat-forbidden-connects-like-blocks }
/
-
+
{ -session-cookies-only }
.google.com
@@ -6856,41 +8030,53 @@ In file: user.action [ View ] [ Edit ]
- This tells us how we have defined our
+ This is telling us how we have defined our
actions
, and
- which ones match for our example, google.com
. The first listing
- is any matches for the standard.action file. No hits at
- all here on standard
. Then next is default
, or
- our default.action file. The large, multi-line listing,
- is how the actions are set to match for all URLs, i.e. our default settings.
- If you look at your actions
file, this would be the section
- just below the aliases
section near the top. This will apply to
- all URLs as signified by the single forward slash at the end of the listing
- -- /
.
-
-
-
- But we can define additional actions that would be exceptions to these general
- rules, and then list specific URLs (or patterns) that these exceptions would
- apply to. Last match wins. Just below this then are two explicit matches for
- .google.com
. The first is negating our previous cookie setting,
- which was for google.com.
+ Displayed is all the actions that are available to us. Remember,
+ the + sign denotes on
. -
+ denotes off
. So some are on
here, but many
+ are off
. Each example we try may provide a slightly different
+ end result, depending on our configuration directives.
+
+
+ The first listing
+ is for our default.action file. The large, multi-line
+ listing, is how the actions are set to match for all URLs, i.e. our default
+ settings. If you look at your actions
file, this would be the
+ section just below the aliases
section near the top. This
+ will apply to all URLs as signified by the single forward slash at the end
+ of the listing -- /
.
+
+
+
+ But we have defined additional actions that would be exceptions to these general
+ rules, and then we list specific URLs (or patterns) that these exceptions
+ would apply to. Last match wins. Just below this then are two explicit
+ matches for .google.com
. The first is negating our previous
+ cookie setting, which was for +session-cookies-only
- (i.e. not persistent). So we will allow persistent cookies for google. The
- second turns off any
- off any +fast-redirects
action, allowing this to take place unmolested. Note that there is a leading
dot here -- .google.com
. This will match any hosts and
sub-domains, in the google.com domain also, such as
- www.google.com
. So, apparently, we have these two actions
- defined somewhere in the lower part of our default.action
- file, and google.com
is referenced somewhere in these latter
- sections.
+ www.google.com
or mail.google.com
. But it would not
+ match www.google.de
! So, apparently, we have these two actions
+ defined as exceptions to the general rules at the top somewhere in the lower
+ part of our default.action file, and
+ google.com
is referenced somewhere in these latter sections.
Then, for our user.action file, we again have no hits.
+ So there is nothing google-specific that we might have added to our own, local
+ configuration. If there was, those actions would over-rule any actions from
+ previously processed files, such as default.action .
+ user.action typically has the last word. This is the
+ best place to put hard and fast exceptions,
@@ -6905,42 +8091,74 @@ In file: user.action [ View ] [ Edit ]
+ -session-cookies-only
+ +set-image-blocker {pattern}
+ -treat-forbidden-connects-like-blocks
Notice the only difference here to the previous listing, is to
- fast-redirects
and session-cookies-only
.
+ fast-redirects
and session-cookies-only
,
+ which are activated specifically for this site in our configuration,
+ and thus show in the Final Results
.
@@ -6950,22 +8168,23 @@ In file: user.action [ View ] [ Edit ]
- { +block +handle-as-image }
- .ad.doubleclick.net
-
- { +block +handle-as-image }
+ { +block }
ad*.
+ { +block }
+ .ad.
+
{ +block +handle-as-image }
- .doubleclick.net
+ .[a-vx-z]*.doubleclick.net
- We'll just show the interesting part here, the explicit matches. It is
- matched three different times. Each as an +block +handle-as-image
,
+ We'll just show the interesting part here - the explicit matches. It is
+ matched three different times. Two +block
sections,
+ and a +block +handle-as-image
,
which is the expanded form of one of our aliases that had been defined as:
- +imageblock
. ( +block-as-image. (Aliases
are defined in
the first section of the actions file and typically used to combine more
than one action.)
@@ -6980,50 +8199,82 @@ In file: user.action [ View ] [ Edit ]+block
and an
- +handle-as-image
.
- The custom alias +imageblock
just simplifies the process and make
- it more readable.
+ +handle-as-image
.
+ The custom alias +block-as-image
just
+ simplifies the process and make it more readable.
- One last example. Let's try http://www.rhapsodyk.net/adsl/HOWTO/
.
- This one is giving us problems. We are getting a blank page. Hmmm...
+ One last example. Let's try http://www.example.net/adsl/HOWTO/
.
+ This one is giving us problems. We are getting a blank page. Hmmm ...
- Matches for http://www.rhapsodyk.net/adsl/HOWTO/:
+ Matches for http://www.example.net/adsl/HOWTO/:
In file: default.action [ View ] [ Edit ]
{-add-header
- -block
- -crunch-incoming-cookies
- -crunch-outgoing-cookies
+ -block
+ -content-type-overwrite
+ -crunch-client-header
+ -crunch-if-none-match
+ -crunch-incoming-cookies
+ -crunch-outgoing-cookies
+ -crunch-server-header
+deanimate-gifs
-downgrade-http-version
- +fast-redirects
- +filter{html-annoyances}
- +filter{js-annoyances}
- +filter{kill-popups}
- +filter{webbugs}
- +filter{nimda}
- +filter{banners-by-size}
- +filter{hal}
- +filter{fun}
+ +fast-redirects {check-decoded-url}
+ -filter {js-events}
+ -filter {content-cookies}
+ -filter {all-popups}
+ -filter {banners-by-link}
+ -filter {tiny-textforms}
+ -filter {frameset-borders}
+ -filter {demoronizer}
+ -filter {shockwave-flash}
+ -filter {quicktime-kioskmode}
+ -filter {fun}
+ -filter {crude-parental}
+ -filter {site-specifics}
+ -filter {js-annoyances}
+ -filter {html-annoyances}
+ +filter {refresh-tags}
+ -filter {unsolicited-popups}
+ +filter {img-reorder}
+ +filter {banners-by-size}
+ +filter {webbugs}
+ +filter {jumping-windows}
+ +filter {ie-exploits}
+ -filter {google}
+ -filter {yahoo}
+ -filter {msn}
+ -filter {blogspot}
+ -filter {xml-to-html}
+ -filter {html-to-xml}
+ -filter-client-headers
+ -filter-server-headers
+ -force-text-mode
+ -handle-as-empty-document
+ -handle-as-image
+ -hide-accept-language
+ -hide-content-disposition
+hide-forwarded-for-headers
+hide-from-header{block}
+hide-referer{forge}
-hide-user-agent
- -handle-as-image
- +kill-popups
+ -inspect-jpegs
+ -kill-popups
+ -overwrite-last-modified
+prevent-compression
+ -redirect
-send-vanilla-wafer
-send-wafer
+session-cookies-only
- +set-image-blocker{blank} }
+ +set-image-blocker{blank}
+ -treat-forbidden-connects-like-blocks }
/
{ +block +handle-as-image }
@@ -7032,11 +8283,17 @@ In file: user.action [ View ] [ Edit ]
- Ooops, the /adsl/
is matching /ads
! But
- we did not want this at all! Now we see why we get the blank page. We could
- now add a new action below this that explicitly does not
- block ({-block}
) paths with adsl
. There are
- various ways to handle such exceptions. Example:
+ Ooops, the /adsl/
is matching /ads
in our
+ configuration! But we did not want this at all! Now we see why we get the
+ blank page. It is actually triggering two different actions here, and
+ the effects are aggregated so that the URL is blocked, and &my-app; is told
+ to treat the block as if it were an image. But this is, of course, all wrong.
+ We could now add a new action below this (or better in our own
+ user.action file) that explicitly
+ un blocks (
+ {-block}
) paths with
+ adsl
in them (remember, last match in the configuration
+ wins). There are various ways to handle such exceptions. Example:
@@ -7048,8 +8305,10 @@ In file: user.action [ View ] [ Edit ]
- Now the page displays ;-) Be sure to flush your browser's caches when
- making such changes. Or, try using Shift+Reload .
+ Now the page displays ;-)
+ Remember to flush your browser's caches when making these kinds of changes to
+ your configuration to insure that you get a freshly delivered page! Or, try
+ using Shift+Reload .
@@ -7066,18 +8325,21 @@ In file: user.action [ View ] [ Edit ]
- That actually was very telling and pointed us quickly to where the problem
+ That actually was very helpful and pointed us quickly to where the problem
was. If you don't get this kind of match, then it means one of the default
- rules in the first section is causing the problem. This would require some
- guesswork, and maybe a little trial and error to isolate the offending rule.
- One likely cause would be one of the {+filter}
actions. Try
- adding the URL for the site to one of aliases that turn off +filter
:
+ rules in the first section of default.action is causing
+ the problem. This would require some guesswork, and maybe a little trial and
+ error to isolate the offending rule. One likely cause would be one of the
+ +filter
actions.
+ These tend to be harder to troubleshoot.
+ Try adding the URL for the site to one of aliases that turn off
+ +filter
:
- {shop}
+ { shop }
.quietpc.com
.worldpay.com # for quietpc.com
.jungle.com
@@ -7087,8 +8349,8 @@ In file: user.action [ View ] [ Edit ]
- {shop}
is an alias
that expands to
- { -filter -session-cookies-only }
.
+ { shop }
is an alias
that expands to
+ { -filter -session-cookies-only }
.
Or you could do your own exception to negate filtering:
@@ -7096,21 +8358,55 @@ In file: user.action [ View ] [ Edit ]
- {-filter}
+ { -filter }
+ # Disable ALL filter actions for sites in this section
.forbes.com
+ developer.ibm.com
+ localhost
- This would probably be most appropriately put in user.action ,
- for local site exceptions.
+ This would turn off all filtering for these sites. This is best
+ put in user.action , for local site
+ exceptions. Note that when a simple domain pattern is used by itself (without
+ the subsequent path portion), all sub-pages within that domain are included
+ automatcially in the scope of the action.
+
+
+
+ Images that are inexplicably being blocked, may well be hitting the
++filter{banners-by-size}
+ rule, which assumes
+ that images of certain sizes are ad banners (works well
+ most of the time since these tend to be standardized).
+
+
+
+ { fragile }
is an alias that disables most
+ actions that are the most likely to cause trouble. This can be used as a
+ last resort for problem sites.
+
+
+
+
+ { fragile }
+ # Handle with care: easy to break
+ mail.google.
+ mybank.example.com
+
- {fragile}
is an alias that disables most actions. This can be
- used as a last resort for problem sites. Remember to flush caches! If this
- still does not work, you will have to go through the remaining actions one by
- one to find which one(s) is causing the problem.
+ Remember to flush caches! Note that the
+ mail.google reference lacks the TLD portion (e.g.
+ .com
. This will effectively match any TLD with
+ google in it, such as mail.google.de ,
+ just as an example.
+
+
+ If this still does not work, you will have to go through the remaining
+ actions one by one to find which one(s) is causing the problem.
@@ -7134,10 +8430,225 @@ In file: user.action [ View ] [ Edit ] style.
+ - Small fixes in the actions chapter
+ - Small clarifications in the quickstart to ad blocking
+ - Removed from s since the new doc CSS
+ renders them red (bad in TOC).
+
Revision 1.120 2002/05/23 19:16:43 roro
Correct Debian specials (installation and startup).