- full path to avoid confusion. If no config file is found,
- <application>Privoxy</application> will fail to start.
- </para>
- </listitem>
-
- </itemizedlist>
-</para>
-
-<para>
- On <application>MS Windows</application> only there are two additional
- command-line options to allow <application>Privoxy</application> to install and
- run as a <emphasis>service</emphasis>. See the
-<link linkend="installation-pack-win">Window Installation section</link>
-for details.
-</para>
-
-</sect2>
-
-</sect1>
-
-<!-- ~ End section ~ -->
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect1 id="configuration"><title>Privoxy Configuration</title>
- <para>
- All <application>Privoxy</application> configuration is stored
- in text files. These files can be edited with a text editor.
- Many important aspects of <application>Privoxy</application> can
- also be controlled easily with a web browser.
- </para>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-
-<sect2>
-<title>Controlling Privoxy with Your Web Browser</title>
-<para>
- <application>Privoxy</application>'s user interface can be reached through the special
- URL <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
- (shortcut: <ulink url="http://p.p/">http://p.p/</ulink>),
- which is a built-in page and works without Internet access.
- You will see the following section:
-
-</para>
-
-<!-- Needs to be put in a table and colorized -->
-<screen>
- <msgtext>
- <bridgehead renderas="sect2"> Privoxy Menu</bridgehead>
-
- <simplelist>
- <member>
- ▪ <ulink url="http://config.privoxy.org/show-status">View & change the current configuration</ulink>
- </member>
- <member>
- ▪ <ulink url="http://config.privoxy.org/show-version">View the source code version numbers</ulink>
- </member>
- <member>
- ▪ <ulink url="http://config.privoxy.org/show-request">View the request headers.</ulink>
- </member>
- <member>
- ▪ <ulink url="http://config.privoxy.org/show-url-info">Look up which actions apply to a URL and why</ulink>
- </member>
- <member>
- ▪ <ulink url="http://config.privoxy.org/toggle">Toggle Privoxy on or off</ulink>
- </member>
- <member>
- ▪ <ulink
- url="http://www.privoxy.org/&p-version;/user-manual/">Documentation</ulink>
- </member>
- </simplelist>
- </msgtext>
-</screen>
-
-
-<para>
- This should be self-explanatory. Note the first item leads to an editor for the
- <link linkend="actions-file">actions files</link>, which is where the ad, banner,
- cookie, and URL blocking magic is configured as well as other advanced features of
- <application>Privoxy</application>. This is an easy way to adjust various
- aspects of <application>Privoxy</application> configuration. The actions
- file, and other configuration files, are explained in detail below.
-</para>
-
-<para>
- <quote>Toggle Privoxy On or Off</quote> is handy for sites that might
- have problems with your current actions and filters. You can in fact use
- it as a test to see whether it is <application>Privoxy</application>
- causing the problem or not. <application>Privoxy</application> continues
- to run as a proxy in this case, but all manipulation is disabled, i.e.
- <application>Privoxy</application> acts like a normal forwarding proxy. There
- is even a toggle <link linkend="bookmarklets">Bookmarklet</link> offered, so
- that you can toggle <application>Privoxy</application> with one click from
- your browser.
-</para>
-
-<para>
- Note that several of the features described above are disabled by default
- in <application>Privoxy</application> 3.0.7 beta and later.
- Check the
- <ulink url="config.html">configuration file</ulink> to learn why
- and in which cases it's safe to enable them again.
-</para>
-
-</sect2>
-
-<!-- ~ End section ~ -->
-
-
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-
-<sect2 id="confoverview">
-<title>Configuration Files Overview</title>
-<para>
- For Unix, *BSD and Linux, all configuration files are located in
- <filename>/etc/privoxy/</filename> by default. For MS Windows, OS/2, and
- AmigaOS these are all in the same directory as the
- <application>Privoxy</application> executable. <![%p-not-stable;[ The name
- and number of configuration files has changed from previous versions, and is
- subject to change as development progresses.]]>
-</para>
-
-<para>
- The installed defaults provide a reasonable starting point, though
- some settings may be aggressive by some standards. For the time being, the
- principle configuration files are:
-</para>
-
-<para>
- <itemizedlist>
-
- <listitem>
- <para>
- The <link linkend="config">main configuration file</link> is named <filename>config</filename>
- on Linux, Unix, BSD, OS/2, and AmigaOS and <filename>config.txt</filename>
- on Windows. This is a required file.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <filename>match-all.action</filename> is used to define which <quote>actions</quote>
- relating to banner-blocking, images, pop-ups, content modification, cookie handling
- etc should be applied by default. It should be the first actions file loaded.
- </para>
- <para>
- <filename>default.action</filename> defines many exceptions (both positive and negative)
- from the default set of actions that's configured in <filename>match-all.action</filename>.
- It should be the second actions file loaded and shouldn't be edited by the user.
- </para>
- <para>
- Multiple actions files may be defined in <filename>config</filename>. These
- are processed in the order they are defined. Local customizations and locally
- preferred exceptions to the default policies as defined in
- <filename>match-all.action</filename> (which you will most probably want
- to define sooner or later) are best applied in <filename>user.action</filename>,
- where you can preserve them across upgrades. The file isn't installed by all
- installers, but you can easily create it yourself with a text editor.
- </para>
- <para>
- There is also a web based editor that can be accessed from
- <ulink
- url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
- (Shortcut: <ulink
- url="http://p.p/show-status">http://p.p/show-status</ulink>) for the
- various actions files.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <quote>Filter files</quote> (the <link linkend="filter-file">filter
- file</link>) can be used to re-write the raw page content, including
- viewable text as well as embedded HTML and JavaScript, and whatever else
- lurks on any given web page. The filtering jobs are only pre-defined here;
- whether to apply them or not is up to the actions files.
- <filename>default.filter</filename> includes various filters made
- available for use by the developers. Some are much more intrusive than
- others, and all should be used with caution. You may define additional
- filter files in <filename>config</filename> as you can with
- actions files. We suggest <filename>user.filter</filename> for any
- locally defined filters or customizations.
- </para>
- </listitem>
-
- </itemizedlist>
-</para>
-
-<para>
- The syntax of the configuration and filter files may change between different
- Privoxy versions, unfortunately some enhancements cost backwards compatibility.
- <!-- Add link to documentation-->
-</para>
-
-<para>
- All files use the <quote><literal>#</literal></quote> character to denote a
- comment (the rest of the line will be ignored) and understand line continuation
- through placing a backslash ("<literal>\</literal>") as the very last character
- in a line. If the <literal>#</literal> is preceded by a backslash, it looses
- its special function. Placing a <literal>#</literal> in front of an otherwise
- valid configuration line to prevent it from being interpreted is called "commenting
- out" that line. Blank lines are ignored.
-</para>
-
-<para>
- The actions files and filter files
- can use Perl style <link linkend="regex">regular expressions</link> for
- maximum flexibility.
-</para>
-
-<para>
- After making any changes, there is no need to restart
- <application>Privoxy</application> in order for the changes to take
- effect. <application>Privoxy</application> detects such changes
- automatically. Note, however, that it may take one or two additional
- requests for the change to take effect. When changing the listening address
- of <application>Privoxy</application>, these <quote>wake up</quote> requests
- must obviously be sent to the <emphasis>old</emphasis> listening address.
-</para>
-
-<![%p-not-stable;[
-<para>
- While under development, the configuration content is subject to change.
- The below documentation may not be accurate by the time you read this.
- Also, what constitutes a <quote>default</quote> setting, may change, so
- please check all your configuration files on important issues.
-</para>
-]]>
-
-</sect2>
-</sect1>
-<!-- ~ End section ~ -->
-
-
-<!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
-
-<!-- **************************************************** -->
-<!-- Include config.sgml here -->
-<!-- This is where the entire config file is detailed. -->
- &config;
-<!-- end include -->
-
-
-<!-- ~ End section ~ -->
-
-
-
-<!-- ~~~~~~~~ New section Header ~~~~~~~~~ -->
-
-<sect1 id="actions-file"><title>Actions Files</title>
-
-
-<!--
- XXX: similar descriptions are in the Configuration Files sections.
- We should only describe them at one place.
--->
-<para>
- The actions files are used to define what <emphasis>actions</emphasis>
- <application>Privoxy</application> takes for which URLs, and thus determines
- how ad images, cookies and various other aspects of HTTP content and
- transactions are handled, and on which sites (or even parts thereof).
- There are a number of such actions, with a wide range of functionality.
- Each action does something a little different.
- These actions give us a veritable arsenal of tools with which to exert
- our control, preferences and independence. Actions can be combined so that
- their effects are aggregated when applied against a given set of URLs.
-</para>
-<para>
- There
- are three action files included with <application>Privoxy</application> with
- differing purposes:
-</para>
-<para>
- <itemizedlist>
- <listitem>
- <para>
- <filename>match-all.action</filename> - is used to define which
- <quote>actions</quote> relating to banner-blocking, images, pop-ups,
- content modification, cookie handling etc should be applied by default.
- It should be the first actions file loaded
- </para>
- </listitem>
- <listitem>
- <para>
- <filename>default.action</filename> - defines many exceptions (both
- positive and negative) from the default set of actions that's configured
- in <filename>match-all.action</filename>. It is a set of rules that should
- work reasonably well as-is for most users. This file is only supposed to
- be edited by the developers. It should be the second actions file loaded.
- </para>
- </listitem>
- <listitem>
- <para>
- <filename>user.action</filename> - is intended to be for local site
- preferences and exceptions. As an example, if your ISP or your bank
- has specific requirements, and need special handling, this kind of
- thing should go here. This file will not be upgraded.
- </para>
- </listitem>
- <listitem>
- <para>
- <guibutton>Edit</guibutton> <guibutton>Set to Cautious</guibutton> <guibutton>Set to Medium</guibutton> <guibutton>Set to Advanced</guibutton>
- </para>
- <para>
- These have increasing levels of aggressiveness <emphasis>and have no
- influence on your browsing unless you select them explicitly in the
- editor</emphasis>. A default installation should be pre-set to
- <literal>Cautious</literal>. New users should try this for a while before
- adjusting the settings to more aggressive levels. The more aggressive
- the settings, then the more likelihood there is of problems such as sites
- not working as they should.
- </para>
- <para>
- The <guibutton>Edit</guibutton> button allows you to turn each
- action on/off individually for fine-tuning. The <guibutton>Cautious</guibutton>
- button changes the actions list to low/safe settings which will activate
- ad blocking and a minimal set of &my-app;'s features, and subsequently
- there will be less of a chance for accidental problems. The
- <guibutton>Medium</guibutton> button sets the list to a medium level of
- other features and a low level set of privacy features. The
- <guibutton>Advanced</guibutton> button sets the list to a high level of
- ad blocking and medium level of privacy. See the chart below. The latter
- three buttons over-ride any changes via with the
- <guibutton>Edit</guibutton> button. More fine-tuning can be done in the
- lower sections of this internal page.
- </para>
- <para>
- While the actions file editor allows to enable these settings in all
- actions files, they are only supposed to be enabled in the first one
- to make sure you don't unintentionally overrule earlier rules.
- </para>
- <para>
- The default profiles, and their associated actions, as pre-defined in
- <filename>default.action</filename> are:
- </para>
- <para>
- <table frame=all><title>Default Configurations</title>
- <tgroup cols=4 align=left colsep=1 rowsep=1>
- <colspec colname=c1>
- <colspec colname=c2>
- <colspec colname=c3>
- <colspec colname=c4>
- <thead>
- <row>
- <entry>Feature</entry>
- <entry>Cautious</entry>
- <entry>Medium</entry>
- <entry>Advanced</entry>
- </row>
- </thead>
- <!-- <tfoot> -->
- <!-- <row> -->
- <!-- <entry>f1</entry> -->
- <!-- <entry>f2</entry> -->
- <!-- <entry>f3</entry> -->
- <!-- <entry>f4</entry> -->
- <!-- </row> -->
- <!-- </tfoot> -->
- <tbody>
-
- <row>
- <entry>Ad-blocking Aggressiveness</entry>
- <entry>medium</entry>
- <entry>high</entry>
- <entry>high</entry>
- </row>
-
- <row>
- <entry>Ad-filtering by size</entry>
- <entry>no</entry>
- <entry>yes</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>Ad-filtering by link</entry>
- <entry>no</entry>
- <entry>no</entry>
- <entry>yes</entry>
- </row>
- <row>
- <entry>Pop-up killing</entry>
- <entry>blocks only</entry>
- <entry>blocks only</entry>
- <entry>blocks only</entry>
- </row>
-
- <row>
- <entry>Privacy Features</entry>
- <entry>low</entry>
- <entry>medium</entry>
- <entry>medium/high</entry>
- </row>
-
- <row>
- <entry>Cookie handling</entry>
- <entry>none</entry>
- <entry>session-only</entry>
- <entry>kill</entry>
- </row>
-
- <row>
- <entry>Referer forging</entry>
- <entry>no</entry>
- <entry>yes</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>GIF de-animation</entry>
- <entry>no</entry>
- <entry>yes</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>Fast redirects</entry>
- <entry>no</entry>
- <entry>no</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>HTML taming</entry>
- <entry>no</entry>
- <entry>no</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>JavaScript taming</entry>
- <entry>no</entry>
- <entry>no</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>Web-bug killing</entry>
- <entry>no</entry>
- <entry>yes</entry>
- <entry>yes</entry>
- </row>
-
- <row>
- <entry>Image tag reordering</entry>
- <entry>no</entry>
- <entry>yes</entry>
- <entry>yes</entry>
- </row>
-
- </tbody>
- </tgroup>
- </table>
- </para>
-
- </listitem>
- </itemizedlist>
-</para>
-
-<para>
- The list of actions files to be used are defined in the main configuration
- file, and are processed in the order they are defined (e.g.
- <filename>default.action</filename> is typically processed before
- <filename>user.action</filename>). The content of these can all be viewed and
- edited from <ulink
- url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
- The over-riding principle when applying actions, is that the last action that
- matches a given URL wins. The broadest, most general rules go first
- (defined in <filename>default.action</filename>),
- followed by any exceptions (typically also in
- <filename>default.action</filename>), which are then followed lastly by any
- local preferences (typically in <emphasis>user</emphasis><filename>.action</filename>).
- Generally, <filename>user.action</filename> has the last word.
- </para>
-
-<para>
- An actions file typically has multiple sections. If you want to use
- <quote>aliases</quote> in an actions file, you have to place the (optional)
- <link linkend="aliases">alias section</link> at the top of that file.
- Then comes the default set of rules which will apply universally to all
- sites and pages (be <emphasis>very careful</emphasis> with using such a
- universal set in <filename>user.action</filename> or any other actions file after
- <filename>default.action</filename>, because it will override the result
- from consulting any previous file). And then below that,
- exceptions to the defined universal policies. You can regard
- <filename>user.action</filename> as an appendix to <filename>default.action</filename>,
- with the advantage that it is a separate file, which makes preserving your
- personal settings across <application>Privoxy</application> upgrades easier.
-</para>
-
-<para>
- Actions can be used to block anything you want, including ads, banners, or
- just some obnoxious URL whose content you would rather not see. Cookies can be accepted
- or rejected, or accepted only during the current browser session (i.e. not
- written to disk), content can be modified, some JavaScripts tamed, user-tracking
- fooled, and much more. See below for a <link linkend="actions">complete list
- of actions</link>.
-</para>
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect2>
-<title>Finding the Right Mix</title>
-<para>
- Note that some <link linkend="actions">actions</link>, like cookie suppression
- or script disabling, may render some sites unusable that rely on these
- techniques to work properly. Finding the right mix of actions is not always easy and
- certainly a matter of personal taste. And, things can always change, requiring
- refinements in the configuration. In general, it can be said that the more
- <quote>aggressive</quote> your default settings (in the top section of the
- actions file) are, the more exceptions for <quote>trusted</quote> sites you
- will have to make later. If, for example, you want to crunch all cookies per
- default, you'll have to make exceptions from that rule for sites that you
- regularly use and that require cookies for actually useful purposes, like maybe
- your bank, favorite shop, or newspaper.
-</para>
-
-<para>
- We have tried to provide you with reasonable rules to start from in the
- distribution actions files. But there is no general rule of thumb on these
- things. There just are too many variables, and sites are constantly changing.
- Sooner or later you will want to change the rules (and read this chapter again :).
-</para>
-</sect2>
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect2>
-<title>How to Edit</title>
-<para>
- The easiest way to edit the actions files is with a browser by
- using our browser-based editor, which can be reached from <ulink
- url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
- Note: the config file option <link
- linkend="enable-edit-actions">enable-edit-actions</link> must be enabled for
- this to work. The editor allows both fine-grained control over every single
- feature on a per-URL basis, and easy choosing from wholesale sets of defaults
- like <quote>Cautious</quote>, <quote>Medium</quote> or
- <quote>Advanced</quote>. Warning: the <quote>Advanced</quote> setting is more
- aggressive, and will be more likely to cause problems for some sites.
- Experienced users only!
- </para>
-
-<para>
- If you prefer plain text editing to GUIs, you can of course also directly edit the
- the actions files with your favorite text editor. Look at
- <filename>default.action</filename> which is richly commented with many
- good examples.
-</para>
-</sect2>
-
-
-<sect2 id="actions-apply">
-<title>How Actions are Applied to Requests</title>
-<para>
- Actions files are divided into sections. There are special sections,
- like the <quote><link linkend="aliases">alias</link></quote> sections which will
- be discussed later. For now let's concentrate on regular sections: They have a
- heading line (often split up to multiple lines for readability) which consist
- of a list of actions, separated by whitespace and enclosed in curly braces.
- Below that, there is a list of URL and tag patterns, each on a separate line.
-</para>
-
-<para>
- To determine which actions apply to a request, the URL of the request is
- compared to all URL patterns in each <quote>action file</quote>.
- Every time it matches, the list of applicable actions for the request is
- incrementally updated, using the heading of the section in which the
- pattern is located. The same is done again for tags and tag patterns later on.
-</para>
-
-<para>
- If multiple applying sections set the same action differently,
- the last match wins. If not, the effects are aggregated.
- E.g. a URL might match a regular section with a heading line of <literal>{
- +<link linkend="handle-as-image">handle-as-image</link> }</literal>,
- then later another one with just <literal>{
- +<link linkend="block">block</link> }</literal>, resulting
- in <emphasis>both</emphasis> actions to apply. And there may well be
- cases where you will want to combine actions together. Such a section then
- might look like:
-</para>
-
- <para>
- <screen>
- { +<literal>handle-as-image</literal> +<literal>block{Banner ads.}</literal> }
- # Block these as if they were images. Send no block page.
- banners.example.com
- media.example.com/.*banners
- .example.com/images/ads/</screen>
- </para>
-
-<para>
- You can trace this process for URL patterns and any given URL by visiting <ulink
- url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>.
-</para>
-
-<para>
- Examples and more detail on this is provided in the Appendix, <link linkend="ACTIONSANAT">
- Troubleshooting: Anatomy of an Action</link> section.
-</para>
-</sect2>
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect2 id="af-patterns">
-<title>Patterns</title>
-<para>
- As mentioned, <application>Privoxy</application> uses <quote>patterns</quote>
- to determine what <emphasis>actions</emphasis> might apply to which sites and
- pages your browser attempts to access. These <quote>patterns</quote> use wild
- card type <emphasis>pattern</emphasis> matching to achieve a high degree of
- flexibility. This allows one expression to be expanded and potentially match
- against many similar patterns.
-</para>
-
-<para>
- Generally, an URL pattern has the form
- <literal><domain><port>/<path></literal>, where the
- <literal><domain></literal>, the <literal><port></literal>
- and the <literal><path></literal> are optional. (This is why the special
- <literal>/</literal> pattern matches all URLs). Note that the protocol
- portion of the URL pattern (e.g. <literal>http://</literal>) should
- <emphasis>not</emphasis> be included in the pattern. This is assumed already!
-</para>
-<para>
- The pattern matching syntax is different for the domain and path parts of
- the URL. The domain part uses a simple globbing type matching technique,
- while the path part uses more flexible
- <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular
- Expressions</quote></ulink> (POSIX 1003.2).
-</para>
-<para>
- The port part of a pattern is a decimal port number preceded by a colon
- (<literal>:</literal>). If the domain part contains a numerical IPv6 address,
- it has to be put into angle brackets
- (<literal><</literal>, <literal>></literal>).
-</para>
-
-<variablelist>
- <varlistentry>
- <term><literal>www.example.com/</literal></term>
- <listitem>
- <para>
- is a domain-only pattern and will match any request to <literal>www.example.com</literal>,
- regardless of which document on that server is requested. So ALL pages in
- this domain would be covered by the scope of this action. Note that a
- simple <literal>example.com</literal> is different and would NOT match.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>www.example.com</literal></term>
- <listitem>
- <para>
- means exactly the same. For domain-only patterns, the trailing <literal>/</literal> may
- be omitted.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>www.example.com/index.html</literal></term>
- <listitem>
- <para>
- matches all the documents on <literal>www.example.com</literal>
- whose name starts with <literal>/index.html</literal>.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>www.example.com/index.html$</literal></term>
- <listitem>
- <para>
- matches only the single document <literal>/index.html</literal>
- on <literal>www.example.com</literal>.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>/index.html$</literal></term>
- <listitem>
- <para>
- matches the document <literal>/index.html</literal>, regardless of the domain,
- i.e. on <emphasis>any</emphasis> web server anywhere.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>/</literal></term>
- <listitem>
- <para>
- Matches any URL because there's no requirement for either the
- domain or the path to match anything.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>:8000/</literal></term>
- <listitem>
- <para>
- Matches any URL pointing to TCP port 8000.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal><2001:db8::1>/</literal></term>
- <listitem>
- <para>
- Matches any URL with the host address <literal>2001:db8::1</literal>.
- (Note that the real URL uses plain brackets, not angle brackets.)
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>index.html</literal></term>
- <listitem>
- <para>
- matches nothing, since it would be interpreted as a domain name and
- there is no top-level domain called <literal>.html</literal>. So its
- a mistake.
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3><title>The Domain Pattern</title>
-
-<para>
- The matching of the domain part offers some flexible options: if the
- domain starts or ends with a dot, it becomes unanchored at that end.
- For example:
-</para>
-
-<variablelist>
- <varlistentry>
- <term><literal>.example.com</literal></term>
- <listitem>
- <para>
- matches any domain with first-level domain <literal>com</literal>
- and second-level domain <literal>example</literal>.
- For example <literal>www.example.com</literal>,
- <literal>example.com</literal> and <literal>foo.bar.baz.example.com</literal>.
- Note that it wouldn't match if the second-level domain was <literal>another-example</literal>.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>www.</literal></term>
- <listitem>
- <para>
- matches any domain that <emphasis>STARTS</emphasis> with
- <literal>www.</literal> (It also matches the domain
- <literal>www</literal> but most of the time that doesn't matter.)
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>.example.</literal></term>
- <listitem>
- <para>
- matches any domain that <emphasis>CONTAINS</emphasis> <literal>.example.</literal>.
- And, by the way, also included would be any files or documents that exist
- within that domain since no path limitations are specified. (Correctly
- speaking: It matches any FQDN that contains <literal>example</literal> as
- a domain.) This might be <literal>www.example.com</literal>,
- <literal>news.example.de</literal>, or
- <literal>www.example.net/cgi/testing.pl</literal> for instance. All these
- cases are matched.
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-
-<para>
- Additionally, there are wild-cards that you can use in the domain names
- themselves. These work similarly to shell globbing type wild-cards:
- <quote>*</quote> represents zero or more arbitrary characters (this is
- equivalent to the
- <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular
- Expression</quote></ulink> based syntax of <quote>.*</quote>),
- <quote>?</quote> represents any single character (this is equivalent to the
- regular expression syntax of a simple <quote>.</quote>), and you can define
- <quote>character classes</quote> in square brackets which is similar to
- the same regular expression technique. All of this can be freely mixed:
-</para>
-
-<variablelist>
- <varlistentry>
- <term><literal>ad*.example.com</literal></term>
- <listitem>
- <para>
- matches <quote>adserver.example.com</quote>,
- <quote>ads.example.com</quote>, etc but not <quote>sfads.example.com</quote>
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>*ad*.example.com</literal></term>
- <listitem>
- <para>
- matches all of the above, and then some.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>.?pix.com</literal></term>
- <listitem>
- <para>
- matches <literal>www.ipix.com</literal>,
- <literal>pictures.epix.com</literal>, <literal>a.b.c.d.e.upix.com</literal> etc.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>www[1-9a-ez].example.c*</literal></term>
- <listitem>
- <para>
- matches <literal>www1.example.com</literal>,
- <literal>www4.example.cc</literal>, <literal>wwwd.example.cy</literal>,
- <literal>wwwz.example.com</literal> etc., but <emphasis>not</emphasis>
- <literal>wwww.example.com</literal>.
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-
-<para>
- While flexible, this is not the sophistication of full regular expression based syntax.
-</para>
-
-</sect3>
-
-<!-- ~ End section ~ -->
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3><title>The Path Pattern</title>
-
-<para>
- <application>Privoxy</application> uses <quote>modern</quote> POSIX 1003.2
- <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular
- Expressions</quote></ulink> for matching the path portion (after the slash),
- and is thus more flexible.
-</para>
-
-<para>
- There is an <link linkend="regex">Appendix</link> with a brief quick-start into regular
- expressions, you also might want to have a look at your operating system's documentation
- on regular expressions (try <literal>man re_format</literal>).
-</para>
-
-<para>
- Note that the path pattern is automatically left-anchored at the <quote>/</quote>,
- i.e. it matches as if it would start with a <quote>^</quote> (regular expression speak
- for the beginning of a line).
-</para>
-
-<para>
- Please also note that matching in the path is <emphasis>CASE INSENSITIVE</emphasis>
- by default, but you can switch to case sensitive at any point in the pattern by using the
- <quote>(?-i)</quote> switch: <literal>www.example.com/(?-i)PaTtErN.*</literal> will match
- only documents whose path starts with <literal>PaTtErN</literal> in
- <emphasis>exactly</emphasis> this capitalization.
-</para>
-
-<variablelist>
- <varlistentry>
- <term><literal>.example.com/.*</literal></term>
- <listitem>
- <para>
- Is equivalent to just <quote>.example.com</quote>, since any documents
- within that domain are matched with or without the <quote>.*</quote>
- regular expression. This is redundant
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>.example.com/.*/index.html$</literal></term>
- <listitem>
- <para>
- Will match any page in the domain of <quote>example.com</quote> that is
- named <quote>index.html</quote>, and that is part of some path. For
- example, it matches <quote>www.example.com/testing/index.html</quote> but
- NOT <quote>www.example.com/index.html</quote> because the regular
- expression called for at least two <quote>/'s</quote>, thus the path
- requirement. It also would match
- <quote>www.example.com/testing/index_html</quote>, because of the
- special meta-character <quote>.</quote>.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>.example.com/(.*/)?index\.html$</literal></term>
- <listitem>
- <para>
- This regular expression is conditional so it will match any page
- named <quote>index.html</quote> regardless of path which in this case can
- have one or more <quote>/'s</quote>. And this one must contain exactly
- <quote>.html</quote> (but does not have to end with that!).
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>.example.com/(.*/)(ads|banners?|junk)</literal></term>
- <listitem>
- <para>
- This regular expression will match any path of <quote>example.com</quote>
- that contains any of the words <quote>ads</quote>, <quote>banner</quote>,
- <quote>banners</quote> (because of the <quote>?</quote>) or <quote>junk</quote>.
- The path does not have to end in these words, just contain them.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term><literal>.example.com/(.*/)(ads|banners?|junk)/.*\.(jpe?g|gif|png)$</literal></term>
- <listitem>
- <para>
- This is very much the same as above, except now it must end in either
- <quote>.jpg</quote>, <quote>.jpeg</quote>, <quote>.gif</quote> or <quote>.png</quote>. So this
- one is limited to common image formats.
- </para>
- </listitem>
- </varlistentry>
-
-</variablelist>
-<para>
- There are many, many good examples to be found in <filename>default.action</filename>,
- and more tutorials below in <link linkend="regex">Appendix on regular expressions</link>.
-</para>
-
-</sect3>
-
-<!-- ~ End section ~ -->
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 id="tag-pattern"><title>The Tag Pattern</title>
-
-<para>
- Tag patterns are used to change the applying actions based on the
- request's tags. Tags can be created with either the
- <link linkend="CLIENT-HEADER-TAGGER">client-header-tagger</link>
- or the <link linkend="SERVER-HEADER-TAGGER">server-header-tagger</link> action.
-</para>
-
-<para>
- Tag patterns have to start with <quote>TAG:</quote>, so &my-app;
- can tell them apart from URL patterns. Everything after the colon
- including white space, is interpreted as a regular expression with
- path pattern syntax, except that tag patterns aren't left-anchored
- automatically (&my-app; doesn't silently add a <quote>^</quote>,
- you have to do it yourself if you need it).
-</para>
-
-<para>
- To match all requests that are tagged with <quote>foo</quote>
- your pattern line should be <quote>TAG:^foo$</quote>,
- <quote>TAG:foo</quote> would work as well, but it would also
- match requests whose tags contain <quote>foo</quote> somewhere.
- <quote>TAG: foo</quote> wouldn't work as it requires white space.
-</para>
-
-<para>
- Sections can contain URL and tag patterns at the same time,
- but tag patterns are checked after the URL patterns and thus
- always overrule them, even if they are located before the URL patterns.
-</para>
-
-<para>
- Once a new tag is added, Privoxy checks right away if it's matched by one
- of the tag patterns and updates the action settings accordingly. As a result
- tags can be used to activate other tagger actions, as long as these other
- taggers look for headers that haven't already be parsed.
-</para>
-
-<para>
- For example you could tag client requests which use the
- <literal>POST</literal> method,
- then use this tag to activate another tagger that adds a tag if cookies
- are sent, and then use a block action based on the cookie tag. This allows
- the outcome of one action, to be input into a subsequent action. However if
- you'd reverse the position of the described taggers, and activated the
- method tagger based on the cookie tagger, no method tags would be created.
- The method tagger would look for the request line, but at the time
- the cookie tag is created, the request line has already been parsed.
-</para>
-
-<para>
- While this is a limitation you should be aware of, this kind of
- indirection is seldom needed anyway and even the example doesn't
- make too much sense.
-</para>
-
-</sect3>
-
-</sect2>
-
-<!-- ~ End section ~ -->
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-
-<sect2 id="actions">
-<title>Actions</title>
-<para>
- All actions are disabled by default, until they are explicitly enabled
- somewhere in an actions file. Actions are turned on if preceded with a
- <quote>+</quote>, and turned off if preceded with a <quote>-</quote>. So a
- <literal>+action</literal> means <quote>do that action</quote>, e.g.
- <literal>+block</literal> means <quote>please block URLs that match the
- following patterns</quote>, and <literal>-block</literal> means <quote>don't
- block URLs that match the following patterns, even if <literal>+block</literal>
- previously applied.</quote>
-
-</para>
-
-<para>
- Again, actions are invoked by placing them on a line, enclosed in curly braces and
- separated by whitespace, like in
- <literal>{+some-action -some-other-action{some-parameter}}</literal>,
- followed by a list of URL patterns, one per line, to which they apply.
- Together, the actions line and the following pattern lines make up a section
- of the actions file.
-</para>
-
-<para>
- Actions fall into three categories:
-</para>
-
-<para>
- <itemizedlist>
- <listitem>
- <para>
- Boolean, i.e the action can only be <quote>enabled</quote> or
- <quote>disabled</quote>. Syntax:
- </para>
- <para>
- <screen>
- +<replaceable class="function">name</replaceable> # enable action <replaceable class="parameter">name</replaceable>
- -<replaceable class="function">name</replaceable> # disable action <replaceable class="parameter">name</replaceable></screen>
- </para>
- <para>
- Example: <literal>+handle-as-image</literal>
- </para>
- </listitem>
-
-
- <listitem>
- <para>
- Parameterized, where some value is required in order to enable this type of action.
- Syntax:
- </para>
- <para>
- <screen>
- +<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # enable action and set parameter to <replaceable class="parameter">param</replaceable>,
- # overwriting parameter from previous match if necessary
- -<replaceable class="function">name</replaceable> # disable action. The parameter can be omitted</screen>
- </para>
- <para>
- Note that if the URL matches multiple positive forms of a parameterized action,
- the last match wins, i.e. the params from earlier matches are simply ignored.
- </para>
- <para>
- Example: <literal>+hide-user-agent{Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602 Firefox/2.0.0.4}</literal>
- </para>
- </listitem>
-
- <listitem>
- <para>
- Multi-value. These look exactly like parameterized actions,
- but they behave differently: If the action applies multiple times to the
- same URL, but with different parameters, <emphasis>all</emphasis> the parameters
- from <emphasis>all</emphasis> matches are remembered. This is used for actions
- that can be executed for the same request repeatedly, like adding multiple
- headers, or filtering through multiple filters. Syntax:
- </para>
- <para>
- <screen>
- +<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # enable action and add <replaceable class="parameter">param</replaceable> to the list of parameters
- -<replaceable class="function">name</replaceable>{<replaceable class="parameter">param</replaceable>} # remove the parameter <replaceable class="parameter">param</replaceable> from the list of parameters
- # If it was the last one left, disable the action.
- <replaceable class="parameter">-name</replaceable> # disable this action completely and remove all parameters from the list</screen>
- </para>
- <para>
- Examples: <literal>+add-header{X-Fun-Header: Some text}</literal> and
- <literal>+filter{html-annoyances}</literal>
- </para>
- </listitem>
-
- </itemizedlist>
-</para>
-
-<para>
- If nothing is specified in any actions file, no <quote>actions</quote> are
- taken. So in this case <application>Privoxy</application> would just be a
- normal, non-blocking, non-filtering proxy. You must specifically enable the
- privacy and blocking features you need (although the provided default actions
- files will give a good starting point).
-</para>
-
-<para>
- Later defined action sections always over-ride earlier ones of the same type.
- So exceptions to any rules you make, should come in the latter part of the file (or
- in a file that is processed later when using multiple actions files such
- as <filename>user.action</filename>). For multi-valued actions, the actions
- are applied in the order they are specified. Actions files are processed in
- the order they are defined in <filename>config</filename> (the default
- installation has three actions files). It also quite possible for any given
- URL to match more than one <quote>pattern</quote> (because of wildcards and
- regular expressions), and thus to trigger more than one set of actions! Last
- match wins.
-</para>
-
-<!-- start actions listing -->
-<para>
- The list of valid <application>Privoxy</application> actions are:
-</para>
-
-
-<!-- ********************************************************** -->
-<!-- Please note the below defined actions use id's that are -->
-<!-- probably linked from other places, so please don't change. -->
-<!-- -->
-<!-- ********************************************************** -->
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-
-<sect3 renderas="sect4" id="add-header">
-<title>add-header</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Confuse log analysis, custom applications</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Sends a user defined HTTP header to the web server.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Multi-value.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- Any string value is possible. Validity of the defined HTTP headers is not checked.
- It is recommended that you use the <quote><literal>X-</literal></quote> prefix
- for custom headers.
- </para>
- </listitem>
- </varlistentry>
-
-<varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This action may be specified multiple times, in order to define multiple
- headers. This is rarely needed for the typical user. If you don't know what
- <quote>HTTP headers</quote> are, you definitely don't need to worry about this
- one.
- </para>
- <para>
- Headers added by this action are not modified by other actions.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage:</term>
- <listitem>
- <para>
- <screen>+add-header{X-User-Tracking: sucks}</screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="block">
-<title>block</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Block ads or other unwanted content</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Requests for URLs to which this action applies are blocked, i.e. the
- requests are trapped by &my-app; and the requested URL is never retrieved,
- but is answered locally with a substitute page or image, as determined by
- the <literal><link
- linkend="handle-as-image">handle-as-image</link></literal>,
- <literal><link
- linkend="set-image-blocker">set-image-blocker</link></literal>, and
- <literal><link
- linkend="handle-as-empty-document">handle-as-empty-document</link></literal> actions.
-
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>A block reason that should be given to the user.</para>
- </listitem>
- </varlistentry>
-
-<varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- <application>Privoxy</application> sends a special <quote>BLOCKED</quote> page
- for requests to blocked pages. This page contains the block reason given as
- parameter, a link to find out why the block action applies, and a click-through
- to the blocked content (the latter only if the force feature is available and
- enabled).
- </para>
- <para>
- A very important exception occurs if <emphasis>both</emphasis>
- <literal>block</literal> and <literal><link linkend="handle-as-image">handle-as-image</link></literal>,
- apply to the same request: it will then be replaced by an image. If
- <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
- (see below) also applies, the type of image will be determined by its parameter,
- if not, the standard checkerboard pattern is sent.
- </para>
- <para>
- It is important to understand this process, in order
- to understand how <application>Privoxy</application> deals with
- ads and other unwanted content. Blocking is a core feature, and one
- upon which various other features depend.
- </para>
- <para>
- The <literal><link linkend="filter">filter</link></literal>
- action can perform a very similar task, by <quote>blocking</quote>
- banner images and other content through rewriting the relevant URLs in the
- document's HTML source, so they don't get requested in the first place.
- Note that this is a totally different technique, and it's easy to confuse the two.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen>{+block{No nasty stuff for you.}}
-# Block and replace with "blocked" page
- .nasty-stuff.example.com
-
-{+block{Doubleclick banners.} +handle-as-image}
-# Block and replace with image
- .ad.doubleclick.net
- .ads.r.us/banners/
-
-{+block{Layered ads.} +handle-as-empty-document}
-# Block and then ignore
- adserver.example.net/.*\.js$</screen>
- </para>
- </listitem>
- </varlistentry>
-
-
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="change-x-forwarded-for">
-<title>change-x-forwarded-for</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Improve privacy by not forwarding the source of the request in the HTTP headers.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Deletes the <quote>X-Forwarded-For:</quote> HTTP header from the client request,
- or adds a new one.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <itemizedlist>
- <listitem>
- <para><quote>block</quote> to delete the header.</para>
- </listitem>
- <listitem>
- <para>
- <quote>add</quote> to create the header (or append
- the client's IP address to an already existing one).
- </para>
- </listitem>
- </itemizedlist>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- It is safe and recommended to use <literal>block</literal>.
- </para>
- <para>
- Forwarding the source address of the request may make
- sense in some multi-user setups but is also a privacy risk.
- </para>
- </listitem>
- </varlistentry>
- <varlistentry>
- <term>Example usage:</term>
- <listitem>
- <para>
- <screen>+change-x-forwarded-for{block}</screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="client-header-filter">
-<title>client-header-filter</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>
- Rewrite or remove single client headers.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- All client headers to which this action applies are filtered on-the-fly through
- the specified regular expression based substitutions.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- The name of a client-header filter, as defined in one of the
- <link linkend="filter-file">filter files</link>.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- Client-header filters are applied to each header on its own, not to
- all at once. This makes it easier to diagnose problems, but on the downside
- you can't write filters that only change header x if header y's value is z.
- You can do that by using tags though.
- </para>
- <para>
- Client-header filters are executed after the other header actions have finished
- and use their output as input.
- </para>
- <para>
- If the request URL gets changed, &my-app; will detect that and use the new
- one. This can be used to rewrite the request destination behind the client's
- back, for example to specify a Tor exit relay for certain requests.
- </para>
- <para>
- Please refer to the <link linkend="filter-file">filter file chapter</link>
- to learn which client-header filters are available by default, and how to
- create your own.
- </para>
-
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen>
-# Hide Tor exit notation in Host and Referer Headers
-{+client-header-filter{hide-tor-exit-notation}}
-/
- </screen>
- </para>
- </listitem>
- </varlistentry>
-
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="client-header-tagger">
-<title>client-header-tagger</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>
- Block requests based on their headers.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Client headers to which this action applies are filtered on-the-fly through
- the specified regular expression based substitutions, the result is used as
- tag.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- The name of a client-header tagger, as defined in one of the
- <link linkend="filter-file">filter files</link>.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- Client-header taggers are applied to each header on its own,
- and as the header isn't modified, each tagger <quote>sees</quote>
- the original.
- </para>
- <para>
- Client-header taggers are the first actions that are executed
- and their tags can be used to control every other action.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen>
-# Tag every request with the User-Agent header
-{+client-header-tagger{user-agent}}
-/
-
-# Tagging itself doesn't change the action
-# settings, sections with TAG patterns do:
-#
-# If it's a download agent, use a different forwarding proxy,
-# show the real User-Agent and make sure resume works.
-{+forward-override{forward-socks5 10.0.0.2:2222 .} \
- -hide-if-modified-since \
- -overwrite-last-modified \
- -hide-user-agent \
- -filter \
- -deanimate-gifs \
-}
-TAG:^User-Agent: NetBSD-ftp/
-TAG:^User-Agent: Novell ZYPP Installer
-TAG:^User-Agent: RPM APT-HTTP/
-TAG:^User-Agent: fetch libfetch/
-TAG:^User-Agent: Ubuntu APT-HTTP/
-TAG:^User-Agent: MPlayer/
- </screen>
- </para>
- </listitem>
- </varlistentry>
-
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="content-type-overwrite">
-<title>content-type-overwrite</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Stop useless download menus from popping up, or change the browser's rendering mode</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Replaces the <quote>Content-Type:</quote> HTTP server header.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- Any string.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- The <quote>Content-Type:</quote> HTTP server header is used by the
- browser to decide what to do with the document. The value of this
- header can cause the browser to open a download menu instead of
- displaying the document by itself, even if the document's format is
- supported by the browser.
- </para>
- <para>
- The declared content type can also affect which rendering mode
- the browser chooses. If XHTML is delivered as <quote>text/html</quote>,
- many browsers treat it as yet another broken HTML document.
- If it is send as <quote>application/xml</quote>, browsers with
- XHTML support will only display it, if the syntax is correct.
- </para>
- <para>
- If you see a web site that proudly uses XHTML buttons, but sets
- <quote>Content-Type: text/html</quote>, you can use &my-app;
- to overwrite it with <quote>application/xml</quote> and validate
- the web master's claim inside your XHTML-supporting browser.
- If the syntax is incorrect, the browser will complain loudly.
- </para>
- <para>
- You can also go the opposite direction: if your browser prints
- error messages instead of rendering a document falsely declared
- as XHTML, you can overwrite the content type with
- <quote>text/html</quote> and have it rendered as broken HTML document.
- </para>
- <para>
- By default <literal>content-type-overwrite</literal> only replaces
- <quote>Content-Type:</quote> headers that look like some kind of text.
- If you want to overwrite it unconditionally, you have to combine it with
- <literal><link linkend="force-text-mode">force-text-mode</link></literal>.
- This limitation exists for a reason, think twice before circumventing it.
- </para>
- <para>
- Most of the time it's easier to replace this action with a custom
- <literal><link linkend="server-header-filter">server-header filter</link></literal>.
- It allows you to activate it for every document of a certain site and it will still
- only replace the content types you aimed at.
- </para>
- <para>
- Of course you can apply <literal>content-type-overwrite</literal>
- to a whole site and then make URL based exceptions, but it's a lot
- more work to get the same precision.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (sections):</term>
- <listitem>
- <para>
- <screen># Check if www.example.net/ really uses valid XHTML
-{ +content-type-overwrite{application/xml} }
-www.example.net/
-
-# but leave the content type unmodified if the URL looks like a style sheet
-{-content-type-overwrite}
-www.example.net/.*\.css$
-www.example.net/.*style
-</screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="crunch-client-header">
-<!--
-new action
--->
-<title>crunch-client-header</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Remove a client header <application>Privoxy</application> has no dedicated action for.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Deletes every header sent by the client that contains the string the user supplied as parameter.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- Any string.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This action allows you to block client headers for which no dedicated
- <application>Privoxy</application> action exists.
- <application>Privoxy</application> will remove every client header that
- contains the string you supplied as parameter.
- </para>
- <para>
- Regular expressions are <emphasis>not supported</emphasis> and you can't
- use this action to block different headers in the same request, unless
- they contain the same string.
- </para>
- <para>
- <literal>crunch-client-header</literal> is only meant for quick tests.
- If you have to block several different headers, or only want to modify
- parts of them, you should use a
- <literal><link linkend="client-header-filter">client-header filter</link></literal>.
- </para>
- <warning>
- <para>
- Don't block any header without understanding the consequences.
- </para>
- </warning>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen># Block the non-existent "Privacy-Violation:" client header
-{ +crunch-client-header{Privacy-Violation:} }
-/
- </screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="crunch-if-none-match">
-<title>crunch-if-none-match</title>
-<!--
-new action
--->
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Prevent yet another way to track the user's steps between sessions.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Deletes the <quote>If-None-Match:</quote> HTTP client header.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Boolean.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- N/A
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- Removing the <quote>If-None-Match:</quote> HTTP client header
- is useful for filter testing, where you want to force a real
- reload instead of getting status code <quote>304</quote> which
- would cause the browser to use a cached copy of the page.
- </para>
- <para>
- It is also useful to make sure the header isn't used as a cookie
- replacement (unlikely but possible).
- </para>
- <para>
- Blocking the <quote>If-None-Match:</quote> header shouldn't cause any
- caching problems, as long as the <quote>If-Modified-Since:</quote> header
- isn't blocked or missing as well.
- </para>
- <para>
- It is recommended to use this action together with
- <literal><link linkend="hide-if-modified-since">hide-if-modified-since</link></literal>
- and
- <literal><link linkend="overwrite-last-modified">overwrite-last-modified</link></literal>.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen># Let the browser revalidate cached documents but don't
-# allow the server to use the revalidation headers for user tracking.
-{+hide-if-modified-since{-60} \
- +overwrite-last-modified{randomize} \
- +crunch-if-none-match}
-/ </screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="crunch-incoming-cookies">
-<title>crunch-incoming-cookies</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>
- Prevent the web server from setting HTTP cookies on your system
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Deletes any <quote>Set-Cookie:</quote> HTTP headers from server replies.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Boolean.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- N/A
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This action is only concerned with <emphasis>incoming</emphasis> HTTP cookies. For
- <emphasis>outgoing</emphasis> HTTP cookies, use
- <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>.
- Use <emphasis>both</emphasis> to disable HTTP cookies completely.
- </para>
- <para>
- It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
- with the <literal><link linkend="session-cookies-only">session-cookies-only</link></literal> action,
- since it would prevent the session cookies from being set. See also
- <literal><link linkend="filter-content-cookies">filter-content-cookies</link></literal>.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage:</term>
- <listitem>
- <para>
- <screen>+crunch-incoming-cookies</screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="crunch-server-header">
-<title>crunch-server-header</title>
-<!--
-new action
--->
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Remove a server header <application>Privoxy</application> has no dedicated action for.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Deletes every header sent by the server that contains the string the user supplied as parameter.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- Any string.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This action allows you to block server headers for which no dedicated
- <application>Privoxy</application> action exists. <application>Privoxy</application>
- will remove every server header that contains the string you supplied as parameter.
- </para>
- <para>
- Regular expressions are <emphasis>not supported</emphasis> and you can't
- use this action to block different headers in the same request, unless
- they contain the same string.
- </para>
- <para>
- <literal>crunch-server-header</literal> is only meant for quick tests.
- If you have to block several different headers, or only want to modify
- parts of them, you should use a custom
- <literal><link linkend="server-header-filter">server-header filter</link></literal>.
- </para>
- <warning>
- <para>
- Don't block any header without understanding the consequences.
- </para>
- </warning>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen># Crunch server headers that try to prevent caching
-{ +crunch-server-header{no-cache} }
-/ </screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="crunch-outgoing-cookies">
-<title>crunch-outgoing-cookies</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>
- Prevent the web server from reading any HTTP cookies from your system
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Deletes any <quote>Cookie:</quote> HTTP headers from client requests.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- Boolean, Parameterized, Multi-value -->
- <listitem>
- <para>Boolean.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- N/A
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This action is only concerned with <emphasis>outgoing</emphasis> HTTP cookies. For
- <emphasis>incoming</emphasis> HTTP cookies, use
- <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>.
- Use <emphasis>both</emphasis> to disable HTTP cookies completely.
- </para>
- <para>
- It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
- with the <literal><link linkend="session-cookies-only">session-cookies-only</link></literal> action,
- since it would prevent the session cookies from being read.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage:</term>
- <listitem>
- <para>
- <screen>+crunch-outgoing-cookies</screen>
- </para>
- </listitem>
- </varlistentry>
-
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="deanimate-gifs">
-<title>deanimate-gifs</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Stop those annoying, distracting animated GIF images.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- De-animate GIF animations, i.e. reduce them to their first or last image.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- <quote>last</quote> or <quote>first</quote>
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This will also shrink the images considerably (in bytes, not pixels!). If
- the option <quote>first</quote> is given, the first frame of the animation
- is used as the replacement. If <quote>last</quote> is given, the last
- frame of the animation is used instead, which probably makes more sense for
- most banner animations, but also has the risk of not showing the entire
- last frame (if it is only a delta to an earlier frame).
- </para>
- <para>
- You can safely use this action with patterns that will also match non-GIF
- objects, because no attempt will be made at anything that doesn't look like
- a GIF.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage:</term>
- <listitem>
- <para>
- <screen>+deanimate-gifs{last}</screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="downgrade-http-version">
-<title>downgrade-http-version</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Work around (very rare) problems with HTTP/1.1</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Boolean.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- N/A
- </para>
- </listitem>
- </varlistentry>
-
-<varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- This is a left-over from the time when <application>Privoxy</application>
- didn't support important HTTP/1.1 features well. It is left here for the
- unlikely case that you experience HTTP/1.1 related problems with some server
- out there. Not all HTTP/1.1 features and requirements are supported yet,
- so there is a chance you might need this action.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (section):</term>
- <listitem>
- <para>
- <screen>{+downgrade-http-version}
-problem-host.example.com</screen>
- </para>
- </listitem>
- </varlistentry>
-
-</variablelist>
-</sect3>
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="fast-redirects">
-<title>fast-redirects</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Fool some click-tracking scripts and speed up indirect links.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- Detects redirection URLs and redirects the browser without contacting
- the redirection server first.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <itemizedlist>
- <listitem>
- <para>
- <quote>simple-check</quote> to just search for the string <quote>http://</quote>
- to detect redirection URLs.
- </para>
- </listitem>
- <listitem>
- <para>
- <quote>check-decoded-url</quote> to decode URLs (if necessary) before searching
- for redirection URLs.
- </para>
- </listitem>
- </itemizedlist>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- Many sites, like yahoo.com, don't just link to other sites. Instead, they
- will link to some script on their own servers, giving the destination as a
- parameter, which will then redirect you to the final target. URLs
- resulting from this scheme typically look like:
- <quote>http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/</quote>.
- </para>
- <para>
- Sometimes, there are even multiple consecutive redirects encoded in the
- URL. These redirections via scripts make your web browsing more traceable,
- since the server from which you follow such a link can see where you go
- to. Apart from that, valuable bandwidth and time is wasted, while your
- browser asks the server for one redirect after the other. Plus, it feeds
- the advertisers.
- </para>
- <para>
- This feature is currently not very smart and is scheduled for improvement.
- If it is enabled by default, you will have to create some exceptions to
- this action. It can lead to failures in several ways:
- </para>
- <para>
- Not every URLs with other URLs as parameters is evil.
- Some sites offer a real service that requires this information to work.
- For example a validation service needs to know, which document to validate.
- <literal>fast-redirects</literal> assumes that every URL parameter that
- looks like another URL is a redirection target, and will always redirect to
- the last one. Most of the time the assumption is correct, but if it isn't,
- the user gets redirected anyway.
- </para>
- <para>
- Another failure occurs if the URL contains other parameters after the URL parameter.
- The URL:
- <quote>http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar</quote>.
- contains the redirection URL <quote>http://www.example.net/</quote>,
- followed by another parameter. <literal>fast-redirects</literal> doesn't know that
- and will cause a redirect to <quote>http://www.example.net/&foo=bar</quote>.
- Depending on the target server configuration, the parameter will be silently ignored
- or lead to a <quote>page not found</quote> error. You can prevent this problem by
- first using the <literal><link linkend="redirect">redirect</link></literal> action
- to remove the last part of the URL, but it requires a little effort.
- </para>
- <para>
- To detect a redirection URL, <literal>fast-redirects</literal> only
- looks for the string <quote>http://</quote>, either in plain text
- (invalid but often used) or encoded as <quote>http%3a//</quote>.
- Some sites use their own URL encoding scheme, encrypt the address
- of the target server or replace it with a database id. In theses cases
- <literal>fast-redirects</literal> is fooled and the request reaches the
- redirection server where it probably gets logged.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage:</term>
- <listitem>
- <para>
- <screen>
- { +fast-redirects{simple-check} }
- one.example.com
-
- { +fast-redirects{check-decoded-url} }
- another.example.com/testing</screen>
- </para>
- </listitem>
- </varlistentry>
-
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="filter">
-<title>filter</title>
-
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Get rid of HTML and JavaScript annoyances, banner advertisements (by size),
- do fun text replacements, add personalized effects, etc.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Effect:</term>
- <listitem>
- <para>
- All instances of text-based type, most notably HTML and JavaScript, to which
- this action applies, can be filtered on-the-fly through the specified regular
- expression based substitutions. (Note: as of version 3.0.3 plain text documents
- are exempted from filtering, because web servers often use the
- <literal>text/plain</literal> MIME type for all files whose type they don't know.)
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Type:</term>
- <!-- boolean, parameterized, Multi-value -->
- <listitem>
- <para>Parameterized.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Parameter:</term>
- <listitem>
- <para>
- The name of a content filter, as defined in the <link linkend="filter-file">filter file</link>.
- Filters can be defined in one or more files as defined by the
- <literal><link linkend="filterfile">filterfile</link></literal>
- option in the <link linkend="config">config file</link>.
- <filename>default.filter</filename> is the collection of filters
- supplied by the developers. Locally defined filters should go
- in their own file, such as <filename>user.filter</filename>.
- </para>
- <para>
- When used in its negative form,
- and without parameters, <emphasis>all</emphasis> filtering is completely disabled.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Notes:</term>
- <listitem>
- <para>
- For your convenience, there are a number of pre-defined filters available
- in the distribution filter file that you can use. See the examples below for
- a list.
- </para>
- <para>
- Filtering requires buffering the page content, which may appear to
- slow down page rendering since nothing is displayed until all content has
- passed the filters. (The total time until the page is completely rendered
- doesn't change much, but it may be perceived as slower since the page is
- not incrementally displayed.)
- This effect will be more noticeable on slower connections.
- </para>
- <para>
- <quote>Rolling your own</quote>
- filters requires a knowledge of
- <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular
- Expressions</quote></ulink> and
- <ulink url="http://en.wikipedia.org/wiki/Html"><quote>HTML</quote></ulink>.
- This is very powerful feature, and potentially very intrusive.
- Filters should be used with caution, and where an equivalent
- <quote>action</quote> is not available.
- </para>
- <para>
- The amount of data that can be filtered is limited to the
- <literal><link linkend="buffer-limit">buffer-limit</link></literal>
- option in the main <link linkend="config">config file</link>. The
- default is 4096 KB (4 Megs). Once this limit is exceeded, the buffered
- data, and all pending data, is passed through unfiltered.
- </para>
- <para>
- Inappropriate MIME types, such as zipped files, are not filtered at all.
- (Again, only text-based types except plain text). Encrypted SSL data
- (from HTTPS servers) cannot be filtered either, since this would violate
- the integrity of the secure transaction. In some situations it might
- be necessary to protect certain text, like source code, from filtering
- by defining appropriate <literal>-filter</literal> exceptions.
- </para>
- <para>
- Compressed content can't be filtered either, but if &my-app;
- is compiled with zlib support and a supported compression algorithm
- is used (gzip or deflate), &my-app; can first decompress the content
- and then filter it.
- </para>
- <para>
- If you use a &my-app; version without zlib support, but want filtering to work on
- as much documents as possible, even those that would normally be sent compressed,
- you must use the <literal><link linkend="prevent-compression">prevent-compression</link></literal>
- action in conjunction with <literal>filter</literal>.
- </para>
- <para>
- Content filtering can achieve some of the same effects as the
- <literal><link linkend="block">block</link></literal>
- action, i.e. it can be used to block ads and banners. But the mechanism
- works quite differently. One effective use, is to block ad banners
- based on their size (see below), since many of these seem to be somewhat
- standardized.
- </para>
- <para>
- <link linkend="contact">Feedback</link> with suggestions for new or
- improved filters is particularly welcome!
- </para>
- <para>
- The below list has only the names and a one-line description of each
- predefined filter. There are <link linkend="predefined-filters">more
- verbose explanations</link> of what these filters do in the <link
- linkend="filter-file">filter file chapter</link>.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>Example usage (with filters from the distribution <filename>default.filter</filename> file).
- See <link linkend="PREDEFINED-FILTERS">the Predefined Filters section</link> for
- more explanation on each:</term>
- <listitem>
- <para>
- <anchor id="filter-js-annoyances">
- <screen>+filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse.</screen>
- </para>
- <para>
- <anchor id="filter-js-events">
- <screen>+filter{js-events} # Kill all JS event bindings and timers (Radically destructive! Only for extra nasty sites).</screen>
- </para>
- <para>
- <anchor id="filter-html-annoyances">
- <screen>+filter{html-annoyances} # Get rid of particularly annoying HTML abuse.</screen>
- </para>
- <para>
- <anchor id="filter-content-cookies">
- <screen>+filter{content-cookies} # Kill cookies that come in the HTML or JS content.</screen>
- </para>
- <para>
- <anchor id="filter-refresh-tags">
- <screen>+filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups).</screen>
- </para>
- <para>
- <anchor id="filter-unsolicited-popups">
- <screen>+filter{unsolicited-popups} # Disable only unsolicited pop-up windows. Useful if your browser lacks this ability.</screen>
- </para>
- <para>
- <anchor id="filter-all-popups">
- <screen>+filter{all-popups} # Kill all popups in JavaScript and HTML. Useful if your browser lacks this ability.</screen>
- </para>
- <para>
- <anchor id="filter-img-reorder">
- <screen>+filter{img-reorder} # Reorder attributes in <img> tags to make the banners-by-* filters more effective.</screen>
- </para>
- <para>
- <anchor id="filter-banners-by-size">
- <screen>+filter{banners-by-size} # Kill banners by size.</screen>
- </para>
- <para>
- <anchor id="filter-banners-by-link">
- <screen>+filter{banners-by-link} # Kill banners by their links to known clicktrackers.</screen>
- </para>
- <para>
- <anchor id="filter-webbugs">
- <screen>+filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking).</screen>
- </para>
- <para>
- <anchor id="filter-tiny-textforms">
- <screen>+filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap.</screen>
- </para>
- <para>
- <anchor id="filter-jumping-windows">
- <screen>+filter{jumping-windows} # Prevent windows from resizing and moving themselves.</screen>
- </para>
- <para>
- <anchor id="filter-frameset-borders">
- <screen>+filter{frameset-borders} # Give frames a border and make them resizable.</screen>
- </para>
- <para>
- <anchor id="filter-demoronizer">
- <screen>+filter{demoronizer} # Fix MS's non-standard use of standard charsets.</screen>
- </para>
- <para>
- <anchor id="filter-shockwave-flash">
- <screen>+filter{shockwave-flash} # Kill embedded Shockwave Flash objects.</screen>
- </para>
- <para>
- <anchor id="filter-quicktime-kioskmode">
- <screen>+filter{quicktime-kioskmode} # Make Quicktime movies saveable.</screen>
- </para>
- <para>
- <anchor id="filter-fun">
- <screen>+filter{fun} # Text replacements for subversive browsing fun!</screen>
- </para>
- <para>
- <anchor id="filter-crude-parental">
- <screen>+filter{crude-parental} # Crude parental filtering. Note that this filter doesn't work reliably.</screen>
- </para>
- <para>
- <anchor id="filter-ie-exploits">
- <screen>+filter{ie-exploits} # Disable some known Internet Explorer bug exploits.</screen>
- </para>
- <para>
- <anchor id="filter-site-specifics">
- <screen>+filter{site-specifics} # Cure for site-specific problems. Don't apply generally!</screen>
- </para>
- <para>
- <anchor id="filter-no-ping">
- <screen>+filter{no-ping} # Removes non-standard ping attributes in <a> and <area> tags.</screen>
- </para>
- <para>
- <anchor id="filter-google">
- <screen>+filter{google} # CSS-based block for Google text ads. Also removes a width limitation and the toolbar advertisement.</screen>
- </para>
- <para>
- <anchor id="filter-yahoo">
- <screen>+filter{yahoo} # CSS-based block for Yahoo text ads. Also removes a width limitation.</screen>
- </para>
- <para>
- <anchor id="filter-msn">
- <screen>+filter{msn} # CSS-based block for MSN text ads. Also removes tracking URLs and a width limitation.</screen>
- </para>
- <para>
- <anchor id="filter-blogspot">
- <screen>+filter{blogspot} # Cleans up some Blogspot blogs. Read the fine print before using this.</screen>
- </para>
- </listitem>
- </varlistentry>
-</variablelist>
-</sect3>
-
-
-<!-- ~~~~~ New section ~~~~~ -->
-<sect3 renderas="sect4" id="force-text-mode">
-<title>force-text-mode</title>
-<!--
-new action
--->
-<variablelist>
- <varlistentry>
- <term>Typical use:</term>
- <listitem>
- <para>Force <application>Privoxy</application> to treat a document as if it was in some kind of <emphasis>text</emphasis> format. </para>
- </listitem>
- </varlistentry>