Stop pretending that we release updated action files on their own

[privoxy.git] / doc / source / user-manual.sgml
diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml

index 408a612..ef21bd2 100644 (file)
--- a/doc/source/user-manual.sgml
+++ b/doc/source/user-manual.sgml
@@ -11,8 +11,8 @@
  <!entity license SYSTEM "license.sgml">
  <!entity p-authors SYSTEM "p-authors.sgml">
  <!entity config SYSTEM "p-config.sgml">
-<!entity p-version "3.0.18">
-<!entity p-status "UNRELEASED">
+<!entity p-version "3.0.20">
+<!entity p-status "beta">
  <!entity % p-authors-formal "INCLUDE"> <!-- include additional text, etc  -->
  <!entity % p-not-stable "INCLUDE">
  <!entity % p-stable "IGNORE">
@@ -34,9 +34,9 @@
                  This file belongs into
                  ijbswa.sourceforge.net:/home/groups/i/ij/ijbswa/htdocs/
  
- $Id: user-manual.sgml,v 2.134 2011/08/18 11:45:02 fabiankeil Exp $
+ $Id: user-manual.sgml,v 2.159 2013/01/09 15:03:06 fabiankeil Exp $
  
- Copyright (C) 2001-2011 Privoxy Developers http://www.privoxy.org/
+ Copyright (C) 2001-2013 Privoxy Developers http://www.privoxy.org/
   See LICENSE.
  
   ========================================================================
@@ -55,12 +55,12 @@
   <subscript>
  <!-- Completely the wrong markup, but very little is allowed  -->
  <!-- in this part of an article. FIXME -->
- <link linkend="copyright">Copyright</link> &my-copy; 2001-2011 by
+ <link linkend="copyright">Copyright</link> &my-copy; 2001-2013 by
   <ulink url="http://www.privoxy.org/">Privoxy Developers</ulink>
   </subscript>
  </pubdate>
  
-<pubdate>$Id: user-manual.sgml,v 2.134 2011/08/18 11:45:02 fabiankeil Exp $</pubdate>
+<pubdate>$Id: user-manual.sgml,v 2.159 2013/01/09 15:03:06 fabiankeil Exp $</pubdate>
  
  <!--
  
@@ -301,22 +301,74 @@ How to install the binary packages depends on your operating system:
  <!--   ~~~~~       New section      ~~~~~     -->
  <sect3 id="installation-mac"><title>Mac OS X</title>
  <para>
- Unzip the downloaded file (you can either double-click on the zip file
- icon from the Finder, or from the desktop if you downloaded it there).
- Then, double-click on the package installer icon and follow the
- installation process.
+ Installation instructions for the OS X platform depend upon whether
+ you downloaded a ready-built installation package (.pkg or .mpkg) or have
+ downloaded the source code.
+</para>
+</sect3>
+<sect3 renderas="sect4" id="OS-X-install-from-package">
+<title>Installation from ready-built package</title>
+<para>
+ The downloaded file will either be a .pkg (for OS X 10.5 upwards) or a bzipped
+ .mpkg file (for OS X 10.4). The former can be double-clicked as is and the
+ installation will start; double-clicking the latter will unzip the .mpkg file
+ which can then be double-clicked to commence the installation.
+</para>
+<para>
+ The privoxy service will automatically start after a successful installation
+ (and thereafter every time your computer starts up) however you will need to
+ configure your web browser(s) to use it. To do so, configure them to use a
+ proxy for HTTP and HTTPS at the address 127.0.0.1:8118.
+</para>
+<para>
+ To prevent the privoxy service from automatically starting when your computer
+ starts up, remove or rename the file <literal>/Library/LaunchDaemons/org.ijbswa.privoxy.plist</literal>
+ (on OS X 10.5 and higher) or the folder named
+ <literal>/Library/StartupItems/Privoxy</literal> (on OS X 10.4 'Tiger').
+</para>
+<para>
+ To manually start or stop the privoxy service, use the scripts startPrivoxy.sh
+ and stopPrivoxy.sh supplied in /Applications/Privoxy. They must be run from an
+ administrator account, using sudo.
+</para>
+<para>
+ To uninstall, run /Applications/Privoxy/uninstall.command as sudo from an
+ administrator account.
+</para>
+</sect3>
+<sect3 renderas="sect4" id="OS-X-install-from-source">
+<title>Installation from source</title>
+<para>
+ To build and install the Privoxy source code on OS X you will need to obtain
+ the macsetup module from the Privoxy Sourceforge CVS repository (refer to
+ Sourceforge help for details of how to set up a CVS client to have read-only
+ access to the repository). This module contains scripts that leverage the usual
+ open-source tools (available as part of Apple's free of charge Xcode
+ distribution or via the usual open-source software package managers for OS X
+ (MacPorts, Homebrew, Fink etc.) to build and then install the privoxy binary
+ and associated files. The macsetup module's README file contains complete
+ instructions for its use.
+</para>
+<para>
+ The privoxy service will automatically start after a successful installation
+ (and thereafter every time your computer starts up) however you will need to
+ configure your web browser(s) to use it. To do so, configure them to use a
+ proxy for HTTP and HTTPS at the address 127.0.0.1:8118.
  </para>
  <para>
- The privoxy service will automatically start after a successful
- installation (in addition to every time your computer starts up).  To
- prevent the privoxy service from automatically starting when your
- computer starts up, remove or rename the folder named
- <literal>/Library/StartupItems/Privoxy</literal>.
+ To prevent the privoxy service from automatically starting when your computer
+ starts up, remove or rename the file <literal>/Library/LaunchDaemons/org.ijbswa.privoxy.plist</literal>
+ (on OS X 10.5 and higher) or the folder named
+ <literal>/Library/StartupItems/Privoxy</literal> (on OS X 10.4 'Tiger').
  </para>
  <para>
   To manually start or stop the privoxy service, use the Privoxy Utility
- for Mac OS X.  This application controls the privoxy service (e.g.
- starting and stopping the service as well as uninstalling the software).
+ for Mac OS X (also part of the macsetup module).  This application can start
+ and stop the privoxy service and display its log and configuration files.
+</para>
+<para>
+ To uninstall, run the macsetup module's uninstall.sh as sudo from an
+ administrator account.
  </para>
  </sect3>
  
@@ -402,13 +454,6 @@ How to install the binary packages depends on your operating system:
  </sect2>
  <!--   ~~~~~       New section      ~~~~~     -->
  <sect2 id="installation-keepupdated"><title>Keeping your Installation Up-to-Date</title>
-<para>
- As user feedback comes in and development continues, we will make updated versions
- of both the main <link linkend="actions-file">actions file</link> (as a <ulink
- url="http://sourceforge.net/project/showfiles.php?group_id=11118&amp;release_id=103670">separate
- package</ulink>) and the software itself (including the actions file) available for
- download.
-</para>
  
  <para>
   If you wish to receive an email notification whenever we release updates of
@@ -437,642 +482,1158 @@ How to install the binary packages depends on your operating system:
  <sect1 id="whatsnew">
  <title>What's New in this Release</title>
  <para>
- <application>Privoxy 3.0.17</application> is a stable release.
- The changes since 3.0.16 stable are:
+ <application>Privoxy 3.0.19</application> is a stable release.
+ The changes since 3.0.18 stable are:
  </para>
  
  <para>
   <itemizedlist>
-  <listitem>
-   <para>
-    Fixed last-chunk-detection for responses where the content was small
-    enough to be read with the body, causing Privoxy to wait for the
-    end of the content until the server closed the connection or the
-    request timed out. Reported by "Karsten" in #3028326.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Responses with status code 204 weren't properly detected as body-less
-    like RFC2616 mandates. Like the previous bug, this caused Privoxy to
-    wait for the end of the content until the server closed the connection
-    or the request timed out. Fixes #3022042 and #3025553, reported by a
-    user with no visible name. Most likely also fixes a bunch of other
-    AJAX-related problem reports that got closed in the past due to
-    insufficient information and lack of feedback.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Fixed an ACL bug that made it impossible to build a blacklist.
-    Usually the ACL directives are used in a whitelist, which worked
-    as expected, but blacklisting is still useful for public proxies
-    where one only needs to deny known abusers access.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Added LOG_LEVEL_RECEIVED to log the not-yet-parsed data read from the
-    network. This should make debugging various parsing issues a lot easier.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    The IPv6 code is enabled by default on Windows versions that support it.
-    Patch submitted by oCameLo in #2942729.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    In mingw32 versions, the user.filter file is reachable through the
-    GUI, just like default.filter is. Feature request 3040263.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Added the configure option --enable-large-file-support to set a few
-    defines that are required by platforms like GNU/Linux to support files
-    larger then 2GB. Mainly interesting for users without proper logfile
-    management.
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Logging with "debug 16" no longer stops at the first nul byte which is
-    pretty useless. Non-printable characters are replaced with their hex value
-    so the result can't span multiple lines making parsing them harder then
-    necessary.
-   </para>
-  </listitem>
-  <listitem>
+    <listitem>
     <para>
-    Privoxy logs when reading an action, filter or trust file.
+    Bug fixes:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Prevent a segmentation fault when de-chunking buffered content.
+      It could be triggered by malicious web servers if Privoxy was
+      configured to filter the content and running on a platform
+      where SIZE_T_MAX isn't larger than UINT_MAX, which probably
+      includes most 32-bit systems. On those platforms, all Privoxy
+      versions before 3.0.19 appear to be affected.
+      To be on the safe side, this bug should be presumed to allow
+      code execution as proving that it doesn't seems unrealistic.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Do not expect a response from the SOCKS4/4A server until it
+      got something to respond to. This regression was introduced
+      in 3.0.18 and prevented the SOCKS4/4A negotiation from working.
+      Reported by qqqqqw in #3459781.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
    </listitem>
    <listitem>
     <para>
-    Fixed incorrect regression test markup which caused a test in
-    3.0.16 to fail while Privoxy itself was working correctly.
-    While Privoxy accepts hide-referer, too, the action name is actually
-    hide-referrer which is also the name used one the final results page,
-    where the test expected the alias.
+    General improvements:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Fix an off-by-one in an error message about connect failures.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Use a GNUMakefile variable for the webserver root directory and
+      update the path. Sourceforge changed it which broke various
+      web-related targets.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Update the CODE_STATUS description.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
    </listitem>
+ </itemizedlist>
+</para>
+
+<para>
+ The following changes were made between 3.0.17 and 3.0.18:
+</para>
+
+<para>
+ <itemizedlist>
    <listitem>
     <para>
-    CGI interface improvements:
+    Bug fixes:
      <itemizedlist>
      <listitem>
       <para>
-      In finish_http_response(), continue to add the 'Connection: close'
-      header if the client connection will not be kept alive.
-      Anonymously pointed out in #2987454.
+      If a generated redirect URL contains characters RFC 3986 doesn't
+      permit, they are (re)encoded. Not doing this makes Privoxy versions
+      from 3.0.5 to 3.0.17 susceptible to HTTP response splitting (CWE-113)
+      attacks if the +fast-redirects{check-decoded-url} action is used.
       </para>
      </listitem>
      <listitem>
       <para>
-      Apostrophes in block messages no longer cause parse errors
-      when the blocked page is viewed with JavaScript enabled.
-      Reported by dg1727 in #3062296.
+      Fix a logic bug that could cause Privoxy to reuse a server
+      socket after it got tainted by a server-header-tagger-induced
+      block that was triggered before the whole server response had
+      been read. If keep-alive was enabled and the request following
+      the blocked one was to the same host and using the same forwarding
+      settings, Privoxy would send it on the tainted server socket.
+      While the server would simply treat it as a pipelined request,
+      Privoxy would later on fail to properly parse the server's
+      response as it would try to parse the unread data from the
+      first response as server headers for the second one.
+      Regression introduced in 3.0.17.
       </para>
      </listitem>
      <listitem>
       <para>
-      Fix a bunch of anchors that used underscores instead of dashes.
+      When implying keep-alive in client_connection(), remember that
+      the client didn't. Fixes a regression introduced in 3.0.13 that
+      would cause Privoxy to wait for additional client requests after
+      receiving a HTTP/1.1 request with "Connection: close" set
+      and connection sharing enabled.
+      With clients which terminates the client connection after detecting
+      that the whole body has been received it doesn't really matter,
+      but with clients that don't the connection would be kept open until
+      it timed out.
       </para>
      </listitem>
      <listitem>
       <para>
-      Allow to keep the client connection alive after crunching the previous request.
-      Already opened server connections can be kept alive, too.
+      Fix a subtle race condition between prepare_csp_for_next_request()
+      and sweep(). A thread preparing itself for the next client request
+      could briefly appear to be inactive.
+      If all other threads were already using more recent files,
+      the thread could get its files swept away under its feet.
+      So far this has only been reproduced while stress testing in
+      valgrind while touching action files in a loop. It's unlikely
+      to have caused any actual problems in the real world.
       </para>
      </listitem>
      <listitem>
       <para>
-      In cgi_show_url_info(), don't forget to prefix URLs that only contain
-      http:// or https:// in the path. Fixes #2975765 reported by Adam Piggott.
+      Disable filters if SDCH compression is used unless filtering is forced.
+      If SDCH was combined with a supported compression algorithm, Privoxy
+      previously could try to decompress it and ditch the Content-Encoding
+      header even though the SDCH compression wasn't dealt with.
+      Reported by zebul666 in #3225863.
       </para>
      </listitem>
      <listitem>
       <para>
-      Show the 404 CGI page if cgi_send_user_manual() is called while
-      local user manual delivery is disabled.
+      Make a copy of the --user value and only mess with that when splitting
+      user and group. On some operating systems modifying the value directly
+      is reflected in the output of ps and friends and can be misleading.
+      Reported by zepard in #3292710.
       </para>
      </listitem>
-   </itemizedlist>
+    <listitem>
+     <para>
+      If forwarded-connect-retries is set, only retry if Privoxy is actually
+      forwarding the request. Previously direct connections would be retried
+      as well.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Fixed a small memory leak when retrying connections with IPv6
+      support enabled.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Remove an incorrect assertion in compile_dynamic_pcrs_job_list()
+      It could be triggered by a pcrs job with an invalid pcre
+      pattern (for example one that contains a lone quantifier).
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      If the --user argument user[.group] contains a dot, always bail out
+      if no group has been specified. Previously the intended, but undocumented
+      (and apparently untested), behaviour was to try interpreting the whole
+      argument as user name, but the detection was flawed and checked for '0'
+      instead of '\0', thus merely preventing group names beginning with a zero.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In html_code_map[], use a numeric character reference instead of &apos;
+      which wasn't standardized before XHTML 1.0.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Fix an invalid free when compiled with FEATURE_GRACEFUL_TERMINATION
+      and shut down through http://config.privoxy.org/die
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In get_actions(), fix the "temporary" backwards compatibility hack
+      to accept block actions without reason.
+      It also covered other actions that should be rejected as invalid.
+      Reported by Billy Crook.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
    </listitem>
    <listitem>
     <para>
-    Action file improvements:
+    General improvements:
      <itemizedlist>
      <listitem>
       <para>
-      Enable user.filter by default. Suggested by David White in #3001830.
+      Privoxy can (re)compress buffered content before delivering
+      it to the client. Disabled by default as most users wouldn't
+      benefit from it.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block .sitestat.com/. Reported by johnd16 in #3002725.
+      The +fast-redirects{check-decoded-url} action checks URL
+      segments separately. If there are other parameters behind
+      the redirect URL, this makes it unnecessary to cut them off
+      by additionally using a +redirect{} pcrs command.
+      Initial patch submitted by Jamie Zawinski in #3429848.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block .atemda.com/. Reported by johnd16 in #3002723.
+      When loading action sections, verify that the referenced filters
+      exist. Currently missing filters only result in an error message,
+      but eventually the severity will be upgraded to fatal.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block js.adlink.net/. Reported by johnd16 in #3002720.
+      Allow to bind to multiple separate addresses.
+      Patch set submitted by Petr Pisar in #3354485.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block .analytics.yahoo.com/. Reported by johnd16 in #3002713.
+      Set socket_error to errno if connecting fails in rfc2553_connect_to().
+      Previously rejected direct connections could be incorrectly reported
+      as DNS issues if Privoxy was compiled with IPv6 support.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block sb.scorecardresearch.com, too. Reported by dg1727 in #2992652.
+      Adjust url_code_map[] so spaces are replaced with %20 instead of '+'
+      While '+' can be used by client's submitting form data, this is not
+      actually what Privoxy is using the lookups for. This is more of a
+      cosmetic issue and doesn't fix any known problems.
       </para>
      </listitem>
      <listitem>
       <para>
-      Fix problems noticed on Yahoo mail and news pages.
+      When compiled without FEATURE_FAST_REDIRECTS, do not silently
+      ignore +fast-redirect{} directives
       </para>
      </listitem>
      <listitem>
       <para>
-      Remove the too broad yahoo section, only keeping the
-      fast-redirects exception as discussed on ijbswa-devel@.
+      Added a workaround for GNU libc's strptime() reporting negative
+      year values when the parsed year is only specified with two digits.
+      On affected systems cookies with such a date would not be turned
+      into session cookies by the +session-cookies-only action.
+      Reported by Vaeinoe in #3403560
       </para>
      </listitem>
      <listitem>
       <para>
-      Don't block adesklets.sourceforge.net. Reported in #2974204.
+      Fixed bind failures with certain GNU libc versions if no non-loopback
+      IP address has been configured on the system. This is mainly an issue
+      if the system is using DHCP and Privoxy is started before the network
+      is completely configured.
+      Reported by Raphael Marichez in #3349356.
+      Additional insight from Petr Pisar.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block chartbeat ping tracking. Reported in #2975895.
+      Privoxy log messages now use the ISO 8601 date format %Y-%m-%d.
+      It's only slightly longer than the old format, but contains
+      the full date including the year and allows sorting by date
+      (when grepping in multiple log files) without hassle.
       </para>
      </listitem>
      <listitem>
       <para>
-      Tag CSS and image requests with cautious and medium settings, too.
+      In get_last_url(), do not bother trying to decode URLs that do
+      not contain at least one '%' sign. It reduces the log noise and
+      a number of unnecessary memory allocations.
       </para>
      </listitem>
      <listitem>
       <para>
-      Don't handle view.atdmt.com as image. It's used for click-throughs
-      so users should be able to "go there anyway".
-      Reported by Adam Piggott in #2975927.
+      In case of SOCKS5 failures, dump the socks response in the log message.
       </para>
      </listitem>
      <listitem>
       <para>
-      Also let the refresh-tags filter remove invalid refresh tags where
-      the 'url=' part is missing. Anonymously reported in #2986382.
-      While at it, update the description to mention the fact that only
-      refresh tags with refresh times above 9 seconds are covered.
+      Simplify the signal setup in main().
       </para>
      </listitem>
      <listitem>
       <para>
-      javascript needs to be blocked with +handle-as-empty-document to
-      work around Firefox bug 492459.  So move .js blockers from
-      +block{Might be a web-bug.} -handle-as-empty-document to
-      +block{Might be a web-bug.} +handle-as-empty-document.
+      Streamline socks5_connect() slightly.
       </para>
      </listitem>
      <listitem>
       <para>
-      ijbswa-Feature Requests-3006719 - Block 160x578 Banners.
+      In socks5_connect(), require a complete socks response from the server.
+      Previously Privoxy didn't care how much data the server response
+      contained as long as the first two bytes contained the expected
+      values. While at it, shrink the buffer size so Privoxy can't read
+      more than a whole socks response.
       </para>
      </listitem>
      <listitem>
       <para>
-      Block another omniture tracking domain.
+      In chat(), do not bother to generate a client request in case of
+      direct CONNECT requests. It will not be used anyway.
       </para>
      </listitem>
      <listitem>
       <para>
-      Added a range-requests tagger.
+      Reduce server_last_modified()'s stack size.
       </para>
      </listitem>
      <listitem>
       <para>
-      Added two sections to get Flickr's Ajax interface working with
-      default pre-settings. If you change the configuration to block
-      cookies by default, you'll need additional exceptions.
-      Reported by Mathias Homann in #3101419 and by Patrick on ijbswa-users@.
+      Shorten get_http_time() by using strftime().
       </para>
      </listitem>
-    </itemizedlist>
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Documentation improvements:
-    <itemizedlist>
      <listitem>
       <para>
-      Explicitly mention how to match all URLs.
+      Constify the known_http_methods pointers in unknown_method().
       </para>
      </listitem>
      <listitem>
       <para>
-      Consistently recommend socks5 in the Tor FAQ entry and mention
-      its advantage compared to socks4a. Reported by David in #2960129.
+      Constify the time_formats pointers in parse_header_time().
       </para>
      </listitem>
      <listitem>
       <para>
-      Slightly improve the explanation of why filtering may appear
-      slower than it is.
+      Constify the formerly_valid_actions pointers in action_used_to_be_valid().
       </para>
      </listitem>
      <listitem>
       <para>
-      Grammar fixes for the ACL section.
+      Introduce a GNUMakefile MAN_PAGE variable that defaults to privoxy.1.
+      The Debian package uses section 8 for the man page and this
+      should simplify the patch.
       </para>
      </listitem>
      <listitem>
       <para>
-      Fixed a link to the 'intercepting' entry and add another one.
+      Deduplicate the INADDR_NONE definition for Solaris by moving it to jbsockets.h
       </para>
      </listitem>
      <listitem>
       <para>
-      Rename the 'Other' section to 'Mailing Lists' and reword it
-      to make it clear that nobody is forced to use the trackers
+      In block_url(), ditch the obsolete workaround for ancient Netscape versions
+      that supposedly couldn't properly deal with status code 403.
       </para>
      </listitem>
      <listitem>
       <para>
-      Note that 'anonymously' posting on the trackers may not always
-      be possible.
+      Remove a useless NULL pointer check in load_trustfile().
       </para>
      </listitem>
      <listitem>
       <para>
-      Suggest to enable debug 32768 when suspecting parsing problems.
+      Remove two useless NULL pointer checks in load_one_re_filterfile().
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Change url_code_map[] from an array of pointers to an array of arrays
+      It removes an unnecessary layer of indirection and on 64bit system reduces
+      the size of the binary a bit.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Fix various typos. Fixes taken from Debian's 29_typos.dpatch by Roland Rosenfeld.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Add a dok-tidy GNUMakefile target to clean up the messy HTML
+      generated by the other dok targets.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      GNUisms in the GNUMakefile have been removed.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Change the HTTP version in static responses to 1.1
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Synced config.sub and config.guess with upstream
+      2011-11-11/386c7218162c145f5f9e1ff7f558a3fbb66c37c5.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Add a dedicated function to parse the values of toggles. Reduces duplicated
+      code in load_config() and provides better error handling. Invalid or missing
+      toggle values are now a fatal error instead of being silently ignored.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Terminate HTML lines in static error messages with \n instead of \r\n.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Simplify cgi_error_unknown() a bit.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In LogPutString(), don't bother looking at pszText when not
+      actually logging anything.
       </para>
      </listitem>
-    </itemizedlist>
-   </para>
-  </listitem>
-  <listitem>
-   <para>
-    Privoxy-Log-Parser improvements:
-    <itemizedlist>
      <listitem>
       <para>
-      Gather statistics for ressources, methods, and HTTP versions
-      used by the client.
+      Change ssplit()'s fourth parameter from int to size_t.
+      Fixes a clang complaint.
       </para>
      </listitem>
      <listitem>
       <para>
-      Also gather statistics for blocked and redirected requests.
+      Add a warning that the statistics currently can't be trusted.
+      Mention Privoxy-Log-Parser's --statistics option as
+      an alternative for the time being.
       </para>
      </listitem>
      <listitem>
       <para>
-      Provide the percentage of keep-alive offers the client accepted.
+      In rfc2553_connect_to(), start setting cgi->error_message on error.
       </para>
      </listitem>
      <listitem>
       <para>
-      Add a --url-statistics-threshold option.
+      Change the expected status code returned for http://p.p/die depending
+      on whether or not FEATURE_GRACEFUL_TERMINATION is available.
       </para>
      </listitem>
      <listitem>
       <para>
-      Add a --host-statistics-threshold option to also gather
-      statistics about how many request where made per host.
+      In cgi_die(), mark the client connection for closing.
+      If the client will fetch the style sheet through another connection
+      it gets the main thread out of the accept() state and should thus
+      trigger the actual shutdown.
       </para>
      </listitem>
      <listitem>
       <para>
-      Fix a bug in handle_loglevel_header() where a 'scan: ' got lost.
+      Add a proper CGI message for cgi_die().
       </para>
      </listitem>
      <listitem>
       <para>
-      Add a --shorten-thread-ids option to replace the thread id with
-      a decimal number.
+      Don't enforce a logical line length limit in read_config_line().
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and ignore: Looks like we got the last chunk together
-      with the server headers. We better stop reading.
+      Slightly refactor server_last_modified() to remove useless gmtime*() calls.
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and ignore: Continue hack in da house.
+      In get_content_type(), also recognize '.jpeg' as JPEG extension.
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and higlight: Rejecting connection from 10.0.0.2.
-      Maximum number of connections reached.
+      Add '.png' to the list of recognized file extensions in get_content_type().
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and highlight: Loading actions file: /usr/local/etc/privoxy/default.action
+      In block_url(), consistently use the block reason "Request blocked by Privoxy"
+      In two places the reason was "Request for blocked URL" which hides the
+      fact that the request got blocked by Privoxy and isn't necessarily
+      correct as the block may be due to tags.
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and highlight: Loading filter file: /usr/local/etc/privoxy/default.filter
+      In listen_loop(), reload the configuration files after accepting
+      a new connection instead of before.
+      Previously the first connection that arrived after a configuration
+      change would still be handled with the old configuration.
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and highlight: Killed all-caps Host header line: HOST: bestproxydb.com
+      In chat()'s receive-data loop, skip a client socket check if
+      the socket will be written to right away anyway. This can
+      increase the transfer speed for unfiltered content on fast
+      network connections.
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept and highlight: Reducing expected bytes to 0. Marking
-      the server socket tainted after throwing 4 bytes away.
+      The socket timeout is used for SOCKS negotiations as well which
+      previously couldn't timeout.
       </para>
      </listitem>
      <listitem>
       <para>
-      Accept: Merged multiple header lines to: 'X-FORWARDED-PROTO: http X-HOST: 127.0.0.1'
+      Don't keep the client connection alive if any configuration file
+      changed since the time the connection came in. This is closer to
+      Privoxy's behaviour before keep-alive support for client connection
+      has been added and also less confusing in general.
       </para>
      </listitem>
+    <listitem>
+     <para>
+      Treat all Content-Type header values containing the pattern
+      'script' as a sign of text. Reported by pribog in #3134970.
+     </para>
+     </listitem>
      </itemizedlist>
     </para>
    </listitem>
    <listitem>
     <para>
-    Code cleanups:
+    Action file improvements:
      <itemizedlist>
      <listitem>
       <para>
-      Remove the next member from the client_state struct. Only the main
-      thread needs access to all client states so give it its own struct.
+      Moved the site-specific block pattern section below the one for the
+      generic patterns so for requests that are matched in both, the block
+      reason for the domain is shown which is usually more useful than showing
+      the one for the generic pattern.
       </para>
      </listitem>
      <listitem>
       <para>
-      Garbage-collect request_contains_null_bytes().
+      Remove -prevent-compression from the fragile alias. It's no longer
+      used anywhere by default and isn't known to break stuff anyway.
       </para>
      </listitem>
      <listitem>
       <para>
-      Ditch redundant code in unload_configfile().
+      Add a (disabled) section to block various Facebook tracking URLs.
+      Reported by Dan Stahlke in #3421764.
       </para>
      </listitem>
      <listitem>
       <para>
-      Ditch LogGetURLUnderCursor() which doesn't seem to be used anywhere.
+      Add a (disabled) section to rewrite and redirect click-tracking
+      URLs used on news.google.com.
+      Reported by Dan Stahlke in #3421755.
       </para>
      </listitem>
      <listitem>
       <para>
-      In write_socket(), remove the write-only variable write_len in
-      an ifdef __OS2__ block. Spotted by cppcheck.
+      Unblock linuxcounter.net/.
+      Reported by Dan Stahlke in #3422612.
       </para>
      </listitem>
      <listitem>
       <para>
-      In connect_to(), don't declare the variable 'flags' on OS/2 where
-      it isn't used. Spotted by cppcheck.
+      Block 'www91.intel.com/' which is used by Omniture.
+      Reported by Adam Piggott in #3167370.
       </para>
      </listitem>
      <listitem>
       <para>
-      Limit the scope of various variables. Spotted by cppcheck.
+      Disable the handle-as-empty-doc-returns-ok option and mark it as deprecated.
+      Reminded by tceverling in #2790091.
       </para>
      </listitem>
      <listitem>
       <para>
-      In add_to_iob(), turn an interestingly looking for loop into a
-      boring while loop.
+      Add ".ivwbox.de/" to the "Cross-site user tracking" section.
+      Reported by Nettozahler in #3172525.
       </para>
      </listitem>
      <listitem>
       <para>
-      Code cleanup in preparation for external filters.
+      Unblock and fast-redirect ".awin1.com/.*=http://".
+      Reported by Adam Piggott in #3170921.
       </para>
      </listitem>
      <listitem>
       <para>
-      In listen_loop(), mention the socket on which we accepted the
-      connection, not just the source IP address.
+      Block "b.collective-media.net/".
       </para>
      </listitem>
      <listitem>
       <para>
-      In write_socket(), also log the socket we're writing to.
+      Widen the Debian popcon exception to "qa.debian.org/popcon".
+      Seen in Debian's 05_default_action.dpatch by Roland Rosenfeld.
       </para>
      </listitem>
      <listitem>
       <para>
-      In log_error(), assert that escaped characters get logged
-      completely or not at all.
+      Block ".gemius.pl/" which only seems to be used for user tracking.
+      Reported by johnd16 in #3002731. Additional input from Lee and movax.
       </para>
      </listitem>
      <listitem>
       <para>
-      In log_error(), assert that ival and sval have reasonable values.
-      There's no reason not to abort() if they don't.
+      Disable banners-by-size filters for '.thinkgeek.com/'.
+      The filter only seems to catch pictures of the inventory.
       </para>
      </listitem>
      <listitem>
       <para>
-      Remove an incorrect cgi_error_unknown() call in a
-      cannot-happen-situation in send_crunch_response().
+      Block requests for 'go.idmnet.bbelements.com/please/showit/'.
+      Reported by kacperdominik in #3372959.
       </para>
      </listitem>
      <listitem>
       <para>
-      Clean up white-space in http_response definition and
-      move the crunch_reason to the beginning.
+      Unblock adainitiative.org/.
       </para>
      </listitem>
      <listitem>
       <para>
-      Turn http_response.reason into an enum and rename it
-      to http_response.crunch_reason.
+      Add a fast-redirects exception for '.googleusercontent.com/.*=cache'.
       </para>
      </listitem>
      <listitem>
       <para>
-      Silence a 'gcc (Debian 4.3.2-1.1) 4.3.2' warning on i686 GNU/Linux.
+      Add a fast-redirects exception for webcache.googleusercontent.com/.
       </para>
      </listitem>
      <listitem>
       <para>
-      Fix white-space in a log message in remove_chunked_transfer_coding().
-      While at it, add a note that the message doesn't seem to
-      be entirely correct and should be improved later on.
+      Unblock http://adassier.wordpress.com/ and http://adassier.files.wordpress.com/.
       </para>
-    </listitem>
+     </listitem>
      </itemizedlist>
     </para>
    </listitem>
    <listitem>
     <para>
-    GNUmakefile improvements:
+    Filter file improvements:
      <itemizedlist>
      <listitem>
       <para>
-      Use $(SSH) instead of ssh, so one only needs to specify a username once.
+      Let the yahoo filter hide '.ads'.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Let the msn filter hide overlay ads for Facebook 'likes' in search
+      results and elements with the id 's_notf_div'. They only seem to be
+      used to advertise site 'enhancements'.
       </para>
      </listitem>
      <listitem>
       <para>
-      Removed references to the action feedback thingy that hasn't been
-      working for years.
+      Let the js-events filter additionally disarm setInterval().
+      Suggested by dg1727 in #3423775.
+     </para>
+     </listitem>
+    </itemizedlist>
+   </para>
+  </listitem>
+  <listitem>
+   <para>
+    Documentation improvements:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Clarify the effect of compiling Privoxy with zlib support.
+      Suggested by dg1727 in #3423782.
       </para>
      </listitem>
      <listitem>
       <para>
-      Consistently use shell.sourceforge.net instead of shell.sf.net so
-      one doesn't need to check server fingerprints twice.
+      Point out that the SourceForge messaging system works like a black
+      hole and should thus not be used to contact individual developers.
       </para>
      </listitem>
      <listitem>
       <para>
-      Removed GNUisms in the webserver and webactions targets so they
-      work with standard tar.
+      Mention some of the problems one can experience when not explicitly
+      configuring an IP addresses as listen address.
       </para>
      </listitem>
+    <listitem>
+     <para>
+      Explicitly mention that hostnames can be used instead of IP addresses
+      for the listen-address, that only the first address returned will be
+      used and what happens if the address is invalid.
+      Requested by Calestyo in #3302213.
+     </para>
+     </listitem>
      </itemizedlist>
     </para>
    </listitem>
- </itemizedlist>
-</para>
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-
-<sect2 id="upgradersnote">
-<title>Note to Upgraders</title>
-
-<para>
- A quick list of things to be aware of before upgrading from earlier
- versions of <application>Privoxy</application>:
-</para>
-
-<para>
- <itemizedlist>
-
- <listitem>
-  <para>
-   The recommended way to upgrade &my-app; is to backup your old
-   configuration files, install the new ones, verify that &my-app;
-   is working correctly and finally merge back your changes using
-   <application>diff</application> and maybe <application>patch</application>.
-  </para>
-  <para>
-   There are a number of new features in each &my-app; release and
-   most of them have to be explicitly enabled in the configuration
-   files. Old configuration files obviously don't do that and due
-   to syntax changes using old configuration files with a new
-   &my-app; isn't always possible anyway.
-  </para>
- </listitem>
- <listitem>
-  <para>
-    Note that some installers remove earlier versions completely,
-    including configuration files, therefore you should really save
-    any important configuration files!
-  </para>
- </listitem>
- <listitem>
-  <para>
-   On the other hand, other installers don't overwrite existing configuration
-   files, thinking you will want to do that yourself.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   <filename>standard.action</filename> has been merged into
-   the <filename>default.action</filename> file.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   In the default configuration only fatal errors are logged now.
-   You can change that in the <link linkend="DEBUG">debug section</link>
-   of the configuration file. You may also want to enable more verbose
-   logging until you verified that the new &my-app; version is working
-   as expected.
-  </para>
- </listitem>
-
- <listitem>
-    <para>
-     Three other config file settings are now off by default:
-     <link linkend="enable-remote-toggle">enable-remote-toggle</link>,
-     <link linkend="enable-remote-http-toggle">enable-remote-http-toggle</link>,
-     and  <link linkend="enable-edit-actions">enable-edit-actions</link>.
-     If you use or want these, you will need to explicitly enable them, and
-     be aware of the security issues involved.
-    </para>
-  </listitem>
-
-<!--
- <listitem>
-  <para>
-   What constitutes a <quote>default</quote> configuration has changed,
-   and you may want to review which actions are <quote>on</quote> by
-   default. This is primarily a matter of emphasis, but some features
-   you may have been used to, may now be <quote>off</quote> by default.
-   There are also a number of new actions and filters you may want to
-   consider, most of which are not fully incorporated into the default
-   settings as yet (see above).
-  </para>
- </listitem>
--->
-<!--
    <listitem>
     <para>
-    The default actions setting is now <literal>Cautious</literal>. Previous
-    releases had a default setting of <literal>Medium</literal>. Experienced
-    users may want to adjust this, as it is fairly conservative by &my-app;
-    standards and past practices. See <ulink
-    url="http://config.privoxy.org/edit-actions-list?f=default">
-    http://config.privoxy.org/edit-actions-list?f=default</ulink>. New users
-    should try the default settings for a while before turning up the volume.
+    Log message improvements:
+    <itemizedlist>
+    <listitem>
+     <para>
+      If only the server connection is kept alive, do not pretend to
+      wait for a new client request.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Remove a superfluous log message in forget_connection().
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In chat(), properly report missing server responses as such
+      instead of calling them empty.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In forwarded_connect(), fix a log message nobody should ever see.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Fix a log message in socks5_connect(), a failed write operation
+      was logged as failed read operation.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Let load_one_actions_file() properly complain about a missing
+      '{' at the beginning of the file.
+      Simply stating that a line is invalid isn't particularly helpful.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Do not claim to listen on a socket until Privoxy actually does.
+      Patch submitted by Petr Pisar #3354485
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Prevent a duplicated LOG_LEVEL_CLF message when sending out
+      the "no-server-data" response.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Also log the client socket when dropping a connection.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Include the destination host in the 'Request ... marked for
+      blocking. limit-connect{...} doesn't allow CONNECT ...' message
+      Patch submitted by Saperski in #3296250.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Prevent a duplicated log message if none of the resolved IP
+      addresses were reachable.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In connect_to(), do not pretend to retry if forwarded-connect-retries
+      is zero or unset.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      When a specified user or group can't be found, put the name in
+      single-quotes when logging it.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In rfc2553_connect_to(), explain getnameinfo() errors better.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Remove a useless log message in chat().
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      When retrying to connect, also log the maximum number of connection
+      attempts.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Rephrase a log message in compile_dynamic_pcrs_job_list().
+      Divide the error code and its meaning with a colon. Call the pcrs
+      job dynamic and not the filter. Filters may contain dynamic and
+      non-dynamic pcrs jobs at the same time. Only mention the name of
+      the filter or tagger, but don't claim it's a filter when it could
+      be a tagger.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In a fatal error message in load_one_actions_file(), cover both
+      URL and TAG patterns.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In pcrs_strerror(), properly report unknown positive error code
+      values as such. Previously they were handled like 0 (no error).
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In compile_dynamic_pcrs_job_list(), also log the actual error code as
+      pcrs_strerror() doesn't handle all errors reported by pcre.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Don't bother trying to continue chatting if the client didn't ask for it.
+      Reduces log noise a bit.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Make two fatal error message in load_one_actions_file() more descriptive.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In cgi_send_user_manual(), log when rejecting a file name due to '/' or '..'.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In load_file(), log a message if opening a file failed.
+      The CGI error message alone isn't too helpful.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In connection_destination_matches(), improve two log messages
+      to help understand why the destinations don't match.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Rephrase a log message in serve(). Client request arrival
+      should be differentiated from closed client connections now.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In serve(), log if a client connection isn't reused due to a
+      configuration file change.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Let mark_server_socket_tainted() always mark the server socket tainted,
+      just don't talk about it in cases where it has no effect. It doesn't change
+      Privoxy's behaviour, but makes understanding the log file easier.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
    </listitem>
-
    <listitem>
     <para>
-    The default setting has filtering turned <emphasis>off</emphasis>, which
-    subsequently means that compression is <emphasis>on</emphasis>. Remember
-    that filtering does not work on compressed pages, so if you use, or want to
-    use, filtering, you will need to force compression off. Example:
+    configure:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Added a --disable-ipv6-support switch for platforms where support
+      is detected but doesn't actually work.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Do not check for the existence of strerror() and memmove() twice
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Remove a useless test for setpgrp(2). Privoxy doesn't need it and
+      it can cause problems when cross-compiling.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Rename the --disable-acl-files switch to --disable-acl-support.
+      Since about 2001, ACL directives are specified in the standard
+      config file.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Update the URL of the 'Removing outdated PCRE version after the
+      next stable release' posting. The old URL stopped working after
+      one of SF's recent site "optimizations". Reported by Han Liu.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
+  </listitem>
+  <listitem>
     <para>
- <screen>
-  { +<link linkend="filter">filter</link>{google}  +<link linkend="prevent-compression">prevent-compression</link> }
-   .google.</screen>
+    Privoxy-Regression-Test:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Added --shuffle-tests option to increase the chances of detection race conditions.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Added a --local-test-file option that allows to use Privoxy-Regression-Test without Privoxy.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Added tests for missing socks4 and socks4a forwarders.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      The --privoxy-address option now works with IPv6 addresses containing brackets, too.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Perform limited sanity checks for parameters that are supposed to have numerical values.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Added a --sleep-time option to specify a number of seconds to
+      sleep between tests, defaults to 0.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Disable the range-requests tagger for tests that break if it's enabled.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Log messages use the ISO 8601 date format %Y-%m-%d.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Fix spelling in two error messages.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      In the --help output, include a list of supported tests and their default levels.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Adjust the tests to properly deal with FEATURE_TOGGLE being disabled.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
+  </listitem>
+  <listitem>
     <para>
-    Or if you use a number of filters, or filter many sites, you may just want
-    to turn off compression for all sites in
-    <filename>default.action</filename> (or
-    <filename>user.action</filename>).
+    Privoxy-Log-Parser:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Perform limited sanity checks for command line parameters that
+      are supposed to have numerical values.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Implement a --unbreak-lines-only option to try to revert MUA breakage.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Accept and highlight: Added header: Content-Encoding: deflate
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Accept and highlight: Compressed content from 29258 to 8630 bytes.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Accept and highlight: Client request arrived in time on socket 21.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Highlight: Didn't receive data in time: a.fsdn.com:443
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Accept log messages with ISO 8601 time stamps, too.
+     </para>
+     </listitem>
+    </itemizedlist>
     </para>
-
    </listitem>
-
    <listitem>
-  <para>
-   Also, <link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> is
-   off by default now. If you've liked this feature in the past, you may want
-   to turn it back on in <filename>user.action</filename> now.
+   <para>
+    uagen:
+    <itemizedlist>
+    <listitem>
+     <para>
+      Bump generated Firefox version to 8.0.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      Only randomize the release date if the new --randomize-release-date
+      option is enabled. Firefox versions after 4 use a fixed date string
+      without meaning.
+     </para>
+     </listitem>
+    </itemizedlist>
+   </para>
+  </listitem>
+ </itemizedlist>
+</para>
+
+
+<!--   ~~~~~       New section      ~~~~~     -->
+
+<sect2 id="upgradersnote">
+<title>Note to Upgraders</title>
+
+<para>
+ A quick list of things to be aware of before upgrading from earlier
+ versions of <application>Privoxy</application>:
+</para>
+
+<para>
+ <itemizedlist>
+
+ <listitem>
+  <para>
+   The recommended way to upgrade &my-app; is to backup your old
+   configuration files, install the new ones, verify that &my-app;
+   is working correctly and finally merge back your changes using
+   <application>diff</application> and maybe <application>patch</application>.
+  </para>
+  <para>
+   There are a number of new features in each &my-app; release and
+   most of them have to be explicitly enabled in the configuration
+   files. Old configuration files obviously don't do that and due
+   to syntax changes using old configuration files with a new
+   &my-app; isn't always possible anyway.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+    Note that some installers remove earlier versions completely,
+    including configuration files, therefore you should really save
+    any important configuration files!
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   On the other hand, other installers don't overwrite existing configuration
+   files, thinking you will want to do that yourself.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   <filename>standard.action</filename> has been merged into
+   the <filename>default.action</filename> file.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   In the default configuration only fatal errors are logged now.
+   You can change that in the <link linkend="DEBUG">debug section</link>
+   of the configuration file. You may also want to enable more verbose
+   logging until you verified that the new &my-app; version is working
+   as expected.
+  </para>
+ </listitem>
+
+ <listitem>
+    <para>
+     Three other config file settings are now off by default:
+     <link linkend="enable-remote-toggle">enable-remote-toggle</link>,
+     <link linkend="enable-remote-http-toggle">enable-remote-http-toggle</link>,
+     and  <link linkend="enable-edit-actions">enable-edit-actions</link>.
+     If you use or want these, you will need to explicitly enable them, and
+     be aware of the security issues involved.
+    </para>
+  </listitem>
+
+<!--
+ <listitem>
+  <para>
+   What constitutes a <quote>default</quote> configuration has changed,
+   and you may want to review which actions are <quote>on</quote> by
+   default. This is primarily a matter of emphasis, but some features
+   you may have been used to, may now be <quote>off</quote> by default.
+   There are also a number of new actions and filters you may want to
+   consider, most of which are not fully incorporated into the default
+   settings as yet (see above).
+  </para>
+ </listitem>
+-->
+<!--
+  <listitem>
+   <para>
+    The default actions setting is now <literal>Cautious</literal>. Previous
+    releases had a default setting of <literal>Medium</literal>. Experienced
+    users may want to adjust this, as it is fairly conservative by &my-app;
+    standards and past practices. See <ulink
+    url="http://config.privoxy.org/edit-actions-list?f=default">
+    http://config.privoxy.org/edit-actions-list?f=default</ulink>. New users
+    should try the default settings for a while before turning up the volume.
+   </para>
+  </listitem>
+
+  <listitem>
+   <para>
+    The default setting has filtering turned <emphasis>off</emphasis>, which
+    subsequently means that compression is <emphasis>on</emphasis>. Remember
+    that filtering does not work on compressed pages, so if you use, or want to
+    use, filtering, you will need to force compression off. Example:
+   </para>
+   <para>
+ <screen>
+  { +<link linkend="filter">filter</link>{google}  +<link linkend="prevent-compression">prevent-compression</link> }
+   .google.</screen>
+   </para>
+   <para>
+    Or if you use a number of filters, or filter many sites, you may just want
+    to turn off compression for all sites in
+    <filename>default.action</filename> (or
+    <filename>user.action</filename>).
+   </para>
+
+  </listitem>
+
+  <listitem>
+  <para>
+   Also, <link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> is
+   off by default now. If you've liked this feature in the past, you may want
+   to turn it back on in <filename>user.action</filename> now.
    </para>
    </listitem>
  
@@ -1872,6 +2433,27 @@ must find a better place for this paragraph
  <para>
   <itemizedlist>
  
+ <listitem>
+  <para>
+   <emphasis>--config-test</emphasis>
+  </para>
+  <para>
+   Exit after loading the configuration files before binding to
+   the listen address. The exit code signals whether or not the
+   configuration files have been successfully loaded.
+  </para>
+  <para>
+   If the exit code is 1, at least one of the configuration files
+   is invalid, if it is 0, all the configuration files have been
+   successfully loaded (but may still contain errors that can
+   currently only be detected at run time).
+  </para>
+  <para>
+   This option doesn't affect the log setting, combination with
+   <emphasis>--no-daemon</emphasis> is recommended if a configured
+   log file shouldn't be used.
+  </para>
+ </listitem>
   <listitem>
    <para>
      <emphasis>--version</emphasis>
@@ -3383,7 +3965,7 @@ for details.
      and use their output as input.
     </para>
     <para>
-    If the request URL gets changed, &my-app; will detect that and use the new
+    If the request URI gets changed, &my-app; will detect that and use the new
      one. This can be used to rewrite the request destination behind the client's
      back, for example to specify a Tor exit relay for certain requests.
     </para>
@@ -3405,7 +3987,7 @@ for details.
  {+client-header-filter{hide-tor-exit-notation}}
  /
      </screen>
-    </para>
+   </para>
    </listitem>
   </varlistentry>
  
@@ -3499,6 +4081,22 @@ TAG:^User-Agent: fetch libfetch/
  TAG:^User-Agent: Ubuntu APT-HTTP/
  TAG:^User-Agent: MPlayer/
      </screen>
+   </para>
+   <para>
+     <screen>
+# Tag all requests with the Range header set
+{+client-header-tagger{range-requests}}
+/
+
+# Disable filtering for the tagged requests.
+#
+# With filtering enabled Privoxy would remove the Range headers
+# to be able to filter the whole response. The downside is that
+# it prevents clients from resuming downloads or skipping over
+# parts of multimedia files.
+{-filter -deanimate-gifs}
+TAG:^RANGE-REQUEST$
+    </screen>
      </para>
    </listitem>
   </varlistentry>
@@ -4117,9 +4715,19 @@ new action
     <para>
      This is a left-over from the time when <application>Privoxy</application>
      didn't support important HTTP/1.1 features well. It is left here for the
-    unlikely case that you experience HTTP/1.1 related problems with some server
-    out there. Not all HTTP/1.1 features and requirements are supported yet,
-    so there is a chance you might need this action.
+    unlikely case that you experience HTTP/1.1-related problems with some server
+    out there.
+   </para>
+   <para>
+    Note that enabling this action is only a workaround. It should not
+    be enabled for sites that work without it. While it shouldn't break
+    any pages, it has an (usually negative) performance impact.
+  </para>
+  <para>
+    If you come across a site where enabling this action helps, please report it,
+    so the cause of the problem can be analyzed. If the problem turns out to be
+    caused by a bug in  <application>Privoxy</application> it should be
+    fixed so the following release works without the work around.
     </para>
    </listitem>
   </varlistentry>
@@ -4356,10 +4964,10 @@ problem-host.example.com</screen>
      by defining appropriate <literal>-filter</literal> exceptions.
     </para>
     <para>
-    Compressed content can't be filtered either, unless &my-app;
-    is compiled with zlib support (requires at least &my-app; 3.0.7),
-    in which case &my-app; will decompress the content before filtering
-    it.
+    Compressed content can't be filtered either, but if &my-app;
+    is compiled with zlib support and a supported compression algorithm
+    is used (gzip or deflate), &my-app; can first decompress the content
+    and then filter it.
     </para>
     <para>
      If you use a &my-app; version without zlib support, but want filtering to work on
@@ -5463,6 +6071,94 @@ new action
  </variablelist>
  </sect3>
  
+
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect3 renderas="sect4" id="limit-cookie-lifetime">
+<title>limit-cookie-lifetime</title>
+
+<variablelist>
+ <varlistentry>
+  <term>Typical use:</term>
+  <listitem>
+   <para>Limit the lifetime of HTTP cookies to a couple of minutes or hours.</para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Effect:</term>
+  <listitem>
+   <para>
+    Overwrites the expires field in Set-Cookie server headers if it's above the specified limit.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Type:</term>
+  <!-- Boolean, Parameterized, Multi-value -->
+  <listitem>
+   <para>Parameterized.</para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Parameter:</term>
+  <listitem>
+   <para>
+    The lifetime limit in minutes, or 0.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Notes:</term>
+  <listitem>
+   <para>
+    This action reduces the lifetime of HTTP cookies coming from the
+    server to the specified number of minutes, starting from the time
+    the cookie passes Privoxy.
+   </para>
+   <para>
+    Cookies with a lifetime below the limit are not modified.
+    The lifetime of session cookies is set to the specified limit.
+   </para>
+   <para>
+    The effect of this action depends on the server.
+   </para>
+   <para>
+    In case of servers which refresh their cookies with each response
+    (or at least frequently), the lifetime limit set by this action
+    is updated as well.
+    Thus, a session associated with the cookie continues to work with
+    this action enabled, as long as a new request is made before the
+    last limit set is reached.
+   </para>
+   <para>
+    However, some servers send their cookies once, with a lifetime of several
+    years (the year 2037 is a popular choice), and do not refresh them
+    until a certain event in the future, for example the user logging out.
+    In this case this action may limit the absolute lifetime of the session,
+    even if requests are made frequently.
+   </para>
+   <para>
+    If the parameter is <quote>0</quote>, this action behaves like
+    <literal><link linkend="session-cookies-only">session-cookies-only</link></literal>.
+   </para>
+  </listitem>
+ </varlistentry>
+
+ <varlistentry>
+  <term>Example usages:</term>
+  <listitem>
+    <para>
+     <screen>+limit-cookie-lifetime{60}
+       </screen>
+   </para>
+  </listitem>
+ </varlistentry>
+</variablelist>
+</sect3>
+
  <!--   ~~~~~       New section      ~~~~~     -->
  <sect3 renderas="sect4" id="prevent-compression">
  <title>prevent-compression</title>
@@ -5719,6 +6415,10 @@ new action
      either provided as parameter, or derived by applying a
      single pcrs command to the original URL.
     </para>
+   <para>
+    The syntax for pcrs commands is documented in the
+    <link linkend="filter-file">filter file</link> section.
+   </para>
     <para>
      This action will be ignored if you use it together with
      <literal><link linkend="block">block</link></literal>.
@@ -6149,3740 +6849,2769 @@ example.org/instance-that-is-delivered-as-xml-but-is-not
  </variablelist>
  </sect3>
  
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect3>
-<title>Summary</title>
-<para>
- Note that many of these actions have the potential to cause a page to
- misbehave, possibly even not to display at all. There are many ways
- a site designer may choose to design his site, and what HTTP header
- content, and other criteria, he may depend on. There is no way to have hard
- and fast rules for all sites. See the <link
- linkend="ACTIONSANAT">Appendix</link> for a brief example on troubleshooting
- actions.
-</para>
-</sect3>
-</sect2>
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2 id="aliases">
-<title>Aliases</title>
-<para>
- Custom <quote>actions</quote>, known to <application>Privoxy</application>
- as <quote>aliases</quote>, can be defined by combining other actions.
- These can in turn be invoked just like the built-in actions.
- Currently, an alias name can contain any character except space, tab,
- <quote>=</quote>,
- <quote>{</quote> and <quote>}</quote>, but we <emphasis>strongly
- recommend</emphasis> that you only use <quote>a</quote> to <quote>z</quote>,
- <quote>0</quote> to <quote>9</quote>, <quote>+</quote>, and <quote>-</quote>.
- Alias names are not case sensitive, and are not required to start with a
- <quote>+</quote> or <quote>-</quote> sign, since they are merely textually
- expanded.
-</para>
-<para>
- Aliases can be used throughout the actions file, but they <emphasis>must be
- defined in a special section at the top of the file!</emphasis>
- And there can only be one such section per actions file. Each actions file may
- have its own alias section, and the aliases defined in it are only visible
- within that file.
-</para>
-<para>
- There are two main reasons to use aliases: One is to save typing for frequently
- used combinations of actions, the other one is a gain in flexibility: If you
- decide once how you want to handle shops by defining an alias called
- <quote>shop</quote>, you can later change your policy on shops in
- <emphasis>one</emphasis> place, and your changes will take effect everywhere
- in the actions file where the <quote>shop</quote> alias is used. Calling aliases
- by their purpose also makes your actions files more readable.
-</para>
-<para>
- Currently, there is one big drawback to using aliases, though:
- <application>Privoxy</application>'s built-in web-based action file
- editor honors aliases when reading the actions files, but it expands
- them before writing. So the effects of your aliases are of course preserved,
- but the aliases themselves are lost when you edit sections that use aliases
- with it.
-</para>
-
-<para>
- Now let's define some aliases...
-</para>
-
-<para>
- <screen>
- # Useful custom aliases we can use later.
- #
- # Note the (required!) section header line and that this section
- # must be at the top of the actions file!
- #
- {{alias}}
-
- # These aliases just save typing later:
- # (Note that some already use other aliases!)
- #
- +crunch-all-cookies = +<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> +<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
- -crunch-all-cookies = -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
- +block-as-image      = +block{Blocked image.} +handle-as-image
- allow-all-cookies   = -crunch-all-cookies -<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link>
-
- # These aliases define combinations of actions
- # that are useful for certain types of sites:
- #
- fragile     = -<link linkend="BLOCK">block</link> -<link linkend="FILTER">filter</link> -crunch-all-cookies -<link linkend="FAST-REDIRECTS">fast-redirects</link> -<link linkend="HIDE-REFERER">hide-referrer</link> -<link linkend="PREVENT-COMPRESSION">prevent-compression</link>
-
- shop        = -crunch-all-cookies -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link>
-
- # Short names for other aliases, for really lazy people ;-)
- #
- c0 = +crunch-all-cookies
- c1 = -crunch-all-cookies</screen>
-</para>
-
-<para>
- ...and put them to use. These sections would appear in the lower part of an
- actions file and define exceptions to the default actions (as specified further
- up for the <quote>/</quote> pattern):
-</para>
-
-<para>
- <screen>
- # These sites are either very complex or very keen on
- # user data and require minimal interference to work:
- #
- {fragile}
- .office.microsoft.com
- .windowsupdate.microsoft.com
- # Gmail is really mail.google.com, not gmail.com
- mail.google.com
-
- # Shopping sites:
- # Allow cookies (for setting and retrieving your customer data)
- #
- {shop}
- .quietpc.com
- .worldpay.com   # for quietpc.com
- mybank.example.com
-
- # These shops require pop-ups:
- #
- {-filter{all-popups} -filter{unsolicited-popups}}
-  .dabs.com
-  .overclockers.co.uk</screen>
-</para>
-
-<para>
- Aliases like <quote>shop</quote> and <quote>fragile</quote> are typically used for
- <quote>problem</quote> sites that require more than one action to be disabled
- in order to function properly.
-</para>
-</sect2>
-<!--
-hal stop here
--->
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2 id="act-examples">
-<title>Actions Files Tutorial</title>
-<para>
- The above chapters have shown <link linkend="actions-file">which actions files
- there are and how they are organized</link>, how actions are <link
- linkend="actions">specified</link> and <link linkend="actions-apply">applied
- to URLs</link>, how <link linkend="af-patterns">patterns</link> work, and how to
- define and use <link linkend="aliases">aliases</link>. Now, let's look at an
- example <filename>match-all.action</filename>, <filename>default.action</filename>
- and <filename>user.action</filename> file and see how all these pieces come together:
-</para>
-
-<sect3>
-<title>match-all.action</title>
-<para>
- Remember <emphasis>all actions are disabled when matching starts</emphasis>,
- so we have to explicitly enable the ones we want.
-</para>
-
-<para>
- While the <filename>match-all.action</filename> file only contains a
- single section, it is probably the most important one. It has only one
- pattern, <quote><literal>/</literal></quote>, but this pattern
- <link linkend="af-patterns">matches all URLs</link>. Therefore, the set of
- actions used in this <quote>default</quote> section <emphasis>will
- be applied to all requests as a start</emphasis>. It can  be partly or
- wholly overridden by other actions files like <filename>default.action</filename>
- and <filename>user.action</filename>, but it will still be largely responsible
- for your overall browsing experience.
-</para>
-
-<para>
- Again, at the start of matching, all actions are disabled, so there is
- no need to disable any actions here. (Remember: a <quote>+</quote>
- preceding the action name enables the action, a <quote>-</quote> disables!).
- Also note how this long line has been made more readable by splitting it into
- multiple lines with line continuation.
-</para>
-
-<para>
- <screen>
-{ \
- +<link linkend="CHANGE-X-FORWARDED-FOR">change-x-forwarded-for{block}</link> \
- +<link linkend="HIDE-FROM-HEADER">hide-from-header{block}</link> \
- +<link linkend="SET-IMAGE-BLOCKER">set-image-blocker{pattern}</link> \
-}
-/ # Match all URLs
- </screen>
-</para>
-
-<para>
- The default behavior is now set.
-</para>
-</sect3>
-
-<sect3>
-<title>default.action</title>
-
-<para>
- If you aren't a developer, there's no need for you to edit the
- <filename>default.action</filename> file. It is maintained by
- the &my-app; developers and if you disagree with some of the
- sections, you should overrule them in your <filename>user.action</filename>.
-</para>
-
-<para>
- Understanding the <filename>default.action</filename> file can
- help you with your <filename>user.action</filename>, though.
-</para>
-
-<para>
- The first section in this file is a special section for internal use
- that prevents older &my-app; versions from reading the file:
-</para>
-
-<para>
- <screen>
-##########################################################################
-# Settings -- Don't change! For internal Privoxy use ONLY.
-##########################################################################
-{{settings}}
-for-privoxy-version=3.0.11</screen>
-</para>
-
-<para>
- After that comes the (optional) alias section. We'll use the example
- section from the above <link linkend="aliases">chapter on aliases</link>,
- that also explains why and how aliases are used:
-</para>
-
-<para>
- <screen>
-##########################################################################
-# Aliases
-##########################################################################
-{{alias}}
-
- # These aliases just save typing later:
- # (Note that some already use other aliases!)
- #
- +crunch-all-cookies = +<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> +<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
- -crunch-all-cookies = -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
- +block-as-image      = +block{Blocked image.} +handle-as-image
- mercy-for-cookies   = -crunch-all-cookies -<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link>
-
- # These aliases define combinations of actions
- # that are useful for certain types of sites:
- #
- fragile     = -<link linkend="BLOCK">block</link> -<link linkend="FILTER">filter</link> -crunch-all-cookies -<link linkend="FAST-REDIRECTS">fast-redirects</link> -<link linkend="HIDE-REFERER">hide-referrer</link>
- shop        = -crunch-all-cookies -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link></screen>
-</para>
-
-<para>
- The first of our specialized sections is concerned with <quote>fragile</quote>
- sites, i.e. sites that require minimum interference, because they are either
- very complex or very keen on tracking you (and have mechanisms in place that
- make them unusable for people who avoid being tracked). We will simply use
- our pre-defined <literal>fragile</literal> alias instead of stating the list
- of actions explicitly:
-</para>
-
-<para>
- <screen>
-##########################################################################
-# Exceptions for sites that'll break under the default action set:
-##########################################################################
-
-# "Fragile" Use a minimum set of actions for these sites (see alias above):
-#
-{ fragile }
-.office.microsoft.com           # surprise, surprise!
-.windowsupdate.microsoft.com
-mail.google.com</screen>
-</para>
-
-<para>
- Shopping sites are not as fragile, but they typically
- require cookies to log in, and pop-up windows for shopping
- carts or item details. Again, we'll use a pre-defined alias:
-</para>
-
-<para>
- <screen>
-# Shopping sites:
-#
-{ shop }
-.quietpc.com
-.worldpay.com   # for quietpc.com
-.jungle.com
-.scan.co.uk</screen>
-</para>
-
-<para>
- The <literal><link linkend="FAST-REDIRECTS">fast-redirects</link></literal>
- action, which may have been enabled in <filename>match-all.action</filename>,
- breaks some sites. So disable it for popular sites where we know it misbehaves:
-</para>
-
-<para>
- <screen>
-{ -<link linkend="FAST-REDIRECTS">fast-redirects</link> }
-login.yahoo.com
-edit.*.yahoo.com
-.google.com
-.altavista.com/.*(like|url|link):http
-.altavista.com/trans.*urltext=http
-.nytimes.com</screen>
-</para>
-
-<para>
- It is important that <application>Privoxy</application> knows which
- URLs belong to images, so that <emphasis>if</emphasis> they are to
- be blocked, a substitute image can be sent, rather than an HTML page.
- Contacting the remote site to find out is not an option, since it
- would destroy the loading time advantage of banner blocking, and it
- would feed the advertisers information about you. We can mark any
- URL as an image with the <literal><link
- linkend="handle-as-image">handle-as-image</link></literal> action,
- and marking all URLs that end in a known image file extension is a
- good start:
-</para>
-
-<para>
- <screen>
-##########################################################################
-# Images:
-##########################################################################
-
-# Define which file types will be treated as images, in case they get
-# blocked further down this file:
-#
-{ +<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> }
-/.*\.(gif|jpe?g|png|bmp|ico)$</screen>
-</para>
-
-<para>
- And then there are known banner sources. They often use scripts to
- generate the banners, so it won't be visible from the URL that the
- request is for an image. Hence we block them <emphasis>and</emphasis>
- mark them as images in one go, with the help of our
- <literal>+block-as-image</literal> alias defined above. (We could of
- course just as well use <literal>+<link linkend="block">block</link>
- +<link linkend="handle-as-image">handle-as-image</link></literal> here.)
- Remember that the type of the replacement image is chosen by the
- <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
- action. Since all URLs have matched the default section with its
- <literal>+<link linkend="set-image-blocker">set-image-blocker</link>{pattern}</literal>
- action before, it still applies and needn't be repeated:
-</para>
-
-<para>
- <screen>
-# Known ad generators:
-#
-{ +block-as-image }
-ar.atwola.com
-.ad.doubleclick.net
-.ad.*.doubleclick.net
-.a.yimg.com/(?:(?!/i/).)*$
-.a[0-9].yimg.com/(?:(?!/i/).)*$
-bs*.gsanet.com
-.qkimg.net</screen>
-</para>
-
-<para>
- One of the most important jobs of <application>Privoxy</application>
- is to block banners. Many of these can be <quote>blocked</quote>
- by the <literal><link linkend="filter">filter</link>{banners-by-size}</literal>
- action, which we enabled above, and which deletes the references to banner
- images from the pages while they are loaded, so the browser doesn't request
- them anymore, and hence they don't need to be blocked here. But this naturally
- doesn't catch all banners, and some people choose not to use filters, so we
- need a comprehensive list of patterns for banner URLs here, and apply the
- <literal><link linkend="block">block</link></literal> action to them.
-</para>
-<para>
- First comes many generic patterns, which do most of the work, by
- matching typical domain and path name components of banners. Then comes
- a list of individual patterns for specific sites, which is omitted here
- to keep the example short:
-</para>
-
-<para>
- <screen>
-##########################################################################
-# Block these fine banners:
-##########################################################################
-{ <link linkend="BLOCK">+block{Banner ads.}</link> }
-
-# Generic patterns:
-#
-ad*.
-.*ads.
-banner?.
-count*.
-/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
-/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
-
-# Site-specific patterns (abbreviated):
-#
-.hitbox.com</screen>
-</para>
-
-<para>
- It's quite remarkable how many advertisers actually call their banner
- servers ads.<replaceable>company</replaceable>.com, or call the directory
- in which the banners are stored simply <quote>banners</quote>. So the above
- generic patterns are surprisingly effective.
-</para>
-<para>
- But being very generic, they necessarily also catch URLs that we don't want
- to block. The pattern <literal>.*ads.</literal> e.g. catches
- <quote>nasty-<emphasis>ads</emphasis>.nasty-corp.com</quote> as intended,
- but also <quote>downlo<emphasis>ads</emphasis>.sourcefroge.net</quote> or
- <quote><emphasis>ads</emphasis>l.some-provider.net.</quote> So here come some
- well-known exceptions to the <literal>+<link linkend="BLOCK">block</link></literal>
- section above.
-</para>
-<para>
- Note that these are exceptions to exceptions from the default! Consider the URL
- <quote>downloads.sourcefroge.net</quote>: Initially, all actions are deactivated,
- so it wouldn't get blocked. Then comes the defaults section, which matches the
- URL, but just deactivates the <literal><link linkend="BLOCK">block</link></literal>
- action once again. Then it matches <literal>.*ads.</literal>, an exception to the
- general non-blocking policy, and suddenly
- <literal><link linkend="BLOCK">+block</link></literal> applies. And now, it'll match
- <literal>.*loads.</literal>, where <literal><link linkend="BLOCK">-block</link></literal>
- applies, so (unless it matches <emphasis>again</emphasis> further down) it ends up
- with no <literal><link linkend="BLOCK">block</link></literal> action applying.
-</para>
-
-<para>
- <screen>
-##########################################################################
-# Save some innocent victims of the above generic block patterns:
-##########################################################################
-
-# By domain:
-#
-{ -<link linkend="BLOCK">block</link> }
-adv[io]*.  # (for advogato.org and advice.*)
-adsl.      # (has nothing to do with ads)
-adobe.     # (has nothing to do with ads either)
-ad[ud]*.   # (adult.* and add.*)
-.edu       # (universities don't host banners (yet!))
-.*loads.   # (downloads, uploads etc)
-
-# By path:
-#
-/.*loads/
-
-# Site-specific:
-#
-www.globalintersec.com/adv # (adv = advanced)
-www.ugu.com/sui/ugu/adv</screen>
-</para>
-
-<para>
- Filtering source code can have nasty side effects,
- so make an exception for our friends at sourceforge.net,
- and all paths with <quote>cvs</quote> in them. Note that
- <literal>-<link linkend="FILTER">filter</link></literal>
- disables <emphasis>all</emphasis> filters in one fell swoop!
-</para>
-
-<para>
- <screen>
-# Don't filter code!
-#
-{ -<link linkend="FILTER">filter</link> }
-/(.*/)?cvs
-bugzilla.
-developer.
-wiki.
-.sourceforge.net</screen>
-</para>
-
-<para>
- The actual <filename>default.action</filename> is of course much more
- comprehensive, but we hope this example made clear how it works.
-</para>
-
-</sect3>
-
-<sect3><title>user.action</title>
-
-<para>
- So far we are painting with a broad brush by setting general policies,
- which would be a reasonable starting point for many people. Now,
- you might want to be more specific and have customized rules that
- are more suitable to your personal habits and preferences. These would
- be for narrowly defined situations like your ISP or your bank, and should
- be placed in <filename>user.action</filename>, which is parsed after all other
- actions files and hence has the last word, over-riding any previously
- defined actions. <filename>user.action</filename> is also a
- <emphasis>safe</emphasis> place for your personal settings, since
- <filename>default.action</filename> is actively maintained by the
- <application>Privoxy</application> developers and you'll probably want
- to install updated versions from time to time.
-</para>
-
-<para>
- So let's look at a few examples of things that one might typically do in
- <filename>user.action</filename>:
-</para>
-
-
-<!-- brief sample user.action here -->
-
-<para>
- <screen>
-# My user.action file. &lt;fred@example.com&gt;</screen>
-</para>
-
-<para>
- As <link linkend="aliases">aliases</link> are local to the actions
- file that they are defined in, you can't use the ones from
- <filename>default.action</filename>, unless you repeat them here:
-</para>
-
-<para>
- <screen>
-# Aliases are local to the file they are defined in.
-# (Re-)define aliases for this file:
-#
-{{alias}}
-#
-# These aliases just save typing later, and the alias names should
-# be self explanatory.
-#
-+crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
--crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
- allow-all-cookies  = -crunch-all-cookies -session-cookies-only
- allow-popups       = -filter{all-popups}
-+block-as-image     = +block{Blocked as image.} +handle-as-image
--block-as-image     = -block
-
-# These aliases define combinations of actions that are useful for
-# certain types of sites:
-#
-fragile     = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer
-shop        = -crunch-all-cookies allow-popups
-
-# Allow ads for selected useful free sites:
-#
-allow-ads   = -block -filter{banners-by-size} -filter{banners-by-link}
-
-# Alias for specific file types that are text, but might have conflicting
-# MIME types. We want the browser to force these to be text documents.
-handle-as-text = -<link linkend="FILTER">filter</link> +-<link linkend="content-type-overwrite">content-type-overwrite{text/plain}</link> +-<link linkend="FORCE-TEXT-MODE">force-text-mode</link> -<link linkend="HIDE-CONTENT-DISPOSITION">hide-content-disposition</link></screen>
-
-</para>
-
-<para>
- Say you have accounts on some sites that you visit regularly, and
- you don't want to have to log in manually each time. So you'd like
- to allow persistent cookies for these sites. The
- <literal>allow-all-cookies</literal> alias defined above does exactly
- that, i.e. it disables crunching of cookies in any direction, and the
- processing of cookies to make them only temporary.
-</para>
-
-<para>
- <screen>
-{ allow-all-cookies }
- sourceforge.net
- .yahoo.com
- .msdn.microsoft.com
- .redhat.com</screen>
-</para>
-
-<para>
- Your bank is allergic to some filter, but you don't know which, so you disable them all:
-</para>
-
-<para>
- <screen>
-{ -<link linkend="FILTER">filter</link> }
- .your-home-banking-site.com</screen>
-</para>
-
-<para>
- Some file types you may not want to filter for various reasons:
-</para>
-
-<para>
- <screen>
-# Technical documentation is likely to contain strings that might
-# erroneously get altered by the JavaScript-oriented filters:
-#
-.tldp.org
-/(.*/)?selfhtml/
-
-# And this stupid host sends streaming video with a wrong MIME type,
-# so that Privoxy thinks it is getting HTML and starts filtering:
-#
-stupid-server.example.com/</screen>
-</para>
-
-<para>
- Example of a simple <link linkend="BLOCK">block</link> action. Say you've
- seen an ad on your favourite page on example.com that you want to get rid of.
- You have right-clicked the image, selected <quote>copy image location</quote>
- and pasted the URL below while removing the leading http://, into a
- <literal>{ +block{} }</literal> section. Note that <literal>{ +handle-as-image
- }</literal> need not be specified, since all URLs ending in
- <literal>.gif</literal> will be tagged as images by the general rules as set
- in default.action anyway:
-</para>
-
-<para>
- <screen>
-{ +<link linkend="BLOCK">block</link>{Nasty ads.} }
- www.example.com/nasty-ads/sponsor\.gif
- another.example.net/more/junk/here/</screen>
-</para>
-
-<para>
- The URLs of dynamically generated banners, especially from large banner
- farms, often don't use the well-known image file name extensions, which
- makes it impossible for <application>Privoxy</application> to guess
- the file type just by looking at the URL.
- You can use the <literal>+block-as-image</literal> alias defined above for
- these cases.
- Note that objects which match this rule but then turn out NOT to be an
- image are typically rendered as a <quote>broken image</quote> icon by the
- browser. Use cautiously.
-</para>
-
-<para>
- <screen>
-{ +block-as-image }
- .doubleclick.net
- .fastclick.net
- /Realmedia/ads/
- ar.atwola.com/</screen>
-</para>
-
-<para>
- Now you noticed that the default configuration breaks Forbes Magazine,
- but you were too lazy to find out which action is the culprit, and you
- were again too lazy to give <link linkend="contact">feedback</link>, so
- you just used the <literal>fragile</literal> alias on the site, and
- -- <emphasis>whoa!</emphasis> -- it worked. The <literal>fragile</literal>
- aliases disables those actions that are most likely to break a site. Also,
- good for testing purposes to see if it is <application>Privoxy</application>
- that is causing the problem or not. We later find other regular sites
- that misbehave, and add those to our personalized list of troublemakers:
-</para>
-
-<para>
-<screen>
-{ fragile }
- .forbes.com
- webmail.example.com
- .mybank.com</screen>
-</para>
-
-<para>
- You like the <quote>fun</quote> text replacements in <filename>default.filter</filename>,
- but it is disabled in the distributed actions file.
- So you'd like to turn it on in your private,
- update-safe config, once and for all:
-</para>
-
-<para>
-<screen>
-{ +<link linkend="filter-fun">filter{fun}</link> }
- / # For ALL sites!</screen>
-</para>
-
-<para>
- Note that the above is not really a good idea: There are exceptions
- to the filters in <filename>default.action</filename> for things that
- really shouldn't be filtered, like code on CVS->Web interfaces. Since
- <filename>user.action</filename> has the last word, these exceptions
- won't be valid for the <quote>fun</quote> filtering specified here.
-</para>
-
-<para>
- You might also worry about how your favourite free websites are
- funded, and find that they rely on displaying banner advertisements
- to survive. So you might want to specifically allow banners for those
- sites that you feel provide value to you:
-</para>
-
-<para>
-<screen>
-{ allow-ads }
- .sourceforge.net
- .slashdot.org
- .osdn.net</screen>
-</para>
-
-<para>
- Note that <literal>allow-ads</literal> has been aliased to
- <literal>-<link linkend="block">block</link></literal>,
- <literal>-<link linkend="filter-banners-by-size">filter{banners-by-size}</link></literal>, and
- <literal>-<link linkend="filter-banners-by-link">filter{banners-by-link}</link></literal> above.
-</para>
-
-<para>
- Invoke another alias here to force an over-ride of the MIME type <literal>
- application/x-sh</literal> which typically would open a download type
- dialog. In my case, I want to look at the shell script, and then I can save
- it should I choose to.
-</para>
-
-<para>
-<screen>
-{ handle-as-text }
- /.*\.sh$</screen>
-</para>
-
-<para>
- <filename>user.action</filename> is generally the best place to define
- exceptions and additions to the default policies of
- <filename>default.action</filename>. Some actions are safe to have their
- default policies set here though. So let's set a default policy to have a
- <quote>blank</quote> image as opposed to the checkerboard pattern for
- <emphasis>ALL</emphasis> sites. <quote>/</quote> of course matches all URL
- paths and patterns:
-</para>
-
-<para>
-<screen>
-{ +<link linkend="set-image-blocker">set-image-blocker{blank}</link> }
-/ # ALL sites</screen>
-</para>
-
-</sect3>
-</sect2>
-
-<!--  ~  End section  ~  -->
-
-</sect1>
-
-<!--  ~  End section  ~  -->
-
-<!--   ~~~~~~~~       New section Header    ~~~~~~~~~     -->
-
-<sect1 id="filter-file">
-<title>Filter Files</title>
-
-<para>
- On-the-fly text substitutions need
- to be defined in a <quote>filter file</quote>. Once defined, they
- can then be invoked as an <quote>action</quote>.
-</para>
-
-<para>
- &my-app; supports three different filter actions:
- <literal><link linkend="filter">filter</link></literal> to
- rewrite the content that is send to the client,
- <literal><link linkend="client-header-filter">client-header-filter</link></literal>
- to rewrite headers that are send by the client, and
- <literal><link linkend="server-header-filter">server-header-filter</link></literal>
- to rewrite headers that are send by the server.
-</para>
-
-<para>
- &my-app; also supports two tagger actions:
- <literal><link linkend="client-header-tagger">client-header-tagger</link></literal>
- and
- <literal><link linkend="server-header-tagger">server-header-tagger</link></literal>.
- Taggers and filters use the same syntax in the filter files, the difference
- is that taggers don't modify the text they are filtering, but use a rewritten
- version of the filtered text as tag. The tags can then be used to change the
- applying actions through sections with <link linkend="tag-pattern">tag-patterns</link>.
-</para>
-
-
-<para>
- Multiple filter files can be defined through the <literal> <link
- linkend="filterfile">filterfile</link></literal> config directive. The filters
- as supplied by the developers are located in
- <filename>default.filter</filename>. It is recommended that any locally
- defined or modified filters go in a separately defined file such as
- <filename>user.filter</filename>.
- </para>
-
-<para>
- Common tasks for content filters are to eliminate common annoyances in
- HTML and JavaScript, such as pop-up windows,
- exit consoles, crippled windows without navigation tools, the
- infamous &lt;BLINK&gt; tag etc, to suppress images with certain
- width and height attributes (standard banner sizes or web-bugs),
- or just to have fun.
-</para>
-
-<para>
- Enabled content filters are applied to any content whose
- <quote>Content Type</quote> header is recognised as a sign
- of text-based content, with the exception of <literal>text/plain</literal>.
- Use the <link linkend="FORCE-TEXT-MODE">force-text-mode</link> action
- to also filter other content.
-</para>
-
-<para>
- Substitutions are made at the source level, so if you want to <quote>roll
- your own</quote> filters, you should first be familiar with HTML syntax,
- and, of course, regular expressions.
-</para>
-
-<para>
- Just like the <link linkend="actions-file">actions files</link>, the
- filter file is organized in sections, which are called <emphasis>filters</emphasis>
- here. Each filter consists of a heading line, that starts with one of the
- <emphasis>keywords</emphasis> <literal>FILTER:</literal>,
- <literal>CLIENT-HEADER-FILTER:</literal> or <literal>SERVER-HEADER-FILTER:</literal>
- followed by the filter's <emphasis>name</emphasis>, and a short (one line)
- <emphasis>description</emphasis> of what it does. Below that line
- come the <emphasis>jobs</emphasis>, i.e. lines that define the actual
- text substitutions. By convention, the name of a filter
- should describe what the filter <emphasis>eliminates</emphasis>. The
- comment is used in the <ulink url="http://config.privoxy.org/">web-based
- user interface</ulink>.
-</para>
-
-<para>
- Once a filter called <replaceable>name</replaceable> has been defined
- in the filter file, it can be invoked by using an action of the form
- +<literal><link linkend="filter">filter</link>{<replaceable>name</replaceable>}</literal>
- in any <link linkend="actions-file">actions file</link>.
-</para>
-
-<para>
- Filter definitions start with a header line that contains the filter
- type, the filter name and the filter description.
- A content filter header line for a filter called <quote>foo</quote> could look
- like this:
-</para>
-
-<para>
- <screen>FILTER: foo Replace all "foo" with "bar"</screen>
-</para>
-
-<para>
- Below that line, and up to the next header line, come the jobs that
- define what text replacements the filter executes. They are specified
- in a syntax that imitates <ulink url="http://www.perl.org/">Perl</ulink>'s
- <literal>s///</literal> operator. If you are familiar with Perl, you
- will find this to be quite intuitive, and may want to look at the
- PCRS documentation for the subtle differences to Perl behaviour. Most
- notably, the non-standard option letter <literal>U</literal> is supported,
- which turns the default to ungreedy matching.
-</para>
-
-<para>
- If you are new to
-  <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular
-  Expressions</quote></ulink>, you might want to take a look at
- the <link linkend="regex">Appendix on regular expressions</link>, and
- see the <ulink url="http://perldoc.perl.org/perlre.html">Perl
- manual</ulink> for
- <ulink url="http://perldoc.perl.org/perlop.html">the
- <literal>s///</literal> operator's syntax</ulink> and <ulink
- url="http://perldoc.perl.org/perlre.html">Perl-style regular
- expressions</ulink> in general.
- The below examples might also help to get you started.
-</para>
-
-
-<!--   ~~~~~~~~       New section Header    ~~~~~~~~~     -->
-
-<sect2><title>Filter File Tutorial</title>
-<para>
- Now, let's complete our <quote>foo</quote> content filter. We have already defined
- the heading, but the jobs are still missing. Since all it does is to replace
- <quote>foo</quote> with <quote>bar</quote>, there is only one (trivial) job
- needed:
-</para>
-
-<para>
- <screen>s/foo/bar/</screen>
-</para>
-
-<para>
- But wait! Didn't the comment say that <emphasis>all</emphasis> occurrences
- of <quote>foo</quote> should be replaced? Our current job will only take
- care of the first <quote>foo</quote> on each page. For global substitution,
- we'll need to add the <literal>g</literal> option:
-</para>
-
-<para>
- <screen>s/foo/bar/g</screen>
-</para>
-
-<para>
- Our complete filter now looks like this:
-</para>
-<para>
- <screen>FILTER: foo Replace all "foo" with "bar"
-s/foo/bar/g</screen>
-</para>
-
-<para>
- Let's look at some real filters for more interesting examples. Here you see
- a filter that protects against some common annoyances that arise from JavaScript
- abuse. Let's look at its jobs one after the other:
-</para>
-
-
-<para>
- <screen>
-FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
-
-# Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
-#
-s|(&lt;script.*)document\.referrer(.*&lt;/script&gt;)|$1"Not Your Business!"$2|Usg</screen>
-</para>
-
-<para>
- Following the header line and a comment, you see the job. Note that it uses
- <literal>|</literal> as the delimiter instead of <literal>/</literal>, because
- the pattern contains a forward slash, which would otherwise have to be escaped
- by a backslash (<literal>\</literal>).
-</para>
-
-<para>
- Now, let's examine the pattern: it starts with the text <literal>&lt;script.*</literal>
- enclosed in parentheses. Since the dot matches any character, and <literal>*</literal>
- means: <quote>Match an arbitrary number of the element left of myself</quote>, this
- matches <quote>&lt;script</quote>, followed by <emphasis>any</emphasis> text, i.e.
- it matches the whole page, from the start of the first &lt;script&gt; tag.
-</para>
-
-<para>
- That's more than we want, but the pattern continues: <literal>document\.referrer</literal>
- matches only the exact string <quote>document.referrer</quote>. The dot needed to
- be <emphasis>escaped</emphasis>, i.e. preceded by a backslash, to take away its
- special meaning as a joker, and make it just a regular dot. So far, the meaning is:
- Match from the start of the first &lt;script&gt; tag in a the page, up to, and including,
- the text <quote>document.referrer</quote>, if <emphasis>both</emphasis> are present
- in the page (and appear in that order).
-</para>
-
-<para>
- But there's still more pattern to go. The next element, again enclosed in parentheses,
- is <literal>.*&lt;/script&gt;</literal>. You already know what <literal>.*</literal>
- means, so the whole pattern translates to: Match from the start of the first  &lt;script&gt;
- tag in a page to the end of the last &lt;script&gt; tag, provided that the text
- <quote>document.referrer</quote> appears somewhere in between.
-</para>
-
-<para>
- This is still not the whole story, since we have ignored the options and the parentheses:
- The portions of the page matched by sub-patterns that are enclosed in parentheses, will be
- remembered and be available through the variables <literal>$1, $2, ...</literal> in
- the substitute. The <literal>U</literal> option switches to ungreedy matching, which means
- that the first <literal>.*</literal> in the pattern will only <quote>eat up</quote> all
- text in between <quote>&lt;script</quote> and the <emphasis>first</emphasis> occurrence
- of <quote>document.referrer</quote>, and that the second <literal>.*</literal> will
- only span the text up to the <emphasis>first</emphasis> <quote>&lt;/script&gt;</quote>
- tag. Furthermore, the <literal>s</literal> option says that the match may span
- multiple lines in the page, and the <literal>g</literal> option again means that the
- substitution is global.
-</para>
-
-<para>
- So, to summarize, the pattern means: Match all scripts that contain the text
- <quote>document.referrer</quote>. Remember the parts of the script from
- (and including) the start tag up to (and excluding) the string
- <quote>document.referrer</quote> as <literal>$1</literal>, and the part following
- that string, up to and including the closing tag, as <literal>$2</literal>.
-</para>
-
-<para>
- Now the pattern is deciphered, but wasn't this about substituting things? So
- lets look at the substitute: <literal>$1"Not Your Business!"$2</literal> is
- easy to read: The text remembered as <literal>$1</literal>, followed by
- <literal>"Not Your Business!"</literal> (<emphasis>including</emphasis>
- the quotation marks!), followed by the text remembered as <literal>$2</literal>.
- This produces an exact copy of the original string, with the middle part
- (the <quote>document.referrer</quote>) replaced by <literal>"Not Your
- Business!"</literal>.
-</para>
-
-<para>
- The whole job now reads: Replace <quote>document.referrer</quote> by
- <literal>"Not Your Business!"</literal> wherever it appears inside a
- &lt;script&gt tag. Note that this job won't break JavaScript syntax,
- since both the original and the replacement are syntactically valid
- string objects. The script just won't have access to the referrer
- information anymore.
-</para>
-
-<para>
- We'll show you two other jobs from the JavaScript taming department, but
- this time only point out the constructs of special interest:
-</para>
-
-<para>
- <screen>
-# The status bar is for displaying link targets, not pointless blahblah
-#
-s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig</screen>
-</para>
-
-<para>
- <literal>\s</literal> stands for whitespace characters (space, tab, newline,
- carriage return, form feed), so that <literal>\s*</literal> means: <quote>zero
- or more whitespace</quote>. The <literal>?</literal> in <literal>.*?</literal>
- makes this matching of arbitrary text ungreedy. (Note that the <literal>U</literal>
- option is not set). The <literal>['"]</literal> construct means: <quote>a single
- <emphasis>or</emphasis> a double quote</quote>. Finally, <literal>\1</literal> is
- a back-reference to the first parenthesis just like <literal>$1</literal> above,
- with the difference that in the <emphasis>pattern</emphasis>, a backslash indicates
- a back-reference, whereas in the <emphasis>substitute</emphasis>, it's the dollar.
-</para>
-
-<para>
- So what does this job do? It replaces assignments of single- or double-quoted
- strings to the <quote>window.status</quote> object with a dummy assignment
- (using a variable name that is hopefully odd enough not to conflict with
- real variables in scripts). Thus, it catches many cases where e.g. pointless
- descriptions are displayed in the status bar instead of the link target when
- you move your mouse over links.
-</para>
-
-<para>
- <screen>
-# Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
-#
-s/(&lt;body [^&gt;]*)onunload(.*&gt;)/$1never$2/iU</screen>
-</para>
-
-<para>
- Including the
- <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">OnUnload
- event binding</ulink> in the HTML DOM was a <emphasis>CRIME</emphasis>.
- When I close a browser window, I want it to close and die. Basta.
- This job replaces the <quote>onunload</quote> attribute in
- <quote>&lt;body&gt</quote> tags with the dummy word <literal>never</literal>.
- Note that the <literal>i</literal> option makes the pattern matching
- case-insensitive. Also note that ungreedy matching alone doesn't always guarantee
- a minimal match: In the first parenthesis, we had to use <literal>[^&gt;]*</literal>
- instead of <literal>.*</literal> to prevent the match from exceeding the
- &lt;body&gt tag if it doesn't contain <quote>OnUnload</quote>, but the page's
- content does.
-</para>
-
-<para>
- The last example is from the fun department:
-</para>
-
-<para>
- <screen>
-FILTER: fun Fun text replacements
-
-# Spice the daily news:
-#
-s/microsoft(?!\.com)/MicroSuck/ig</screen>
-</para>
-
-<para>
- Note the <literal>(?!\.com)</literal> part (a so-called negative lookahead)
- in the job's pattern, which means: Don't match, if the string
- <quote>.com</quote> appears directly following <quote>microsoft</quote>
- in the page. This prevents links to microsoft.com from being trashed, while
- still replacing the word everywhere else.
-</para>
-
-<para>
- <screen>
-# Buzzword Bingo (example for extended regex syntax)
-#
-s* industry[ -]leading \
-|  cutting[ -]edge \
-|  customer[ -]focused \
-|  market[ -]driven \
-|  award[ -]winning # Comments are OK, too! \
-|  high[ -]performance \
-|  solutions[ -]based \
-|  unmatched \
-|  unparalleled \
-|  unrivalled \
-*&lt;font color="red"&gt;&lt;b&gt;BINGO!&lt;/b&gt;&lt;/font&gt; \
-*igx</screen>
-</para>
-
-<para>
- The <literal>x</literal> option in this job turns on extended syntax, and allows for
- e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.
-</para>
-
-<para>
- You get the idea?
-</para>
-</sect2>
-
-<!--   ~~~~~~~~       New section Header    ~~~~~~~~~     -->
-
-<sect2 id="predefined-filters"><title>The Pre-defined Filters</title>
-
-<!--
-
- Note each filter is also listed in the +filter action section above. Please
- keep these listings in sync.
-
--->
-
-<para>
-The distribution <filename>default.filter</filename> file contains a selection of
-pre-defined filters for your convenience:
-</para>
-
-<variablelist>
- <varlistentry>
-  <term><emphasis>js-annoyances</emphasis></term>
-  <listitem>
-   <para>
-    The purpose of this filter is to get rid of particularly annoying JavaScript abuse.
-    To that end, it
-   <itemizedlist>
-    <listitem>
-     <para>
-      replaces JavaScript references to the browser's referrer information
-      with the string "Not Your Business!". This compliments the <literal><link
-      linkend="hide-referrer">hide-referrer</link></literal> action on the content level.
-     </para>
-    </listitem>
-    <listitem>
-     <para>
-      removes the bindings to the DOM's
-      <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">unload
-      event</ulink> which we feel has no right to exist and is responsible for most <quote>exit consoles</quote>, i.e.
-      nasty windows that pop up when you close another one.
-     </para>
-    </listitem>
-    <listitem>
-     <para>
-      removes code that causes new windows to be opened with undesired properties, such as being
-      full-screen, non-resizeable, without location, status or menu bar etc.
-     </para>
-    </listitem>
-   </itemizedlist>
-   </para>
-   <para>
-    Use with caution. This is an aggressive filter, and can break sites that
-    rely heavily on JavaScript.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>js-events</emphasis></term>
-  <listitem>
-   <para>
-    This is a very radical measure. It removes virtually all JavaScript event bindings, which
-    means that scripts can not react to user actions such as mouse movements or clicks, window
-    resizing etc, anymore. Use with caution!
-   </para>
-   <para>
-    We <emphasis>strongly discourage</emphasis> using this filter as a default since it breaks
-    many legitimate scripts. It is meant for use only on extra-nasty sites (should you really
-    need to go there).
-   </para>
-  </listitem>
- </varlistentry>
-
-<varlistentry>
-  <term><emphasis>html-annoyances</emphasis></term>
-  <listitem>
-   <para>
-    This filter will undo many common instances of HTML based abuse.
-   </para>
-   <para>
-    The <literal>BLINK</literal> and <literal>MARQUEE</literal> tags
-    are neutralized (yeah baby!), and browser windows will be created as
-    resizeable (as of course they should be!), and will have location,
-    scroll and menu bars -- even if specified otherwise.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>content-cookies</emphasis></term>
-  <listitem>
-   <para>
-    Most cookies are set in the HTTP dialog, where they can be intercepted
-    by the
-    <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>
-    and <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>
-    actions. But web sites increasingly make use of HTML meta tags and JavaScript
-    to sneak cookies to the browser on the content level.
-   </para>
-   <para>
-    This filter disables most HTML and JavaScript code that reads or sets
-    cookies. It cannot detect all clever uses of these types of code, so it
-    should not be relied on as an absolute fix. Use it wherever you would also
-    use the cookie crunch actions.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>refresh tags</emphasis></term>
-  <listitem>
-   <para>
-    Disable any refresh tags if the interval is greater than nine seconds (so
-    that redirections done via refresh tags are not destroyed). This is useful
-    for dial-on-demand setups, or for those who find this HTML feature
-    annoying.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>unsolicited-popups</emphasis></term>
-  <listitem>
-   <para>
-    This filter attempts to prevent only <quote>unsolicited</quote> pop-up
-    windows from opening, yet still allow pop-up windows that the user
-    has explicitly chosen to open. It was added in version 3.0.1,
-    as an improvement over earlier such filters.
-   </para>
-   <para>
-    Technical note: The filter works by redefining the window.open JavaScript
-    function to a dummy function, <literal>PrivoxyWindowOpen()</literal>,
-    during the loading and rendering phase of each HTML page access, and
-    restoring the function afterward.
-   </para>
-   <para>
-    This is recommended only for browsers that cannot perform this function
-    reliably themselves. And be aware that some sites require such windows
-    in order to function normally. Use with caution.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>all-popups</emphasis></term>
-  <listitem>
-   <para>
-    Attempt to prevent <emphasis>all</emphasis> pop-up windows from opening.
-    Note this should be used with even more discretion than the above, since
-    it is more likely to break some sites that require pop-ups for normal
-    usage. Use with caution.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>img-reorder</emphasis></term>
-  <listitem>
-   <para>
-    This is a helper filter that has no value if used alone. It makes the
-    <literal>banners-by-size</literal> and <literal>banners-by-link</literal>
-    (see below) filters more effective and should be enabled together with them.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>banners-by-size</emphasis></term>
-  <listitem>
-   <para>
-    This filter removes image tags purely based on what size they are. Fortunately
-    for us, many ads and banner images tend to conform to certain standardized
-    sizes, which makes this filter quite effective for ad stripping purposes.
-   </para>
-   <para>
-    Occasionally this filter will cause false positives on images that are not ads,
-    but just happen to be of one of the standard banner sizes.
-   </para>
-   <para>
-    Recommended only for those who require extreme ad blocking. The default
-    block rules should catch 95+% of all ads <emphasis>without</emphasis> this filter enabled.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>banners-by-link</emphasis></term>
-  <listitem>
-   <para>
-    This is an experimental filter that attempts to kill any banners if
-    their URLs seem to point to known or suspected click trackers. It is currently
-    not of much value and is not recommended for use by default.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>webbugs</emphasis></term>
-  <listitem>
-   <para>
-    Webbugs are small, invisible images (technically 1X1 GIF images), that
-    are used to track users across websites, and collect information on them.
-    As an HTML page is loaded by the browser, an embedded image tag causes the
-    browser to contact a third-party site, disclosing the tracking information
-    through the requested URL and/or cookies for that third-party domain, without
-    the user ever becoming aware of the interaction with the third-party site.
-    HTML-ized spam also uses a similar technique to verify email addresses.
-   </para>
-   <para>
-    This filter removes the HTML code that loads such <quote>webbugs</quote>.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>tiny-textforms</emphasis></term>
-  <listitem>
-   <para>
-    A rather special-purpose filter that can be used to enlarge textareas (those
-    multi-line text boxes in web forms) and turn off hard word wrap in them.
-    It was written for the sourceforge.net tracker system where such boxes are
-    a nuisance, but it can be handy on other sites, too.
-   </para>
-   <para>
-    It is not recommended to use this filter as a default.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>jumping-windows</emphasis></term>
-  <listitem>
-   <para>
-    Many consider windows that move, or resize themselves to be abusive. This filter
-    neutralizes the related JavaScript code. Note that some sites might not display
-    or behave as intended when using this filter. Use with caution.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>frameset-borders</emphasis></term>
-  <listitem>
-   <para>
-    Some web designers seem to assume that everyone in the world will view their
-    web sites using the same browser brand and version, screen resolution etc,
-    because only that assumption could explain why they'd use static frame sizes,
-    yet prevent their frames from being resized by the user, should they be too
-    small to show their whole content.
-   </para>
-   <para>
-    This filter removes the related HTML code. It should only be applied to sites
-    which need it.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>demoronizer</emphasis></term>
-  <listitem>
-   <para>
-    Many Microsoft products that generate HTML use non-standard extensions (read:
-    violations) of the ISO 8859-1 aka Latin-1 character set. This can cause those
-    HTML documents to display with errors on standard-compliant platforms.
-   </para>
-   <para>
-    This filter translates the MS-only characters into Latin-1 equivalents.
-    It is not necessary when using MS products, and will cause corruption of
-    all documents that use 8-bit character sets other than Latin-1. It's mostly
-    worthwhile for Europeans on non-MS platforms, if weird garbage characters
-    sometimes appear on some pages, or user agents that don't correct for this on
-    the fly.
-<!--
-    My version of Mozilla (ancient) shows litte square boxes for quote
-    characters, and apostrophes on moronized pages. So many pages have this, I
-    can read them fine now. HB 08/27/06
--->
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>shockwave-flash</emphasis></term>
-  <listitem>
-   <para>
-    A filter for shockwave haters. As the name suggests, this filter strips code
-    out of web pages that is used to embed shockwave flash objects.
-   </para>
-   <para>
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>quicktime-kioskmode</emphasis></term>
-  <listitem>
-   <para>
-    Change HTML code that embeds Quicktime objects so that kioskmode, which
-    prevents saving, is disabled.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>fun</emphasis></term>
-  <listitem>
-   <para>
-    Text replacements for subversive browsing fun. Make fun of your favorite
-    Monopolist or play buzzword bingo.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>crude-parental</emphasis></term>
-  <listitem>
-   <para>
-    A demonstration-only filter that shows how <application>Privoxy</application>
-    can be used to delete web content on a keyword basis.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>ie-exploits</emphasis></term>
-  <listitem>
-   <para>
-    An experimental collection of text replacements to disable malicious HTML and JavaScript
-    code that exploits known security holes in Internet Explorer.
-   </para>
-   <para>
-    Presently, it only protects against Nimda and a cross-site scripting bug, and
-    would need active maintenance to provide more substantial protection.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>site-specifics</emphasis></term>
-  <listitem>
-   <para>
-    Some web sites have very specific problems, the cure for which doesn't apply
-    anywhere else, or could even cause damage on other sites.
-   </para>
-   <para>
-    This is a collection of such site-specific cures which should only be applied
-    to the sites they were intended for, which is what the supplied
-    <filename>default.action</filename> file does. Users shouldn't need to change
-    anything regarding this filter.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>google</emphasis></term>
-  <listitem>
-   <para>
-    A CSS based block for Google text ads. Also removes a width limitation
-    and the toolbar advertisement.
-   </para>
-  </listitem>
- </varlistentry>
-
-  <varlistentry>
-  <term><emphasis>yahoo</emphasis></term>
-  <listitem>
-   <para>
-    Another CSS based block, this time for Yahoo text ads. And removes
-    a width limitation as well.
-   </para>
-  </listitem>
- </varlistentry>
-
-  <varlistentry>
-  <term><emphasis>msn</emphasis></term>
-  <listitem>
-   <para>
-    Another CSS based block, this time for MSN text ads. And removes
-    tracking URLs, as well as a width limitation.
-   </para>
-  </listitem>
- </varlistentry>
-
- <varlistentry>
-  <term><emphasis>blogspot</emphasis></term>
-  <listitem>
-   <para>
-    Cleans up some Blogspot blogs. Read the fine print before using this one!
-   </para>
-   <para>
-    This filter also intentionally removes some navigation stuff and sets the
-    page width to 100%. As a result, some rounded <quote>corners</quote> would
-    appear to early or not at all and as fixing this would require a browser
-    that understands background-size (CSS3), they are removed instead.
-   </para>
-  </listitem>
- </varlistentry>
-
-  <varlistentry>
-  <term><emphasis>xml-to-html</emphasis></term>
-  <listitem>
-   <para>
-    Server-header filter to change the Content-Type from xml to html.
-   </para>
-  </listitem>
- </varlistentry>
-
-  <varlistentry>
-  <term><emphasis>html-to-xml</emphasis></term>
-  <listitem>
-   <para>
-    Server-header filter to change the Content-Type from html to xml.
-   </para>
-  </listitem>
- </varlistentry>
-
-  <varlistentry>
-  <term><emphasis>no-ping</emphasis></term>
-  <listitem>
-   <para>
-    Removes the non-standard <literal>ping</literal> attribute from
-    anchor and area HTML tags.
-   </para>
-  </listitem>
- </varlistentry>
-
-  <varlistentry>
-  <term><emphasis>hide-tor-exit-notation</emphasis></term>
-  <listitem>
-   <para>
-    Client-header filter to remove the <command>Tor</command> exit node notation
-    found in Host and Referer headers.
-   </para>
-   <para>
-    If &my-app; and <command>Tor</command> are chained and &my-app;
-    is configured to use socks4a, one can use <quote>http://www.example.org.foobar.exit/</quote>
-    to access the host <quote>www.example.org</quote> through the
-    <command>Tor</command> exit node <quote>foobar</quote>.
-   </para>
-   <para>
-    As the HTTP client isn't aware of this notation, it treats the
-    whole string <quote>www.example.org.foobar.exit</quote> as host and uses it
-    for the <quote>Host</quote> and <quote>Referer</quote> headers. From the
-    server's point of view the resulting headers are invalid and can cause problems.
-   </para>
-   <para>
-    An invalid <quote>Referer</quote> header can trigger <quote>hot-linking</quote>
-    protections, an invalid <quote>Host</quote> header will make it impossible for
-    the server to find the right vhost (several domains hosted on the same IP address).
-   </para>
-   <para>
-    This client-header filter removes the <quote>foo.exit</quote> part in those headers
-    to prevent the mentioned problems. Note that it only modifies
-    the HTTP headers, it doesn't make it impossible for the server
-    to detect your <command>Tor</command> exit node based on the IP address
-    the request is coming from.
-   </para>
-  </listitem>
- </varlistentry>
-
-<!--
- <varlistentry>
-  <term><emphasis> </emphasis></term>
-  <listitem>
-   <para>
-   </para>
-   <para>
-   </para>
-  </listitem>
- </varlistentry>
--->
-</variablelist>
-
-</sect2>
-</sect1>
-
-<!--  ~  End section  ~  -->
-
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-
-<sect1 id="templates">
-<title>Privoxy's Template Files</title>
-<para>
- All <application>Privoxy</application> built-in pages, i.e. error pages such as the
- <ulink url="http://show-the-404-error.page"><quote>404 - No Such Domain</quote>
- error page</ulink>, the <ulink
- url="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"><quote>BLOCKED</quote>
- page</ulink>
- and all pages of its <ulink url="http://config.privoxy.org/">web-based
- user interface</ulink>, are generated from <emphasis>templates</emphasis>.
- (<application>Privoxy</application> must be running for the above links to work as
- intended.)
-</para>
-
-<para>
- These templates are stored in a subdirectory of the <link linkend="confdir">configuration
- directory</link> called <filename>templates</filename>. On Unixish platforms,
- this is typically
- <ulink url="file:///etc/privoxy/templates/"><filename>/etc/privoxy/templates/</filename></ulink>.
-</para>
-
-<para>
- The templates are basically normal HTML files, but with place-holders (called symbols
- or exports), which <application>Privoxy</application> fills at run time. It
- is possible to edit the templates with a normal text editor, should you want
- to customize them. (<emphasis>Not recommended for the casual
- user</emphasis>). Should you create your own custom templates, you should use
- the <filename>config</filename> setting <link linkend="templdir">templdir</link>
- to specify an alternate location, so your templates do not get overwritten
- during upgrades.
- </para>
- <para>
- Note that just like in configuration files, lines starting
- with <literal>#</literal> are ignored when the templates are filled in.
-</para>
-
-<para>
- The place-holders are of the form <literal>@name@</literal>, and you will
- find a list of available symbols, which vary from template to template,
- in the comments at the start of each file. Note that these comments are not
- always accurate, and that it's probably best to look at the existing HTML
- code to find out which symbols are supported and what they are filled in with.
-</para>
-
-<para>
- A special application of this substitution mechanism is to make whole
- blocks of HTML code disappear when a specific symbol is set. We use this
- for many purposes, one of them being to include the beta warning in all
- our user interface (CGI) pages when <application>Privoxy</application>
- is in an alpha or beta development stage:
-</para>
-
-<para>
- <screen>
-&lt;!-- @if-unstable-start --&gt;
-
-  ... beta warning HTML code goes here ...
-
-&lt;!-- if-unstable-end@ --&gt;</screen>
-</para>
-
-<para>
- If the "unstable" symbol is set, everything in between and including
- <literal>@if-unstable-start</literal> and <literal>if-unstable-end@</literal>
- will disappear, leaving nothing but an empty comment:
-</para>
-
-<para>
- <screen>&lt;!--  --&gt;</screen>
-</para>
-
-<para>
- There's also an if-then-else construct and an <literal>#include</literal>
- mechanism, but you'll sure find out if you are inclined to edit the
- templates ;-)
-</para>
-
-<para>
- All templates refer to a style located at
- <ulink url="http://config.privoxy.org/send-stylesheet"><literal>http://config.privoxy.org/send-stylesheet</literal></ulink>.
- This is, of course, locally served by <application>Privoxy</application>
- and the source for it can be found and edited in the
- <filename>cgi-style.css</filename> template.
-</para>
-
-</sect1>
-
-<!--  ~  End section  ~  -->
-
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-
-<sect1 id="contact"><title>Contacting the Developers, Bug Reporting and Feature
-Requests</title>
-
-<!-- Include contacting.sgml boilerplate: -->
- &contacting;
-<!-- end boilerplate -->
-
-</sect1>
-
-<!--  ~  End section  ~  -->
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect1 id="copyright"><title>Privoxy Copyright, License and History</title>
-
-<!-- Include copyright.sgml: -->
- &copyright;
-<!-- end copyright -->
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2><title>License</title>
-<!-- Include copyright.sgml: -->
- &license;
-<!-- end copyright -->
-</sect2>
-<!--  ~  End section  ~  -->
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-
-<sect2 id="history"><title>History</title>
-<!-- Include history.sgml: -->
- &history;
-<!-- end history -->
-</sect2>
-
-<sect2 id="authors"><title>Authors</title>
-<!-- Include p-authors.sgml: -->
- &p-authors;
-<!-- end authors -->
-</sect2>
-
-</sect1>
-
-<!--  ~  End section  ~  -->
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect1 id="seealso"><title>See Also</title>
-<!-- Include seealso.sgml: -->
- &seealso;
-<!-- end seealso -->
-</sect1>
-
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect1 id="appendix"><title>Appendix</title>
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2 id="regex">
-<title>Regular Expressions</title>
-<para>
- <application>Privoxy</application> uses Perl-style <quote>regular
- expressions</quote> in its <link linkend="actions-file">actions
- files</link> and <link linkend="filter-file">filter file</link>,
- through the <ulink url="http://www.pcre.org/">PCRE</ulink> and
-<!--
- dead 08/27/06
- <ulink url="http://www.oesterhelt.org/pcrs/">PCRS</ulink> libraries.
--->
- <application>PCRS</application> libraries.
-</para>
-
-<para>
- If you are reading this, you probably don't understand what <quote>regular
- expressions</quote> are, or what they can do. So this will be a very brief
- introduction only. A full explanation would require a <ulink
- url="http://www.oreilly.com/catalog/regex/">book</ulink> ;-)
-</para>
-
-<para>
- Regular expressions provide a language to describe patterns that can be
- run against strings of characters (letter, numbers, etc), to see if they
- match the string or not. The  patterns are themselves (sometimes complex)
- strings of literal characters, combined with  wild-cards, and other special
- characters, called meta-characters. The <quote>meta-characters</quote> have
- special meanings and are used to build complex patterns to be matched against.
- Perl Compatible Regular Expressions are an especially convenient
- <quote>dialect</quote> of the regular expression language.
-</para>
-
-<para>
- To make a simple analogy, we do something similar when we use wild-card
- characters when listing files with the <command>dir</command> command in DOS.
- <literal>*.*</literal> matches all filenames. The <quote>special</quote>
- character here is the asterisk which matches any and all characters. We can be
- more specific and use <literal>?</literal> to match just individual
- characters. So <quote>dir file?.text</quote> would match
- <quote>file1.txt</quote>, <quote>file2.txt</quote>, etc. We are pattern
- matching, using a similar technique to <quote>regular expressions</quote>!
-</para>
-
-<para>
- Regular expressions do essentially the same thing, but are much, much more
- powerful. There are many more <quote>special characters</quote> and ways of
- building complex patterns however. Let's look at a few of the common ones,
- and then some examples:
-</para>
-
-<para><simplelist>
- <member>
-  <emphasis>.</emphasis> - Matches any single character, e.g. <quote>a</quote>,
-  <quote>A</quote>, <quote>4</quote>, <quote>:</quote>, or <quote>@</quote>.
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>?</emphasis> - The preceding character or expression is matched ZERO or ONE
-  times. Either/or.
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>+</emphasis> - The preceding character or expression is matched ONE or MORE
-  times.
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>*</emphasis> - The preceding character or expression is matched ZERO or MORE
-  times.
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>\</emphasis> - The <quote>escape</quote> character denotes that
-  the following character should be taken literally. This is used where one of the
-  special characters (e.g. <quote>.</quote>) needs to be taken literally and
-  not as a special meta-character. Example: <quote>example\.com</quote>, makes
-  sure the period is recognized only as a period (and not expanded to its
-  meta-character meaning of any single character).
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>[ ]</emphasis> - Characters enclosed in brackets will be matched if
-  any of the enclosed characters are encountered. For instance, <quote>[0-9]</quote>
-  matches any numeric digit (zero through nine). As an example, we can combine
-  this with <quote>+</quote> to match any digit one of more times: <quote>[0-9]+</quote>.
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>( )</emphasis> - parentheses are used to group a sub-expression,
-  or multiple sub-expressions.
- </member>
-</simplelist></para>
-
-<para><simplelist>
- <member>
-  <emphasis>|</emphasis> - The <quote>bar</quote> character works like an
-  <quote>or</quote> conditional statement. A match is successful if the
-  sub-expression on either side of <quote>|</quote> matches. As an example:
-  <quote>/(this|that) example/</quote> uses grouping and the bar character
-  and would match either <quote>this example</quote> or <quote>that
-  example</quote>, and nothing else.
- </member>
-</simplelist></para>
-
-<para>
- These are just some of the ones you are likely to use when matching URLs with
- <application>Privoxy</application>, and is a long way from a definitive
- list. This is enough to get us started with a few simple examples which may
- be more illuminating:
-</para>
-
-<para>
- <emphasis><literal>/.*/banners/.*</literal></emphasis> - A  simple example
- that uses the common combination of <quote>.</quote> and <quote>*</quote> to
- denote any character, zero or more times. In other words, any string at all.
- So we start with a literal forward slash, then our regular expression pattern
- (<quote>.*</quote>) another literal forward slash, the string
- <quote>banners</quote>, another forward slash, and lastly another
- <quote>.*</quote>. We are building
- a directory path here. This will match any file with the path that has a
- directory named <quote>banners</quote> in it. The <quote>.*</quote> matches
- any characters, and this could conceivably be more forward slashes, so it
- might expand into a much longer looking path. For example, this could match:
- <quote>/eye/hate/spammers/banners/annoy_me_please.gif</quote>, or just
- <quote>/banners/annoying.html</quote>, or almost an infinite number of other
- possible combinations, just so it has <quote>banners</quote> in the path
- somewhere.
-</para>
-
-<para>
- And now something a little more complex:
-</para>
-
-<para>
- <emphasis><literal>/.*/adv((er)?ts?|ertis(ing|ements?))?/</literal></emphasis> -
- We have several literal forward slashes again (<quote>/</quote>), so we are
- building another expression that is a file path statement. We have another
- <quote>.*</quote>, so we are matching against any conceivable sub-path, just so
- it matches our expression. The only true literal that <emphasis>must
- match</emphasis> our pattern is <application>adv</application>, together with
- the forward slashes. What comes after the <quote>adv</quote> string is the
- interesting part.
-</para>
-
-<para>
- Remember the <quote>?</quote> means the preceding expression (either a
- literal character or anything grouped with <quote>(...)</quote> in this case)
- can exist or not, since this means either zero or one match. So
- <quote>((er)?ts?|ertis(ing|ements?))</quote> is optional, as are the
- individual sub-expressions: <quote>(er)</quote>,
- <quote>(ing|ements?)</quote>, and the <quote>s</quote>. The <quote>|</quote>
- means <quote>or</quote>. We have two of those. For instance,
- <quote>(ing|ements?)</quote>, can expand to match either <quote>ing</quote>
- <emphasis>OR</emphasis> <quote>ements?</quote>. What is being done here, is an
- attempt at matching as many variations of <quote>advertisement</quote>, and
- similar, as possible. So this would expand to match just <quote>adv</quote>,
- or <quote>advert</quote>, or <quote>adverts</quote>, or
- <quote>advertising</quote>, or <quote>advertisement</quote>, or
- <quote>advertisements</quote>. You get the idea. But it would not match
- <quote>advertizements</quote> (with a <quote>z</quote>). We could fix that by
- changing our regular expression to:
- <quote>/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/</quote>, which would then match
- either spelling.
-</para>
-
+
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect3>
+<title>Summary</title>
  <para>
- <emphasis><literal>/.*/advert[0-9]+\.(gif|jpe?g)</literal></emphasis> - Again
- another path statement with forward slashes. Anything in the square brackets
- <quote>[ ]</quote> can be matched. This is using <quote>0-9</quote> as a
- shorthand expression to mean any digit one through nine. It is the same as
- saying <quote>0123456789</quote>. So any digit matches. The <quote>+</quote>
- means one or more of the preceding expression must be included. The preceding
- expression here is what is in the square brackets -- in this case, any digit
- one through nine. Then, at the end, we have a grouping: <quote>(gif|jpe?g)</quote>.
- This includes a <quote>|</quote>, so this needs to match the expression on
- either side of that bar character also. A simple <quote>gif</quote> on one side, and the other
- side will in turn match either <quote>jpeg</quote> or <quote>jpg</quote>,
- since the <quote>?</quote> means the letter <quote>e</quote> is optional and
- can be matched once or not at all. So we are building an expression here to
- match image GIF or JPEG type image file. It must include the literal
- string <quote>advert</quote>, then one or more digits, and a <quote>.</quote>
- (which is now a literal, and not a special character, since it is escaped
- with <quote>\</quote>), and lastly either <quote>gif</quote>, or
- <quote>jpeg</quote>, or <quote>jpg</quote>. Some possible matches would
- include: <quote>//advert1.jpg</quote>,
- <quote>/nasty/ads/advert1234.gif</quote>,
- <quote>/banners/from/hell/advert99.jpg</quote>. It would not match
- <quote>advert1.gif</quote> (no leading slash), or
- <quote>/adverts232.jpg</quote> (the expression does not include an
- <quote>s</quote>), or <quote>/advert1.jsp</quote> (<quote>jsp</quote> is not
- in the expression anywhere).
+ Note that many of these actions have the potential to cause a page to
+ misbehave, possibly even not to display at all. There are many ways
+ a site designer may choose to design his site, and what HTTP header
+ content, and other criteria, he may depend on. There is no way to have hard
+ and fast rules for all sites. See the <link
+ linkend="ACTIONSANAT">Appendix</link> for a brief example on troubleshooting
+ actions.
  </para>
+</sect3>
+</sect2>
  
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2 id="aliases">
+<title>Aliases</title>
  <para>
- We are barely scratching the surface of regular expressions here so that you
- can understand the default <application>Privoxy</application>
- configuration files, and maybe use this knowledge to customize your own
- installation. There is much, much more that can be done with regular
- expressions. Now that you know enough to get started, you can learn more on
- your own :/
+ Custom <quote>actions</quote>, known to <application>Privoxy</application>
+ as <quote>aliases</quote>, can be defined by combining other actions.
+ These can in turn be invoked just like the built-in actions.
+ Currently, an alias name can contain any character except space, tab,
+ <quote>=</quote>,
+ <quote>{</quote> and <quote>}</quote>, but we <emphasis>strongly
+ recommend</emphasis> that you only use <quote>a</quote> to <quote>z</quote>,
+ <quote>0</quote> to <quote>9</quote>, <quote>+</quote>, and <quote>-</quote>.
+ Alias names are not case sensitive, and are not required to start with a
+ <quote>+</quote> or <quote>-</quote> sign, since they are merely textually
+ expanded.
  </para>
-
  <para>
- More reading on Perl Compatible Regular expressions:
- <ulink url="http://perldoc.perl.org/perlre.html">http://perldoc.perl.org/perlre.html</ulink>
+ Aliases can be used throughout the actions file, but they <emphasis>must be
+ defined in a special section at the top of the file!</emphasis>
+ And there can only be one such section per actions file. Each actions file may
+ have its own alias section, and the aliases defined in it are only visible
+ within that file.
+</para>
+<para>
+ There are two main reasons to use aliases: One is to save typing for frequently
+ used combinations of actions, the other one is a gain in flexibility: If you
+ decide once how you want to handle shops by defining an alias called
+ <quote>shop</quote>, you can later change your policy on shops in
+ <emphasis>one</emphasis> place, and your changes will take effect everywhere
+ in the actions file where the <quote>shop</quote> alias is used. Calling aliases
+ by their purpose also makes your actions files more readable.
+</para>
+<para>
+ Currently, there is one big drawback to using aliases, though:
+ <application>Privoxy</application>'s built-in web-based action file
+ editor honors aliases when reading the actions files, but it expands
+ them before writing. So the effects of your aliases are of course preserved,
+ but the aliases themselves are lost when you edit sections that use aliases
+ with it.
  </para>
  
  <para>
- For information on regular expression based substitutions and their applications
- in filters, please see the <link linkend="filter-file">filter file tutorial</link>
- in this manual.
+ Now let's define some aliases...
  </para>
-</sect2>
  
-<!--  ~  End section  ~  -->
+<para>
+ <screen>
+ # Useful custom aliases we can use later.
+ #
+ # Note the (required!) section header line and that this section
+ # must be at the top of the actions file!
+ #
+ {{alias}}
  
+ # These aliases just save typing later:
+ # (Note that some already use other aliases!)
+ #
+ +crunch-all-cookies = +<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> +<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
+ -crunch-all-cookies = -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
+ +block-as-image      = +block{Blocked image.} +handle-as-image
+ allow-all-cookies   = -crunch-all-cookies -<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link>
  
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2>
-<title>Privoxy's Internal Pages</title>
+ # These aliases define combinations of actions
+ # that are useful for certain types of sites:
+ #
+ fragile     = -<link linkend="BLOCK">block</link> -<link linkend="FILTER">filter</link> -crunch-all-cookies -<link linkend="FAST-REDIRECTS">fast-redirects</link> -<link linkend="HIDE-REFERER">hide-referrer</link> -<link linkend="PREVENT-COMPRESSION">prevent-compression</link>
  
-<para>
- Since <application>Privoxy</application> proxies each requested
- web page, it is easy for <application>Privoxy</application> to
- trap certain special URLs. In this way, we can talk directly to
- <application>Privoxy</application>, and see how it is
- configured, see how our rules are being applied, change these
- rules and other configuration options, and even turn
- <application>Privoxy's</application> filtering off, all with
- a web browser.
+ shop        = -crunch-all-cookies -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link>
  
+ # Short names for other aliases, for really lazy people ;-)
+ #
+ c0 = +crunch-all-cookies
+ c1 = -crunch-all-cookies</screen>
  </para>
  
  <para>
- The URLs listed below are the special ones that allow direct access
- to <application>Privoxy</application>. Of course,
- <application>Privoxy</application> must be running to access these. If
- not, you will get a friendly error message. Internet access is not
- necessary either.
+ ...and put them to use. These sections would appear in the lower part of an
+ actions file and define exceptions to the default actions (as specified further
+ up for the <quote>/</quote> pattern):
  </para>
  
  <para>
- <itemizedlist>
-
- <listitem>
-  <para>
-   Privoxy main page:
-  </para>
-  <blockquote>
-   <para>
-     <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
-   </para>
-  </blockquote>
-  <para>
-   There is a shortcut: <ulink url="http://p.p/">http://p.p/</ulink> (But it
-   doesn't provide a fall-back to a real page, in case the request is not
-   sent through <application>Privoxy</application>)
-  </para>
- </listitem>
-
- <listitem>
-  <para>
-    Show information about the current configuration, including viewing and
-    editing of actions files:
-  </para>
-   <blockquote>
-   <para>
-    <ulink url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
-   </para>
-  </blockquote>
- </listitem>
-
- <listitem>
-  <para>
-    Show the source code version numbers:
-  </para>
-  <blockquote>
-   <para>
-    <ulink url="http://config.privoxy.org/show-version">http://config.privoxy.org/show-version</ulink>
-   </para>
-  </blockquote>
- </listitem>
-
- <listitem>
-  <para>
-   Show the browser's request headers:
-  </para>
-  <blockquote>
-   <para>
-    <ulink url="http://config.privoxy.org/show-request">http://config.privoxy.org/show-request</ulink>
-   </para>
-  </blockquote>
- </listitem>
+ <screen>
+ # These sites are either very complex or very keen on
+ # user data and require minimal interference to work:
+ #
+ {fragile}
+ .office.microsoft.com
+ .windowsupdate.microsoft.com
+ # Gmail is really mail.google.com, not gmail.com
+ mail.google.com
  
- <listitem>
-  <para>
-   Show which actions apply to a URL and why:
-  </para>
-   <blockquote>
-   <para>
-    <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
-   </para>
-  </blockquote>
- </listitem>
+ # Shopping sites:
+ # Allow cookies (for setting and retrieving your customer data)
+ #
+ {shop}
+ .quietpc.com
+ .worldpay.com   # for quietpc.com
+ mybank.example.com
  
- <listitem>
-  <para>
-   Toggle Privoxy on or off. This feature can be turned off/on in the main
-   <filename>config</filename> file. When toggled <quote>off</quote>, <quote>Privoxy</quote>
-   continues to run, but only as a pass-through proxy, with no actions taking
-   place:
-  </para>
-   <blockquote>
-   <para>
-    <ulink url="http://config.privoxy.org/toggle">http://config.privoxy.org/toggle</ulink>
-   </para>
-  </blockquote>
-  <para>
-   Short cuts. Turn off, then on:
-  </para>
-   <blockquote>
-   <para>
-     <ulink url="http://config.privoxy.org/toggle?set=disable">http://config.privoxy.org/toggle?set=disable</ulink>
-   </para>
-  </blockquote>
-   <blockquote>
-   <para>
-     <ulink url="http://config.privoxy.org/toggle?set=enable">http://config.privoxy.org/toggle?set=enable</ulink>
-   </para>
-  </blockquote>
- </listitem>
+ # These shops require pop-ups:
+ #
+ {-filter{all-popups} -filter{unsolicited-popups}}
+  .dabs.com
+  .overclockers.co.uk</screen>
+</para>
  
- </itemizedlist>
+<para>
+ Aliases like <quote>shop</quote> and <quote>fragile</quote> are typically used for
+ <quote>problem</quote> sites that require more than one action to be disabled
+ in order to function properly.
+</para>
+</sect2>
+<!--
+hal stop here
+-->
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2 id="act-examples">
+<title>Actions Files Tutorial</title>
+<para>
+ The above chapters have shown <link linkend="actions-file">which actions files
+ there are and how they are organized</link>, how actions are <link
+ linkend="actions">specified</link> and <link linkend="actions-apply">applied
+ to URLs</link>, how <link linkend="af-patterns">patterns</link> work, and how to
+ define and use <link linkend="aliases">aliases</link>. Now, let's look at an
+ example <filename>match-all.action</filename>, <filename>default.action</filename>
+ and <filename>user.action</filename> file and see how all these pieces come together:
  </para>
  
+<sect3>
+<title>match-all.action</title>
  <para>
- These may be bookmarked for quick reference. See next.
-
+ Remember <emphasis>all actions are disabled when matching starts</emphasis>,
+ so we have to explicitly enable the ones we want.
  </para>
  
-<sect3 id="bookmarklets">
-<title>Bookmarklets</title>
  <para>
- Below are some <quote>bookmarklets</quote> to allow you to easily access a
- <quote>mini</quote> version of some of <application>Privoxy's</application>
- special pages. They are designed for MS Internet Explorer, but should work
- equally well in Netscape, Mozilla, and other browsers which support
- JavaScript. They are designed to run directly from your bookmarks - not by
- clicking the links below (although that should work for testing).
+ While the <filename>match-all.action</filename> file only contains a
+ single section, it is probably the most important one. It has only one
+ pattern, <quote><literal>/</literal></quote>, but this pattern
+ <link linkend="af-patterns">matches all URLs</link>. Therefore, the set of
+ actions used in this <quote>default</quote> section <emphasis>will
+ be applied to all requests as a start</emphasis>. It can  be partly or
+ wholly overridden by other actions files like <filename>default.action</filename>
+ and <filename>user.action</filename>, but it will still be largely responsible
+ for your overall browsing experience.
  </para>
+
  <para>
- To save them, right-click the link and choose <quote>Add to Favorites</quote>
- (IE) or <quote>Add Bookmark</quote> (Netscape). You will get a warning that
- the bookmark <quote>may not be safe</quote> - just click OK. Then you can run the
- Bookmarklet directly from your favorites/bookmarks. For even faster access,
- you can put them on the <quote>Links</quote> bar (IE) or the <quote>Personal
- Toolbar</quote> (Netscape), and run them with a single click.
+ Again, at the start of matching, all actions are disabled, so there is
+ no need to disable any actions here. (Remember: a <quote>+</quote>
+ preceding the action name enables the action, a <quote>-</quote> disables!).
+ Also note how this long line has been made more readable by splitting it into
+ multiple lines with line continuation.
  </para>
  
  <para>
- <itemizedlist>
-
-  <listitem>
-   <para>
-    <ulink
-    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&#38;set=enabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Enable</ulink>
-   </para>
-  </listitem>
+ <screen>
+{ \
+ +<link linkend="CHANGE-X-FORWARDED-FOR">change-x-forwarded-for{block}</link> \
+ +<link linkend="HIDE-FROM-HEADER">hide-from-header{block}</link> \
+ +<link linkend="SET-IMAGE-BLOCKER">set-image-blocker{pattern}</link> \
+}
+/ # Match all URLs
+ </screen>
+</para>
  
-  <listitem>
-   <para>
-    <ulink
-    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&#38;set=disabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Disable</ulink>
-   </para>
-  </listitem>
+<para>
+ The default behavior is now set.
+</para>
+</sect3>
  
-  <listitem>
-   <para>
-    <ulink
-    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&#38;set=toggle','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Toggle Privoxy</ulink> (Toggles between enabled and disabled)
-   </para>
-  </listitem>
+<sect3>
+<title>default.action</title>
  
-  <listitem>
-   <para>
-    <ulink
-    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy- View Status</ulink>
-   </para>
-  </listitem>
-<!--
-  <listitem>
-   <para>
-    <ulink url="javascript:w=Math.floor(screen.width/2);h=Math.floor(screen.height*0.9);void(window.open('http://www.privoxy.org/actions/index.php?url='+escape(location.href),'Feedback','screenx='+w+',width='+w+',height='+h+',scrollbars=yes,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Submit Actions File Feedback</ulink>
-   </para>
-  </listitem>
- -->
-  <listitem>
-   <para>
-    <ulink url="javascript:void(window.open('http://config.privoxy.org/show-url-info?url='+escape(location.href),'Why').focus());">Privoxy - Why?</ulink>
-   </para>
-  </listitem>
- </itemizedlist>
+<para>
+ If you aren't a developer, there's no need for you to edit the
+ <filename>default.action</filename> file. It is maintained by
+ the &my-app; developers and if you disagree with some of the
+ sections, you should overrule them in your <filename>user.action</filename>.
  </para>
  
  <para>
- Credit: The site which gave us the general idea for these bookmarklets is
- <ulink url="http://www.bookmarklets.com/">www.bookmarklets.com</ulink>. They
- have more information about bookmarklets.
+ Understanding the <filename>default.action</filename> file can
+ help you with your <filename>user.action</filename>, though.
  </para>
  
+<para>
+ The first section in this file is a special section for internal use
+ that prevents older &my-app; versions from reading the file:
+</para>
  
-</sect3>
-
-</sect2>
-
+<para>
+ <screen>
+##########################################################################
+# Settings -- Don't change! For internal Privoxy use ONLY.
+##########################################################################
+{{settings}}
+for-privoxy-version=3.0.11</screen>
+</para>
  
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2 id="chain">
-<title>Chain of Events</title>
  <para>
- Let's take a quick look at how some of <application>Privoxy's</application>
- core features are triggered, and the ensuing sequence of events when a web
- page is requested by your browser:
+ After that comes the (optional) alias section. We'll use the example
+ section from the above <link linkend="aliases">chapter on aliases</link>,
+ that also explains why and how aliases are used:
  </para>
  
  <para>
- <itemizedlist>
- <listitem>
-  <para>
-   First, your web browser requests a web page. The browser knows to send
-   the request to <application>Privoxy</application>, which will in turn,
-   relay the request to the remote web server after passing the following
-   tests:
-  </para>
- </listitem>
- <listitem>
-  <para>
-   <application>Privoxy</application> traps any request for its own internal CGI
-   pages (e.g <ulink url="http://p.p/">http://p.p/</ulink>) and sends the CGI page back to the browser.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   Next, <application>Privoxy</application> checks to see if the URL
-   matches any <link
-   linkend="BLOCK"><quote>+block</quote></link> patterns. If
-   so, the URL is then blocked, and the remote web server will not be contacted.
-   <link linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>
-   and
-   <link linkend="HANDLE-AS-EMPTY-DOCUMENT"><quote>+handle-as-empty-document</quote></link>
-   are then checked, and if there is no match, an
-   HTML <quote>BLOCKED</quote> page is sent back to the browser. Otherwise, if
-   it does match, an image is returned for the former, and an empty text
-   document for the latter. The type of image would depend on the setting of
-   <link linkend="SET-IMAGE-BLOCKER"><quote>+set-image-blocker</quote></link>
-   (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
-  </para>
- </listitem>
- <listitem>
-  <para>
-   Untrusted URLs are blocked. If URLs are being added to the
-   <filename>trust</filename> file, then that is done.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   If the URL pattern matches the <link
-   linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link> action,
-   it is then processed. Unwanted parts of the requested URL are stripped.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   Now the rest of the client browser's request headers are processed. If any
-   of these match any of the relevant actions (e.g. <link
-   linkend="HIDE-USER-AGENT"><quote>+hide-user-agent</quote></link>,
-   etc.), headers are suppressed or forged as determined by these actions and
-   their parameters.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   Now the web server starts sending its response back (i.e. typically a web
-   page).
-  </para>
- </listitem>
- <listitem>
-  <para>
-   First, the server headers are read and processed to determine, among other
-   things, the MIME type (document type) and encoding. The headers are then
-   filtered as determined by the
-   <link linkend="CRUNCH-INCOMING-COOKIES"><quote>+crunch-incoming-cookies</quote></link>,
-   <link linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>,
-   and <link linkend="DOWNGRADE-HTTP-VERSION"><quote>+downgrade-http-version</quote></link>
-   actions.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   If any <link linkend="FILTER"><quote>+filter</quote></link> action
-   or <link
-   linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
-   action applies (and the document type fits the action), the rest of the page is
-   read into memory (up to a configurable limit). Then the filter rules (from
-   <filename>default.filter</filename> and any other filter files) are
-   processed against the buffered content. Filters are applied in the order
-   they are specified in one of the filter files. Animated GIFs, if present,
-   are reduced to either the first or last frame, depending on the action
-   setting.The entire page, which is now filtered, is then sent by
-   <application>Privoxy</application> back to your browser.
-  </para>
-  <para>
-   If neither a <link linkend="FILTER"><quote>+filter</quote></link> action
-   or <link
-   linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
-   matches, then <application>Privoxy</application> passes the raw data through
-   to the client browser as it becomes available.
-  </para>
- </listitem>
- <listitem>
-  <para>
-   As the browser receives the now (possibly filtered) page content, it
-   reads and then requests any URLs that may be embedded within the page
-   source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
-   frames), sounds, etc. For each of these objects, the browser issues a
-   separate request (this is easily viewable in <application>Privoxy's</application>
-   logs). And each such request is in turn processed just as above. Note that a
-   complex web page will have many, many such embedded URLs. If these
-   secondary requests are to a different server, then quite possibly a very
-   differing set of actions is triggered.
-  </para>
- </listitem>
+ <screen>
+##########################################################################
+# Aliases
+##########################################################################
+{{alias}}
  
- </itemizedlist>
+ # These aliases just save typing later:
+ # (Note that some already use other aliases!)
+ #
+ +crunch-all-cookies = +<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> +<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
+ -crunch-all-cookies = -<link linkend="CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</link> -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link>
+ +block-as-image      = +block{Blocked image.} +handle-as-image
+ mercy-for-cookies   = -crunch-all-cookies -<link linkend="SESSION-COOKIES-ONLY">session-cookies-only</link> -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link>
+
+ # These aliases define combinations of actions
+ # that are useful for certain types of sites:
+ #
+ fragile     = -<link linkend="BLOCK">block</link> -<link linkend="FILTER">filter</link> -crunch-all-cookies -<link linkend="FAST-REDIRECTS">fast-redirects</link> -<link linkend="HIDE-REFERER">hide-referrer</link>
+ shop        = -crunch-all-cookies -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link></screen>
  </para>
+
  <para>
- NOTE: This is somewhat of a simplistic overview of what happens with each URL
- request. For the sake of brevity and simplicity, we have focused on
- <application>Privoxy's</application> core features only.
+ The first of our specialized sections is concerned with <quote>fragile</quote>
+ sites, i.e. sites that require minimum interference, because they are either
+ very complex or very keen on tracking you (and have mechanisms in place that
+ make them unusable for people who avoid being tracked). We will simply use
+ our pre-defined <literal>fragile</literal> alias instead of stating the list
+ of actions explicitly:
  </para>
  
-</sect2>
-
-
-<!--   ~~~~~       New section      ~~~~~     -->
-<sect2 id="actionsanat">
-<title>Troubleshooting: Anatomy of an Action</title>
-
  <para>
- The way <application>Privoxy</application> applies
- <link linkend="ACTIONS">actions</link> and <link linkend="FILTER">filters</link>
- to any given URL can be complex, and not always so
- easy to understand what is happening. And sometimes we need to be able to
- <emphasis>see</emphasis> just what <application>Privoxy</application> is
- doing. Especially, if something <application>Privoxy</application> is doing
- is causing us a problem inadvertently. It can be a little daunting to look at
- the actions and filters files themselves, since they tend to be filled with
- <link linkend="regex">regular expressions</link> whose consequences are not
- always so obvious.
+ <screen>
+##########################################################################
+# Exceptions for sites that'll break under the default action set:
+##########################################################################
+
+# "Fragile" Use a minimum set of actions for these sites (see alias above):
+#
+{ fragile }
+.office.microsoft.com           # surprise, surprise!
+.windowsupdate.microsoft.com
+mail.google.com</screen>
  </para>
  
  <para>
- One quick test to see if <application>Privoxy</application> is causing a problem
- or not, is to disable it temporarily. This should be the first troubleshooting
- step. See <link linkend="bookmarklets">the Bookmarklets</link> section on a quick
- and easy way to do this (be sure to flush caches afterward!). Looking at the
- logs is a good idea too. (Note that both the toggle feature and logging are
- enabled via <filename>config</filename> file settings, and may need to be
- turned <quote>on</quote>.)
+ Shopping sites are not as fragile, but they typically
+ require cookies to log in, and pop-up windows for shopping
+ carts or item details. Again, we'll use a pre-defined alias:
  </para>
+
  <para>
- Another easy troubleshooting step to try is if you have done any
- customization of your installation, revert back to the installed
- defaults and see if that helps. There are times the developers get complaints
- about one thing or another, and the problem is more related to a customized
- configuration issue.
+ <screen>
+# Shopping sites:
+#
+{ shop }
+.quietpc.com
+.worldpay.com   # for quietpc.com
+.jungle.com
+.scan.co.uk</screen>
  </para>
  
  <para>
- <application>Privoxy</application> also provides the
- <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
- page that can show us very specifically how <application>actions</application>
- are being applied to any given URL. This is a big help for troubleshooting.
+ The <literal><link linkend="FAST-REDIRECTS">fast-redirects</link></literal>
+ action, which may have been enabled in <filename>match-all.action</filename>,
+ breaks some sites. So disable it for popular sites where we know it misbehaves:
  </para>
  
  <para>
- First, enter one URL (or partial URL) at the prompt, and then
- <application>Privoxy</application> will tell us
- how the current configuration will handle it. This will not
- help with filtering effects (i.e. the <link
- linkend="FILTER"><quote>+filter</quote></link> action) from
- one of the filter files since this is handled very
- differently and not so easy to trap! It also will not tell you about any other
- URLs that may be embedded within the URL you are testing. For instance, images
- such as ads are expressed as URLs within the raw page source of HTML pages. So
- you will only get info for the actual URL that is pasted into the prompt area
- -- not any sub-URLs. If you want to know about embedded URLs like ads, you
- will have to dig those out of the HTML source. Use your browser's <quote>View
- Page Source</quote> option for this. Or right click on the ad, and grab the
- URL.
+ <screen>
+{ -<link linkend="FAST-REDIRECTS">fast-redirects</link> }
+login.yahoo.com
+edit.*.yahoo.com
+.google.com
+.altavista.com/.*(like|url|link):http
+.altavista.com/trans.*urltext=http
+.nytimes.com</screen>
  </para>
  
  <para>
- Let's try an example, <ulink url="http://google.com">google.com</ulink>,
- and look at it one section at a time in a sample configuration (your real
- configuration may vary):
+ It is important that <application>Privoxy</application> knows which
+ URLs belong to images, so that <emphasis>if</emphasis> they are to
+ be blocked, a substitute image can be sent, rather than an HTML page.
+ Contacting the remote site to find out is not an option, since it
+ would destroy the loading time advantage of banner blocking, and it
+ would feed the advertisers information about you. We can mark any
+ URL as an image with the <literal><link
+ linkend="handle-as-image">handle-as-image</link></literal> action,
+ and marking all URLs that end in a known image file extension is a
+ good start:
  </para>
  
  <para>
   <screen>
- Matches for http://www.google.com:
-
- In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
-
- {+change-x-forwarded-for{block}
- +deanimate-gifs {last}
- +fast-redirects {check-decoded-url}
- +filter {refresh-tags}
- +filter {img-reorder}
- +filter {banners-by-size}
- +filter {webbugs}
- +filter {jumping-windows}
- +filter {ie-exploits}
- +hide-from-header {block}
- +hide-referrer {forge}
- +session-cookies-only
- +set-image-blocker {pattern}
-/
+##########################################################################
+# Images:
+##########################################################################
  
- { -session-cookies-only }
- .google.com
+# Define which file types will be treated as images, in case they get
+# blocked further down this file:
+#
+{ +<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> }
+/.*\.(gif|jpe?g|png|bmp|ico)$</screen>
+</para>
  
- { -fast-redirects }
- .google.com
+<para>
+ And then there are known banner sources. They often use scripts to
+ generate the banners, so it won't be visible from the URL that the
+ request is for an image. Hence we block them <emphasis>and</emphasis>
+ mark them as images in one go, with the help of our
+ <literal>+block-as-image</literal> alias defined above. (We could of
+ course just as well use <literal>+<link linkend="block">block</link>
+ +<link linkend="handle-as-image">handle-as-image</link></literal> here.)
+ Remember that the type of the replacement image is chosen by the
+ <literal><link linkend="set-image-blocker">set-image-blocker</link></literal>
+ action. Since all URLs have matched the default section with its
+ <literal>+<link linkend="set-image-blocker">set-image-blocker</link>{pattern}</literal>
+ action before, it still applies and needn't be repeated:
+</para>
  
-In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
-(no matches in this file)
-</screen>
+<para>
+ <screen>
+# Known ad generators:
+#
+{ +block-as-image }
+ar.atwola.com
+.ad.doubleclick.net
+.ad.*.doubleclick.net
+.a.yimg.com/(?:(?!/i/).)*$
+.a[0-9].yimg.com/(?:(?!/i/).)*$
+bs*.gsanet.com
+.qkimg.net</screen>
  </para>
  
  <para>
- This is telling us how we have defined our
- <link linkend="ACTIONS"><quote>actions</quote></link>, and
- which ones match for our test case, <quote>google.com</quote>.
- Displayed is all the actions that are available to us. Remember,
- the <literal>+</literal> sign denotes <quote>on</quote>. <literal>-</literal>
- denotes <quote>off</quote>. So some are <quote>on</quote> here, but many
- are <quote>off</quote>. Each example we try may provide a slightly different
- end result, depending on our configuration directives.
+ One of the most important jobs of <application>Privoxy</application>
+ is to block banners. Many of these can be <quote>blocked</quote>
+ by the <literal><link linkend="filter">filter</link>{banners-by-size}</literal>
+ action, which we enabled above, and which deletes the references to banner
+ images from the pages while they are loaded, so the browser doesn't request
+ them anymore, and hence they don't need to be blocked here. But this naturally
+ doesn't catch all banners, and some people choose not to use filters, so we
+ need a comprehensive list of patterns for banner URLs here, and apply the
+ <literal><link linkend="block">block</link></literal> action to them.
  </para>
  <para>
- The first listing
-  is for our <filename>default.action</filename> file. The large, multi-line
-  listing, is how the actions are set to match for all URLs, i.e. our default
-  settings. If you look at your <quote>actions</quote> file, this would be the
-  section just below the <quote>aliases</quote> section near the top. This
-  will apply to all URLs as signified by the single forward slash at the end
-  of the listing -- <quote> / </quote>.
+ First comes many generic patterns, which do most of the work, by
+ matching typical domain and path name components of banners. Then comes
+ a list of individual patterns for specific sites, which is omitted here
+ to keep the example short:
  </para>
  
  <para>
- But we have defined additional actions that would be exceptions to these general
- rules, and then we list specific URLs (or patterns) that these exceptions
- would apply to. Last match wins. Just below this then are two explicit
- matches for <quote>.google.com</quote>. The first is negating our previous
- cookie setting, which was for <link
- linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>
- (i.e. not persistent). So we will allow persistent cookies for google, at
- least that is how it is in this example. The second turns
- <emphasis>off</emphasis> any <link
- linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link>
- action, allowing this to take place unmolested. Note that there is a leading
- dot here -- <quote>.google.com</quote>. This will match any hosts and
- sub-domains, in the google.com domain also, such as
- <quote>www.google.com</quote> or <quote>mail.google.com</quote>. But it would not
- match <quote>www.google.de</quote>! So, apparently, we have these two actions
- defined as exceptions to the general rules at the top somewhere in the lower
- part of our <filename>default.action</filename> file, and
- <quote>google.com</quote> is referenced somewhere in these latter sections.
+ <screen>
+##########################################################################
+# Block these fine banners:
+##########################################################################
+{ <link linkend="BLOCK">+block{Banner ads.}</link> }
+
+# Generic patterns:
+#
+ad*.
+.*ads.
+banner?.
+count*.
+/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
+/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
+
+# Site-specific patterns (abbreviated):
+#
+.hitbox.com</screen>
  </para>
  
  <para>
- Then, for our <filename>user.action</filename> file, we again have no hits.
- So there is nothing google-specific that we might have added to our own, local
- configuration. If there was, those actions would over-rule any actions from
- previously processed files, such as <filename>default.action</filename>.
- <filename>user.action</filename> typically has the last word. This is the
- best place to put hard and fast exceptions,
+ It's quite remarkable how many advertisers actually call their banner
+ servers ads.<replaceable>company</replaceable>.com, or call the directory
+ in which the banners are stored simply <quote>banners</quote>. So the above
+ generic patterns are surprisingly effective.
  </para>
-
  <para>
- And finally we pull it all together in the bottom section and summarize how
- <application>Privoxy</application> is applying all its <quote>actions</quote>
- to <quote>google.com</quote>:
-
+ But being very generic, they necessarily also catch URLs that we don't want
+ to block. The pattern <literal>.*ads.</literal> e.g. catches
+ <quote>nasty-<emphasis>ads</emphasis>.nasty-corp.com</quote> as intended,
+ but also <quote>downlo<emphasis>ads</emphasis>.sourcefroge.net</quote> or
+ <quote><emphasis>ads</emphasis>l.some-provider.net.</quote> So here come some
+ well-known exceptions to the <literal>+<link linkend="BLOCK">block</link></literal>
+ section above.
+</para>
+<para>
+ Note that these are exceptions to exceptions from the default! Consider the URL
+ <quote>downloads.sourcefroge.net</quote>: Initially, all actions are deactivated,
+ so it wouldn't get blocked. Then comes the defaults section, which matches the
+ URL, but just deactivates the <literal><link linkend="BLOCK">block</link></literal>
+ action once again. Then it matches <literal>.*ads.</literal>, an exception to the
+ general non-blocking policy, and suddenly
+ <literal><link linkend="BLOCK">+block</link></literal> applies. And now, it'll match
+ <literal>.*loads.</literal>, where <literal><link linkend="BLOCK">-block</link></literal>
+ applies, so (unless it matches <emphasis>again</emphasis> further down) it ends up
+ with no <literal><link linkend="BLOCK">block</link></literal> action applying.
  </para>
  
  <para>
   <screen>
+##########################################################################
+# Save some innocent victims of the above generic block patterns:
+##########################################################################
  
- Final results:
+# By domain:
+#
+{ -<link linkend="BLOCK">block</link> }
+adv[io]*.  # (for advogato.org and advice.*)
+adsl.      # (has nothing to do with ads)
+adobe.     # (has nothing to do with ads either)
+ad[ud]*.   # (adult.* and add.*)
+.edu       # (universities don't host banners (yet!))
+.*loads.   # (downloads, uploads etc)
  
- -add-header
- -block
- +change-x-forwarded-for{block}
- -client-header-filter{hide-tor-exit-notation}
- -content-type-overwrite
- -crunch-client-header
- -crunch-if-none-match
- -crunch-incoming-cookies
- -crunch-outgoing-cookies
- -crunch-server-header
- +deanimate-gifs {last}
- -downgrade-http-version
- -fast-redirects
- -filter {js-events}
- -filter {content-cookies}
- -filter {all-popups}
- -filter {banners-by-link}
- -filter {tiny-textforms}
- -filter {frameset-borders}
- -filter {demoronizer}
- -filter {shockwave-flash}
- -filter {quicktime-kioskmode}
- -filter {fun}
- -filter {crude-parental}
- -filter {site-specifics}
- -filter {js-annoyances}
- -filter {html-annoyances}
- +filter {refresh-tags}
- -filter {unsolicited-popups}
- +filter {img-reorder}
- +filter {banners-by-size}
- +filter {webbugs}
- +filter {jumping-windows}
- +filter {ie-exploits}
- -filter {google}
- -filter {yahoo}
- -filter {msn}
- -filter {blogspot}
- -filter {no-ping}
- -force-text-mode
- -handle-as-empty-document
- -handle-as-image
- -hide-accept-language
- -hide-content-disposition
- +hide-from-header {block}
- -hide-if-modified-since
- +hide-referrer {forge}
- -hide-user-agent
- -limit-connect
- -overwrite-last-modified
- -prevent-compression
- -redirect
- -server-header-filter{xml-to-html}
- -server-header-filter{html-to-xml}
- -session-cookies-only
- +set-image-blocker {pattern} </screen>
+# By path:
+#
+/.*loads/
+
+# Site-specific:
+#
+www.globalintersec.com/adv # (adv = advanced)
+www.ugu.com/sui/ugu/adv</screen>
  </para>
  
  <para>
- Notice the only difference here to the previous listing, is to
- <quote>fast-redirects</quote> and <quote>session-cookies-only</quote>,
- which are activated specifically for this site in our configuration,
- and thus show in the <quote>Final Results</quote>.
+ Filtering source code can have nasty side effects,
+ so make an exception for our friends at sourceforge.net,
+ and all paths with <quote>cvs</quote> in them. Note that
+ <literal>-<link linkend="FILTER">filter</link></literal>
+ disables <emphasis>all</emphasis> filters in one fell swoop!
  </para>
  
  <para>
- Now another example, <quote>ad.doubleclick.net</quote>:
+ <screen>
+# Don't filter code!
+#
+{ -<link linkend="FILTER">filter</link> }
+/(.*/)?cvs
+bugzilla.
+developer.
+wiki.
+.sourceforge.net</screen>
  </para>
  
  <para>
- <screen>
+ The actual <filename>default.action</filename> is of course much more
+ comprehensive, but we hope this example made clear how it works.
+</para>
  
- { +block{Domains starts with "ad"} }
-  ad*.
+</sect3>
  
- { +block{Domain contains "ad"} }
-  .ad.
+<sect3><title>user.action</title>
  
- { +block{Doubleclick banner server} +handle-as-image }
-  .[a-vx-z]*.doubleclick.net
-</screen>
+<para>
+ So far we are painting with a broad brush by setting general policies,
+ which would be a reasonable starting point for many people. Now,
+ you might want to be more specific and have customized rules that
+ are more suitable to your personal habits and preferences. These would
+ be for narrowly defined situations like your ISP or your bank, and should
+ be placed in <filename>user.action</filename>, which is parsed after all other
+ actions files and hence has the last word, over-riding any previously
+ defined actions. <filename>user.action</filename> is also a
+ <emphasis>safe</emphasis> place for your personal settings, since
+ <filename>default.action</filename> is actively maintained by the
+ <application>Privoxy</application> developers and you'll probably want
+ to install updated versions from time to time.
  </para>
  
  <para>
- We'll just show the interesting part here - the explicit matches. It is
- matched three different times. Two <quote>+block{}</quote> sections,
- and a <quote>+block{} +handle-as-image</quote>,
- which is the expanded form of one of our aliases that had been defined as:
- <quote>+block-as-image</quote>. (<link
- linkend="ALIASES"><quote>Aliases</quote></link> are defined in
- the first section of the actions file and typically used to combine more
- than one action.)
+ So let's look at a few examples of things that one might typically do in
+ <filename>user.action</filename>:
  </para>
  
+
+<!-- brief sample user.action here -->
+
  <para>
- Any one of these would have done the trick and blocked this as an unwanted
- image. This is unnecessarily redundant since the last case effectively
- would also cover the first. No point in taking chances with these guys
- though ;-) Note that if you want an ad or obnoxious
- URL to be invisible, it should be defined as <quote>ad.doubleclick.net</quote>
- is done here -- as both a <link
- linkend="BLOCK"><quote>+block{}</quote></link>
- <emphasis>and</emphasis> an
- <link linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>.
- The custom alias <quote><literal>+block-as-image</literal></quote> just
- simplifies the process and make it more readable.
+ <screen>
+# My user.action file. &lt;fred@example.com&gt;</screen>
  </para>
  
  <para>
- One last example. Let's try <quote>http://www.example.net/adsl/HOWTO/</quote>.
- This one is giving us problems. We are getting a blank page. Hmmm ...
+ As <link linkend="aliases">aliases</link> are local to the actions
+ file that they are defined in, you can't use the ones from
+ <filename>default.action</filename>, unless you repeat them here:
  </para>
  
  <para>
   <screen>
+# Aliases are local to the file they are defined in.
+# (Re-)define aliases for this file:
+#
+{{alias}}
+#
+# These aliases just save typing later, and the alias names should
+# be self explanatory.
+#
++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
+-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
+ allow-all-cookies  = -crunch-all-cookies -session-cookies-only
+ allow-popups       = -filter{all-popups}
++block-as-image     = +block{Blocked as image.} +handle-as-image
+-block-as-image     = -block
  
- Matches for http://www.example.net/adsl/HOWTO/:
+# These aliases define combinations of actions that are useful for
+# certain types of sites:
+#
+fragile     = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer
+shop        = -crunch-all-cookies allow-popups
  
- In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
+# Allow ads for selected useful free sites:
+#
+allow-ads   = -block -filter{banners-by-size} -filter{banners-by-link}
  
- {-add-header
-  -block
-  +change-x-forwarded-for{block}
-  -client-header-filter{hide-tor-exit-notation}
-  -content-type-overwrite
-  -crunch-client-header
-  -crunch-if-none-match
-  -crunch-incoming-cookies
-  -crunch-outgoing-cookies
-  -crunch-server-header
-  +deanimate-gifs
-  -downgrade-http-version
-  +fast-redirects {check-decoded-url}
-  -filter {js-events}
-  -filter {content-cookies}
-  -filter {all-popups}
-  -filter {banners-by-link}
-  -filter {tiny-textforms}
-  -filter {frameset-borders}
-  -filter {demoronizer}
-  -filter {shockwave-flash}
-  -filter {quicktime-kioskmode}
-  -filter {fun}
-  -filter {crude-parental}
-  -filter {site-specifics}
-  -filter {js-annoyances}
-  -filter {html-annoyances}
-  +filter {refresh-tags}
-  -filter {unsolicited-popups}
-  +filter {img-reorder}
-  +filter {banners-by-size}
-  +filter {webbugs}
-  +filter {jumping-windows}
-  +filter {ie-exploits}
-  -filter {google}
-  -filter {yahoo}
-  -filter {msn}
-  -filter {blogspot}
-  -filter {no-ping}
-  -force-text-mode
-  -handle-as-empty-document
-  -handle-as-image
-  -hide-accept-language
-  -hide-content-disposition
-  +hide-from-header{block}
-  +hide-referer{forge}
-  -hide-user-agent
-  -overwrite-last-modified
-  +prevent-compression
-  -redirect
-  -server-header-filter{xml-to-html}
-  -server-header-filter{html-to-xml}
-  +session-cookies-only
-  +set-image-blocker{blank} }
-   /
+# Alias for specific file types that are text, but might have conflicting
+# MIME types. We want the browser to force these to be text documents.
+handle-as-text = -<link linkend="FILTER">filter</link> +-<link linkend="content-type-overwrite">content-type-overwrite{text/plain}</link> +-<link linkend="FORCE-TEXT-MODE">force-text-mode</link> -<link linkend="HIDE-CONTENT-DISPOSITION">hide-content-disposition</link></screen>
  
- { +block{Path contains "ads".} +handle-as-image }
-  /ads
-</screen>
  </para>
  
  <para>
- Ooops, the <quote>/adsl/</quote> is matching <quote>/ads</quote> in our
- configuration! But we did not want this at all! Now we see why we get the
- blank page. It is actually triggering two different actions here, and
- the effects are aggregated so that the URL is blocked, and &my-app; is told
- to treat the block as if it were an image. But this is, of course, all wrong.
-  We could now add a new action below this (or better in our own
-  <filename>user.action</filename> file) that explicitly
-  <emphasis>un</emphasis> blocks (
-  <link linkend="BLOCK"><quote>{-block}</quote></link>) paths with
-  <quote>adsl</quote> in them (remember, last match in the configuration
-  wins). There are various ways to handle such exceptions. Example:
+ Say you have accounts on some sites that you visit regularly, and
+ you don't want to have to log in manually each time. So you'd like
+ to allow persistent cookies for these sites. The
+ <literal>allow-all-cookies</literal> alias defined above does exactly
+ that, i.e. it disables crunching of cookies in any direction, and the
+ processing of cookies to make them only temporary.
  </para>
  
  <para>
   <screen>
+{ allow-all-cookies }
+ sourceforge.net
+ .yahoo.com
+ .msdn.microsoft.com
+ .redhat.com</screen>
+</para>
  
- { -block }
-  /adsl
-</screen>
+<para>
+ Your bank is allergic to some filter, but you don't know which, so you disable them all:
  </para>
  
  <para>
- Now the page displays ;-)
- Remember to flush your browser's caches when making these kinds of changes to
- your configuration to insure that you get a freshly delivered page! Or, try
- using <literal>Shift+Reload</literal>.
+ <screen>
+{ -<link linkend="FILTER">filter</link> }
+ .your-home-banking-site.com</screen>
  </para>
  
  <para>
- But now what about a situation where we get no explicit matches like
- we did with:
+ Some file types you may not want to filter for various reasons:
  </para>
  
  <para>
   <screen>
+# Technical documentation is likely to contain strings that might
+# erroneously get altered by the JavaScript-oriented filters:
+#
+.tldp.org
+/(.*/)?selfhtml/
  
- { +block{Path starts with "ads".} +handle-as-image }
- /ads
-</screen>
+# And this stupid host sends streaming video with a wrong MIME type,
+# so that Privoxy thinks it is getting HTML and starts filtering:
+#
+stupid-server.example.com/</screen>
  </para>
  
  <para>
- That actually was very helpful and pointed us quickly to where the problem
- was. If you don't get this kind of match, then it means one of the default
- rules in the first section of <filename>default.action</filename> is causing
- the problem. This would require some guesswork, and maybe a little trial and
- error to isolate the offending rule. One likely cause would be one of the
- <link linkend="FILTER"><quote>+filter</quote></link> actions.
- These tend to be harder to troubleshoot.
- Try adding the URL for the site to one of aliases that turn off
- <link linkend="FILTER"><quote>+filter</quote></link>:
+ Example of a simple <link linkend="BLOCK">block</link> action. Say you've
+ seen an ad on your favourite page on example.com that you want to get rid of.
+ You have right-clicked the image, selected <quote>copy image location</quote>
+ and pasted the URL below while removing the leading http://, into a
+ <literal>{ +block{} }</literal> section. Note that <literal>{ +handle-as-image
+ }</literal> need not be specified, since all URLs ending in
+ <literal>.gif</literal> will be tagged as images by the general rules as set
+ in default.action anyway:
  </para>
  
  <para>
   <screen>
-
- { shop }
- .quietpc.com
- .worldpay.com   # for quietpc.com
- .jungle.com
- .scan.co.uk
- .forbes.com
-</screen>
+{ +<link linkend="BLOCK">block</link>{Nasty ads.} }
+ www.example.com/nasty-ads/sponsor\.gif
+ another.example.net/more/junk/here/</screen>
  </para>
  
  <para>
- <quote><literal>{ shop }</literal></quote> is an <quote>alias</quote> that expands to
- <quote><literal>{ -filter -session-cookies-only }</literal></quote>.
- Or you could do your own exception to negate filtering:
-
+ The URLs of dynamically generated banners, especially from large banner
+ farms, often don't use the well-known image file name extensions, which
+ makes it impossible for <application>Privoxy</application> to guess
+ the file type just by looking at the URL.
+ You can use the <literal>+block-as-image</literal> alias defined above for
+ these cases.
+ Note that objects which match this rule but then turn out NOT to be an
+ image are typically rendered as a <quote>broken image</quote> icon by the
+ browser. Use cautiously.
  </para>
  
  <para>
   <screen>
-
- { -filter }
- # Disable ALL filter actions for sites in this section
- .forbes.com
- developer.ibm.com
- localhost
-</screen>
+{ +block-as-image }
+ .doubleclick.net
+ .fastclick.net
+ /Realmedia/ads/
+ ar.atwola.com/</screen>
  </para>
  
  <para>
- This would turn off all filtering for these sites. This is best
- put in <filename>user.action</filename>, for local site
- exceptions. Note that when a simple domain pattern is used by itself (without
- the subsequent path portion), all sub-pages within that domain are included
- automatically in the scope of the action.
+ Now you noticed that the default configuration breaks Forbes Magazine,
+ but you were too lazy to find out which action is the culprit, and you
+ were again too lazy to give <link linkend="contact">feedback</link>, so
+ you just used the <literal>fragile</literal> alias on the site, and
+ -- <emphasis>whoa!</emphasis> -- it worked. The <literal>fragile</literal>
+ aliases disables those actions that are most likely to break a site. Also,
+ good for testing purposes to see if it is <application>Privoxy</application>
+ that is causing the problem or not. We later find other regular sites
+ that misbehave, and add those to our personalized list of troublemakers:
  </para>
  
  <para>
- Images that are inexplicably being blocked, may well be hitting the
-<link linkend="FILTER-BANNERS-BY-SIZE"><quote>+filter{banners-by-size}</quote></link>
- rule, which assumes
- that images of certain sizes are ad banners (works well
- <emphasis>most of the time</emphasis>  since these tend to be standardized).
+<screen>
+{ fragile }
+ .forbes.com
+ webmail.example.com
+ .mybank.com</screen>
  </para>
  
  <para>
- <quote><literal>{ fragile }</literal></quote> is an alias that disables most
- actions that are the most likely to cause trouble. This can be used as a
- last resort for problem sites.
+ You like the <quote>fun</quote> text replacements in <filename>default.filter</filename>,
+ but it is disabled in the distributed actions file.
+ So you'd like to turn it on in your private,
+ update-safe config, once and for all:
  </para>
-<para>
- <screen>
  
- { fragile }
- # Handle with care: easy to break
- mail.google.
- mybank.example.com</screen>
+<para>
+<screen>
+{ +<link linkend="filter-fun">filter{fun}</link> }
+ / # For ALL sites!</screen>
  </para>
  
-
  <para>
- <emphasis>Remember to flush caches!</emphasis> Note that the
- <literal>mail.google</literal> reference lacks the TLD portion (e.g.
- <quote>.com</quote>). This will effectively match any TLD with
- <literal>google</literal> in it, such as <literal>mail.google.de.</literal>,
- just as an example.
+ Note that the above is not really a good idea: There are exceptions
+ to the filters in <filename>default.action</filename> for things that
+ really shouldn't be filtered, like code on CVS->Web interfaces. Since
+ <filename>user.action</filename> has the last word, these exceptions
+ won't be valid for the <quote>fun</quote> filtering specified here.
  </para>
+
  <para>
- If this still does not work, you will have to go through the remaining
- actions one by one to find which one(s) is causing the problem.
+ You might also worry about how your favourite free websites are
+ funded, and find that they rely on displaying banner advertisements
+ to survive. So you might want to specifically allow banners for those
+ sites that you feel provide value to you:
  </para>
  
-</sect2>
-
-</sect1>
-
- <!--
-
- This program is free software; you can redistribute it
- and/or modify it under the terms of the GNU General
- Public License as published by the Free Software
- Foundation; either version 2 of the License, or (at
- your option) any later version.
-
- This program is distributed in the hope that it will
- be useful, but WITHOUT ANY WARRANTY; without even the
- implied warranty of MERCHANTABILITY or FITNESS FOR A
- PARTICULAR PURPOSE.  See the GNU General Public
- License for more details.
-
- The GNU General Public License should be included with
- this file.  If not, you can view it at
- http://www.gnu.org/copyleft/gpl.html
- or write to the Free Software Foundation, Inc.,
- 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
- USA
-
- $Log: user-manual.sgml,v $
- Revision 2.134  2011/08/18 11:45:02  fabiankeil
- Don't use unspecified MSN sites as examples for User-Agent-based descrimination
-
- Without knowing the URLs, nobody can easily verify it and it could
- be mistaken as FUD. I also assume that it's no longer an issue anyway.
-
- Revision 2.133  2011/08/18 11:42:50  fabiankeil
- Bump some more documentation copyright ranges.
-
- Revision 2.132  2011/08/17 10:40:07  fabiankeil
- Update the entities.
-
- This commit is chronological out of order.
-
- Revision 2.131  2011/04/19 13:14:10  fabiankeil
- Fix spelling errors in the documentation. Found with codespell.
-
- Revision 2.130  2010/12/01 19:28:28  fabiankeil
- Hopefully unbreak the dok target when using some kind of jade.
-
- Reported by Lee.
-
- Revision 2.129  2010/11/13 20:17:11  fabiankeil
- Merge ChangeLog updates
-
- Revision 2.128  2010/11/10 22:00:13  fabiankeil
- Update the first paragraph of the 'What's New' section.
-
- Revision 2.127  2010/11/10 21:48:54  fabiankeil
- Update the "What's New" section.
-
- Revision 2.126  2010/11/06 12:55:48  fabiankeil
- Set p-version to 3.0.17
-
- Revision 2.125  2010/09/03 17:39:37  fabiankeil
- Slightly improve the explanation of why filtering may appear slower than it is.
-
- Revision 2.124  2010/05/01 18:21:30  fabiankeil
- Explicitly mention how to match any URL.
-
- Revision 2.123  2010/02/19 16:00:38  fabiankeil
- Even more fixes.
-
- Revision 2.122  2010/02/19 15:22:47  fabiankeil
- Add missing word.
-
- Revision 2.121  2010/02/15 15:30:13  fabiankeil
- Mention the use of the no-such-domain template for DNS problems with FEATURE_IPV6_SUPPORT enabled.
-
- Revision 2.120  2010/02/13 17:38:39  fabiankeil
- Update entities for 3.0.16 stable.
-
- Revision 2.119  2010/02/13 16:37:37  fabiankeil
- Update 'What's new?' section.
-
- Revision 2.118  2010/02/11 13:59:48  fabiankeil
- Mention that the headers added by the add-header action aren't modified by other actions.
-
- Revision 2.117  2010/01/11 12:56:04  fabiankeil
- Bump copyright range as p-config.sgml's copyright line is only used in the config file.
-
- Revision 2.116  2009/11/15 14:24:12  fabiankeil
- Prepare to generate docs for 3.0.16 UNRELEASED.
-
- Revision 2.115  2009/10/10 06:19:34  fabiankeil
- Ditch a duplicated 'since'.
-
- Revision 2.114  2009/10/10 05:51:48  fabiankeil
- Update "What's new" section.
-
- Revision 2.113  2009/10/10 05:48:55  fabiankeil
- Prepare for 3.0.15 beta.
-
- Revision 2.112  2009/07/24 12:20:30  fabiankeil
- Remove duplicated period.
-
- Revision 2.111  2009/07/18 18:11:11  fabiankeil
- Don't claim that NTLM should work when there are multiple reports that it doesn't.
-
- Revision 2.110  2009/07/18 16:25:17  fabiankeil
- Fix trailing whitespace.
-
- Revision 2.109  2009/07/18 16:24:39  fabiankeil
- Bump entities for 3.0.14 beta.
-
- Revision 2.108  2009/07/18 15:49:23  fabiankeil
- Add most of the changes in 3.0.14 to the "What's New" section.
-
- Revision 2.107  2009/06/12 14:30:58  fabiankeil
- Update entities for 3.0.13 beta.
-
- Revision 2.106  2009/06/12 11:04:13  fabiankeil
- Import ChangeLog for 3.0.13 beta.
-
- Revision 2.105  2009/04/17 11:32:57  fabiankeil
- Grammar and spelling fixes.
-
- Revision 2.104  2009/04/17 11:27:49  fabiankeil
- Petr Pisar's privoxy-3.0.12-ipv6-3.diff.
-
- Revision 2.103  2009/03/21 10:49:05  fabiankeil
- Merge updated ChangeLog.
-
- Revision 2.102  2009/03/15 19:31:36  fabiankeil
- Update "What's New in this Release" section.
-
- Revision 2.101  2009/02/25 19:01:56  fabiankeil
- Fix typo.
-
- Revision 2.100  2009/02/19 17:14:11  fabiankeil
- - Copy the release cycle description from announce.txt into
-   the "What's New" section.
- - Stop referring to the ChangeLog for a "complete list of changes".
-   The "What's New" section already contains the complete list.
-
- Revision 2.99  2009/02/19 02:20:22  hal9
- Make some links in seealso conditional. Man page is now privoxy only links.
-
- Revision 2.98  2009/02/16 17:10:33  fabiankeil
- Fix entry about shortened log messages. Noticed by Lee.
+<para>
+<screen>
+{ allow-ads }
+ .sourceforge.net
+ .slashdot.org
+ .osdn.net</screen>
+</para>
  
- Revision 2.97  2009/02/14 18:01:00  fabiankeil
- Import ChangeLog.
+<para>
+ Note that <literal>allow-ads</literal> has been aliased to
+ <literal>-<link linkend="block">block</link></literal>,
+ <literal>-<link linkend="filter-banners-by-size">filter{banners-by-size}</link></literal>, and
+ <literal>-<link linkend="filter-banners-by-link">filter{banners-by-link}</link></literal> above.
+</para>
  
- Revision 2.96  2009/02/14 13:14:03  fabiankeil
- Unbreak syntax.
+<para>
+ Invoke another alias here to force an over-ride of the MIME type <literal>
+ application/x-sh</literal> which typically would open a download type
+ dialog. In my case, I want to look at the shell script, and then I can save
+ it should I choose to.
+</para>
  
- Revision 2.95  2009/02/14 12:51:26  fabiankeil
- Mention match-all.action in the "Actions Files Tutorial" section.
+<para>
+<screen>
+{ handle-as-text }
+ /.*\.sh$</screen>
+</para>
  
- Revision 2.94  2009/02/14 11:50:31  fabiankeil
- Some indentation fixes.
+<para>
+ <filename>user.action</filename> is generally the best place to define
+ exceptions and additions to the default policies of
+ <filename>default.action</filename>. Some actions are safe to have their
+ default policies set here though. So let's set a default policy to have a
+ <quote>blank</quote> image as opposed to the checkerboard pattern for
+ <emphasis>ALL</emphasis> sites. <quote>/</quote> of course matches all URL
+ paths and patterns:
+</para>
  
- Revision 2.93  2009/02/14 10:14:42  fabiankeil
- Mention match-all.action in the action file descriptions.
+<para>
+<screen>
+{ +<link linkend="set-image-blocker">set-image-blocker{blank}</link> }
+/ # ALL sites</screen>
+</para>
  
- Revision 2.92  2009/02/12 16:08:26  fabiankeil
- Declare the code stable.
+</sect3>
+</sect2>
  
- Revision 2.91  2009/01/13 16:50:35  fabiankeil
- The standard.action file is gone.
+<!--  ~  End section  ~  -->
  
- Revision 2.90  2008/09/26 16:53:09  fabiankeil
- Update "What's new" section.
+</sect1>
  
- Revision 2.89  2008/09/21 15:38:56  fabiankeil
- Fix Portage tree sync instructions in Gentoo section.
- Anonymously reported at ijbswa-developers@.
+<!--  ~  End section  ~  -->
  
- Revision 2.88  2008/09/21 14:42:52  fabiankeil
- Add documentation for change-x-forwarded-for{},
- remove documentation for hide-forwarded-for-headers.
+<!--   ~~~~~~~~       New section Header    ~~~~~~~~~     -->
  
- Revision 2.87  2008/08/30 15:37:35  fabiankeil
- Update entities.
+<sect1 id="filter-file">
+<title>Filter Files</title>
  
- Revision 2.86  2008/08/16 10:12:23  fabiankeil
- Merge two sentences and move the URL to the end of the item.
+<para>
+ On-the-fly text substitutions need
+ to be defined in a <quote>filter file</quote>. Once defined, they
+ can then be invoked as an <quote>action</quote>.
+</para>
  
- Revision 2.85  2008/08/16 10:04:59  fabiankeil
- Some more syntax fixes. This version actually builds.
+<para>
+ &my-app; supports three different filter actions:
+ <literal><link linkend="filter">filter</link></literal> to
+ rewrite the content that is send to the client,
+ <literal><link linkend="client-header-filter">client-header-filter</link></literal>
+ to rewrite headers that are send by the client, and
+ <literal><link linkend="server-header-filter">server-header-filter</link></literal>
+ to rewrite headers that are send by the server.
+</para>
  
- Revision 2.84  2008/08/16 09:42:45  fabiankeil
- Turns out building docs works better if the syntax is valid.
+<para>
+ &my-app; also supports two tagger actions:
+ <literal><link linkend="client-header-tagger">client-header-tagger</link></literal>
+ and
+ <literal><link linkend="server-header-tagger">server-header-tagger</link></literal>.
+ Taggers and filters use the same syntax in the filter files, the difference
+ is that taggers don't modify the text they are filtering, but use a rewritten
+ version of the filtered text as tag. The tags can then be used to change the
+ applying actions through sections with <link linkend="tag-pattern">tag-patterns</link>.
+</para>
  
- Revision 2.83  2008/08/16 09:32:02  fabiankeil
- Mention changes since 3.0.9 beta.
  
- Revision 2.82  2008/08/16 09:00:52  fabiankeil
- Fix example URL pattern (once more with feeling).
+<para>
+ Multiple filter files can be defined through the <literal> <link
+ linkend="filterfile">filterfile</link></literal> config directive. The filters
+ as supplied by the developers are located in
+ <filename>default.filter</filename>. It is recommended that any locally
+ defined or modified filters go in a separately defined file such as
+ <filename>user.filter</filename>.
+ </para>
  
- Revision 2.81  2008/08/16 08:51:28  fabiankeil
- Update version-related entities.
+<para>
+ Common tasks for content filters are to eliminate common annoyances in
+ HTML and JavaScript, such as pop-up windows,
+ exit consoles, crippled windows without navigation tools, the
+ infamous &lt;BLINK&gt; tag etc, to suppress images with certain
+ width and height attributes (standard banner sizes or web-bugs),
+ or just to have fun.
+</para>
  
- Revision 2.80  2008/07/18 16:54:30  fabiankeil
- Remove erroneous whitespace in documentation link.
- Reported by John Chronister in #2021611.
+<para>
+ Enabled content filters are applied to any content whose
+ <quote>Content Type</quote> header is recognised as a sign
+ of text-based content, with the exception of <literal>text/plain</literal>.
+ Use the <link linkend="FORCE-TEXT-MODE">force-text-mode</link> action
+ to also filter other content.
+</para>
  
- Revision 2.79  2008/06/27 18:00:53  markm68k
- remove outdated startup information for mac os x
+<para>
+ Substitutions are made at the source level, so if you want to <quote>roll
+ your own</quote> filters, you should first be familiar with HTML syntax,
+ and, of course, regular expressions.
+</para>
  
- Revision 2.78  2008/06/21 17:03:03  fabiankeil
- Fix typo.
+<para>
+ Just like the <link linkend="actions-file">actions files</link>, the
+ filter file is organized in sections, which are called <emphasis>filters</emphasis>
+ here. Each filter consists of a heading line, that starts with one of the
+ <emphasis>keywords</emphasis> <literal>FILTER:</literal>,
+ <literal>CLIENT-HEADER-FILTER:</literal> or <literal>SERVER-HEADER-FILTER:</literal>
+ followed by the filter's <emphasis>name</emphasis>, and a short (one line)
+ <emphasis>description</emphasis> of what it does. Below that line
+ come the <emphasis>jobs</emphasis>, i.e. lines that define the actual
+ text substitutions. By convention, the name of a filter
+ should describe what the filter <emphasis>eliminates</emphasis>. The
+ comment is used in the <ulink url="http://config.privoxy.org/">web-based
+ user interface</ulink>.
+</para>
  
- Revision 2.77  2008/06/14 13:45:22  fabiankeil
- Re-add a colon I unintentionally removed a few revisions ago.
+<para>
+ Once a filter called <replaceable>name</replaceable> has been defined
+ in the filter file, it can be invoked by using an action of the form
+ +<literal><link linkend="filter">filter</link>{<replaceable>name</replaceable>}</literal>
+ in any <link linkend="actions-file">actions file</link>.
+</para>
  
- Revision 2.76  2008/06/14 13:21:28  fabiankeil
- Prepare for the upcoming 3.0.9 beta release.
+<para>
+ Filter definitions start with a header line that contains the filter
+ type, the filter name and the filter description.
+ A content filter header line for a filter called <quote>foo</quote> could look
+ like this:
+</para>
  
- Revision 2.75  2008/06/13 16:06:48  fabiankeil
- Update the "What's New in this Release" section with
- the ChangeLog entries changelog2doc.pl could handle.
+<para>
+ <screen>FILTER: foo Replace all "foo" with "bar"</screen>
+</para>
  
- Revision 2.74  2008/05/26 15:55:46  fabiankeil
- - Update "default profiles" table.
- - Add some more pcrs redirect examples and note that
-   enabling debug 128 helps to get redirects working.
+<para>
+ Below that line, and up to the next header line, come the jobs that
+ define what text replacements the filter executes. They are specified
+ in a syntax that imitates <ulink url="http://www.perl.org/">Perl</ulink>'s
+ <literal>s///</literal> operator. If you are familiar with Perl, you
+ will find this to be quite intuitive, and may want to look at the
+ PCRS documentation for the subtle differences to Perl behaviour. Most
+ notably, the non-standard option letter <literal>U</literal> is supported,
+ which turns the default to ungreedy matching.
+</para>
  
- Revision 2.73  2008/05/23 14:43:18  fabiankeil
- Remove previously out-commented block that caused syntax problems.
+<para>
+ If you are new to
+  <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular
+  Expressions</quote></ulink>, you might want to take a look at
+ the <link linkend="regex">Appendix on regular expressions</link>, and
+ see the <ulink url="http://perldoc.perl.org/perlre.html">Perl
+ manual</ulink> for
+ <ulink url="http://perldoc.perl.org/perlop.html">the
+ <literal>s///</literal> operator's syntax</ulink> and <ulink
+ url="http://perldoc.perl.org/perlre.html">Perl-style regular
+ expressions</ulink> in general.
+ The below examples might also help to get you started.
+</para>
  
- Revision 2.72  2008/05/12 10:26:14  fabiankeil
- Synchronize content filter descriptions with the ones in default.filter.
  
- Revision 2.71  2008/04/10 17:37:16  fabiankeil
- Actually we use "modern" POSIX 1003.2 regular
- expressions in path patterns, not PCRE.
+<!--   ~~~~~~~~       New section Header    ~~~~~~~~~     -->
  
- Revision 2.70  2008/04/10 15:59:12  fabiankeil
- Add another section to the client-header-tagger example that shows
- how to actually change the action settings once the tag is created.
+<sect2><title>Filter File Tutorial</title>
+<para>
+ Now, let's complete our <quote>foo</quote> content filter. We have already defined
+ the heading, but the jobs are still missing. Since all it does is to replace
+ <quote>foo</quote> with <quote>bar</quote>, there is only one (trivial) job
+ needed:
+</para>
  
- Revision 2.69  2008/03/29 12:14:25  fabiankeil
- Remove send-wafer and send-vanilla-wafer actions.
+<para>
+ <screen>s/foo/bar/</screen>
+</para>
  
- Revision 2.68  2008/03/28 15:13:43  fabiankeil
- Remove inspect-jpegs action.
+<para>
+ But wait! Didn't the comment say that <emphasis>all</emphasis> occurrences
+ of <quote>foo</quote> should be replaced? Our current job will only take
+ care of the first <quote>foo</quote> on each page. For global substitution,
+ we'll need to add the <literal>g</literal> option:
+</para>
  
- Revision 2.67  2008/03/27 18:31:21  fabiankeil
- Remove kill-popups action.
+<para>
+ <screen>s/foo/bar/g</screen>
+</para>
  
- Revision 2.66  2008/03/06 16:33:47  fabiankeil
- If limit-connect isn't used, don't limit CONNECT requests to port 443.
+<para>
+ Our complete filter now looks like this:
+</para>
+<para>
+ <screen>FILTER: foo Replace all "foo" with "bar"
+s/foo/bar/g</screen>
+</para>
  
- Revision 2.65  2008/03/04 18:30:40  fabiankeil
- Remove the treat-forbidden-connects-like-blocks action. We now
- use the "blocked" page for forbidden CONNECT requests by default.
+<para>
+ Let's look at some real filters for more interesting examples. Here you see
+ a filter that protects against some common annoyances that arise from JavaScript
+ abuse. Let's look at its jobs one after the other:
+</para>
  
- Revision 2.64  2008/03/01 14:10:28  fabiankeil
- Use new block syntax. Still needs some polishing.
  
- Revision 2.63  2008/02/22 05:50:37  markm68k
- fix merge problem
+<para>
+ <screen>
+FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
  
- Revision 2.62  2008/02/11 11:52:23  hal9
- Fix entity ... s/&/&amp;
+# Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
+#
+s|(&lt;script.*)document\.referrer(.*&lt;/script&gt;)|$1"Not Your Business!"$2|Usg</screen>
+</para>
  
- Revision 2.61  2008/02/11 03:41:47  markm68k
- more updates for mac os x
+<para>
+ Following the header line and a comment, you see the job. Note that it uses
+ <literal>|</literal> as the delimiter instead of <literal>/</literal>, because
+ the pattern contains a forward slash, which would otherwise have to be escaped
+ by a backslash (<literal>\</literal>).
+</para>
  
- Revision 2.60  2008/02/11 03:40:25  markm68k
- more updates for mac os x
+<para>
+ Now, let's examine the pattern: it starts with the text <literal>&lt;script.*</literal>
+ enclosed in parentheses. Since the dot matches any character, and <literal>*</literal>
+ means: <quote>Match an arbitrary number of the element left of myself</quote>, this
+ matches <quote>&lt;script</quote>, followed by <emphasis>any</emphasis> text, i.e.
+ it matches the whole page, from the start of the first &lt;script&gt; tag.
+</para>
  
- Revision 2.59  2008/02/11 00:52:34  markm68k
- reflect new changes for mac os x
+<para>
+ That's more than we want, but the pattern continues: <literal>document\.referrer</literal>
+ matches only the exact string <quote>document.referrer</quote>. The dot needed to
+ be <emphasis>escaped</emphasis>, i.e. preceded by a backslash, to take away its
+ special meaning as a joker, and make it just a regular dot. So far, the meaning is:
+ Match from the start of the first &lt;script&gt; tag in a the page, up to, and including,
+ the text <quote>document.referrer</quote>, if <emphasis>both</emphasis> are present
+ in the page (and appear in that order).
+</para>
  
- Revision 2.58  2008/02/03 21:37:40  hal9
- Apply patch from Mark: s/OSX/OS X/
+<para>
+ But there's still more pattern to go. The next element, again enclosed in parentheses,
+ is <literal>.*&lt;/script&gt;</literal>. You already know what <literal>.*</literal>
+ means, so the whole pattern translates to: Match from the start of the first  &lt;script&gt;
+ tag in a page to the end of the last &lt;script&gt; tag, provided that the text
+ <quote>document.referrer</quote> appears somewhere in between.
+</para>
  
- Revision 2.57  2008/02/03 19:10:14  fabiankeil
- Mention forward-socks5.
+<para>
+ This is still not the whole story, since we have ignored the options and the parentheses:
+ The portions of the page matched by sub-patterns that are enclosed in parentheses, will be
+ remembered and be available through the variables <literal>$1, $2, ...</literal> in
+ the substitute. The <literal>U</literal> option switches to ungreedy matching, which means
+ that the first <literal>.*</literal> in the pattern will only <quote>eat up</quote> all
+ text in between <quote>&lt;script</quote> and the <emphasis>first</emphasis> occurrence
+ of <quote>document.referrer</quote>, and that the second <literal>.*</literal> will
+ only span the text up to the <emphasis>first</emphasis> <quote>&lt;/script&gt;</quote>
+ tag. Furthermore, the <literal>s</literal> option says that the match may span
+ multiple lines in the page, and the <literal>g</literal> option again means that the
+ substitution is global.
+</para>
  
- Revision 2.56  2008/01/31 19:11:35  fabiankeil
- Let the +client-header-filter{hide-tor-exit-notation} example apply
- to all requests as "tainted" Referers aren't limited to exit TLDs.
+<para>
+ So, to summarize, the pattern means: Match all scripts that contain the text
+ <quote>document.referrer</quote>. Remember the parts of the script from
+ (and including) the start tag up to (and excluding) the string
+ <quote>document.referrer</quote> as <literal>$1</literal>, and the part following
+ that string, up to and including the closing tag, as <literal>$2</literal>.
+</para>
  
- Revision 2.55  2008/01/19 21:26:37  hal9
- Add IE7 to configuration section per Gerry.
+<para>
+ Now the pattern is deciphered, but wasn't this about substituting things? So
+ lets look at the substitute: <literal>$1"Not Your Business!"$2</literal> is
+ easy to read: The text remembered as <literal>$1</literal>, followed by
+ <literal>"Not Your Business!"</literal> (<emphasis>including</emphasis>
+ the quotation marks!), followed by the text remembered as <literal>$2</literal>.
+ This produces an exact copy of the original string, with the middle part
+ (the <quote>document.referrer</quote>) replaced by <literal>"Not Your
+ Business!"</literal>.
+</para>
  
- Revision 2.54  2008/01/19 17:52:39  hal9
- Re-commit to fix various minor issues for new release.
+<para>
+ The whole job now reads: Replace <quote>document.referrer</quote> by
+ <literal>"Not Your Business!"</literal> wherever it appears inside a
+ &lt;script&gt tag. Note that this job won't break JavaScript syntax,
+ since both the original and the replacement are syntactically valid
+ string objects. The script just won't have access to the referrer
+ information anymore.
+</para>
  
- Revision 2.53  2008/01/19 15:03:05  hal9
- Doc sources tagged for 3.0.8 release.
+<para>
+ We'll show you two other jobs from the JavaScript taming department, but
+ this time only point out the constructs of special interest:
+</para>
  
- Revision 2.52  2008/01/17 01:49:51  hal9
- Change copyright notice for docs s/2007/2008/. All these will be rebuilt soon
- enough.
+<para>
+ <screen>
+# The status bar is for displaying link targets, not pointless blahblah
+#
+s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig</screen>
+</para>
  
- Revision 2.51  2007/12/23 16:48:24  fabiankeil
- Use more precise example descriptions for the mysterious domain patterns.
+<para>
+ <literal>\s</literal> stands for whitespace characters (space, tab, newline,
+ carriage return, form feed), so that <literal>\s*</literal> means: <quote>zero
+ or more whitespace</quote>. The <literal>?</literal> in <literal>.*?</literal>
+ makes this matching of arbitrary text ungreedy. (Note that the <literal>U</literal>
+ option is not set). The <literal>['"]</literal> construct means: <quote>a single
+ <emphasis>or</emphasis> a double quote</quote>. Finally, <literal>\1</literal> is
+ a back-reference to the first parenthesis just like <literal>$1</literal> above,
+ with the difference that in the <emphasis>pattern</emphasis>, a backslash indicates
+ a back-reference, whereas in the <emphasis>substitute</emphasis>, it's the dollar.
+</para>
  
- Revision 2.50  2007/12/08 12:44:36  fabiankeil
- - Remove already commented out pre-3.0.7 changes.
- - Update the "new log defaults" paragraph.
+<para>
+ So what does this job do? It replaces assignments of single- or double-quoted
+ strings to the <quote>window.status</quote> object with a dummy assignment
+ (using a variable name that is hopefully odd enough not to conflict with
+ real variables in scripts). Thus, it catches many cases where e.g. pointless
+ descriptions are displayed in the status bar instead of the link target when
+ you move your mouse over links.
+</para>
  
- Revision 2.49  2007/12/06 18:21:55  fabiankeil
- Update hide-forwarded-for-headers description.
+<para>
+ <screen>
+# Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
+#
+s/(&lt;body [^&gt;]*)onunload(.*&gt;)/$1never$2/iU</screen>
+</para>
  
- Revision 2.48  2007/11/24 19:07:17  fabiankeil
- - Mention request rewriting.
- - Enable the conditional-forge paragraph.
- - Minor rewordings.
+<para>
+ Including the
+ <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">OnUnload
+ event binding</ulink> in the HTML DOM was a <emphasis>CRIME</emphasis>.
+ When I close a browser window, I want it to close and die. Basta.
+ This job replaces the <quote>onunload</quote> attribute in
+ <quote>&lt;body&gt</quote> tags with the dummy word <literal>never</literal>.
+ Note that the <literal>i</literal> option makes the pattern matching
+ case-insensitive. Also note that ungreedy matching alone doesn't always guarantee
+ a minimal match: In the first parenthesis, we had to use <literal>[^&gt;]*</literal>
+ instead of <literal>.*</literal> to prevent the match from exceeding the
+ &lt;body&gt tag if it doesn't contain <quote>OnUnload</quote>, but the page's
+ content does.
+</para>
  
- Revision 2.47  2007/11/18 14:59:47  fabiankeil
- A few "Note to Upgraders" updates.
+<para>
+ The last example is from the fun department:
+</para>
  
- Revision 2.46  2007/11/17 17:24:44  fabiankeil
- - Use new action defaults.
- - Minor fixes and rewordings.
+<para>
+ <screen>
+FILTER: fun Fun text replacements
  
- Revision 2.45  2007/11/16 11:48:46  hal9
- Fix one typo, and add a couple of small refinements.
+# Spice the daily news:
+#
+s/microsoft(?!\.com)/MicroSuck/ig</screen>
+</para>
  
- Revision 2.44  2007/11/15 03:30:20  hal9
- Results of spell check.
+<para>
+ Note the <literal>(?!\.com)</literal> part (a so-called negative lookahead)
+ in the job's pattern, which means: Don't match, if the string
+ <quote>.com</quote> appears directly following <quote>microsoft</quote>
+ in the page. This prevents links to microsoft.com from being trashed, while
+ still replacing the word everywhere else.
+</para>
  
- Revision 2.43  2007/11/14 18:45:39  fabiankeil
- - Mention some more contributors in the "New in this Release" list.
- - Minor rewordings.
+<para>
+ <screen>
+# Buzzword Bingo (example for extended regex syntax)
+#
+s* industry[ -]leading \
+|  cutting[ -]edge \
+|  customer[ -]focused \
+|  market[ -]driven \
+|  award[ -]winning # Comments are OK, too! \
+|  high[ -]performance \
+|  solutions[ -]based \
+|  unmatched \
+|  unparalleled \
+|  unrivalled \
+*&lt;font color="red"&gt;&lt;b&gt;BINGO!&lt;/b&gt;&lt;/font&gt; \
+*igx</screen>
+</para>
  
- Revision 2.42  2007/11/12 03:32:40  hal9
- Updates for "What's New" and "Notes to Upgraders". Various other changes in
- preparation for new release. User Manual is almost ready.
+<para>
+ The <literal>x</literal> option in this job turns on extended syntax, and allows for
+ e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.
+</para>
  
- Revision 2.41  2007/11/11 16:32:11  hal9
- This is primarily syncing What's New and Note to Upgraders sections with the many
- new features and changes (gleaned from memory but mostly from ChangeLog).
+<para>
+ You get the idea?
+</para>
+</sect2>
  
- Revision 2.40  2007/11/10 17:10:59  fabiankeil
- In the first third of the file, mention several times that
- the action editor is disabled by default in 3.0.7 beta and later.
+<!--   ~~~~~~~~       New section Header    ~~~~~~~~~     -->
  
- Revision 2.39  2007/11/05 02:34:49  hal9
- Various changes in preparation for the upcoming release. Much yet to be done.
+<sect2 id="predefined-filters"><title>The Pre-defined Filters</title>
  
- Revision 2.38  2007/09/22 16:01:42  fabiankeil
- Update embedded show-url-info output.
+<!--
  
- Revision 2.37  2007/08/27 16:09:55  fabiankeil
- Fix pre-chroot-nslookup description which I failed to
- copy and paste properly. Reported by Stephen Gildea.
+ Note each filter is also listed in the +filter action section above. Please
+ keep these listings in sync.
  
- Revision 2.36  2007/08/26 16:47:14  fabiankeil
- Add Stephen Gildea's pre-chroot-nslookup patch [#1276666],
- extensive comments moved to user manual.
+-->
  
- Revision 2.35  2007/08/26 14:59:49  fabiankeil
- Minor rewordings and fixes.
+<para>
+The distribution <filename>default.filter</filename> file contains a selection of
+pre-defined filters for your convenience:
+</para>
  
- Revision 2.34  2007/08/05 15:19:50  fabiankeil
- - Don't claim HTTP/1.1 compliance.
- - Use $ in some of the path pattern examples.
- - Use a hide-user-agent example argument without
-   leading and trailing space.
- - Make it clear that the cookie actions work with
-   HTTP cookies only.
- - Rephrase the inspect-jpegs text to underline
-   that it's only meant to protect against a single
-   exploit.
+<variablelist>
+ <varlistentry>
+  <term><emphasis>js-annoyances</emphasis></term>
+  <listitem>
+   <para>
+    The purpose of this filter is to get rid of particularly annoying JavaScript abuse.
+    To that end, it
+   <itemizedlist>
+    <listitem>
+     <para>
+      replaces JavaScript references to the browser's referrer information
+      with the string "Not Your Business!". This compliments the <literal><link
+      linkend="hide-referrer">hide-referrer</link></literal> action on the content level.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      removes the bindings to the DOM's
+      <ulink url="http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents">unload
+      event</ulink> which we feel has no right to exist and is responsible for most <quote>exit consoles</quote>, i.e.
+      nasty windows that pop up when you close another one.
+     </para>
+    </listitem>
+    <listitem>
+     <para>
+      removes code that causes new windows to be opened with undesired properties, such as being
+      full-screen, non-resizeable, without location, status or menu bar etc.
+     </para>
+    </listitem>
+   </itemizedlist>
+   </para>
+   <para>
+    Use with caution. This is an aggressive filter, and can break sites that
+    rely heavily on JavaScript.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.33  2007/07/27 10:57:35  hal9
- Add references for user-agent strings for hide-user-agenet
+ <varlistentry>
+  <term><emphasis>js-events</emphasis></term>
+  <listitem>
+   <para>
+    This is a very radical measure. It removes virtually all JavaScript event bindings, which
+    means that scripts can not react to user actions such as mouse movements or clicks, window
+    resizing etc, anymore. Use with caution!
+   </para>
+   <para>
+    We <emphasis>strongly discourage</emphasis> using this filter as a default since it breaks
+    many legitimate scripts. It is meant for use only on extra-nasty sites (should you really
+    need to go there).
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.32  2007/06/07 12:36:22  fabiankeil
- Apply Roland's 29_usermanual.dpatch to fix a bunch
- of syntax errors I collected over the last months.
+<varlistentry>
+  <term><emphasis>html-annoyances</emphasis></term>
+  <listitem>
+   <para>
+    This filter will undo many common instances of HTML based abuse.
+   </para>
+   <para>
+    The <literal>BLINK</literal> and <literal>MARQUEE</literal> tags
+    are neutralized (yeah baby!), and browser windows will be created as
+    resizeable (as of course they should be!), and will have location,
+    scroll and menu bars -- even if specified otherwise.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.31  2007/06/02 14:01:37  fabiankeil
- Start to document forward-override{}.
+ <varlistentry>
+  <term><emphasis>content-cookies</emphasis></term>
+  <listitem>
+   <para>
+    Most cookies are set in the HTTP dialog, where they can be intercepted
+    by the
+    <literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>
+    and <literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>
+    actions. But web sites increasingly make use of HTML meta tags and JavaScript
+    to sneak cookies to the browser on the content level.
+   </para>
+   <para>
+    This filter disables most HTML and JavaScript code that reads or sets
+    cookies. It cannot detect all clever uses of these types of code, so it
+    should not be relied on as an absolute fix. Use it wherever you would also
+    use the cookie crunch actions.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.30  2007/04/25 15:10:36  fabiankeil
- - Describe installation for FreeBSD.
- - Start to document taggers and tag patterns.
- - Don't confuse devils and daemons.
+ <varlistentry>
+  <term><emphasis>refresh-tags</emphasis></term>
+  <listitem>
+   <para>
+    Disable any refresh tags if the interval is greater than nine seconds (so
+    that redirections done via refresh tags are not destroyed). This is useful
+    for dial-on-demand setups, or for those who find this HTML feature
+    annoying.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.29  2007/04/05 11:47:51  fabiankeil
- Some updates regarding header filtering,
- handling of compressed content and redirect's
- support for pcrs commands.
+ <varlistentry>
+  <term><emphasis>unsolicited-popups</emphasis></term>
+  <listitem>
+   <para>
+    This filter attempts to prevent only <quote>unsolicited</quote> pop-up
+    windows from opening, yet still allow pop-up windows that the user
+    has explicitly chosen to open. It was added in version 3.0.1,
+    as an improvement over earlier such filters.
+   </para>
+   <para>
+    Technical note: The filter works by redefining the window.open JavaScript
+    function to a dummy function, <literal>PrivoxyWindowOpen()</literal>,
+    during the loading and rendering phase of each HTML page access, and
+    restoring the function afterward.
+   </para>
+   <para>
+    This is recommended only for browsers that cannot perform this function
+    reliably themselves. And be aware that some sites require such windows
+    in order to function normally. Use with caution.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.28  2006/12/10 23:42:48  hal9
- Fix various typos reported by Adam P. Thanks.
+ <varlistentry>
+  <term><emphasis>all-popups</emphasis></term>
+  <listitem>
+   <para>
+    Attempt to prevent <emphasis>all</emphasis> pop-up windows from opening.
+    Note this should be used with even more discretion than the above, since
+    it is more likely to break some sites that require pop-ups for normal
+    usage. Use with caution.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.27  2006/11/14 01:57:47  hal9
- Dump all docs prior to 3.0.6 release. Various minor changes to faq and user
- manual.
+ <varlistentry>
+  <term><emphasis>img-reorder</emphasis></term>
+  <listitem>
+   <para>
+    This is a helper filter that has no value if used alone. It makes the
+    <literal>banners-by-size</literal> and <literal>banners-by-link</literal>
+    (see below) filters more effective and should be enabled together with them.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.26  2006/10/24 11:16:44  hal9
- Add new filters.
+ <varlistentry>
+  <term><emphasis>banners-by-size</emphasis></term>
+  <listitem>
+   <para>
+    This filter removes image tags purely based on what size they are. Fortunately
+    for us, many ads and banner images tend to conform to certain standardized
+    sizes, which makes this filter quite effective for ad stripping purposes.
+   </para>
+   <para>
+    Occasionally this filter will cause false positives on images that are not ads,
+    but just happen to be of one of the standard banner sizes.
+   </para>
+   <para>
+    Recommended only for those who require extreme ad blocking. The default
+    block rules should catch 95+% of all ads <emphasis>without</emphasis> this filter enabled.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.25  2006/10/18 10:50:33  hal9
- Add note that since filters are off in Cautious, compression is ON. Turn off
- compression to make filters work on all sites.
+ <varlistentry>
+  <term><emphasis>banners-by-link</emphasis></term>
+  <listitem>
+   <para>
+    This is an experimental filter that attempts to kill any banners if
+    their URLs seem to point to known or suspected click trackers. It is currently
+    not of much value and is not recommended for use by default.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.24  2006/10/03 11:13:54  hal9
- More references to the new filters. Include html this time around.
+ <varlistentry>
+  <term><emphasis>webbugs</emphasis></term>
+  <listitem>
+   <para>
+    Webbugs are small, invisible images (technically 1X1 GIF images), that
+    are used to track users across websites, and collect information on them.
+    As an HTML page is loaded by the browser, an embedded image tag causes the
+    browser to contact a third-party site, disclosing the tracking information
+    through the requested URL and/or cookies for that third-party domain, without
+    the user ever becoming aware of the interaction with the third-party site.
+    HTML-ized spam also uses a similar technique to verify email addresses.
+   </para>
+   <para>
+    This filter removes the HTML code that loads such <quote>webbugs</quote>.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.23  2006/10/02 22:43:53  hal9
- Contains new filter definitions from Fabian, and few other miscellaneous
- touch-ups.
+ <varlistentry>
+  <term><emphasis>tiny-textforms</emphasis></term>
+  <listitem>
+   <para>
+    A rather special-purpose filter that can be used to enlarge textareas (those
+    multi-line text boxes in web forms) and turn off hard word wrap in them.
+    It was written for the sourceforge.net tracker system where such boxes are
+    a nuisance, but it can be handy on other sites, too.
+   </para>
+   <para>
+    It is not recommended to use this filter as a default.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.22  2006/09/22 01:27:55  hal9
- Final commit of probably various minor changes here and there. Unless
- something changes this should be ready for pending release.
+ <varlistentry>
+  <term><emphasis>jumping-windows</emphasis></term>
+  <listitem>
+   <para>
+    Many consider windows that move, or resize themselves to be abusive. This filter
+    neutralizes the related JavaScript code. Note that some sites might not display
+    or behave as intended when using this filter. Use with caution.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.21  2006/09/20 03:21:36  david__schmidt
- Just the tiniest tweak.  Wafer thin!
+ <varlistentry>
+  <term><emphasis>frameset-borders</emphasis></term>
+  <listitem>
+   <para>
+    Some web designers seem to assume that everyone in the world will view their
+    web sites using the same browser brand and version, screen resolution etc,
+    because only that assumption could explain why they'd use static frame sizes,
+    yet prevent their frames from being resized by the user, should they be too
+    small to show their whole content.
+   </para>
+   <para>
+    This filter removes the related HTML code. It should only be applied to sites
+    which need it.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.20  2006/09/10 14:53:54  hal9
- Results of spell check. User manual has some updates to standard.actions file
- info.
+ <varlistentry>
+  <term><emphasis>demoronizer</emphasis></term>
+  <listitem>
+   <para>
+    Many Microsoft products that generate HTML use non-standard extensions (read:
+    violations) of the ISO 8859-1 aka Latin-1 character set. This can cause those
+    HTML documents to display with errors on standard-compliant platforms.
+   </para>
+   <para>
+    This filter translates the MS-only characters into Latin-1 equivalents.
+    It is not necessary when using MS products, and will cause corruption of
+    all documents that use 8-bit character sets other than Latin-1. It's mostly
+    worthwhile for Europeans on non-MS platforms, if weird garbage characters
+    sometimes appear on some pages, or user agents that don't correct for this on
+    the fly.
+<!--
+    My version of Mozilla (ancient) shows litte square boxes for quote
+    characters, and apostrophes on moronized pages. So many pages have this, I
+    can read them fine now. HB 08/27/06
+-->
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.19  2006/09/08 12:19:02  fabiankeil
- Adjust hide-if-modified-since example values
- to reflect the recent changes.
+ <varlistentry>
+  <term><emphasis>shockwave-flash</emphasis></term>
+  <listitem>
+   <para>
+    A filter for shockwave haters. As the name suggests, this filter strips code
+    out of web pages that is used to embed shockwave flash objects.
+   </para>
+   <para>
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.18  2006/09/08 02:38:57  hal9
- Various changes:
-  -Fix a number of broken links.
-  -Migrate the new Windows service command line options, and reference as
-   needed.
-  -Rebuild so that can be used with the new "user-manual" config capabilities.
-  -Etc.
+ <varlistentry>
+  <term><emphasis>quicktime-kioskmode</emphasis></term>
+  <listitem>
+   <para>
+    Change HTML code that embeds Quicktime objects so that kioskmode, which
+    prevents saving, is disabled.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.17  2006/09/05 13:25:12  david__schmidt
- Add Windows service invocation stuff (duplicated) in FAQ and in user manual under Windows startup.  One probably ought to reference the other.
+ <varlistentry>
+  <term><emphasis>fun</emphasis></term>
+  <listitem>
+   <para>
+    Text replacements for subversive browsing fun. Make fun of your favorite
+    Monopolist or play buzzword bingo.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.16  2006/09/02 12:49:37  hal9
- Various small updates for new actions, filterfiles, etc.
+ <varlistentry>
+  <term><emphasis>crude-parental</emphasis></term>
+  <listitem>
+   <para>
+    A demonstration-only filter that shows how <application>Privoxy</application>
+    can be used to delete web content on a keyword basis.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.15  2006/08/30 11:15:22  hal9
- More work on the new actions, especially filter-*-headers, and What's New
- section. User Manual is close to final form for 3.0.4 release. Some tinkering
- and proof reading left to do.
+ <varlistentry>
+  <term><emphasis>ie-exploits</emphasis></term>
+  <listitem>
+   <para>
+    An experimental collection of text replacements to disable malicious HTML and JavaScript
+    code that exploits known security holes in Internet Explorer.
+   </para>
+   <para>
+    Presently, it only protects against Nimda and a cross-site scripting bug, and
+    would need active maintenance to provide more substantial protection.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.14  2006/08/29 10:59:36  hal9
- Add a "Whats New in this release" Section. Further work on multiple filter
- files, and assorted other minor changes.
+ <varlistentry>
+  <term><emphasis>site-specifics</emphasis></term>
+  <listitem>
+   <para>
+    Some web sites have very specific problems, the cure for which doesn't apply
+    anywhere else, or could even cause damage on other sites.
+   </para>
+   <para>
+    This is a collection of such site-specific cures which should only be applied
+    to the sites they were intended for, which is what the supplied
+    <filename>default.action</filename> file does. Users shouldn't need to change
+    anything regarding this filter.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.13  2006/08/22 11:04:59  hal9
- Silence warnings and errors. This should build now. New filters were only
- stubbed in. More to be done.
+ <varlistentry>
+  <term><emphasis>google</emphasis></term>
+  <listitem>
+   <para>
+    A CSS based block for Google text ads. Also removes a width limitation
+    and the toolbar advertisement.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.12  2006/08/14 08:40:39  fabiankeil
- Documented new actions that were part of
- the "minor Privoxy improvements".
+  <varlistentry>
+  <term><emphasis>yahoo</emphasis></term>
+  <listitem>
+   <para>
+    Another CSS based block, this time for Yahoo text ads. And removes
+    a width limitation as well.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 2.11  2006/07/18 14:48:51  david__schmidt
- Reorganizing the repository: swapping out what was HEAD (the old 3.1 branch)
- with what was really the latest development (the v_3_0_branch branch)
+  <varlistentry>
+  <term><emphasis>msn</emphasis></term>
+  <listitem>
+   <para>
+    Another CSS based block, this time for MSN text ads. And removes
+    tracking URLs, as well as a width limitation.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 1.123.2.43  2005/05/23 09:59:10  hal9
- Fix typo 'loose'
+ <varlistentry>
+  <term><emphasis>blogspot</emphasis></term>
+  <listitem>
+   <para>
+    Cleans up some Blogspot blogs. Read the fine print before using this one!
+   </para>
+   <para>
+    This filter also intentionally removes some navigation stuff and sets the
+    page width to 100%. As a result, some rounded <quote>corners</quote> would
+    appear to early or not at all and as fixing this would require a browser
+    that understands background-size (CSS3), they are removed instead.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 1.123.2.42  2004/12/04 14:39:57  hal9
- Fix two minor typos per bug SF report.
+  <varlistentry>
+  <term><emphasis>xml-to-html</emphasis></term>
+  <listitem>
+   <para>
+    Server-header filter to change the Content-Type from xml to html.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 1.123.2.41  2004/03/23 12:58:42  oes
- Fixed an inaccuracy
+  <varlistentry>
+  <term><emphasis>html-to-xml</emphasis></term>
+  <listitem>
+   <para>
+    Server-header filter to change the Content-Type from html to xml.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 1.123.2.40  2004/02/27 12:48:49  hal9
- Add comment re: redirecting to local file system for set-image-blocker may
- is dependent on browser.
+  <varlistentry>
+  <term><emphasis>no-ping</emphasis></term>
+  <listitem>
+   <para>
+    Removes the non-standard <literal>ping</literal> attribute from
+    anchor and area HTML tags.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 1.123.2.39  2004/01/30 22:31:40  oes
- Added a hint re bookmarklets to Quickstart section
+  <varlistentry>
+  <term><emphasis>hide-tor-exit-notation</emphasis></term>
+  <listitem>
+   <para>
+    Client-header filter to remove the <command>Tor</command> exit node notation
+    found in Host and Referer headers.
+   </para>
+   <para>
+    If &my-app; and <command>Tor</command> are chained and &my-app;
+    is configured to use socks4a, one can use <quote>http://www.example.org.foobar.exit/</quote>
+    to access the host <quote>www.example.org</quote> through the
+    <command>Tor</command> exit node <quote>foobar</quote>.
+   </para>
+   <para>
+    As the HTTP client isn't aware of this notation, it treats the
+    whole string <quote>www.example.org.foobar.exit</quote> as host and uses it
+    for the <quote>Host</quote> and <quote>Referer</quote> headers. From the
+    server's point of view the resulting headers are invalid and can cause problems.
+   </para>
+   <para>
+    An invalid <quote>Referer</quote> header can trigger <quote>hot-linking</quote>
+    protections, an invalid <quote>Host</quote> header will make it impossible for
+    the server to find the right vhost (several domains hosted on the same IP address).
+   </para>
+   <para>
+    This client-header filter removes the <quote>foo.exit</quote> part in those headers
+    to prevent the mentioned problems. Note that it only modifies
+    the HTTP headers, it doesn't make it impossible for the server
+    to detect your <command>Tor</command> exit node based on the IP address
+    the request is coming from.
+   </para>
+  </listitem>
+ </varlistentry>
  
- Revision 1.123.2.38  2004/01/30 16:47:51  oes
- Some minor clarifications
+<!--
+ <varlistentry>
+  <term><emphasis> </emphasis></term>
+  <listitem>
+   <para>
+   </para>
+   <para>
+   </para>
+  </listitem>
+ </varlistentry>
+-->
+</variablelist>
  
- Revision 1.123.2.37  2004/01/29 22:36:11  hal9
- Updates for no longer filtering text/plain, and demoronizer default settings,
- and copyright notice dates.
+</sect2>
+</sect1>
  
- Revision 1.123.2.36  2003/12/10 02:26:26  hal9
- Changed the demoronizer filter description.
+<!--  ~  End section  ~  -->
  
- Revision 1.123.2.35  2003/11/06 13:36:37  oes
- Updated link to nightly CVS tarball
  
- Revision 1.123.2.34  2003/06/26 23:50:16  hal9
- Add a small bit on filtering and problems re: source code being corrupted.
  
- Revision 1.123.2.33  2003/05/08 18:17:33  roro
- Use apt-get instead of dpkg to install Debian package, which is more
- solid, uses the correct and most recent Debian version automatically.
+<!--   ~~~~~       New section      ~~~~~     -->
  
- Revision 1.123.2.32  2003/04/11 03:13:57  hal9
- Add small note about only one filterfile (as opposed to multiple actions
- files).
+<sect1 id="templates">
+<title>Privoxy's Template Files</title>
+<para>
+ All <application>Privoxy</application> built-in pages, i.e. error pages such as the
+ <ulink url="http://show-the-404-error.page"><quote>404 - No Such Domain</quote>
+ error page</ulink>, the <ulink
+ url="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"><quote>BLOCKED</quote>
+ page</ulink>
+ and all pages of its <ulink url="http://config.privoxy.org/">web-based
+ user interface</ulink>, are generated from <emphasis>templates</emphasis>.
+ (<application>Privoxy</application> must be running for the above links to work as
+ intended.)
+</para>
  
- Revision 1.123.2.31  2003/03/26 02:03:43  oes
- Updated hard-coded copyright dates
+<para>
+ These templates are stored in a subdirectory of the <link linkend="confdir">configuration
+ directory</link> called <filename>templates</filename>. On Unixish platforms,
+ this is typically
+ <ulink url="file:///etc/privoxy/templates/"><filename>/etc/privoxy/templates/</filename></ulink>.
+</para>
  
- Revision 1.123.2.30  2003/03/24 12:58:56  hal9
- Add new section on Predefined Filters.
+<para>
+ The templates are basically normal HTML files, but with place-holders (called symbols
+ or exports), which <application>Privoxy</application> fills at run time. It
+ is possible to edit the templates with a normal text editor, should you want
+ to customize them. (<emphasis>Not recommended for the casual
+ user</emphasis>). Should you create your own custom templates, you should use
+ the <filename>config</filename> setting <link linkend="templdir">templdir</link>
+ to specify an alternate location, so your templates do not get overwritten
+ during upgrades.
+ </para>
+ <para>
+ Note that just like in configuration files, lines starting
+ with <literal>#</literal> are ignored when the templates are filled in.
+</para>
  
- Revision 1.123.2.29  2003/03/20 02:45:29  hal9
- More problems with \-\-chroot causing markup problems :(
+<para>
+ The place-holders are of the form <literal>@name@</literal>, and you will
+ find a list of available symbols, which vary from template to template,
+ in the comments at the start of each file. Note that these comments are not
+ always accurate, and that it's probably best to look at the existing HTML
+ code to find out which symbols are supported and what they are filled in with.
+</para>
  
- Revision 1.123.2.28  2003/03/19 00:35:24  hal9
- Manual edit of revision log because 'chroot' (even inside a comment) was
- causing Docbook to hang here (due to double hyphen and the processor thinking
- it was a comment).
+<para>
+ A special application of this substitution mechanism is to make whole
+ blocks of HTML code disappear when a specific symbol is set. We use this
+ for many purposes, one of them being to include the beta warning in all
+ our user interface (CGI) pages when <application>Privoxy</application>
+ is in an alpha or beta development stage:
+</para>
  
- Revision 1.123.2.27  2003/03/18 19:37:14  oes
- s/Advanced|Radical/Adventuresome/g to avoid complaints re fun filter
+<para>
+ <screen>
+&lt;!-- @if-unstable-start --&gt;
  
- Revision 1.123.2.26  2003/03/17 16:50:53  oes
- Added documentation for new chroot option
+  ... beta warning HTML code goes here ...
  
- Revision 1.123.2.25  2003/03/15 18:36:55  oes
- Adapted to the new filters
+&lt;!-- if-unstable-end@ --&gt;</screen>
+</para>
  
- Revision 1.123.2.24  2002/11/17 06:41:06  hal9
- Move default profiles table from FAQ to U-M, and other minor related changes.
- Add faq on cookies.
+<para>
+ If the "unstable" symbol is set, everything in between and including
+ <literal>@if-unstable-start</literal> and <literal>if-unstable-end@</literal>
+ will disappear, leaving nothing but an empty comment:
+</para>
  
- Revision 1.123.2.23  2002/10/21 02:32:01  hal9
- Updates to the user.action examples section. A few new ones.
+<para>
+ <screen>&lt;!--  --&gt;</screen>
+</para>
  
- Revision 1.123.2.22  2002/10/12 00:51:53  hal9
- Add demoronizer to filter section.
+<para>
+ There's also an if-then-else construct and an <literal>#include</literal>
+ mechanism, but you'll sure find out if you are inclined to edit the
+ templates ;-)
+</para>
  
- Revision 1.123.2.21  2002/10/10 04:09:35  hal9
- s/Advanced/Radical/ and added very brief note.
+<para>
+ All templates refer to a style located at
+ <ulink url="http://config.privoxy.org/send-stylesheet"><literal>http://config.privoxy.org/send-stylesheet</literal></ulink>.
+ This is, of course, locally served by <application>Privoxy</application>
+ and the source for it can be found and edited in the
+ <filename>cgi-style.css</filename> template.
+</para>
  
- Revision 1.123.2.20  2002/10/10 03:49:21  hal9
- Add notes to session-cookies-only and Quickstart about pre-existing
- cookies. Also, note content-cookies work differently.
+</sect1>
  
- Revision 1.123.2.19  2002/09/26 01:25:36  hal9
- More explanation on Privoxy patterns, more on content-cookies and SSL.
+<!--  ~  End section  ~  -->
  
- Revision 1.123.2.18  2002/08/22 23:47:58  hal9
- Add 'Documentation' to Privoxy Menu shot in Configuration section to match
- CGIs.
  
- Revision 1.123.2.17  2002/08/18 01:13:05  hal9
- Spell checked (only one typo this time!).
  
- Revision 1.123.2.16  2002/08/09 19:20:54  david__schmidt
- Update to Mac OS X startup script name
+<!--   ~~~~~       New section      ~~~~~     -->
  
- Revision 1.123.2.15  2002/08/07 17:32:11  oes
- Converted some internal links from ulink to link for PDF creation; no content changed
+<sect1 id="contact"><title>Contacting the Developers, Bug Reporting and Feature
+Requests</title>
  
- Revision 1.123.2.14  2002/08/06 09:16:13  oes
- Nits re: actions file download
+<!-- Include contacting.sgml boilerplate: -->
+ &contacting;
+<!-- end boilerplate -->
  
- Revision 1.123.2.13  2002/08/02 18:23:19  g_sauthoff
- Just 2 small corrections to the Gentoo sections
+</sect1>
  
- Revision 1.123.2.12  2002/08/02 18:17:21  g_sauthoff
- Added 2 Gentoo sections
+<!--  ~  End section  ~  -->
  
- Revision 1.123.2.11  2002/07/26 15:20:31  oes
- - Added version info to title
- - Added info on new filters
- - Revised parts of the filter file tutorial
- - Added info on where to get updated actions files
  
- Revision 1.123.2.10  2002/07/25 21:42:29  hal9
- Add brief notes on not proxying non-HTTP protocols.
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect1 id="copyright"><title>Privoxy Copyright, License and History</title>
  
- Revision 1.123.2.9  2002/07/11 03:40:28  david__schmidt
+<!-- Include copyright.sgml: -->
+ &copyright;
+<!-- end copyright -->
  
- Updated Mac OS X sections due to installation location change
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2><title>License</title>
+<!-- Include copyright.sgml: -->
+ &license;
+<!-- end copyright -->
+</sect2>
+<!--  ~  End section  ~  -->
  
- Revision 1.123.2.8  2002/06/09 16:36:32  hal9
- Clarifications on filtering and MIME. Hardcode 'latest release' in index.html.
  
- Revision 1.123.2.7  2002/06/09 00:29:34  hal9
- Touch ups on filtering, in actions section and Anatomy.
+<!--   ~~~~~       New section      ~~~~~     -->
  
- Revision 1.123.2.6  2002/06/06 23:11:03  hal9
- Fix broken link. Linkchecked all docs.
+<sect2 id="history"><title>History</title>
+<!-- Include history.sgml: -->
+ &history;
+<!-- end history -->
+</sect2>
  
- Revision 1.123.2.5  2002/05/29 02:01:02  hal9
- This is break out of the entire config section from u-m, so it can
- eventually be used to generate the comments, etc in the main config file
- so that these are in sync with each other.
+<sect2 id="authors"><title>Authors</title>
+<!-- Include p-authors.sgml: -->
+ &p-authors;
+<!-- end authors -->
+</sect2>
  
- Revision 1.123.2.4  2002/05/27 03:28:45  hal9
- Ooops missed something from David.
+</sect1>
  
- Revision 1.123.2.3  2002/05/27 03:23:17  hal9
- Fix FIXMEs for OS2 and Mac OS X startup. Fix Redhat typos (should be Red Hat).
- That's a wrap, I think.
+<!--  ~  End section  ~  -->
  
- Revision 1.123.2.2  2002/05/26 19:02:09  hal9
- Move Amiga stuff around to take of FIXME in start up section.
  
- Revision 1.123.2.1  2002/05/26 17:04:25  hal9
- -Spellcheck, very minor edits, and sync across branches
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect1 id="seealso"><title>See Also</title>
+<!-- Include seealso.sgml: -->
+ &seealso;
+<!-- end seealso -->
+</sect1>
  
- Revision 1.123  2002/05/24 23:19:23  hal9
- Include new image (Proxy setup). More fun with guibutton.
- Minor corrections/clarifications here and there.
  
- Revision 1.122  2002/05/24 13:24:08  oes
- Added Bookmarklet for one-click pre-filled access to show-url-info
  
- Revision 1.121  2002/05/23 23:20:17  oes
-  - Changed more (all?) references to actions to the
-    <literal><link> style.
-  - Small fixes in the actions chapter
-  - Small clarifications in the quickstart to ad blocking
-  - Removed <emphasis> from <title>s since the new doc CSS
-    renders them red (bad in TOC).
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect1 id="appendix"><title>Appendix</title>
  
- Revision 1.120  2002/05/23 19:16:43  roro
- Correct Debian specials (installation and startup).
  
- Revision 1.119  2002/05/22 17:17:05  oes
- Added Security hint
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2 id="regex">
+<title>Regular Expressions</title>
+<para>
+ <application>Privoxy</application> uses Perl-style <quote>regular
+ expressions</quote> in its <link linkend="actions-file">actions
+ files</link> and <link linkend="filter-file">filter file</link>,
+ through the <ulink url="http://www.pcre.org/">PCRE</ulink> and
+<!--
+ dead 08/27/06
+ <ulink url="http://www.oesterhelt.org/pcrs/">PCRS</ulink> libraries.
+-->
+ <application>PCRS</application> libraries.
+</para>
  
- Revision 1.118  2002/05/21 04:54:55  hal9
- -New Section: Quickstart to Ad Blocking
- -Reformat Actions Anatomy to match new CGI layout
+<para>
+ If you are reading this, you probably don't understand what <quote>regular
+ expressions</quote> are, or what they can do. So this will be a very brief
+ introduction only. A full explanation would require a <ulink
+ url="http://www.oreilly.com/catalog/regex/">book</ulink> ;-)
+</para>
  
- Revision 1.117  2002/05/17 13:56:16  oes
-  - Reworked & extended Templates chapter
-  - Small changes to Regex appendix
-  - #included authors.sgml into (C) and hist chapter
+<para>
+ Regular expressions provide a language to describe patterns that can be
+ run against strings of characters (letter, numbers, etc), to see if they
+ match the string or not. The  patterns are themselves (sometimes complex)
+ strings of literal characters, combined with  wild-cards, and other special
+ characters, called meta-characters. The <quote>meta-characters</quote> have
+ special meanings and are used to build complex patterns to be matched against.
+ Perl Compatible Regular Expressions are an especially convenient
+ <quote>dialect</quote> of the regular expression language.
+</para>
  
- Revision 1.116  2002/05/17 03:23:46  hal9
- Fixing merge conflict in Quickstart section.
+<para>
+ To make a simple analogy, we do something similar when we use wild-card
+ characters when listing files with the <command>dir</command> command in DOS.
+ <literal>*.*</literal> matches all filenames. The <quote>special</quote>
+ character here is the asterisk which matches any and all characters. We can be
+ more specific and use <literal>?</literal> to match just individual
+ characters. So <quote>dir file?.text</quote> would match
+ <quote>file1.txt</quote>, <quote>file2.txt</quote>, etc. We are pattern
+ matching, using a similar technique to <quote>regular expressions</quote>!
+</para>
  
- Revision 1.115  2002/05/16 16:25:00  oes
- Extended the Filter File chapter & minor fixes
+<para>
+ Regular expressions do essentially the same thing, but are much, much more
+ powerful. There are many more <quote>special characters</quote> and ways of
+ building complex patterns however. Let's look at a few of the common ones,
+ and then some examples:
+</para>
  
- Revision 1.114  2002/05/16 09:42:50  oes
- More ulink->link, added some hints to Quickstart section
+<para><simplelist>
+ <member>
+  <emphasis>.</emphasis> - Matches any single character, e.g. <quote>a</quote>,
+  <quote>A</quote>, <quote>4</quote>, <quote>:</quote>, or <quote>@</quote>.
+ </member>
+</simplelist></para>
  
- Revision 1.113  2002/05/15 21:07:25  oes
- Extended and further commented the example actions files
+<para><simplelist>
+ <member>
+  <emphasis>?</emphasis> - The preceding character or expression is matched ZERO or ONE
+  times. Either/or.
+ </member>
+</simplelist></para>
  
- Revision 1.112  2002/05/15 03:57:14  hal9
- Spell check. A few minor edits here and there for better syntax and
- clarification.
+<para><simplelist>
+ <member>
+  <emphasis>+</emphasis> - The preceding character or expression is matched ONE or MORE
+  times.
+ </member>
+</simplelist></para>
  
- Revision 1.111  2002/05/14 23:01:36  oes
- Fixing the fixes
+<para><simplelist>
+ <member>
+  <emphasis>*</emphasis> - The preceding character or expression is matched ZERO or MORE
+  times.
+ </member>
+</simplelist></para>
  
- Revision 1.110  2002/05/14 19:10:45  oes
- Restored alphabetical order of actions
+<para><simplelist>
+ <member>
+  <emphasis>\</emphasis> - The <quote>escape</quote> character denotes that
+  the following character should be taken literally. This is used where one of the
+  special characters (e.g. <quote>.</quote>) needs to be taken literally and
+  not as a special meta-character. Example: <quote>example\.com</quote>, makes
+  sure the period is recognized only as a period (and not expanded to its
+  meta-character meaning of any single character).
+ </member>
+</simplelist></para>
  
- Revision 1.109  2002/05/14 17:23:11  oes
- Renamed the prevent-*-cookies actions, extended aliases section and moved it before the example AFs
+<para><simplelist>
+ <member>
+  <emphasis>[ ]</emphasis> - Characters enclosed in brackets will be matched if
+  any of the enclosed characters are encountered. For instance, <quote>[0-9]</quote>
+  matches any numeric digit (zero through nine). As an example, we can combine
+  this with <quote>+</quote> to match any digit one of more times: <quote>[0-9]+</quote>.
+ </member>
+</simplelist></para>
  
- Revision 1.108  2002/05/14 15:29:12  oes
- Completed proofreading the actions chapter
+<para><simplelist>
+ <member>
+  <emphasis>( )</emphasis> - parentheses are used to group a sub-expression,
+  or multiple sub-expressions.
+ </member>
+</simplelist></para>
  
- Revision 1.107  2002/05/12 03:20:41  hal9
- Small clarifications for 127.0.0.1 vs localhost for listen-address since this
- apparently an important distinction for some OS's.
+<para><simplelist>
+ <member>
+  <emphasis>|</emphasis> - The <quote>bar</quote> character works like an
+  <quote>or</quote> conditional statement. A match is successful if the
+  sub-expression on either side of <quote>|</quote> matches. As an example:
+  <quote>/(this|that) example/</quote> uses grouping and the bar character
+  and would match either <quote>this example</quote> or <quote>that
+  example</quote>, and nothing else.
+ </member>
+</simplelist></para>
  
- Revision 1.106  2002/05/10 01:48:20  hal9
- This is mostly proposed copyright/licensing additions and changes. Docs
- are still GPL, but licensing and copyright are more visible. Also, copyright
- changed in doc header comments (eliminate references to JB except FAQ).
+<para>
+ These are just some of the ones you are likely to use when matching URLs with
+ <application>Privoxy</application>, and is a long way from a definitive
+ list. This is enough to get us started with a few simple examples which may
+ be more illuminating:
+</para>
  
- Revision 1.105  2002/05/05 20:26:02  hal9
- Sorting out license vs copyright in these docs.
+<para>
+ <emphasis><literal>/.*/banners/.*</literal></emphasis> - A  simple example
+ that uses the common combination of <quote>.</quote> and <quote>*</quote> to
+ denote any character, zero or more times. In other words, any string at all.
+ So we start with a literal forward slash, then our regular expression pattern
+ (<quote>.*</quote>) another literal forward slash, the string
+ <quote>banners</quote>, another forward slash, and lastly another
+ <quote>.*</quote>. We are building
+ a directory path here. This will match any file with the path that has a
+ directory named <quote>banners</quote> in it. The <quote>.*</quote> matches
+ any characters, and this could conceivably be more forward slashes, so it
+ might expand into a much longer looking path. For example, this could match:
+ <quote>/eye/hate/spammers/banners/annoy_me_please.gif</quote>, or just
+ <quote>/banners/annoying.html</quote>, or almost an infinite number of other
+ possible combinations, just so it has <quote>banners</quote> in the path
+ somewhere.
+</para>
  
- Revision 1.104  2002/05/04 08:44:45  swa
- bumped version
+<para>
+ And now something a little more complex:
+</para>
  
- Revision 1.103  2002/05/04 00:40:53  hal9
- -Remove the TOC first page kludge. It's fixed proper now in ldp.dsl.in.
- -Some minor additions to Quickstart.
+<para>
+ <emphasis><literal>/.*/adv((er)?ts?|ertis(ing|ements?))?/</literal></emphasis> -
+ We have several literal forward slashes again (<quote>/</quote>), so we are
+ building another expression that is a file path statement. We have another
+ <quote>.*</quote>, so we are matching against any conceivable sub-path, just so
+ it matches our expression. The only true literal that <emphasis>must
+ match</emphasis> our pattern is <application>adv</application>, together with
+ the forward slashes. What comes after the <quote>adv</quote> string is the
+ interesting part.
+</para>
  
- Revision 1.102  2002/05/03 17:46:00  oes
- Further proofread & reactivated short build instructions
+<para>
+ Remember the <quote>?</quote> means the preceding expression (either a
+ literal character or anything grouped with <quote>(...)</quote> in this case)
+ can exist or not, since this means either zero or one match. So
+ <quote>((er)?ts?|ertis(ing|ements?))</quote> is optional, as are the
+ individual sub-expressions: <quote>(er)</quote>,
+ <quote>(ing|ements?)</quote>, and the <quote>s</quote>. The <quote>|</quote>
+ means <quote>or</quote>. We have two of those. For instance,
+ <quote>(ing|ements?)</quote>, can expand to match either <quote>ing</quote>
+ <emphasis>OR</emphasis> <quote>ements?</quote>. What is being done here, is an
+ attempt at matching as many variations of <quote>advertisement</quote>, and
+ similar, as possible. So this would expand to match just <quote>adv</quote>,
+ or <quote>advert</quote>, or <quote>adverts</quote>, or
+ <quote>advertising</quote>, or <quote>advertisement</quote>, or
+ <quote>advertisements</quote>. You get the idea. But it would not match
+ <quote>advertizements</quote> (with a <quote>z</quote>). We could fix that by
+ changing our regular expression to:
+ <quote>/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/</quote>, which would then match
+ either spelling.
+</para>
  
- Revision 1.101  2002/05/03 03:58:30  hal9
- Move the user-manual config directive to top of section. Add note about
- Privoxy needing read permissions for configs, and write for logs.
+<para>
+ <emphasis><literal>/.*/advert[0-9]+\.(gif|jpe?g)</literal></emphasis> - Again
+ another path statement with forward slashes. Anything in the square brackets
+ <quote>[ ]</quote> can be matched. This is using <quote>0-9</quote> as a
+ shorthand expression to mean any digit one through nine. It is the same as
+ saying <quote>0123456789</quote>. So any digit matches. The <quote>+</quote>
+ means one or more of the preceding expression must be included. The preceding
+ expression here is what is in the square brackets -- in this case, any digit
+ one through nine. Then, at the end, we have a grouping: <quote>(gif|jpe?g)</quote>.
+ This includes a <quote>|</quote>, so this needs to match the expression on
+ either side of that bar character also. A simple <quote>gif</quote> on one side, and the other
+ side will in turn match either <quote>jpeg</quote> or <quote>jpg</quote>,
+ since the <quote>?</quote> means the letter <quote>e</quote> is optional and
+ can be matched once or not at all. So we are building an expression here to
+ match image GIF or JPEG type image file. It must include the literal
+ string <quote>advert</quote>, then one or more digits, and a <quote>.</quote>
+ (which is now a literal, and not a special character, since it is escaped
+ with <quote>\</quote>), and lastly either <quote>gif</quote>, or
+ <quote>jpeg</quote>, or <quote>jpg</quote>. Some possible matches would
+ include: <quote>//advert1.jpg</quote>,
+ <quote>/nasty/ads/advert1234.gif</quote>,
+ <quote>/banners/from/hell/advert99.jpg</quote>. It would not match
+ <quote>advert1.gif</quote> (no leading slash), or
+ <quote>/adverts232.jpg</quote> (the expression does not include an
+ <quote>s</quote>), or <quote>/advert1.jsp</quote> (<quote>jsp</quote> is not
+ in the expression anywhere).
+</para>
  
- Revision 1.100  2002/04/29 03:05:55  hal9
- Add clarification on differences of new actions files.
+<para>
+ We are barely scratching the surface of regular expressions here so that you
+ can understand the default <application>Privoxy</application>
+ configuration files, and maybe use this knowledge to customize your own
+ installation. There is much, much more that can be done with regular
+ expressions. Now that you know enough to get started, you can learn more on
+ your own :/
+</para>
  
- Revision 1.99  2002/04/28 16:59:05  swa
- more structure in starting section
+<para>
+ More reading on Perl Compatible Regular expressions:
+ <ulink url="http://perldoc.perl.org/perlre.html">http://perldoc.perl.org/perlre.html</ulink>
+</para>
  
- Revision 1.98  2002/04/28 05:43:59  hal9
- This is the break up of configuration.html into multiple files. This
- will probably break links elsewhere :(
+<para>
+ For information on regular expression based substitutions and their applications
+ in filters, please see the <link linkend="filter-file">filter file tutorial</link>
+ in this manual.
+</para>
+</sect2>
  
- Revision 1.97  2002/04/27 21:04:42  hal9
- -Rewrite of Actions File example.
- -Add section for user-manual directive in config.
+<!--  ~  End section  ~  -->
  
- Revision 1.96  2002/04/27 05:32:00  hal9
- -Add short section to Filter Files to tie in with +filter action.
- -Start rewrite of examples in Actions Examples (not finished).
  
- Revision 1.95  2002/04/26 17:23:29  swa
- bookmarks cleaned, changed structure of user manual, screen and programlisting cleanups, and numerous other changes that I forgot
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2>
+<title>Privoxy's Internal Pages</title>
  
- Revision 1.94  2002/04/26 05:24:36  hal9
- -Add most of Andreas suggestions to Chain of Events section.
- -A few other minor corrections and touch up.
+<para>
+ Since <application>Privoxy</application> proxies each requested
+ web page, it is easy for <application>Privoxy</application> to
+ trap certain special URLs. In this way, we can talk directly to
+ <application>Privoxy</application>, and see how it is
+ configured, see how our rules are being applied, change these
+ rules and other configuration options, and even turn
+ <application>Privoxy's</application> filtering off, all with
+ a web browser.
  
- Revision 1.92  2002/04/25 18:55:13  hal9
- More catchups on new actions files, and new actions names.
- Other assorted cleanups, and minor modifications.
+</para>
  
- Revision 1.91  2002/04/24 02:39:31  hal9
- Add 'Chain of Events' section.
+<para>
+ The URLs listed below are the special ones that allow direct access
+ to <application>Privoxy</application>. Of course,
+ <application>Privoxy</application> must be running to access these. If
+ not, you will get a friendly error message. Internet access is not
+ necessary either.
+</para>
  
- Revision 1.90  2002/04/23 21:41:25  hal9
- Linuxconf is deprecated on RH, substitute chkconfig.
+<para>
+ <itemizedlist>
  
- Revision 1.89  2002/04/23 21:05:28  oes
- Added hint for startup on Red Hat
+ <listitem>
+  <para>
+   Privoxy main page:
+  </para>
+  <blockquote>
+   <para>
+     <ulink url="http://config.privoxy.org/">http://config.privoxy.org/</ulink>
+   </para>
+  </blockquote>
+  <para>
+   There is a shortcut: <ulink url="http://p.p/">http://p.p/</ulink> (But it
+   doesn't provide a fall-back to a real page, in case the request is not
+   sent through <application>Privoxy</application>)
+  </para>
+ </listitem>
  
- Revision 1.88  2002/04/23 05:37:54  hal9
- Add AmigaOS install stuff.
+ <listitem>
+  <para>
+    Show information about the current configuration, including viewing and
+    editing of actions files:
+  </para>
+   <blockquote>
+   <para>
+    <ulink url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>
+   </para>
+  </blockquote>
+ </listitem>
  
- Revision 1.87  2002/04/23 02:53:15  david__schmidt
- Updated Mac OS X installation section
- Added a few English tweaks here an there
+ <listitem>
+  <para>
+    Show the source code version numbers:
+  </para>
+  <blockquote>
+   <para>
+    <ulink url="http://config.privoxy.org/show-version">http://config.privoxy.org/show-version</ulink>
+   </para>
+  </blockquote>
+ </listitem>
  
- Revision 1.86  2002/04/21 01:46:32  hal9
- Re-write actions section.
+ <listitem>
+  <para>
+   Show the browser's request headers:
+  </para>
+  <blockquote>
+   <para>
+    <ulink url="http://config.privoxy.org/show-request">http://config.privoxy.org/show-request</ulink>
+   </para>
+  </blockquote>
+ </listitem>
  
- Revision 1.85  2002/04/18 21:23:23  hal9
- Fix ugly typo (mine).
+ <listitem>
+  <para>
+   Show which actions apply to a URL and why:
+  </para>
+   <blockquote>
+   <para>
+    <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
+   </para>
+  </blockquote>
+ </listitem>
  
- Revision 1.84  2002/04/18 21:17:13  hal9
- Spell Redhat correctly (ie Red Hat). A few minor grammar corrections.
+ <listitem>
+  <para>
+   Toggle Privoxy on or off. This feature can be turned off/on in the main
+   <filename>config</filename> file. When toggled <quote>off</quote>, <quote>Privoxy</quote>
+   continues to run, but only as a pass-through proxy, with no actions taking
+   place:
+  </para>
+   <blockquote>
+   <para>
+    <ulink url="http://config.privoxy.org/toggle">http://config.privoxy.org/toggle</ulink>
+   </para>
+  </blockquote>
+  <para>
+   Short cuts. Turn off, then on:
+  </para>
+   <blockquote>
+   <para>
+     <ulink url="http://config.privoxy.org/toggle?set=disable">http://config.privoxy.org/toggle?set=disable</ulink>
+   </para>
+  </blockquote>
+   <blockquote>
+   <para>
+     <ulink url="http://config.privoxy.org/toggle?set=enable">http://config.privoxy.org/toggle?set=enable</ulink>
+   </para>
+  </blockquote>
+ </listitem>
  
- Revision 1.83  2002/04/18 18:21:12  oes
- Added RPM install detail
+ </itemizedlist>
+</para>
  
- Revision 1.82  2002/04/18 12:04:50  oes
- Cosmetics
+<para>
+ These may be bookmarked for quick reference. See next.
  
- Revision 1.81  2002/04/18 11:50:24  oes
- Extended Install section - needs fixing by packagers
+</para>
  
- Revision 1.80  2002/04/18 10:45:19  oes
- Moved text to buildsource.sgml, renamed some filters, details
+<sect3 id="bookmarklets">
+<title>Bookmarklets</title>
+<para>
+ Below are some <quote>bookmarklets</quote> to allow you to easily access a
+ <quote>mini</quote> version of some of <application>Privoxy's</application>
+ special pages. They are designed for MS Internet Explorer, but should work
+ equally well in Netscape, Mozilla, and other browsers which support
+ JavaScript. They are designed to run directly from your bookmarks - not by
+ clicking the links below (although that should work for testing).
+</para>
+<para>
+ To save them, right-click the link and choose <quote>Add to Favorites</quote>
+ (IE) or <quote>Add Bookmark</quote> (Netscape). You will get a warning that
+ the bookmark <quote>may not be safe</quote> - just click OK. Then you can run the
+ Bookmarklet directly from your favorites/bookmarks. For even faster access,
+ you can put them on the <quote>Links</quote> bar (IE) or the <quote>Personal
+ Toolbar</quote> (Netscape), and run them with a single click.
+</para>
  
- Revision 1.79  2002/04/18 03:18:06  hal9
- Spellcheck, and minor touchups.
+<para>
+ <itemizedlist>
  
- Revision 1.78  2002/04/17 18:04:16  oes
- Proofreading part 2
+  <listitem>
+   <para>
+    <ulink
+    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&#38;set=enabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Enable</ulink>
+   </para>
+  </listitem>
  
- Revision 1.77  2002/04/17 13:51:23  oes
- Proofreading, part one
+  <listitem>
+   <para>
+    <ulink
+    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&#38;set=disabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Disable</ulink>
+   </para>
+  </listitem>
  
- Revision 1.76  2002/04/16 04:25:51  hal9
- -Added 'Note to Upgraders' and re-ordered the 'Quickstart' section.
- -Note about proxy may need requests to re-read config files.
+  <listitem>
+   <para>
+    <ulink
+    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&#38;set=toggle','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Toggle Privoxy</ulink> (Toggles between enabled and disabled)
+   </para>
+  </listitem>
  
- Revision 1.75  2002/04/12 02:08:48  david__schmidt
- Remove OS/2 building info... it is already in the developer-manual
+  <listitem>
+   <para>
+    <ulink
+    url="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy- View Status</ulink>
+   </para>
+  </listitem>
+<!--
+  <listitem>
+   <para>
+    <ulink url="javascript:w=Math.floor(screen.width/2);h=Math.floor(screen.height*0.9);void(window.open('http://www.privoxy.org/actions/index.php?url='+escape(location.href),'Feedback','screenx='+w+',width='+w+',height='+h+',scrollbars=yes,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());">Privoxy - Submit Actions File Feedback</ulink>
+   </para>
+  </listitem>
+ -->
+  <listitem>
+   <para>
+    <ulink url="javascript:void(window.open('http://config.privoxy.org/show-url-info?url='+escape(location.href),'Why').focus());">Privoxy - Why?</ulink>
+   </para>
+  </listitem>
+ </itemizedlist>
+</para>
  
- Revision 1.74  2002/04/11 00:54:38  hal9
- Add small section on submitting actions.
+<para>
+ Credit: The site which gave us the general idea for these bookmarklets is
+ <ulink url="http://www.bookmarklets.com/">www.bookmarklets.com</ulink>. They
+ have more information about bookmarklets.
+</para>
  
- Revision 1.73  2002/04/10 18:45:15  swa
- generated
  
- Revision 1.72  2002/04/10 04:06:19  hal9
- Added actions feedback  to Bookmarklets section
+</sect3>
  
- Revision 1.71  2002/04/08 22:59:26  hal9
- Version update. Spell chkconfig correctly :)
+</sect2>
  
- Revision 1.70  2002/04/08 20:53:56  swa
- ?
  
- Revision 1.69  2002/04/06 05:07:29  hal9
- -Add privoxy-man-page.sgml, for man page.
- -Add authors.sgml for AUTHORS (and p-authors.sgml)
- -Reworked various aspects of various docs.
- -Added additional comments to sub-docs.
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2 id="chain">
+<title>Chain of Events</title>
+<para>
+ Let's take a quick look at how some of <application>Privoxy's</application>
+ core features are triggered, and the ensuing sequence of events when a web
+ page is requested by your browser:
+</para>
  
- Revision 1.68  2002/04/04 18:46:47  swa
- consistent look. reuse of copyright, history et. al.
+<para>
+ <itemizedlist>
+ <listitem>
+  <para>
+   First, your web browser requests a web page. The browser knows to send
+   the request to <application>Privoxy</application>, which will in turn,
+   relay the request to the remote web server after passing the following
+   tests:
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   <application>Privoxy</application> traps any request for its own internal CGI
+   pages (e.g <ulink url="http://p.p/">http://p.p/</ulink>) and sends the CGI page back to the browser.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   Next, <application>Privoxy</application> checks to see if the URL
+   matches any <link
+   linkend="BLOCK"><quote>+block</quote></link> patterns. If
+   so, the URL is then blocked, and the remote web server will not be contacted.
+   <link linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>
+   and
+   <link linkend="HANDLE-AS-EMPTY-DOCUMENT"><quote>+handle-as-empty-document</quote></link>
+   are then checked, and if there is no match, an
+   HTML <quote>BLOCKED</quote> page is sent back to the browser. Otherwise, if
+   it does match, an image is returned for the former, and an empty text
+   document for the latter. The type of image would depend on the setting of
+   <link linkend="SET-IMAGE-BLOCKER"><quote>+set-image-blocker</quote></link>
+   (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   Untrusted URLs are blocked. If URLs are being added to the
+   <filename>trust</filename> file, then that is done.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   If the URL pattern matches the <link
+   linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link> action,
+   it is then processed. Unwanted parts of the requested URL are stripped.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   Now the rest of the client browser's request headers are processed. If any
+   of these match any of the relevant actions (e.g. <link
+   linkend="HIDE-USER-AGENT"><quote>+hide-user-agent</quote></link>,
+   etc.), headers are suppressed or forged as determined by these actions and
+   their parameters.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   Now the web server starts sending its response back (i.e. typically a web
+   page).
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   First, the server headers are read and processed to determine, among other
+   things, the MIME type (document type) and encoding. The headers are then
+   filtered as determined by the
+   <link linkend="CRUNCH-INCOMING-COOKIES"><quote>+crunch-incoming-cookies</quote></link>,
+   <link linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>,
+   and <link linkend="DOWNGRADE-HTTP-VERSION"><quote>+downgrade-http-version</quote></link>
+   actions.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   If any <link linkend="FILTER"><quote>+filter</quote></link> action
+   or <link
+   linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
+   action applies (and the document type fits the action), the rest of the page is
+   read into memory (up to a configurable limit). Then the filter rules (from
+   <filename>default.filter</filename> and any other filter files) are
+   processed against the buffered content. Filters are applied in the order
+   they are specified in one of the filter files. Animated GIFs, if present,
+   are reduced to either the first or last frame, depending on the action
+   setting.The entire page, which is now filtered, is then sent by
+   <application>Privoxy</application> back to your browser.
+  </para>
+  <para>
+   If neither a <link linkend="FILTER"><quote>+filter</quote></link> action
+   or <link
+   linkend="DEANIMATE-GIFS"><quote>+deanimate-gifs</quote></link>
+   matches, then <application>Privoxy</application> passes the raw data through
+   to the client browser as it becomes available.
+  </para>
+ </listitem>
+ <listitem>
+  <para>
+   As the browser receives the now (possibly filtered) page content, it
+   reads and then requests any URLs that may be embedded within the page
+   source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
+   frames), sounds, etc. For each of these objects, the browser issues a
+   separate request (this is easily viewable in <application>Privoxy's</application>
+   logs). And each such request is in turn processed just as above. Note that a
+   complex web page will have many, many such embedded URLs. If these
+   secondary requests are to a different server, then quite possibly a very
+   differing set of actions is triggered.
+  </para>
+ </listitem>
  
- Revision 1.67  2002/04/04 17:27:57  swa
- more single file to be included at multiple points. make maintaining easier
+ </itemizedlist>
+</para>
+<para>
+ NOTE: This is somewhat of a simplistic overview of what happens with each URL
+ request. For the sake of brevity and simplicity, we have focused on
+ <application>Privoxy's</application> core features only.
+</para>
  
- Revision 1.66  2002/04/04 06:48:37  hal9
- Structural changes to allow for conditional inclusion/exclusion of content
- based on entity toggles, e.g. 'entity % p-not-stable  "INCLUDE"'. And
- definition of internal entities, e.g. 'entity p-version "2.9.13"' that will
- eventually be set by Makefile.
- More boilerplate text for use across multiple docs.
+</sect2>
  
- Revision 1.65  2002/04/03 19:52:07  swa
- enhance squid section due to user suggestion
  
- Revision 1.64  2002/04/03 03:53:43  hal9
- A few minor bug fixes, and touch ups. Ready for review.
+<!--   ~~~~~       New section      ~~~~~     -->
+<sect2 id="actionsanat">
+<title>Troubleshooting: Anatomy of an Action</title>
  
- Revision 1.63  2002/04/01 16:24:49  hal9
- Define entities to include boilerplate text. See doc/source/*.
+<para>
+ The way <application>Privoxy</application> applies
+ <link linkend="ACTIONS">actions</link> and <link linkend="FILTER">filters</link>
+ to any given URL can be complex, and not always so
+ easy to understand what is happening. And sometimes we need to be able to
+ <emphasis>see</emphasis> just what <application>Privoxy</application> is
+ doing. Especially, if something <application>Privoxy</application> is doing
+ is causing us a problem inadvertently. It can be a little daunting to look at
+ the actions and filters files themselves, since they tend to be filled with
+ <link linkend="regex">regular expressions</link> whose consequences are not
+ always so obvious.
+</para>
  
- Revision 1.62  2002/03/30 04:15:53  hal9
- - Fix privoxy.org/config links.
- - Paste in Bookmarklets from Toggle page.
- - Move Quickstart nearer top, and minor rework.
+<para>
+ One quick test to see if <application>Privoxy</application> is causing a problem
+ or not, is to disable it temporarily. This should be the first troubleshooting
+ step. See <link linkend="bookmarklets">the Bookmarklets</link> section on a quick
+ and easy way to do this (be sure to flush caches afterward!). Looking at the
+ logs is a good idea too. (Note that both the toggle feature and logging are
+ enabled via <filename>config</filename> file settings, and may need to be
+ turned <quote>on</quote>.)
+</para>
+<para>
+ Another easy troubleshooting step to try is if you have done any
+ customization of your installation, revert back to the installed
+ defaults and see if that helps. There are times the developers get complaints
+ about one thing or another, and the problem is more related to a customized
+ configuration issue.
+</para>
  
- Revision 1.61  2002/03/29 01:31:08  hal9
- Minor update.
+<para>
+ <application>Privoxy</application> also provides the
+ <ulink url="http://config.privoxy.org/show-url-info">http://config.privoxy.org/show-url-info</ulink>
+ page that can show us very specifically how <application>actions</application>
+ are being applied to any given URL. This is a big help for troubleshooting.
+</para>
  
- Revision 1.60  2002/03/27 01:57:34  hal9
- Added more to Anatomy section.
+<para>
+ First, enter one URL (or partial URL) at the prompt, and then
+ <application>Privoxy</application> will tell us
+ how the current configuration will handle it. This will not
+ help with filtering effects (i.e. the <link
+ linkend="FILTER"><quote>+filter</quote></link> action) from
+ one of the filter files since this is handled very
+ differently and not so easy to trap! It also will not tell you about any other
+ URLs that may be embedded within the URL you are testing. For instance, images
+ such as ads are expressed as URLs within the raw page source of HTML pages. So
+ you will only get info for the actual URL that is pasted into the prompt area
+ -- not any sub-URLs. If you want to know about embedded URLs like ads, you
+ will have to dig those out of the HTML source. Use your browser's <quote>View
+ Page Source</quote> option for this. Or right click on the ad, and grab the
+ URL.
+</para>
  
- Revision 1.59  2002/03/27 00:54:33  hal9
- Touch up intro for new name.
+<para>
+ Let's try an example, <ulink url="http://google.com">google.com</ulink>,
+ and look at it one section at a time in a sample configuration (your real
+ configuration may vary):
+</para>
  
- Revision 1.58  2002/03/26 22:29:55  swa
- we have a new homepage!
+<para>
+ <screen>
+ Matches for http://www.google.com:
  
- Revision 1.57  2002/03/24 20:33:30  hal9
- A few minor catch ups with name change.
+ In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
  
- Revision 1.56  2002/03/24 16:17:06  swa
- configure needs to be generated.
+ {+change-x-forwarded-for{block}
+ +deanimate-gifs {last}
+ +fast-redirects {check-decoded-url}
+ +filter {refresh-tags}
+ +filter {img-reorder}
+ +filter {banners-by-size}
+ +filter {webbugs}
+ +filter {jumping-windows}
+ +filter {ie-exploits}
+ +hide-from-header {block}
+ +hide-referrer {forge}
+ +session-cookies-only
+ +set-image-blocker {pattern}
+/
  
- Revision 1.55  2002/03/24 16:08:08  swa
- we are too lazy to make a block-built
- privoxy logo. hence removed the option.
+ { -session-cookies-only }
+ .google.com
  
- Revision 1.54  2002/03/24 15:46:20  swa
- name change related issue.
+ { -fast-redirects }
+ .google.com
  
- Revision 1.53  2002/03/24 11:51:00  swa
- name change. changed filenames.
+In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
+(no matches in this file)
+</screen>
+</para>
  
- Revision 1.52  2002/03/24 11:01:06  swa
- name change
+<para>
+ This is telling us how we have defined our
+ <link linkend="ACTIONS"><quote>actions</quote></link>, and
+ which ones match for our test case, <quote>google.com</quote>.
+ Displayed is all the actions that are available to us. Remember,
+ the <literal>+</literal> sign denotes <quote>on</quote>. <literal>-</literal>
+ denotes <quote>off</quote>. So some are <quote>on</quote> here, but many
+ are <quote>off</quote>. Each example we try may provide a slightly different
+ end result, depending on our configuration directives.
+</para>
+<para>
+ The first listing
+  is for our <filename>default.action</filename> file. The large, multi-line
+  listing, is how the actions are set to match for all URLs, i.e. our default
+  settings. If you look at your <quote>actions</quote> file, this would be the
+  section just below the <quote>aliases</quote> section near the top. This
+  will apply to all URLs as signified by the single forward slash at the end
+  of the listing -- <quote> / </quote>.
+</para>
  
- Revision 1.51  2002/03/23 15:13:11  swa
- renamed every reference to the old name with foobar.
- fixed "application foobar application" tag, fixed
- "the foobar" with "foobar". left junkbustser in cvs
- comments and remarks to history untouched.
+<para>
+ But we have defined additional actions that would be exceptions to these general
+ rules, and then we list specific URLs (or patterns) that these exceptions
+ would apply to. Last match wins. Just below this then are two explicit
+ matches for <quote>.google.com</quote>. The first is negating our previous
+ cookie setting, which was for <link
+ linkend="SESSION-COOKIES-ONLY"><quote>+session-cookies-only</quote></link>
+ (i.e. not persistent). So we will allow persistent cookies for google, at
+ least that is how it is in this example. The second turns
+ <emphasis>off</emphasis> any <link
+ linkend="FAST-REDIRECTS"><quote>+fast-redirects</quote></link>
+ action, allowing this to take place unmolested. Note that there is a leading
+ dot here -- <quote>.google.com</quote>. This will match any hosts and
+ sub-domains, in the google.com domain also, such as
+ <quote>www.google.com</quote> or <quote>mail.google.com</quote>. But it would not
+ match <quote>www.google.de</quote>! So, apparently, we have these two actions
+ defined as exceptions to the general rules at the top somewhere in the lower
+ part of our <filename>default.action</filename> file, and
+ <quote>google.com</quote> is referenced somewhere in these latter sections.
+</para>
  
- Revision 1.50  2002/03/23 05:06:21  hal9
- Touch up.
+<para>
+ Then, for our <filename>user.action</filename> file, we again have no hits.
+ So there is nothing google-specific that we might have added to our own, local
+ configuration. If there was, those actions would over-rule any actions from
+ previously processed files, such as <filename>default.action</filename>.
+ <filename>user.action</filename> typically has the last word. This is the
+ best place to put hard and fast exceptions,
+</para>
  
- Revision 1.49  2002/03/21 17:01:05  hal9
- New section in Appendix.
+<para>
+ And finally we pull it all together in the bottom section and summarize how
+ <application>Privoxy</application> is applying all its <quote>actions</quote>
+ to <quote>google.com</quote>:
  
- Revision 1.48  2002/03/12 06:33:01  hal9
- Catching up to Andreas and re_filterfile changes.
+</para>
  
- Revision 1.47  2002/03/11 13:13:27  swa
- correct feedback channels
+<para>
+ <screen>
  
- Revision 1.46  2002/03/10 00:51:08  hal9
- Added section on JB internal pages in Appendix.
+ Final results:
  
- Revision 1.45  2002/03/09 17:43:53  swa
- more distros
+ -add-header
+ -block
+ +change-x-forwarded-for{block}
+ -client-header-filter{hide-tor-exit-notation}
+ -content-type-overwrite
+ -crunch-client-header
+ -crunch-if-none-match
+ -crunch-incoming-cookies
+ -crunch-outgoing-cookies
+ -crunch-server-header
+ +deanimate-gifs {last}
+ -downgrade-http-version
+ -fast-redirects
+ -filter {js-events}
+ -filter {content-cookies}
+ -filter {all-popups}
+ -filter {banners-by-link}
+ -filter {tiny-textforms}
+ -filter {frameset-borders}
+ -filter {demoronizer}
+ -filter {shockwave-flash}
+ -filter {quicktime-kioskmode}
+ -filter {fun}
+ -filter {crude-parental}
+ -filter {site-specifics}
+ -filter {js-annoyances}
+ -filter {html-annoyances}
+ +filter {refresh-tags}
+ -filter {unsolicited-popups}
+ +filter {img-reorder}
+ +filter {banners-by-size}
+ +filter {webbugs}
+ +filter {jumping-windows}
+ +filter {ie-exploits}
+ -filter {google}
+ -filter {yahoo}
+ -filter {msn}
+ -filter {blogspot}
+ -filter {no-ping}
+ -force-text-mode
+ -handle-as-empty-document
+ -handle-as-image
+ -hide-accept-language
+ -hide-content-disposition
+ +hide-from-header {block}
+ -hide-if-modified-since
+ +hide-referrer {forge}
+ -hide-user-agent
+ -limit-connect
+ -overwrite-last-modified
+ -prevent-compression
+ -redirect
+ -server-header-filter{xml-to-html}
+ -server-header-filter{html-to-xml}
+ -session-cookies-only
+ +set-image-blocker {pattern} </screen>
+</para>
  
- Revision 1.44  2002/03/09 17:08:48  hal9
- New section on Jon's actions file editor, and move some stuff around.
+<para>
+ Notice the only difference here to the previous listing, is to
+ <quote>fast-redirects</quote> and <quote>session-cookies-only</quote>,
+ which are activated specifically for this site in our configuration,
+ and thus show in the <quote>Final Results</quote>.
+</para>
  
- Revision 1.43  2002/03/08 00:47:32  hal9
- Added imageblock{pattern}.
+<para>
+ Now another example, <quote>ad.doubleclick.net</quote>:
+</para>
  
- Revision 1.42  2002/03/07 18:16:55  swa
- looks better
+<para>
+ <screen>
  
- Revision 1.41  2002/03/07 16:46:43  hal9
- Fix a few markup problems for jade.
+ { +block{Domains starts with "ad"} }
+  ad*.
  
- Revision 1.40  2002/03/07 16:28:39  swa
- provide correct feedback channels
+ { +block{Domain contains "ad"} }
+  .ad.
  
- Revision 1.39  2002/03/06 16:19:28  hal9
- Note on perceived filtering slowdown per FR.
+ { +block{Doubleclick banner server} +handle-as-image }
+  .[a-vx-z]*.doubleclick.net
+</screen>
+</para>
  
- Revision 1.38  2002/03/05 23:55:14  hal9
- Stupid I did it again. Double hyphen in comment breaks jade.
+<para>
+ We'll just show the interesting part here - the explicit matches. It is
+ matched three different times. Two <quote>+block{}</quote> sections,
+ and a <quote>+block{} +handle-as-image</quote>,
+ which is the expanded form of one of our aliases that had been defined as:
+ <quote>+block-as-image</quote>. (<link
+ linkend="ALIASES"><quote>Aliases</quote></link> are defined in
+ the first section of the actions file and typically used to combine more
+ than one action.)
+</para>
  
- Revision 1.37  2002/03/05 23:53:49  hal9
- jade barfs on '- -' embedded in comments. - -user option broke it.
+<para>
+ Any one of these would have done the trick and blocked this as an unwanted
+ image. This is unnecessarily redundant since the last case effectively
+ would also cover the first. No point in taking chances with these guys
+ though ;-) Note that if you want an ad or obnoxious
+ URL to be invisible, it should be defined as <quote>ad.doubleclick.net</quote>
+ is done here -- as both a <link
+ linkend="BLOCK"><quote>+block{}</quote></link>
+ <emphasis>and</emphasis> an
+ <link linkend="HANDLE-AS-IMAGE"><quote>+handle-as-image</quote></link>.
+ The custom alias <quote><literal>+block-as-image</literal></quote> just
+ simplifies the process and make it more readable.
+</para>
  
- Revision 1.36  2002/03/05 22:53:28  hal9
- Add new - - user option.
+<para>
+ One last example. Let's try <quote>http://www.example.net/adsl/HOWTO/</quote>.
+ This one is giving us problems. We are getting a blank page. Hmmm ...
+</para>
  
- Revision 1.35  2002/03/05 00:17:27  hal9
- Added section on command line options.
+<para>
+ <screen>
  
- Revision 1.34  2002/03/04 19:32:07  oes
- Changed default port to 8118
+ Matches for http://www.example.net/adsl/HOWTO/:
  
- Revision 1.33  2002/03/03 19:46:13  hal9
- Emphasis on where/how to report bugs, etc
+ In file: default.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibutton>
  
- Revision 1.32  2002/03/03 09:26:06  joergs
- AmigaOS changes, config is now loaded from PROGDIR: instead of
- AmiTCP:db/junkbuster/ if no configuration file is specified on the
- command line.
+ {-add-header
+  -block
+  +change-x-forwarded-for{block}
+  -client-header-filter{hide-tor-exit-notation}
+  -content-type-overwrite
+  -crunch-client-header
+  -crunch-if-none-match
+  -crunch-incoming-cookies
+  -crunch-outgoing-cookies
+  -crunch-server-header
+  +deanimate-gifs
+  -downgrade-http-version
+  +fast-redirects {check-decoded-url}
+  -filter {js-events}
+  -filter {content-cookies}
+  -filter {all-popups}
+  -filter {banners-by-link}
+  -filter {tiny-textforms}
+  -filter {frameset-borders}
+  -filter {demoronizer}
+  -filter {shockwave-flash}
+  -filter {quicktime-kioskmode}
+  -filter {fun}
+  -filter {crude-parental}
+  -filter {site-specifics}
+  -filter {js-annoyances}
+  -filter {html-annoyances}
+  +filter {refresh-tags}
+  -filter {unsolicited-popups}
+  +filter {img-reorder}
+  +filter {banners-by-size}
+  +filter {webbugs}
+  +filter {jumping-windows}
+  +filter {ie-exploits}
+  -filter {google}
+  -filter {yahoo}
+  -filter {msn}
+  -filter {blogspot}
+  -filter {no-ping}
+  -force-text-mode
+  -handle-as-empty-document
+  -handle-as-image
+  -hide-accept-language
+  -hide-content-disposition
+  +hide-from-header{block}
+  +hide-referer{forge}
+  -hide-user-agent
+  -overwrite-last-modified
+  +prevent-compression
+  -redirect
+  -server-header-filter{xml-to-html}
+  -server-header-filter{html-to-xml}
+  +session-cookies-only
+  +set-image-blocker{blank} }
+   /
  
- Revision 1.31  2002/03/02 22:45:52  david__schmidt
- Just tweaking
+ { +block{Path contains "ads".} +handle-as-image }
+  /ads
+</screen>
+</para>
  
- Revision 1.30  2002/03/02 22:00:14  hal9
- Updated 'New Features' list. Ran through spell-checker.
+<para>
+ Ooops, the <quote>/adsl/</quote> is matching <quote>/ads</quote> in our
+ configuration! But we did not want this at all! Now we see why we get the
+ blank page. It is actually triggering two different actions here, and
+ the effects are aggregated so that the URL is blocked, and &my-app; is told
+ to treat the block as if it were an image. But this is, of course, all wrong.
+  We could now add a new action below this (or better in our own
+  <filename>user.action</filename> file) that explicitly
+  <emphasis>un</emphasis> blocks (
+  <link linkend="BLOCK"><quote>{-block}</quote></link>) paths with
+  <quote>adsl</quote> in them (remember, last match in the configuration
+  wins). There are various ways to handle such exceptions. Example:
+</para>
  
- Revision 1.29  2002/03/02 20:34:07  david__schmidt
- Update OS/2 build section
+<para>
+ <screen>
  
- Revision 1.28  2002/02/24 14:34:24  jongfoster
- Formatting changes.  Now changing the doctype to DocBook XML 4.1
- will work - no other changes are needed.
+ { -block }
+  /adsl
+</screen>
+</para>
  
- Revision 1.27  2002/01/11 14:14:32  hal9
- Added a very short section on Templates
+<para>
+ Now the page displays ;-)
+ Remember to flush your browser's caches when making these kinds of changes to
+ your configuration to insure that you get a freshly delivered page! Or, try
+ using <literal>Shift+Reload</literal>.
+</para>
  
- Revision 1.26  2002/01/09 20:02:50  hal9
- Fix bug re: auto-detect config file changes.
+<para>
+ But now what about a situation where we get no explicit matches like
+ we did with:
+</para>
  
- Revision 1.25  2002/01/09 18:20:30  hal9
- Touch ups for *.action files.
+<para>
+ <screen>
  
- Revision 1.24  2001/12/02 01:13:42  hal9
- Fix typo.
+ { +block{Path starts with "ads".} +handle-as-image }
+ /ads
+</screen>
+</para>
  
- Revision 1.23  2001/12/02 00:20:41  hal9
- Updates for recent changes.
+<para>
+ That actually was very helpful and pointed us quickly to where the problem
+ was. If you don't get this kind of match, then it means one of the default
+ rules in the first section of <filename>default.action</filename> is causing
+ the problem. This would require some guesswork, and maybe a little trial and
+ error to isolate the offending rule. One likely cause would be one of the
+ <link linkend="FILTER"><quote>+filter</quote></link> actions.
+ These tend to be harder to troubleshoot.
+ Try adding the URL for the site to one of aliases that turn off
+ <link linkend="FILTER"><quote>+filter</quote></link>:
+</para>
  
- Revision 1.22  2001/11/05 23:57:51  hal9
- Minor update for startup now daemon mode.
+<para>
+ <screen>
  
- Revision 1.21  2001/10/31 21:11:03  hal9
- Correct 2 minor errors
+ { shop }
+ .quietpc.com
+ .worldpay.com   # for quietpc.com
+ .jungle.com
+ .scan.co.uk
+ .forbes.com
+</screen>
+</para>
  
- Revision 1.18  2001/10/24 18:45:26  hal9
- *** empty log message ***
+<para>
+ <quote><literal>{ shop }</literal></quote> is an <quote>alias</quote> that expands to
+ <quote><literal>{ -filter -session-cookies-only }</literal></quote>.
+ Or you could do your own exception to negate filtering:
  
- Revision 1.17  2001/10/24 17:10:55  hal9
- Catching up with Jon's recent work, and a few other things.
+</para>
  
- Revision 1.16  2001/10/21 17:19:21  swa
- wrong url in documentation
+<para>
+ <screen>
  
- Revision 1.15  2001/10/14 23:46:24  hal9
- Various minor changes. Fleshed out SEE ALSO section.
+ { -filter }
+ # Disable ALL filter actions for sites in this section
+ .forbes.com
+ developer.ibm.com
+ localhost
+</screen>
+</para>
  
- Revision 1.13  2001/10/10 17:28:33  hal9
- Very minor changes.
+<para>
+ This would turn off all filtering for these sites. This is best
+ put in <filename>user.action</filename>, for local site
+ exceptions. Note that when a simple domain pattern is used by itself (without
+ the subsequent path portion), all sub-pages within that domain are included
+ automatically in the scope of the action.
+</para>
  
- Revision 1.12  2001/09/28 02:57:04  hal9
- Ditto :/
+<para>
+ Images that are inexplicably being blocked, may well be hitting the
+<link linkend="FILTER-BANNERS-BY-SIZE"><quote>+filter{banners-by-size}</quote></link>
+ rule, which assumes
+ that images of certain sizes are ad banners (works well
+ <emphasis>most of the time</emphasis>  since these tend to be standardized).
+</para>
  
- Revision 1.11  2001/09/28 02:25:20  hal9
- Ditto.
+<para>
+ <quote><literal>{ fragile }</literal></quote> is an alias that disables most
+ actions that are the most likely to cause trouble. This can be used as a
+ last resort for problem sites.
+</para>
+<para>
+ <screen>
  
- Revision 1.9  2001/09/27 23:50:29  hal9
- A few changes. A short section on regular expression in appendix.
+ { fragile }
+ # Handle with care: easy to break
+ mail.google.
+ mybank.example.com</screen>
+</para>
  
- Revision 1.8  2001/09/25 00:34:59  hal9
- Some additions, and re-arranging.
  
- Revision 1.7  2001/09/24 14:31:36  hal9
- Diddling.
+<para>
+ <emphasis>Remember to flush caches!</emphasis> Note that the
+ <literal>mail.google</literal> reference lacks the TLD portion (e.g.
+ <quote>.com</quote>). This will effectively match any TLD with
+ <literal>google</literal> in it, such as <literal>mail.google.de.</literal>,
+ just as an example.
+</para>
+<para>
+ If this still does not work, you will have to go through the remaining
+ actions one by one to find which one(s) is causing the problem.
+</para>
  
- Revision 1.6  2001/09/24 14:10:32  hal9
- Including David's OS/2 installation instructions.
+</sect2>
  
- Revision 1.2  2001/09/13 15:27:40  swa
- cosmetics
+</sect1>
  
- Revision 1.1  2001/09/12 15:36:41  swa
- source files for junkbuster documentation
+ <!--
  
- Revision 1.3  2001/09/10 17:43:59  swa
- first proposal of a structure.
+ This program is free software; you can redistribute it
+ and/or modify it under the terms of the GNU General
+ Public License as published by the Free Software
+ Foundation; either version 2 of the License, or (at
+ your option) any later version.
  
- Revision 1.2  2001/06/13 14:28:31  swa
- docs should have an author.
+ This program is distributed in the hope that it will
+ be useful, but WITHOUT ANY WARRANTY; without even the
+ implied warranty of MERCHANTABILITY or FITNESS FOR A
+ PARTICULAR PURPOSE.  See the GNU General Public
+ License for more details.
  
- Revision 1.1  2001/06/13 14:20:37  swa
- first import of project's documentation for the webserver.
+ The GNU General Public License should be included with
+ this file.  If not, you can view it at
+ http://www.gnu.org/copyleft/gpl.html
+ or write to the Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
+ USA
  
   -->