rebuild docs

[privoxy.git] / doc / webserver / user-manual / appendix.html
diff --git a/doc/webserver/user-manual/appendix.html b/doc/webserver/user-manual/appendix.html

index d261148..56b7cf1 100644 (file)
--- a/doc/webserver/user-manual/appendix.html
+++ b/doc/webserver/user-manual/appendix.html
@@ -1,377 +1,1307 @@
-<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
-"http://www.w3.org/TR/html4/loose.dtd">
-<html>
-<head>
-  <title>Appendix</title>
-  <meta name="GENERATOR" content="Modular DocBook HTML Stylesheet Version 1.79">
-  <link rel="HOME" title="Privoxy 3.0.27 User Manual" href="index.html">
-  <link rel="PREVIOUS" title="See Also" href="seealso.html">
-  <link rel="STYLESHEET" type="text/css" href="../p_doc.css">
-  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
-  <link rel="STYLESHEET" type="text/css" href="p_doc.css">
-</head>
-<body class="SECT1" bgcolor="#EEEEEE" text="#000000" link="#0000FF" vlink="#840084" alink="#0000FF">
-  <div class="NAVHEADER">
-    <table summary="Header navigation table" width="100%" border="0" cellpadding="0" cellspacing="0">
-      <tr>
-        <th colspan="3" align="center">Privoxy 3.0.27 User Manual</th>
-      </tr>
-      <tr>
-        <td width="10%" align="left" valign="bottom"><a href="seealso.html" accesskey="P">Prev</a></td>
-        <td width="80%" align="center" valign="bottom"></td>
-        <td width="10%" align="right" valign="bottom">&nbsp;</td>
-      </tr>
-    </table>
-    <hr align="left" width="100%">
-  </div>
-  <div class="SECT1">
-    <h1 class="SECT1"><a name="APPENDIX" id="APPENDIX">14. Appendix</a></h1>
-    <div class="SECT2">
-      <h2 class="SECT2"><a name="REGEX" id="REGEX">14.1. Regular Expressions</a></h2>
-      <p><span class="APPLICATION">Privoxy</span> uses Perl-style <span class="QUOTE">"regular expressions"</span> in
-      its <a href="actions-file.html">actions files</a> and <a href="filter-file.html">filter file</a>, through the
-      <a href="http://www.pcre.org/" target="_top">PCRE</a> and <span class="APPLICATION">PCRS</span> libraries.</p>
-      <p>If you are reading this, you probably don't understand what <span class="QUOTE">"regular expressions"</span>
-      are, or what they can do. So this will be a very brief introduction only. A full explanation would require a
-      <a href="http://www.oreilly.com/catalog/regex/" target="_top">book</a> ;-)</p>
-      <p>Regular expressions provide a language to describe patterns that can be run against strings of characters
-      (letter, numbers, etc), to see if they match the string or not. The patterns are themselves (sometimes complex)
-      strings of literal characters, combined with wild-cards, and other special characters, called meta-characters.
-      The <span class="QUOTE">"meta-characters"</span> have special meanings and are used to build complex patterns to
-      be matched against. Perl Compatible Regular Expressions are an especially convenient <span class=
-      "QUOTE">"dialect"</span> of the regular expression language.</p>
-      <p>To make a simple analogy, we do something similar when we use wild-card characters when listing files with the
-      <b class="COMMAND">dir</b> command in DOS. <tt class="LITERAL">*.*</tt> matches all filenames. The <span class=
-      "QUOTE">"special"</span> character here is the asterisk which matches any and all characters. We can be more
-      specific and use <tt class="LITERAL">?</tt> to match just individual characters. So <span class="QUOTE">"dir
-      file?.text"</span> would match <span class="QUOTE">"file1.txt"</span>, <span class="QUOTE">"file2.txt"</span>,
-      etc. We are pattern matching, using a similar technique to <span class="QUOTE">"regular expressions"</span>!</p>
-      <p>Regular expressions do essentially the same thing, but are much, much more powerful. There are many more
-      <span class="QUOTE">"special characters"</span> and ways of building complex patterns however. Let's look at a
-      few of the common ones, and then some examples:</p>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">.</i></span> - Matches any single character, e.g.
-            <span class="QUOTE">"a"</span>, <span class="QUOTE">"A"</span>, <span class="QUOTE">"4"</span>,
-            <span class="QUOTE">":"</span>, or <span class="QUOTE">"@"</span>.</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">?</i></span> - The preceding character or expression is
-            matched ZERO or ONE times. Either/or.</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">+</i></span> - The preceding character or expression is
-            matched ONE or MORE times.</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">*</i></span> - The preceding character or expression is
-            matched ZERO or MORE times.</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">\</i></span> - The <span class="QUOTE">"escape"</span>
-            character denotes that the following character should be taken literally. This is used where one of the
-            special characters (e.g. <span class="QUOTE">"."</span>) needs to be taken literally and not as a special
-            meta-character. Example: <span class="QUOTE">"example\.com"</span>, makes sure the period is recognized
-            only as a period (and not expanded to its meta-character meaning of any single character).</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">[ ]</i></span> - Characters enclosed in brackets will be
-            matched if any of the enclosed characters are encountered. For instance, <span class="QUOTE">"[0-9]"</span>
-            matches any numeric digit (zero through nine). As an example, we can combine this with <span class=
-            "QUOTE">"+"</span> to match any digit one of more times: <span class="QUOTE">"[0-9]+"</span>.</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">( )</i></span> - parentheses are used to group a
-            sub-expression, or multiple sub-expressions.</td>
-          </tr>
-        </tbody>
-      </table>
-      <table border="0">
-        <tbody>
-          <tr>
-            <td><span class="emphasis"><i class="EMPHASIS">|</i></span> - The <span class="QUOTE">"bar"</span>
-            character works like an <span class="QUOTE">"or"</span> conditional statement. A match is successful if the
-            sub-expression on either side of <span class="QUOTE">"|"</span> matches. As an example: <span class=
-            "QUOTE">"/(this|that) example/"</span> uses grouping and the bar character and would match either
-            <span class="QUOTE">"this example"</span> or <span class="QUOTE">"that example"</span>, and nothing
-            else.</td>
-          </tr>
-        </tbody>
-      </table>
-      <p>These are just some of the ones you are likely to use when matching URLs with <span class=
-      "APPLICATION">Privoxy</span>, and is a long way from a definitive list. This is enough to get us started with a
-      few simple examples which may be more illuminating:</p>
-      <p><span class="emphasis"><i class="EMPHASIS"><tt class="LITERAL">/.*/banners/.*</tt></i></span> - A simple
-      example that uses the common combination of <span class="QUOTE">"."</span> and <span class="QUOTE">"*"</span> to
-      denote any character, zero or more times. In other words, any string at all. So we start with a literal forward
-      slash, then our regular expression pattern (<span class="QUOTE">".*"</span>) another literal forward slash, the
-      string <span class="QUOTE">"banners"</span>, another forward slash, and lastly another <span class=
-      "QUOTE">".*"</span>. We are building a directory path here. This will match any file with the path that has a
-      directory named <span class="QUOTE">"banners"</span> in it. The <span class="QUOTE">".*"</span> matches any
-      characters, and this could conceivably be more forward slashes, so it might expand into a much longer looking
-      path. For example, this could match: <span class="QUOTE">"/eye/hate/spammers/banners/annoy_me_please.gif"</span>,
-      or just <span class="QUOTE">"/banners/annoying.html"</span>, or almost an infinite number of other possible
-      combinations, just so it has <span class="QUOTE">"banners"</span> in the path somewhere.</p>
-      <p>And now something a little more complex:</p>
-      <p><span class="emphasis"><i class="EMPHASIS"><tt class=
-      "LITERAL">/.*/adv((er)?ts?|ertis(ing|ements?))?/</tt></i></span> - We have several literal forward slashes again
-      (<span class="QUOTE">"/"</span>), so we are building another expression that is a file path statement. We have
-      another <span class="QUOTE">".*"</span>, so we are matching against any conceivable sub-path, just so it matches
-      our expression. The only true literal that <span class="emphasis"><i class="EMPHASIS">must match</i></span> our
-      pattern is <span class="APPLICATION">adv</span>, together with the forward slashes. What comes after the
-      <span class="QUOTE">"adv"</span> string is the interesting part.</p>
-      <p>Remember the <span class="QUOTE">"?"</span> means the preceding expression (either a literal character or
-      anything grouped with <span class="QUOTE">"(...)"</span> in this case) can exist or not, since this means either
-      zero or one match. So <span class="QUOTE">"((er)?ts?|ertis(ing|ements?))"</span> is optional, as are the
-      individual sub-expressions: <span class="QUOTE">"(er)"</span>, <span class="QUOTE">"(ing|ements?)"</span>, and
-      the <span class="QUOTE">"s"</span>. The <span class="QUOTE">"|"</span> means <span class="QUOTE">"or"</span>. We
-      have two of those. For instance, <span class="QUOTE">"(ing|ements?)"</span>, can expand to match either
-      <span class="QUOTE">"ing"</span> <span class="emphasis"><i class="EMPHASIS">OR</i></span> <span class=
-      "QUOTE">"ements?"</span>. What is being done here, is an attempt at matching as many variations of <span class=
-      "QUOTE">"advertisement"</span>, and similar, as possible. So this would expand to match just <span class=
-      "QUOTE">"adv"</span>, or <span class="QUOTE">"advert"</span>, or <span class="QUOTE">"adverts"</span>, or
-      <span class="QUOTE">"advertising"</span>, or <span class="QUOTE">"advertisement"</span>, or <span class=
-      "QUOTE">"advertisements"</span>. You get the idea. But it would not match <span class=
-      "QUOTE">"advertizements"</span> (with a <span class="QUOTE">"z"</span>). We could fix that by changing our
-      regular expression to: <span class="QUOTE">"/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/"</span>, which would then
-      match either spelling.</p>
-      <p><span class="emphasis"><i class="EMPHASIS"><tt class="LITERAL">/.*/advert[0-9]+\.(gif|jpe?g)</tt></i></span> -
-      Again another path statement with forward slashes. Anything in the square brackets <span class="QUOTE">"[
-      ]"</span> can be matched. This is using <span class="QUOTE">"0-9"</span> as a shorthand expression to mean any
-      digit one through nine. It is the same as saying <span class="QUOTE">"0123456789"</span>. So any digit matches.
-      The <span class="QUOTE">"+"</span> means one or more of the preceding expression must be included. The preceding
-      expression here is what is in the square brackets -- in this case, any digit one through nine. Then, at the end,
-      we have a grouping: <span class="QUOTE">"(gif|jpe?g)"</span>. This includes a <span class="QUOTE">"|"</span>, so
-      this needs to match the expression on either side of that bar character also. A simple <span class=
-      "QUOTE">"gif"</span> on one side, and the other side will in turn match either <span class="QUOTE">"jpeg"</span>
-      or <span class="QUOTE">"jpg"</span>, since the <span class="QUOTE">"?"</span> means the letter <span class=
-      "QUOTE">"e"</span> is optional and can be matched once or not at all. So we are building an expression here to
-      match image GIF or JPEG type image file. It must include the literal string <span class="QUOTE">"advert"</span>,
-      then one or more digits, and a <span class="QUOTE">"."</span> (which is now a literal, and not a special
-      character, since it is escaped with <span class="QUOTE">"\"</span>), and lastly either <span class=
-      "QUOTE">"gif"</span>, or <span class="QUOTE">"jpeg"</span>, or <span class="QUOTE">"jpg"</span>. Some possible
-      matches would include: <span class="QUOTE">"//advert1.jpg"</span>, <span class=
-      "QUOTE">"/nasty/ads/advert1234.gif"</span>, <span class="QUOTE">"/banners/from/hell/advert99.jpg"</span>. It
-      would not match <span class="QUOTE">"advert1.gif"</span> (no leading slash), or <span class=
-      "QUOTE">"/adverts232.jpg"</span> (the expression does not include an <span class="QUOTE">"s"</span>), or
-      <span class="QUOTE">"/advert1.jsp"</span> (<span class="QUOTE">"jsp"</span> is not in the expression
-      anywhere).</p>
-      <p>We are barely scratching the surface of regular expressions here so that you can understand the default
-      <span class="APPLICATION">Privoxy</span> configuration files, and maybe use this knowledge to customize your own
-      installation. There is much, much more that can be done with regular expressions. Now that you know enough to get
-      started, you can learn more on your own :/</p>
-      <p>More reading on Perl Compatible Regular expressions: <a href="http://perldoc.perl.org/perlre.html" target=
-      "_top">http://perldoc.perl.org/perlre.html</a></p>
-      <p>For information on regular expression based substitutions and their applications in filters, please see the
-      <a href="filter-file.html">filter file tutorial</a> in this manual.</p>
-    </div>
-    <div class="SECT2">
-      <h2 class="SECT2"><a name="INTERNAL-PAGES" id="INTERNAL-PAGES">14.2. Privoxy's Internal Pages</a></h2>
-      <p>Since <span class="APPLICATION">Privoxy</span> proxies each requested web page, it is easy for <span class=
-      "APPLICATION">Privoxy</span> to trap certain special URLs. In this way, we can talk directly to <span class=
-      "APPLICATION">Privoxy</span>, and see how it is configured, see how our rules are being applied, change these
-      rules and other configuration options, and even turn <span class="APPLICATION">Privoxy's</span> filtering off,
-      all with a web browser.</p>
-      <p>The URLs listed below are the special ones that allow direct access to <span class=
-      "APPLICATION">Privoxy</span>. Of course, <span class="APPLICATION">Privoxy</span> must be running to access
-      these. If not, you will get a friendly error message. Internet access is not necessary either.</p>
-      <ul>
-        <li>
-          <p>Privoxy main page:</p><a name="AEN6099" id="AEN6099"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/" target="_top">http://config.privoxy.org/</a></p>
-          </blockquote>
-          <p>There is a shortcut: <a href="http://p.p/" target="_top">http://p.p/</a> (But it doesn't provide a
-          fall-back to a real page, in case the request is not sent through <span class=
-          "APPLICATION">Privoxy</span>)</p>
-        </li>
-        <li>
-          <p>Show information about the current configuration, including viewing and editing of actions
-          files:</p><a name="AEN6107" id="AEN6107"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/show-status" target=
-            "_top">http://config.privoxy.org/show-status</a></p>
-          </blockquote>
-        </li>
-        <li>
-          <p>Show the source code version numbers:</p><a name="AEN6112" id="AEN6112"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/show-version" target=
-            "_top">http://config.privoxy.org/show-version</a></p>
-          </blockquote>
-        </li>
-        <li>
-          <p>Show the browser's request headers:</p><a name="AEN6117" id="AEN6117"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/show-request" target=
-            "_top">http://config.privoxy.org/show-request</a></p>
-          </blockquote>
-        </li>
-        <li>
-          <p>Show which actions apply to a URL and why:</p><a name="AEN6122" id="AEN6122"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/show-url-info" target=
-            "_top">http://config.privoxy.org/show-url-info</a></p>
-          </blockquote>
-        </li>
-        <li>
-          <p>Toggle Privoxy on or off. This feature can be turned off/on in the main <tt class="FILENAME">config</tt>
-          file. When toggled <span class="QUOTE">"off"</span>, <span class="QUOTE">"Privoxy"</span> continues to run,
-          but only as a pass-through proxy, with no actions taking place:</p><a name="AEN6130" id="AEN6130"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/toggle" target="_top">http://config.privoxy.org/toggle</a></p>
-          </blockquote>
-          <p>Short cuts. Turn off, then on:</p><a name="AEN6134" id="AEN6134"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/toggle?set=disable" target=
-            "_top">http://config.privoxy.org/toggle?set=disable</a></p>
-          </blockquote><a name="AEN6137" id="AEN6137"></a>
-          <blockquote class="BLOCKQUOTE">
-            <p><a href="http://config.privoxy.org/toggle?set=enable" target=
-            "_top">http://config.privoxy.org/toggle?set=enable</a></p>
-          </blockquote>
-        </li>
-      </ul>
-    </div>
-    <div class="SECT2">
-      <h2 class="SECT2"><a name="CHAIN" id="CHAIN">14.3. Chain of Events</a></h2>
-      <p>Let's take a quick look at how some of <span class="APPLICATION">Privoxy's</span> core features are triggered,
-      and the ensuing sequence of events when a web page is requested by your browser:</p>
-      <ul>
-        <li>
-          <p>First, your web browser requests a web page. The browser knows to send the request to <span class=
-          "APPLICATION">Privoxy</span>, which will in turn, relay the request to the remote web server after passing
-          the following tests:</p>
-        </li>
-        <li>
-          <p><span class="APPLICATION">Privoxy</span> traps any request for its own internal CGI pages (e.g <a href=
-          "http://p.p/" target="_top">http://p.p/</a>) and sends the CGI page back to the browser.</p>
-        </li>
-        <li>
-          <p>Next, <span class="APPLICATION">Privoxy</span> checks to see if the URL matches any <a href=
-          "actions-file.html#BLOCK"><span class="QUOTE">"+block"</span></a> patterns. If so, the URL is then blocked,
-          and the remote web server will not be contacted. <a href="actions-file.html#HANDLE-AS-IMAGE"><span class=
-          "QUOTE">"+handle-as-image"</span></a> and <a href="actions-file.html#HANDLE-AS-EMPTY-DOCUMENT"><span class=
-          "QUOTE">"+handle-as-empty-document"</span></a> are then checked, and if there is no match, an HTML
-          <span class="QUOTE">"BLOCKED"</span> page is sent back to the browser. Otherwise, if it does match, an image
-          is returned for the former, and an empty text document for the latter. The type of image would depend on the
-          setting of <a href="actions-file.html#SET-IMAGE-BLOCKER"><span class="QUOTE">"+set-image-blocker"</span></a>
-          (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).</p>
-        </li>
-        <li>
-          <p>Untrusted URLs are blocked. If URLs are being added to the <tt class="FILENAME">trust</tt> file, then that
-          is done.</p>
-        </li>
-        <li>
-          <p>If the URL pattern matches the <a href="actions-file.html#FAST-REDIRECTS"><span class=
-          "QUOTE">"+fast-redirects"</span></a> action, it is then processed. Unwanted parts of the requested URL are
-          stripped.</p>
-        </li>
-        <li>
-          <p>Now the rest of the client browser's request headers are processed. If any of these match any of the
-          relevant actions (e.g. <a href="actions-file.html#HIDE-USER-AGENT"><span class=
-          "QUOTE">"+hide-user-agent"</span></a>, etc.), headers are suppressed or forged as determined by these actions
-          and their parameters.</p>
-        </li>
-        <li>
-          <p>Now the web server starts sending its response back (i.e. typically a web page).</p>
-        </li>
-        <li>
-          <p>First, the server headers are read and processed to determine, among other things, the MIME type (document
-          type) and encoding. The headers are then filtered as determined by the <a href=
-          "actions-file.html#CRUNCH-INCOMING-COOKIES"><span class="QUOTE">"+crunch-incoming-cookies"</span></a>,
-          <a href="actions-file.html#SESSION-COOKIES-ONLY"><span class="QUOTE">"+session-cookies-only"</span></a>, and
-          <a href="actions-file.html#DOWNGRADE-HTTP-VERSION"><span class="QUOTE">"+downgrade-http-version"</span></a>
-          actions.</p>
-        </li>
-        <li>
-          <p>If any <a href="actions-file.html#FILTER"><span class="QUOTE">"+filter"</span></a> action or <a href=
-          "actions-file.html#DEANIMATE-GIFS"><span class="QUOTE">"+deanimate-gifs"</span></a> action applies (and the
-          document type fits the action), the rest of the page is read into memory (up to a configurable limit). Then
-          the filter rules (from <tt class="FILENAME">default.filter</tt> and any other filter files) are processed
-          against the buffered content. Filters are applied in the order they are specified in one of the filter files.
-          Animated GIFs, if present, are reduced to either the first or last frame, depending on the action setting.The
-          entire page, which is now filtered, is then sent by <span class="APPLICATION">Privoxy</span> back to your
-          browser.</p>
-          <p>If neither a <a href="actions-file.html#FILTER"><span class="QUOTE">"+filter"</span></a> action or
-          <a href="actions-file.html#DEANIMATE-GIFS"><span class="QUOTE">"+deanimate-gifs"</span></a> matches, then
-          <span class="APPLICATION">Privoxy</span> passes the raw data through to the client browser as it becomes
-          available.</p>
-        </li>
-        <li>
-          <p>As the browser receives the now (possibly filtered) page content, it reads and then requests any URLs that
-          may be embedded within the page source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
-          frames), sounds, etc. For each of these objects, the browser issues a separate request (this is easily
-          viewable in <span class="APPLICATION">Privoxy's</span> logs). And each such request is in turn processed just
-          as above. Note that a complex web page will have many, many such embedded URLs. If these secondary requests
-          are to a different server, then quite possibly a very differing set of actions is triggered.</p>
-        </li>
-      </ul>
-      <p>NOTE: This is somewhat of a simplistic overview of what happens with each URL request. For the sake of brevity
-      and simplicity, we have focused on <span class="APPLICATION">Privoxy's</span> core features only.</p>
-    </div>
-    <div class="SECT2">
-      <h2 class="SECT2"><a name="ACTIONSANAT" id="ACTIONSANAT">14.4. Troubleshooting: Anatomy of an Action</a></h2>
-      <p>The way <span class="APPLICATION">Privoxy</span> applies <a href="actions-file.html#ACTIONS">actions</a> and
-      <a href="actions-file.html#FILTER">filters</a> to any given URL can be complex, and not always so easy to
-      understand what is happening. And sometimes we need to be able to <span class="emphasis"><i class=
-      "EMPHASIS">see</i></span> just what <span class="APPLICATION">Privoxy</span> is doing. Especially, if something
-      <span class="APPLICATION">Privoxy</span> is doing is causing us a problem inadvertently. It can be a little
-      daunting to look at the actions and filters files themselves, since they tend to be filled with <a href=
-      "appendix.html#REGEX">regular expressions</a> whose consequences are not always so obvious.</p>
-      <p>One quick test to see if <span class="APPLICATION">Privoxy</span> is causing a problem or not, is to disable
-      it temporarily. This should be the first troubleshooting step (be sure to flush caches afterward!). Looking at
-      the logs is a good idea too. (Note that both the toggle feature and logging are enabled via <tt class=
-      "FILENAME">config</tt> file settings, and may need to be turned <span class="QUOTE">"on"</span>.)</p>
-      <p>Another easy troubleshooting step to try is if you have done any customization of your installation, revert
-      back to the installed defaults and see if that helps. There are times the developers get complaints about one
-      thing or another, and the problem is more related to a customized configuration issue.</p>
-      <p><span class="APPLICATION">Privoxy</span> also provides the <a href="http://config.privoxy.org/show-url-info"
-      target="_top">http://config.privoxy.org/show-url-info</a> page that can show us very specifically how
-      <span class="APPLICATION">actions</span> are being applied to any given URL. This is a big help for
-      troubleshooting.</p>
-      <p>First, enter one URL (or partial URL) at the prompt, and then <span class="APPLICATION">Privoxy</span> will
-      tell us how the current configuration will handle it. This will not help with filtering effects (i.e. the
-      <a href="actions-file.html#FILTER"><span class="QUOTE">"+filter"</span></a> action) from one of the filter files
-      since this is handled very differently and not so easy to trap! It also will not tell you about any other URLs
-      that may be embedded within the URL you are testing. For instance, images such as ads are expressed as URLs
-      within the raw page source of HTML pages. So you will only get info for the actual URL that is pasted into the
-      prompt area -- not any sub-URLs. If you want to know about embedded URLs like ads, you will have to dig those out
-      of the HTML source. Use your browser's <span class="QUOTE">"View Page Source"</span> option for this. Or right
-      click on the ad, and grab the URL.</p>
-      <p>Let's try an example, <a href="http://google.com" target="_top">google.com</a>, and look at it one section at
-      a time in a sample configuration (your real configuration may vary):</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN"> Matches for http://www.google.com:
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
+<HTML
+><HEAD
+><TITLE
+>Appendix</TITLE
+><META
+NAME="GENERATOR"
+CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
+REL="HOME"
+TITLE="Privoxy 3.0.27 User Manual"
+HREF="index.html"><LINK
+REL="PREVIOUS"
+TITLE="See Also"
+HREF="seealso.html"><LINK
+REL="STYLESHEET"
+TYPE="text/css"
+HREF="../p_doc.css"><META
+HTTP-EQUIV="Content-Type"
+CONTENT="text/html;
+charset=ISO-8859-1">
+<LINK REL="STYLESHEET" TYPE="text/css" HREF="p_doc.css">
+</head
+><BODY
+CLASS="SECT1"
+BGCOLOR="#EEEEEE"
+TEXT="#000000"
+LINK="#0000FF"
+VLINK="#840084"
+ALINK="#0000FF"
+><DIV
+CLASS="NAVHEADER"
+><TABLE
+SUMMARY="Header navigation table"
+WIDTH="100%"
+BORDER="0"
+CELLPADDING="0"
+CELLSPACING="0"
+><TR
+><TH
+COLSPAN="3"
+ALIGN="center"
+>Privoxy 3.0.27 User Manual</TH
+></TR
+><TR
+><TD
+WIDTH="10%"
+ALIGN="left"
+VALIGN="bottom"
+><A
+HREF="seealso.html"
+ACCESSKEY="P"
+>Prev</A
+></TD
+><TD
+WIDTH="80%"
+ALIGN="center"
+VALIGN="bottom"
+></TD
+><TD
+WIDTH="10%"
+ALIGN="right"
+VALIGN="bottom"
+>&nbsp;</TD
+></TR
+></TABLE
+><HR
+ALIGN="LEFT"
+WIDTH="100%"></DIV
+><DIV
+CLASS="SECT1"
+><H1
+CLASS="SECT1"
+><A
+NAME="APPENDIX"
+>14. Appendix</A
+></H1
+><DIV
+CLASS="SECT2"
+><H2
+CLASS="SECT2"
+><A
+NAME="REGEX"
+>14.1. Regular Expressions</A
+></H2
+><P
+> <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> uses Perl-style <SPAN
+CLASS="QUOTE"
+>"regular
+ expressions"</SPAN
+> in its <A
+HREF="actions-file.html"
+>actions
+ files</A
+> and <A
+HREF="filter-file.html"
+>filter file</A
+>,
+ through the <A
+HREF="http://www.pcre.org/"
+TARGET="_top"
+>PCRE</A
+> and
+ <SPAN
+CLASS="APPLICATION"
+>PCRS</SPAN
+> libraries.</P
+><P
+> If you are reading this, you probably don't understand what <SPAN
+CLASS="QUOTE"
+>"regular
+ expressions"</SPAN
+> are, or what they can do. So this will be a very brief
+ introduction only. A full explanation would require a <A
+HREF="http://www.oreilly.com/catalog/regex/"
+TARGET="_top"
+>book</A
+> ;-)</P
+><P
+> Regular expressions provide a language to describe patterns that can be
+ run against strings of characters (letter, numbers, etc), to see if they
+ match the string or not. The  patterns are themselves (sometimes complex)
+ strings of literal characters, combined with  wild-cards, and other special
+ characters, called meta-characters. The <SPAN
+CLASS="QUOTE"
+>"meta-characters"</SPAN
+> have
+ special meanings and are used to build complex patterns to be matched against.
+ Perl Compatible Regular Expressions are an especially convenient
+ <SPAN
+CLASS="QUOTE"
+>"dialect"</SPAN
+> of the regular expression language.</P
+><P
+> To make a simple analogy, we do something similar when we use wild-card
+ characters when listing files with the <B
+CLASS="COMMAND"
+>dir</B
+> command in DOS.
+ <TT
+CLASS="LITERAL"
+>*.*</TT
+> matches all filenames. The <SPAN
+CLASS="QUOTE"
+>"special"</SPAN
+>
+ character here is the asterisk which matches any and all characters. We can be
+ more specific and use <TT
+CLASS="LITERAL"
+>?</TT
+> to match just individual
+ characters. So <SPAN
+CLASS="QUOTE"
+>"dir file?.text"</SPAN
+> would match
+ <SPAN
+CLASS="QUOTE"
+>"file1.txt"</SPAN
+>, <SPAN
+CLASS="QUOTE"
+>"file2.txt"</SPAN
+>, etc. We are pattern
+ matching, using a similar technique to <SPAN
+CLASS="QUOTE"
+>"regular expressions"</SPAN
+>!</P
+><P
+> Regular expressions do essentially the same thing, but are much, much more
+ powerful. There are many more <SPAN
+CLASS="QUOTE"
+>"special characters"</SPAN
+> and ways of
+ building complex patterns however. Let's look at a few of the common ones,
+ and then some examples:</P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>.</I
+></SPAN
+> - Matches any single character, e.g. <SPAN
+CLASS="QUOTE"
+>"a"</SPAN
+>,
+  <SPAN
+CLASS="QUOTE"
+>"A"</SPAN
+>, <SPAN
+CLASS="QUOTE"
+>"4"</SPAN
+>, <SPAN
+CLASS="QUOTE"
+>":"</SPAN
+>, or <SPAN
+CLASS="QUOTE"
+>"@"</SPAN
+>.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>?</I
+></SPAN
+> - The preceding character or expression is matched ZERO or ONE
+  times. Either/or.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>+</I
+></SPAN
+> - The preceding character or expression is matched ONE or MORE
+  times.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>*</I
+></SPAN
+> - The preceding character or expression is matched ZERO or MORE
+  times.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>\</I
+></SPAN
+> - The <SPAN
+CLASS="QUOTE"
+>"escape"</SPAN
+> character denotes that
+  the following character should be taken literally. This is used where one of the
+  special characters (e.g. <SPAN
+CLASS="QUOTE"
+>"."</SPAN
+>) needs to be taken literally and
+  not as a special meta-character. Example: <SPAN
+CLASS="QUOTE"
+>"example\.com"</SPAN
+>, makes
+  sure the period is recognized only as a period (and not expanded to its
+  meta-character meaning of any single character).
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>[ ]</I
+></SPAN
+> - Characters enclosed in brackets will be matched if
+  any of the enclosed characters are encountered. For instance, <SPAN
+CLASS="QUOTE"
+>"[0-9]"</SPAN
+>
+  matches any numeric digit (zero through nine). As an example, we can combine
+  this with <SPAN
+CLASS="QUOTE"
+>"+"</SPAN
+> to match any digit one of more times: <SPAN
+CLASS="QUOTE"
+>"[0-9]+"</SPAN
+>.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>( )</I
+></SPAN
+> - parentheses are used to group a sub-expression,
+  or multiple sub-expressions.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+></P
+><TABLE
+BORDER="0"
+><TBODY
+><TR
+><TD
+>  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>|</I
+></SPAN
+> - The <SPAN
+CLASS="QUOTE"
+>"bar"</SPAN
+> character works like an
+  <SPAN
+CLASS="QUOTE"
+>"or"</SPAN
+> conditional statement. A match is successful if the
+  sub-expression on either side of <SPAN
+CLASS="QUOTE"
+>"|"</SPAN
+> matches. As an example:
+  <SPAN
+CLASS="QUOTE"
+>"/(this|that) example/"</SPAN
+> uses grouping and the bar character
+  and would match either <SPAN
+CLASS="QUOTE"
+>"this example"</SPAN
+> or <SPAN
+CLASS="QUOTE"
+>"that
+  example"</SPAN
+>, and nothing else.
+ </TD
+></TR
+></TBODY
+></TABLE
+><P
+></P
+><P
+> These are just some of the ones you are likely to use when matching URLs with
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+>, and is a long way from a definitive
+ list. This is enough to get us started with a few simple examples which may
+ be more illuminating:</P
+><P
+> <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+><TT
+CLASS="LITERAL"
+>/.*/banners/.*</TT
+></I
+></SPAN
+> - A  simple example
+ that uses the common combination of <SPAN
+CLASS="QUOTE"
+>"."</SPAN
+> and <SPAN
+CLASS="QUOTE"
+>"*"</SPAN
+> to
+ denote any character, zero or more times. In other words, any string at all.
+ So we start with a literal forward slash, then our regular expression pattern
+ (<SPAN
+CLASS="QUOTE"
+>".*"</SPAN
+>) another literal forward slash, the string
+ <SPAN
+CLASS="QUOTE"
+>"banners"</SPAN
+>, another forward slash, and lastly another
+ <SPAN
+CLASS="QUOTE"
+>".*"</SPAN
+>. We are building
+ a directory path here. This will match any file with the path that has a
+ directory named <SPAN
+CLASS="QUOTE"
+>"banners"</SPAN
+> in it. The <SPAN
+CLASS="QUOTE"
+>".*"</SPAN
+> matches
+ any characters, and this could conceivably be more forward slashes, so it
+ might expand into a much longer looking path. For example, this could match:
+ <SPAN
+CLASS="QUOTE"
+>"/eye/hate/spammers/banners/annoy_me_please.gif"</SPAN
+>, or just
+ <SPAN
+CLASS="QUOTE"
+>"/banners/annoying.html"</SPAN
+>, or almost an infinite number of other
+ possible combinations, just so it has <SPAN
+CLASS="QUOTE"
+>"banners"</SPAN
+> in the path
+ somewhere.</P
+><P
+> And now something a little more complex:</P
+><P
+> <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+><TT
+CLASS="LITERAL"
+>/.*/adv((er)?ts?|ertis(ing|ements?))?/</TT
+></I
+></SPAN
+> -
+ We have several literal forward slashes again (<SPAN
+CLASS="QUOTE"
+>"/"</SPAN
+>), so we are
+ building another expression that is a file path statement. We have another
+ <SPAN
+CLASS="QUOTE"
+>".*"</SPAN
+>, so we are matching against any conceivable sub-path, just so
+ it matches our expression. The only true literal that <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>must
+ match</I
+></SPAN
+> our pattern is <SPAN
+CLASS="APPLICATION"
+>adv</SPAN
+>, together with
+ the forward slashes. What comes after the <SPAN
+CLASS="QUOTE"
+>"adv"</SPAN
+> string is the
+ interesting part.</P
+><P
+> Remember the <SPAN
+CLASS="QUOTE"
+>"?"</SPAN
+> means the preceding expression (either a
+ literal character or anything grouped with <SPAN
+CLASS="QUOTE"
+>"(...)"</SPAN
+> in this case)
+ can exist or not, since this means either zero or one match. So
+ <SPAN
+CLASS="QUOTE"
+>"((er)?ts?|ertis(ing|ements?))"</SPAN
+> is optional, as are the
+ individual sub-expressions: <SPAN
+CLASS="QUOTE"
+>"(er)"</SPAN
+>,
+ <SPAN
+CLASS="QUOTE"
+>"(ing|ements?)"</SPAN
+>, and the <SPAN
+CLASS="QUOTE"
+>"s"</SPAN
+>. The <SPAN
+CLASS="QUOTE"
+>"|"</SPAN
+>
+ means <SPAN
+CLASS="QUOTE"
+>"or"</SPAN
+>. We have two of those. For instance,
+ <SPAN
+CLASS="QUOTE"
+>"(ing|ements?)"</SPAN
+>, can expand to match either <SPAN
+CLASS="QUOTE"
+>"ing"</SPAN
+>
+ <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>OR</I
+></SPAN
+> <SPAN
+CLASS="QUOTE"
+>"ements?"</SPAN
+>. What is being done here, is an
+ attempt at matching as many variations of <SPAN
+CLASS="QUOTE"
+>"advertisement"</SPAN
+>, and
+ similar, as possible. So this would expand to match just <SPAN
+CLASS="QUOTE"
+>"adv"</SPAN
+>,
+ or <SPAN
+CLASS="QUOTE"
+>"advert"</SPAN
+>, or <SPAN
+CLASS="QUOTE"
+>"adverts"</SPAN
+>, or
+ <SPAN
+CLASS="QUOTE"
+>"advertising"</SPAN
+>, or <SPAN
+CLASS="QUOTE"
+>"advertisement"</SPAN
+>, or
+ <SPAN
+CLASS="QUOTE"
+>"advertisements"</SPAN
+>. You get the idea. But it would not match
+ <SPAN
+CLASS="QUOTE"
+>"advertizements"</SPAN
+> (with a <SPAN
+CLASS="QUOTE"
+>"z"</SPAN
+>). We could fix that by
+ changing our regular expression to:
+ <SPAN
+CLASS="QUOTE"
+>"/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/"</SPAN
+>, which would then match
+ either spelling.</P
+><P
+> <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+><TT
+CLASS="LITERAL"
+>/.*/advert[0-9]+\.(gif|jpe?g)</TT
+></I
+></SPAN
+> - Again
+ another path statement with forward slashes. Anything in the square brackets
+ <SPAN
+CLASS="QUOTE"
+>"[ ]"</SPAN
+> can be matched. This is using <SPAN
+CLASS="QUOTE"
+>"0-9"</SPAN
+> as a
+ shorthand expression to mean any digit one through nine. It is the same as
+ saying <SPAN
+CLASS="QUOTE"
+>"0123456789"</SPAN
+>. So any digit matches. The <SPAN
+CLASS="QUOTE"
+>"+"</SPAN
+>
+ means one or more of the preceding expression must be included. The preceding
+ expression here is what is in the square brackets -- in this case, any digit
+ one through nine. Then, at the end, we have a grouping: <SPAN
+CLASS="QUOTE"
+>"(gif|jpe?g)"</SPAN
+>.
+ This includes a <SPAN
+CLASS="QUOTE"
+>"|"</SPAN
+>, so this needs to match the expression on
+ either side of that bar character also. A simple <SPAN
+CLASS="QUOTE"
+>"gif"</SPAN
+> on one side, and the other
+ side will in turn match either <SPAN
+CLASS="QUOTE"
+>"jpeg"</SPAN
+> or <SPAN
+CLASS="QUOTE"
+>"jpg"</SPAN
+>,
+ since the <SPAN
+CLASS="QUOTE"
+>"?"</SPAN
+> means the letter <SPAN
+CLASS="QUOTE"
+>"e"</SPAN
+> is optional and
+ can be matched once or not at all. So we are building an expression here to
+ match image GIF or JPEG type image file. It must include the literal
+ string <SPAN
+CLASS="QUOTE"
+>"advert"</SPAN
+>, then one or more digits, and a <SPAN
+CLASS="QUOTE"
+>"."</SPAN
+>
+ (which is now a literal, and not a special character, since it is escaped
+ with <SPAN
+CLASS="QUOTE"
+>"\"</SPAN
+>), and lastly either <SPAN
+CLASS="QUOTE"
+>"gif"</SPAN
+>, or
+ <SPAN
+CLASS="QUOTE"
+>"jpeg"</SPAN
+>, or <SPAN
+CLASS="QUOTE"
+>"jpg"</SPAN
+>. Some possible matches would
+ include: <SPAN
+CLASS="QUOTE"
+>"//advert1.jpg"</SPAN
+>,
+ <SPAN
+CLASS="QUOTE"
+>"/nasty/ads/advert1234.gif"</SPAN
+>,
+ <SPAN
+CLASS="QUOTE"
+>"/banners/from/hell/advert99.jpg"</SPAN
+>. It would not match
+ <SPAN
+CLASS="QUOTE"
+>"advert1.gif"</SPAN
+> (no leading slash), or
+ <SPAN
+CLASS="QUOTE"
+>"/adverts232.jpg"</SPAN
+> (the expression does not include an
+ <SPAN
+CLASS="QUOTE"
+>"s"</SPAN
+>), or <SPAN
+CLASS="QUOTE"
+>"/advert1.jsp"</SPAN
+> (<SPAN
+CLASS="QUOTE"
+>"jsp"</SPAN
+> is not
+ in the expression anywhere).</P
+><P
+> We are barely scratching the surface of regular expressions here so that you
+ can understand the default <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+>
+ configuration files, and maybe use this knowledge to customize your own
+ installation. There is much, much more that can be done with regular
+ expressions. Now that you know enough to get started, you can learn more on
+ your own :/</P
+><P
+> More reading on Perl Compatible Regular expressions:
+ <A
+HREF="http://perldoc.perl.org/perlre.html"
+TARGET="_top"
+>http://perldoc.perl.org/perlre.html</A
+></P
+><P
+> For information on regular expression based substitutions and their applications
+ in filters, please see the <A
+HREF="filter-file.html"
+>filter file tutorial</A
+>
+ in this manual.</P
+></DIV
+><DIV
+CLASS="SECT2"
+><H2
+CLASS="SECT2"
+><A
+NAME="INTERNAL-PAGES"
+>14.2. Privoxy's Internal Pages</A
+></H2
+><P
+> Since <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> proxies each requested
+ web page, it is easy for <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> to
+ trap certain special URLs. In this way, we can talk directly to
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+>, and see how it is
+ configured, see how our rules are being applied, change these
+ rules and other configuration options, and even turn
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy's</SPAN
+> filtering off, all with
+ a web browser.</P
+><P
+> The URLs listed below are the special ones that allow direct access
+ to <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+>. Of course,
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> must be running to access these. If
+ not, you will get a friendly error message. Internet access is not
+ necessary either.</P
+><P
+></P
+><UL
+><LI
+><P
+>   Privoxy main page:
+  </P
+><A
+NAME="AEN5956"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>     <A
+HREF="http://config.privoxy.org/"
+TARGET="_top"
+>http://config.privoxy.org/</A
+>
+   </P
+></BLOCKQUOTE
+><P
+>   There is a shortcut: <A
+HREF="http://p.p/"
+TARGET="_top"
+>http://p.p/</A
+> (But it
+   doesn't provide a fall-back to a real page, in case the request is not
+   sent through <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+>)
+  </P
+></LI
+><LI
+><P
+>    Show information about the current configuration, including viewing and
+    editing of actions files:
+  </P
+><A
+NAME="AEN5964"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>    <A
+HREF="http://config.privoxy.org/show-status"
+TARGET="_top"
+>http://config.privoxy.org/show-status</A
+>
+   </P
+></BLOCKQUOTE
+></LI
+><LI
+><P
+>    Show the source code version numbers:
+  </P
+><A
+NAME="AEN5969"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>    <A
+HREF="http://config.privoxy.org/show-version"
+TARGET="_top"
+>http://config.privoxy.org/show-version</A
+>
+   </P
+></BLOCKQUOTE
+></LI
+><LI
+><P
+>   Show the browser's request headers:
+  </P
+><A
+NAME="AEN5974"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>    <A
+HREF="http://config.privoxy.org/show-request"
+TARGET="_top"
+>http://config.privoxy.org/show-request</A
+>
+   </P
+></BLOCKQUOTE
+></LI
+><LI
+><P
+>   Show which actions apply to a URL and why:
+  </P
+><A
+NAME="AEN5979"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>    <A
+HREF="http://config.privoxy.org/show-url-info"
+TARGET="_top"
+>http://config.privoxy.org/show-url-info</A
+>
+   </P
+></BLOCKQUOTE
+></LI
+><LI
+><P
+>   Toggle Privoxy on or off. This feature can be turned off/on in the main
+   <TT
+CLASS="FILENAME"
+>config</TT
+> file. When toggled <SPAN
+CLASS="QUOTE"
+>"off"</SPAN
+>, <SPAN
+CLASS="QUOTE"
+>"Privoxy"</SPAN
+>
+   continues to run, but only as a pass-through proxy, with no actions taking
+   place:
+  </P
+><A
+NAME="AEN5987"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>    <A
+HREF="http://config.privoxy.org/toggle"
+TARGET="_top"
+>http://config.privoxy.org/toggle</A
+>
+   </P
+></BLOCKQUOTE
+><P
+>   Short cuts. Turn off, then on:
+  </P
+><A
+NAME="AEN5991"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>     <A
+HREF="http://config.privoxy.org/toggle?set=disable"
+TARGET="_top"
+>http://config.privoxy.org/toggle?set=disable</A
+>
+   </P
+></BLOCKQUOTE
+><A
+NAME="AEN5994"
+></A
+><BLOCKQUOTE
+CLASS="BLOCKQUOTE"
+><P
+>     <A
+HREF="http://config.privoxy.org/toggle?set=enable"
+TARGET="_top"
+>http://config.privoxy.org/toggle?set=enable</A
+>
+   </P
+></BLOCKQUOTE
+></LI
+></UL
+></DIV
+><DIV
+CLASS="SECT2"
+><H2
+CLASS="SECT2"
+><A
+NAME="CHAIN"
+>14.3. Chain of Events</A
+></H2
+><P
+> Let's take a quick look at how some of <SPAN
+CLASS="APPLICATION"
+>Privoxy's</SPAN
+>
+ core features are triggered, and the ensuing sequence of events when a web
+ page is requested by your browser:</P
+><P
+></P
+><UL
+><LI
+><P
+>   First, your web browser requests a web page. The browser knows to send
+   the request to <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+>, which will in turn,
+   relay the request to the remote web server after passing the following
+   tests:
+  </P
+></LI
+><LI
+><P
+>   <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> traps any request for its own internal CGI
+   pages (e.g <A
+HREF="http://p.p/"
+TARGET="_top"
+>http://p.p/</A
+>) and sends the CGI page back to the browser.
+  </P
+></LI
+><LI
+><P
+>   Next, <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> checks to see if the URL
+   matches any <A
+HREF="actions-file.html#BLOCK"
+><SPAN
+CLASS="QUOTE"
+>"+block"</SPAN
+></A
+> patterns. If
+   so, the URL is then blocked, and the remote web server will not be contacted.
+   <A
+HREF="actions-file.html#HANDLE-AS-IMAGE"
+><SPAN
+CLASS="QUOTE"
+>"+handle-as-image"</SPAN
+></A
+>
+   and
+   <A
+HREF="actions-file.html#HANDLE-AS-EMPTY-DOCUMENT"
+><SPAN
+CLASS="QUOTE"
+>"+handle-as-empty-document"</SPAN
+></A
+>
+   are then checked, and if there is no match, an
+   HTML <SPAN
+CLASS="QUOTE"
+>"BLOCKED"</SPAN
+> page is sent back to the browser. Otherwise, if
+   it does match, an image is returned for the former, and an empty text
+   document for the latter. The type of image would depend on the setting of
+   <A
+HREF="actions-file.html#SET-IMAGE-BLOCKER"
+><SPAN
+CLASS="QUOTE"
+>"+set-image-blocker"</SPAN
+></A
+>
+   (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
+  </P
+></LI
+><LI
+><P
+>   Untrusted URLs are blocked. If URLs are being added to the
+   <TT
+CLASS="FILENAME"
+>trust</TT
+> file, then that is done.
+  </P
+></LI
+><LI
+><P
+>   If the URL pattern matches the <A
+HREF="actions-file.html#FAST-REDIRECTS"
+><SPAN
+CLASS="QUOTE"
+>"+fast-redirects"</SPAN
+></A
+> action,
+   it is then processed. Unwanted parts of the requested URL are stripped.
+  </P
+></LI
+><LI
+><P
+>   Now the rest of the client browser's request headers are processed. If any
+   of these match any of the relevant actions (e.g. <A
+HREF="actions-file.html#HIDE-USER-AGENT"
+><SPAN
+CLASS="QUOTE"
+>"+hide-user-agent"</SPAN
+></A
+>,
+   etc.), headers are suppressed or forged as determined by these actions and
+   their parameters.
+  </P
+></LI
+><LI
+><P
+>   Now the web server starts sending its response back (i.e. typically a web
+   page).
+  </P
+></LI
+><LI
+><P
+>   First, the server headers are read and processed to determine, among other
+   things, the MIME type (document type) and encoding. The headers are then
+   filtered as determined by the
+   <A
+HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
+><SPAN
+CLASS="QUOTE"
+>"+crunch-incoming-cookies"</SPAN
+></A
+>,
+   <A
+HREF="actions-file.html#SESSION-COOKIES-ONLY"
+><SPAN
+CLASS="QUOTE"
+>"+session-cookies-only"</SPAN
+></A
+>,
+   and <A
+HREF="actions-file.html#DOWNGRADE-HTTP-VERSION"
+><SPAN
+CLASS="QUOTE"
+>"+downgrade-http-version"</SPAN
+></A
+>
+   actions.
+  </P
+></LI
+><LI
+><P
+>   If any <A
+HREF="actions-file.html#FILTER"
+><SPAN
+CLASS="QUOTE"
+>"+filter"</SPAN
+></A
+> action
+   or <A
+HREF="actions-file.html#DEANIMATE-GIFS"
+><SPAN
+CLASS="QUOTE"
+>"+deanimate-gifs"</SPAN
+></A
+>
+   action applies (and the document type fits the action), the rest of the page is
+   read into memory (up to a configurable limit). Then the filter rules (from
+   <TT
+CLASS="FILENAME"
+>default.filter</TT
+> and any other filter files) are
+   processed against the buffered content. Filters are applied in the order
+   they are specified in one of the filter files. Animated GIFs, if present,
+   are reduced to either the first or last frame, depending on the action
+   setting.The entire page, which is now filtered, is then sent by
+   <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> back to your browser.
+  </P
+><P
+>   If neither a <A
+HREF="actions-file.html#FILTER"
+><SPAN
+CLASS="QUOTE"
+>"+filter"</SPAN
+></A
+> action
+   or <A
+HREF="actions-file.html#DEANIMATE-GIFS"
+><SPAN
+CLASS="QUOTE"
+>"+deanimate-gifs"</SPAN
+></A
+>
+   matches, then <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> passes the raw data through
+   to the client browser as it becomes available.
+  </P
+></LI
+><LI
+><P
+>   As the browser receives the now (possibly filtered) page content, it
+   reads and then requests any URLs that may be embedded within the page
+   source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
+   frames), sounds, etc. For each of these objects, the browser issues a
+   separate request (this is easily viewable in <SPAN
+CLASS="APPLICATION"
+>Privoxy's</SPAN
+>
+   logs). And each such request is in turn processed just as above. Note that a
+   complex web page will have many, many such embedded URLs. If these
+   secondary requests are to a different server, then quite possibly a very
+   differing set of actions is triggered.
+  </P
+></LI
+></UL
+><P
+> NOTE: This is somewhat of a simplistic overview of what happens with each URL
+ request. For the sake of brevity and simplicity, we have focused on
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy's</SPAN
+> core features only.</P
+></DIV
+><DIV
+CLASS="SECT2"
+><H2
+CLASS="SECT2"
+><A
+NAME="ACTIONSANAT"
+>14.4. Troubleshooting: Anatomy of an Action</A
+></H2
+><P
+> The way <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> applies
+ <A
+HREF="actions-file.html#ACTIONS"
+>actions</A
+> and <A
+HREF="actions-file.html#FILTER"
+>filters</A
+>
+ to any given URL can be complex, and not always so
+ easy to understand what is happening. And sometimes we need to be able to
+ <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>see</I
+></SPAN
+> just what <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> is
+ doing. Especially, if something <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> is doing
+ is causing us a problem inadvertently. It can be a little daunting to look at
+ the actions and filters files themselves, since they tend to be filled with
+ <A
+HREF="appendix.html#REGEX"
+>regular expressions</A
+> whose consequences are not
+ always so obvious.</P
+><P
+> One quick test to see if <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> is causing a problem
+ or not, is to disable it temporarily. This should be the first troubleshooting
+ step (be sure to flush caches afterward!). Looking at the
+ logs is a good idea too. (Note that both the toggle feature and logging are
+ enabled via <TT
+CLASS="FILENAME"
+>config</TT
+> file settings, and may need to be
+ turned <SPAN
+CLASS="QUOTE"
+>"on"</SPAN
+>.)</P
+><P
+> Another easy troubleshooting step to try is if you have done any
+ customization of your installation, revert back to the installed
+ defaults and see if that helps. There are times the developers get complaints
+ about one thing or another, and the problem is more related to a customized
+ configuration issue.</P
+><P
+> <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> also provides the
+ <A
+HREF="http://config.privoxy.org/show-url-info"
+TARGET="_top"
+>http://config.privoxy.org/show-url-info</A
+>
+ page that can show us very specifically how <SPAN
+CLASS="APPLICATION"
+>actions</SPAN
+>
+ are being applied to any given URL. This is a big help for troubleshooting.</P
+><P
+> First, enter one URL (or partial URL) at the prompt, and then
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> will tell us
+ how the current configuration will handle it. This will not
+ help with filtering effects (i.e. the <A
+HREF="actions-file.html#FILTER"
+><SPAN
+CLASS="QUOTE"
+>"+filter"</SPAN
+></A
+> action) from
+ one of the filter files since this is handled very
+ differently and not so easy to trap! It also will not tell you about any other
+ URLs that may be embedded within the URL you are testing. For instance, images
+ such as ads are expressed as URLs within the raw page source of HTML pages. So
+ you will only get info for the actual URL that is pasted into the prompt area
+ -- not any sub-URLs. If you want to know about embedded URLs like ads, you
+ will have to dig those out of the HTML source. Use your browser's <SPAN
+CLASS="QUOTE"
+>"View
+ Page Source"</SPAN
+> option for this. Or right click on the ad, and grab the
+ URL.</P
+><P
+> Let's try an example, <A
+HREF="http://google.com"
+TARGET="_top"
+>google.com</A
+>,
+ and look at it one section at a time in a sample configuration (your real
+ configuration may vary):</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> Matches for http://www.google.com:
  
- In file: default.action <span class="GUIBUTTON">[ View ]</span> <span class="GUIBUTTON">[ Edit ]</span>
+ In file: default.action <SPAN
+CLASS="GUIBUTTON"
+>[ View ]</SPAN
+> <SPAN
+CLASS="GUIBUTTON"
+>[ Edit ]</SPAN
+>
  
   {+change-x-forwarded-for{block}
   +deanimate-gifs {last}
@@ -394,49 +1324,167 @@
   { -fast-redirects }
   .google.com
  
-In file: user.action <span class="GUIBUTTON">[ View ]</span> <span class="GUIBUTTON">[ Edit ]</span>
-(no matches in this file)</pre>
-          </td>
-        </tr>
-      </table>
-      <p>This is telling us how we have defined our <a href="actions-file.html#ACTIONS"><span class=
-      "QUOTE">"actions"</span></a>, and which ones match for our test case, <span class="QUOTE">"google.com"</span>.
-      Displayed is all the actions that are available to us. Remember, the <tt class="LITERAL">+</tt> sign denotes
-      <span class="QUOTE">"on"</span>. <tt class="LITERAL">-</tt> denotes <span class="QUOTE">"off"</span>. So some are
-      <span class="QUOTE">"on"</span> here, but many are <span class="QUOTE">"off"</span>. Each example we try may
-      provide a slightly different end result, depending on our configuration directives.</p>
-      <p>The first listing is for our <tt class="FILENAME">default.action</tt> file. The large, multi-line listing, is
-      how the actions are set to match for all URLs, i.e. our default settings. If you look at your <span class=
-      "QUOTE">"actions"</span> file, this would be the section just below the <span class="QUOTE">"aliases"</span>
-      section near the top. This will apply to all URLs as signified by the single forward slash at the end of the
-      listing -- <span class="QUOTE">" / "</span>.</p>
-      <p>But we have defined additional actions that would be exceptions to these general rules, and then we list
-      specific URLs (or patterns) that these exceptions would apply to. Last match wins. Just below this then are two
-      explicit matches for <span class="QUOTE">".google.com"</span>. The first is negating our previous cookie setting,
-      which was for <a href="actions-file.html#SESSION-COOKIES-ONLY"><span class=
-      "QUOTE">"+session-cookies-only"</span></a> (i.e. not persistent). So we will allow persistent cookies for google,
-      at least that is how it is in this example. The second turns <span class="emphasis"><i class=
-      "EMPHASIS">off</i></span> any <a href="actions-file.html#FAST-REDIRECTS"><span class=
-      "QUOTE">"+fast-redirects"</span></a> action, allowing this to take place unmolested. Note that there is a leading
-      dot here -- <span class="QUOTE">".google.com"</span>. This will match any hosts and sub-domains, in the
-      google.com domain also, such as <span class="QUOTE">"www.google.com"</span> or <span class=
-      "QUOTE">"mail.google.com"</span>. But it would not match <span class="QUOTE">"www.google.de"</span>! So,
-      apparently, we have these two actions defined as exceptions to the general rules at the top somewhere in the
-      lower part of our <tt class="FILENAME">default.action</tt> file, and <span class="QUOTE">"google.com"</span> is
-      referenced somewhere in these latter sections.</p>
-      <p>Then, for our <tt class="FILENAME">user.action</tt> file, we again have no hits. So there is nothing
-      google-specific that we might have added to our own, local configuration. If there was, those actions would
-      over-rule any actions from previously processed files, such as <tt class="FILENAME">default.action</tt>.
-      <tt class="FILENAME">user.action</tt> typically has the last word. This is the best place to put hard and fast
-      exceptions,</p>
-      <p>And finally we pull it all together in the bottom section and summarize how <span class=
-      "APPLICATION">Privoxy</span> is applying all its <span class="QUOTE">"actions"</span> to <span class=
-      "QUOTE">"google.com"</span>:</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- Final results:
+In file: user.action <SPAN
+CLASS="GUIBUTTON"
+>[ View ]</SPAN
+> <SPAN
+CLASS="GUIBUTTON"
+>[ Edit ]</SPAN
+>
+(no matches in this file)</PRE
+></TD
+></TR
+></TABLE
+><P
+> This is telling us how we have defined our
+ <A
+HREF="actions-file.html#ACTIONS"
+><SPAN
+CLASS="QUOTE"
+>"actions"</SPAN
+></A
+>, and
+ which ones match for our test case, <SPAN
+CLASS="QUOTE"
+>"google.com"</SPAN
+>.
+ Displayed is all the actions that are available to us. Remember,
+ the <TT
+CLASS="LITERAL"
+>+</TT
+> sign denotes <SPAN
+CLASS="QUOTE"
+>"on"</SPAN
+>. <TT
+CLASS="LITERAL"
+>-</TT
+>
+ denotes <SPAN
+CLASS="QUOTE"
+>"off"</SPAN
+>. So some are <SPAN
+CLASS="QUOTE"
+>"on"</SPAN
+> here, but many
+ are <SPAN
+CLASS="QUOTE"
+>"off"</SPAN
+>. Each example we try may provide a slightly different
+ end result, depending on our configuration directives.</P
+><P
+> The first listing
+  is for our <TT
+CLASS="FILENAME"
+>default.action</TT
+> file. The large, multi-line
+  listing, is how the actions are set to match for all URLs, i.e. our default
+  settings. If you look at your <SPAN
+CLASS="QUOTE"
+>"actions"</SPAN
+> file, this would be the
+  section just below the <SPAN
+CLASS="QUOTE"
+>"aliases"</SPAN
+> section near the top. This
+  will apply to all URLs as signified by the single forward slash at the end
+  of the listing -- <SPAN
+CLASS="QUOTE"
+>" / "</SPAN
+>.</P
+><P
+> But we have defined additional actions that would be exceptions to these general
+ rules, and then we list specific URLs (or patterns) that these exceptions
+ would apply to. Last match wins. Just below this then are two explicit
+ matches for <SPAN
+CLASS="QUOTE"
+>".google.com"</SPAN
+>. The first is negating our previous
+ cookie setting, which was for <A
+HREF="actions-file.html#SESSION-COOKIES-ONLY"
+><SPAN
+CLASS="QUOTE"
+>"+session-cookies-only"</SPAN
+></A
+>
+ (i.e. not persistent). So we will allow persistent cookies for google, at
+ least that is how it is in this example. The second turns
+ <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>off</I
+></SPAN
+> any <A
+HREF="actions-file.html#FAST-REDIRECTS"
+><SPAN
+CLASS="QUOTE"
+>"+fast-redirects"</SPAN
+></A
+>
+ action, allowing this to take place unmolested. Note that there is a leading
+ dot here -- <SPAN
+CLASS="QUOTE"
+>".google.com"</SPAN
+>. This will match any hosts and
+ sub-domains, in the google.com domain also, such as
+ <SPAN
+CLASS="QUOTE"
+>"www.google.com"</SPAN
+> or <SPAN
+CLASS="QUOTE"
+>"mail.google.com"</SPAN
+>. But it would not
+ match <SPAN
+CLASS="QUOTE"
+>"www.google.de"</SPAN
+>! So, apparently, we have these two actions
+ defined as exceptions to the general rules at the top somewhere in the lower
+ part of our <TT
+CLASS="FILENAME"
+>default.action</TT
+> file, and
+ <SPAN
+CLASS="QUOTE"
+>"google.com"</SPAN
+> is referenced somewhere in these latter sections.</P
+><P
+> Then, for our <TT
+CLASS="FILENAME"
+>user.action</TT
+> file, we again have no hits.
+ So there is nothing google-specific that we might have added to our own, local
+ configuration. If there was, those actions would over-rule any actions from
+ previously processed files, such as <TT
+CLASS="FILENAME"
+>default.action</TT
+>.
+ <TT
+CLASS="FILENAME"
+>user.action</TT
+> typically has the last word. This is the
+ best place to put hard and fast exceptions,</P
+><P
+> And finally we pull it all together in the bottom section and summarize how
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> is applying all its <SPAN
+CLASS="QUOTE"
+>"actions"</SPAN
+>
+ to <SPAN
+CLASS="QUOTE"
+>"google.com"</SPAN
+>:</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> Final results:
  
   -add-header
   -block
@@ -493,51 +1541,132 @@ In file: user.action <span class="GUIBUTTON">[ View ]</span> <span class="GUIBUT
   -server-header-filter{xml-to-html}
   -server-header-filter{html-to-xml}
   -session-cookies-only
- +set-image-blocker {pattern} </pre>
-          </td>
-        </tr>
-      </table>
-      <p>Notice the only difference here to the previous listing, is to <span class="QUOTE">"fast-redirects"</span> and
-      <span class="QUOTE">"session-cookies-only"</span>, which are activated specifically for this site in our
-      configuration, and thus show in the <span class="QUOTE">"Final Results"</span>.</p>
-      <p>Now another example, <span class="QUOTE">"ad.doubleclick.net"</span>:</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- { +block{Domains starts with "ad"} }
+ +set-image-blocker {pattern} </PRE
+></TD
+></TR
+></TABLE
+><P
+> Notice the only difference here to the previous listing, is to
+ <SPAN
+CLASS="QUOTE"
+>"fast-redirects"</SPAN
+> and <SPAN
+CLASS="QUOTE"
+>"session-cookies-only"</SPAN
+>,
+ which are activated specifically for this site in our configuration,
+ and thus show in the <SPAN
+CLASS="QUOTE"
+>"Final Results"</SPAN
+>.</P
+><P
+> Now another example, <SPAN
+CLASS="QUOTE"
+>"ad.doubleclick.net"</SPAN
+>:</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> { +block{Domains starts with "ad"} }
    ad*.
  
   { +block{Domain contains "ad"} }
    .ad.
  
   { +block{Doubleclick banner server} +handle-as-image }
-  .[a-vx-z]*.doubleclick.net</pre>
-          </td>
-        </tr>
-      </table>
-      <p>We'll just show the interesting part here - the explicit matches. It is matched three different times. Two
-      <span class="QUOTE">"+block{}"</span> sections, and a <span class="QUOTE">"+block{} +handle-as-image"</span>,
-      which is the expanded form of one of our aliases that had been defined as: <span class=
-      "QUOTE">"+block-as-image"</span>. (<a href="actions-file.html#ALIASES"><span class="QUOTE">"Aliases"</span></a>
-      are defined in the first section of the actions file and typically used to combine more than one action.)</p>
-      <p>Any one of these would have done the trick and blocked this as an unwanted image. This is unnecessarily
-      redundant since the last case effectively would also cover the first. No point in taking chances with these guys
-      though ;-) Note that if you want an ad or obnoxious URL to be invisible, it should be defined as <span class=
-      "QUOTE">"ad.doubleclick.net"</span> is done here -- as both a <a href="actions-file.html#BLOCK"><span class=
-      "QUOTE">"+block{}"</span></a> <span class="emphasis"><i class="EMPHASIS">and</i></span> an <a href=
-      "actions-file.html#HANDLE-AS-IMAGE"><span class="QUOTE">"+handle-as-image"</span></a>. The custom alias
-      <span class="QUOTE">"<tt class="LITERAL">+block-as-image</tt>"</span> just simplifies the process and make it
-      more readable.</p>
-      <p>One last example. Let's try <span class="QUOTE">"http://www.example.net/adsl/HOWTO/"</span>. This one is
-      giving us problems. We are getting a blank page. Hmmm ...</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- Matches for http://www.example.net/adsl/HOWTO/:
+  .[a-vx-z]*.doubleclick.net</PRE
+></TD
+></TR
+></TABLE
+><P
+> We'll just show the interesting part here - the explicit matches. It is
+ matched three different times. Two <SPAN
+CLASS="QUOTE"
+>"+block{}"</SPAN
+> sections,
+ and a <SPAN
+CLASS="QUOTE"
+>"+block{} +handle-as-image"</SPAN
+>,
+ which is the expanded form of one of our aliases that had been defined as:
+ <SPAN
+CLASS="QUOTE"
+>"+block-as-image"</SPAN
+>. (<A
+HREF="actions-file.html#ALIASES"
+><SPAN
+CLASS="QUOTE"
+>"Aliases"</SPAN
+></A
+> are defined in
+ the first section of the actions file and typically used to combine more
+ than one action.)</P
+><P
+> Any one of these would have done the trick and blocked this as an unwanted
+ image. This is unnecessarily redundant since the last case effectively
+ would also cover the first. No point in taking chances with these guys
+ though ;-) Note that if you want an ad or obnoxious
+ URL to be invisible, it should be defined as <SPAN
+CLASS="QUOTE"
+>"ad.doubleclick.net"</SPAN
+>
+ is done here -- as both a <A
+HREF="actions-file.html#BLOCK"
+><SPAN
+CLASS="QUOTE"
+>"+block{}"</SPAN
+></A
+>
+ <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>and</I
+></SPAN
+> an
+ <A
+HREF="actions-file.html#HANDLE-AS-IMAGE"
+><SPAN
+CLASS="QUOTE"
+>"+handle-as-image"</SPAN
+></A
+>.
+ The custom alias <SPAN
+CLASS="QUOTE"
+>"<TT
+CLASS="LITERAL"
+>+block-as-image</TT
+>"</SPAN
+> just
+ simplifies the process and make it more readable.</P
+><P
+> One last example. Let's try <SPAN
+CLASS="QUOTE"
+>"http://www.example.net/adsl/HOWTO/"</SPAN
+>.
+ This one is giving us problems. We are getting a blank page. Hmmm ...</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> Matches for http://www.example.net/adsl/HOWTO/:
  
- In file: default.action <span class="GUIBUTTON">[ View ]</span> <span class="GUIBUTTON">[ Edit ]</span>
+ In file: default.action <SPAN
+CLASS="GUIBUTTON"
+>[ View ]</SPAN
+> <SPAN
+CLASS="GUIBUTTON"
+>[ Edit ]</SPAN
+>
  
   {-add-header
    -block
@@ -596,117 +1725,298 @@ In file: user.action <span class="GUIBUTTON">[ View ]</span> <span class="GUIBUT
     /
  
   { +block{Path contains "ads".} +handle-as-image }
-  /ads</pre>
-          </td>
-        </tr>
-      </table>
-      <p>Ooops, the <span class="QUOTE">"/adsl/"</span> is matching <span class="QUOTE">"/ads"</span> in our
-      configuration! But we did not want this at all! Now we see why we get the blank page. It is actually triggering
-      two different actions here, and the effects are aggregated so that the URL is blocked, and <span class=
-      "APPLICATION">Privoxy</span> is told to treat the block as if it were an image. But this is, of course, all
-      wrong. We could now add a new action below this (or better in our own <tt class="FILENAME">user.action</tt> file)
-      that explicitly <span class="emphasis"><i class="EMPHASIS">un</i></span> blocks ( <a href=
-      "actions-file.html#BLOCK"><span class="QUOTE">"{-block}"</span></a>) paths with <span class="QUOTE">"adsl"</span>
-      in them (remember, last match in the configuration wins). There are various ways to handle such exceptions.
-      Example:</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- { -block }
-  /adsl</pre>
-          </td>
-        </tr>
-      </table>
-      <p>Now the page displays ;-) Remember to flush your browser's caches when making these kinds of changes to your
-      configuration to insure that you get a freshly delivered page! Or, try using <tt class=
-      "LITERAL">Shift+Reload</tt>.</p>
-      <p>But now what about a situation where we get no explicit matches like we did with:</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- { +block{Path starts with "ads".} +handle-as-image }
- /ads</pre>
-          </td>
-        </tr>
-      </table>
-      <p>That actually was very helpful and pointed us quickly to where the problem was. If you don't get this kind of
-      match, then it means one of the default rules in the first section of <tt class="FILENAME">default.action</tt> is
-      causing the problem. This would require some guesswork, and maybe a little trial and error to isolate the
-      offending rule. One likely cause would be one of the <a href="actions-file.html#FILTER"><span class=
-      "QUOTE">"+filter"</span></a> actions. These tend to be harder to troubleshoot. Try adding the URL for the site to
-      one of aliases that turn off <a href="actions-file.html#FILTER"><span class="QUOTE">"+filter"</span></a>:</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- { shop }
+  /ads</PRE
+></TD
+></TR
+></TABLE
+><P
+> Ooops, the <SPAN
+CLASS="QUOTE"
+>"/adsl/"</SPAN
+> is matching <SPAN
+CLASS="QUOTE"
+>"/ads"</SPAN
+> in our
+ configuration! But we did not want this at all! Now we see why we get the
+ blank page. It is actually triggering two different actions here, and
+ the effects are aggregated so that the URL is blocked, and <SPAN
+CLASS="APPLICATION"
+>Privoxy</SPAN
+> is told
+ to treat the block as if it were an image. But this is, of course, all wrong.
+  We could now add a new action below this (or better in our own
+  <TT
+CLASS="FILENAME"
+>user.action</TT
+> file) that explicitly
+  <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>un</I
+></SPAN
+> blocks (
+  <A
+HREF="actions-file.html#BLOCK"
+><SPAN
+CLASS="QUOTE"
+>"{-block}"</SPAN
+></A
+>) paths with
+  <SPAN
+CLASS="QUOTE"
+>"adsl"</SPAN
+> in them (remember, last match in the configuration
+  wins). There are various ways to handle such exceptions. Example:</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> { -block }
+  /adsl</PRE
+></TD
+></TR
+></TABLE
+><P
+> Now the page displays ;-)
+ Remember to flush your browser's caches when making these kinds of changes to
+ your configuration to insure that you get a freshly delivered page! Or, try
+ using <TT
+CLASS="LITERAL"
+>Shift+Reload</TT
+>.</P
+><P
+> But now what about a situation where we get no explicit matches like
+ we did with:</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> { +block{Path starts with "ads".} +handle-as-image }
+ /ads</PRE
+></TD
+></TR
+></TABLE
+><P
+> That actually was very helpful and pointed us quickly to where the problem
+ was. If you don't get this kind of match, then it means one of the default
+ rules in the first section of <TT
+CLASS="FILENAME"
+>default.action</TT
+> is causing
+ the problem. This would require some guesswork, and maybe a little trial and
+ error to isolate the offending rule. One likely cause would be one of the
+ <A
+HREF="actions-file.html#FILTER"
+><SPAN
+CLASS="QUOTE"
+>"+filter"</SPAN
+></A
+> actions.
+ These tend to be harder to troubleshoot.
+ Try adding the URL for the site to one of aliases that turn off
+ <A
+HREF="actions-file.html#FILTER"
+><SPAN
+CLASS="QUOTE"
+>"+filter"</SPAN
+></A
+>:</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> { shop }
   .quietpc.com
   .worldpay.com   # for quietpc.com
   .jungle.com
   .scan.co.uk
- .forbes.com</pre>
-          </td>
-        </tr>
-      </table>
-      <p><span class="QUOTE">"<tt class="LITERAL">{ shop }</tt>"</span> is an <span class="QUOTE">"alias"</span> that
-      expands to <span class="QUOTE">"<tt class="LITERAL">{ -filter -session-cookies-only }</tt>"</span>. Or you could
-      do your own exception to negate filtering:</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- { -filter }
+ .forbes.com</PRE
+></TD
+></TR
+></TABLE
+><P
+> <SPAN
+CLASS="QUOTE"
+>"<TT
+CLASS="LITERAL"
+>{ shop }</TT
+>"</SPAN
+> is an <SPAN
+CLASS="QUOTE"
+>"alias"</SPAN
+> that expands to
+ <SPAN
+CLASS="QUOTE"
+>"<TT
+CLASS="LITERAL"
+>{ -filter -session-cookies-only }</TT
+>"</SPAN
+>.
+ Or you could do your own exception to negate filtering:</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> { -filter }
   # Disable ALL filter actions for sites in this section
   .forbes.com
   developer.ibm.com
- localhost</pre>
-          </td>
-        </tr>
-      </table>
-      <p>This would turn off all filtering for these sites. This is best put in <tt class="FILENAME">user.action</tt>,
-      for local site exceptions. Note that when a simple domain pattern is used by itself (without the subsequent path
-      portion), all sub-pages within that domain are included automatically in the scope of the action.</p>
-      <p>Images that are inexplicably being blocked, may well be hitting the <a href=
-      "actions-file.html#FILTER-BANNERS-BY-SIZE"><span class="QUOTE">"+filter{banners-by-size}"</span></a> rule, which
-      assumes that images of certain sizes are ad banners (works well <span class="emphasis"><i class="EMPHASIS">most
-      of the time</i></span> since these tend to be standardized).</p>
-      <p><span class="QUOTE">"<tt class="LITERAL">{ fragile }</tt>"</span> is an alias that disables most actions that
-      are the most likely to cause trouble. This can be used as a last resort for problem sites.</p>
-      <table border="0" bgcolor="#E0E0E0" width="100%">
-        <tr>
-          <td>
-            <pre class="SCREEN">
- { fragile }
+ localhost</PRE
+></TD
+></TR
+></TABLE
+><P
+> This would turn off all filtering for these sites. This is best
+ put in <TT
+CLASS="FILENAME"
+>user.action</TT
+>, for local site
+ exceptions. Note that when a simple domain pattern is used by itself (without
+ the subsequent path portion), all sub-pages within that domain are included
+ automatically in the scope of the action.</P
+><P
+> Images that are inexplicably being blocked, may well be hitting the
+<A
+HREF="actions-file.html#FILTER-BANNERS-BY-SIZE"
+><SPAN
+CLASS="QUOTE"
+>"+filter{banners-by-size}"</SPAN
+></A
+>
+ rule, which assumes
+ that images of certain sizes are ad banners (works well
+ <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>most of the time</I
+></SPAN
+>  since these tend to be standardized).</P
+><P
+> <SPAN
+CLASS="QUOTE"
+>"<TT
+CLASS="LITERAL"
+>{ fragile }</TT
+>"</SPAN
+> is an alias that disables most
+ actions that are the most likely to cause trouble. This can be used as a
+ last resort for problem sites.</P
+><TABLE
+BORDER="0"
+BGCOLOR="#E0E0E0"
+WIDTH="100%"
+><TR
+><TD
+><PRE
+CLASS="SCREEN"
+> { fragile }
   # Handle with care: easy to break
   mail.google.
- mybank.example.com</pre>
-          </td>
-        </tr>
-      </table>
-      <p><span class="emphasis"><i class="EMPHASIS">Remember to flush caches!</i></span> Note that the <tt class=
-      "LITERAL">mail.google</tt> reference lacks the TLD portion (e.g. <span class="QUOTE">".com"</span>). This will
-      effectively match any TLD with <tt class="LITERAL">google</tt> in it, such as <tt class=
-      "LITERAL">mail.google.de.</tt>, just as an example.</p>
-      <p>If this still does not work, you will have to go through the remaining actions one by one to find which one(s)
-      is causing the problem.</p>
-    </div>
-  </div>
-  <div class="NAVFOOTER">
-    <hr align="left" width="100%">
-    <table summary="Footer navigation table" width="100%" border="0" cellpadding="0" cellspacing="0">
-      <tr>
-        <td width="33%" align="left" valign="top"><a href="seealso.html" accesskey="P">Prev</a></td>
-        <td width="34%" align="center" valign="top"><a href="index.html" accesskey="H">Home</a></td>
-        <td width="33%" align="right" valign="top">&nbsp;</td>
-      </tr>
-      <tr>
-        <td width="33%" align="left" valign="top">See Also</td>
-        <td width="34%" align="center" valign="top">&nbsp;</td>
-        <td width="33%" align="right" valign="top">&nbsp;</td>
-      </tr>
-    </table>
-  </div>
-</body>
-</html>
+ mybank.example.com</PRE
+></TD
+></TR
+></TABLE
+><P
+> <SPAN
+CLASS="emphasis"
+><I
+CLASS="EMPHASIS"
+>Remember to flush caches!</I
+></SPAN
+> Note that the
+ <TT
+CLASS="LITERAL"
+>mail.google</TT
+> reference lacks the TLD portion (e.g.
+ <SPAN
+CLASS="QUOTE"
+>".com"</SPAN
+>). This will effectively match any TLD with
+ <TT
+CLASS="LITERAL"
+>google</TT
+> in it, such as <TT
+CLASS="LITERAL"
+>mail.google.de.</TT
+>,
+ just as an example.</P
+><P
+> If this still does not work, you will have to go through the remaining
+ actions one by one to find which one(s) is causing the problem.</P
+></DIV
+></DIV
+><DIV
+CLASS="NAVFOOTER"
+><HR
+ALIGN="LEFT"
+WIDTH="100%"><TABLE
+SUMMARY="Footer navigation table"
+WIDTH="100%"
+BORDER="0"
+CELLPADDING="0"
+CELLSPACING="0"
+><TR
+><TD
+WIDTH="33%"
+ALIGN="left"
+VALIGN="top"
+><A
+HREF="seealso.html"
+ACCESSKEY="P"
+>Prev</A
+></TD
+><TD
+WIDTH="34%"
+ALIGN="center"
+VALIGN="top"
+><A
+HREF="index.html"
+ACCESSKEY="H"
+>Home</A
+></TD
+><TD
+WIDTH="33%"
+ALIGN="right"
+VALIGN="top"
+>&nbsp;</TD
+></TR
+><TR
+><TD
+WIDTH="33%"
+ALIGN="left"
+VALIGN="top"
+>See Also</TD
+><TD
+WIDTH="34%"
+ALIGN="center"
+VALIGN="top"
+>&nbsp;</TD
+><TD
+WIDTH="33%"
+ALIGN="right"
+VALIGN="top"
+>&nbsp;</TD
+></TR
+></TABLE
+></DIV
+></BODY
+></HTML
+>
+\ No newline at end of file