- </dl>
- </div>
- <div class="SECT3">
- <h3 class="SECT3">
- <a name="AEN2843">8.4.1. The Domain Pattern</a>
- </h3>
- <p>
- The matching of the domain part offers some flexible options: if
- the domain starts or ends with a dot, it becomes unanchored at
- that end. For example:
- </p>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- <tt class="LITERAL">.example.com</tt>
- </dt>
- <dd>
- <p>
- matches any domain with first-level domain <tt class=
- "LITERAL">com</tt> and second-level domain <tt class=
- "LITERAL">example</tt>. For example <tt class=
- "LITERAL">www.example.com</tt>, <tt class=
- "LITERAL">example.com</tt> and <tt class=
- "LITERAL">foo.bar.baz.example.com</tt>. Note that it
- wouldn't match if the second-level domain was <tt class=
- "LITERAL">another-example</tt>.
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">www.</tt>
- </dt>
- <dd>
- <p>
- matches any domain that <span class="emphasis"><i class=
- "EMPHASIS">STARTS</i></span> with <tt class=
- "LITERAL">www.</tt> (It also matches the domain <tt class=
- "LITERAL">www</tt> but most of the time that doesn't
- matter.)
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">.example.</tt>
- </dt>
- <dd>
- <p>
- matches any domain that <span class="emphasis"><i class=
- "EMPHASIS">CONTAINS</i></span> <tt class=
- "LITERAL">.example.</tt>. And, by the way, also included
- would be any files or documents that exist within that
- domain since no path limitations are specified. (Correctly
- speaking: It matches any FQDN that contains <tt class=
- "LITERAL">example</tt> as a domain.) This might be <tt
- class="LITERAL">www.example.com</tt>, <tt class=
- "LITERAL">news.example.de</tt>, or <tt class=
- "LITERAL">www.example.net/cgi/testing.pl</tt> for instance.
- All these cases are matched.
- </p>
- </dd>
- </dl>
- </div>
- <p>
- Additionally, there are wild-cards that you can use in the domain
- names themselves. These work similarly to shell globbing type
- wild-cards: <span class="QUOTE">"*"</span> represents zero or
- more arbitrary characters (this is equivalent to the <a href=
- "http://en.wikipedia.org/wiki/Regular_expressions" target=
- "_top"><span class="QUOTE">"Regular Expression"</span></a> based
- syntax of <span class="QUOTE">".*"</span>), <span class=
- "QUOTE">"?"</span> represents any single character (this is
- equivalent to the regular expression syntax of a simple <span
- class="QUOTE">"."</span>), and you can define <span class=
- "QUOTE">"character classes"</span> in square brackets which is
- similar to the same regular expression technique. All of this can
- be freely mixed:
- </p>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- <tt class="LITERAL">ad*.example.com</tt>
- </dt>
- <dd>
- <p>
- matches <span class="QUOTE">"adserver.example.com"</span>,
- <span class="QUOTE">"ads.example.com"</span>, etc but not
- <span class="QUOTE">"sfads.example.com"</span>
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">*ad*.example.com</tt>
- </dt>
- <dd>
- <p>
- matches all of the above, and then some.
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">.?pix.com</tt>
- </dt>
- <dd>
- <p>
- matches <tt class="LITERAL">www.ipix.com</tt>, <tt class=
- "LITERAL">pictures.epix.com</tt>, <tt class=
- "LITERAL">a.b.c.d.e.upix.com</tt> etc.
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">www[1-9a-ez].example.c*</tt>
- </dt>
- <dd>
- <p>
- matches <tt class="LITERAL">www1.example.com</tt>, <tt
- class="LITERAL">www4.example.cc</tt>, <tt class=
- "LITERAL">wwwd.example.cy</tt>, <tt class=
- "LITERAL">wwwz.example.com</tt> etc., but <span class=
- "emphasis"><i class="EMPHASIS">not</i></span> <tt class=
- "LITERAL">wwww.example.com</tt>.
- </p>
- </dd>
- </dl>
- </div>
- <p>
- While flexible, this is not the sophistication of full regular
- expression based syntax.
- </p>
- </div>
- <div class="SECT3">
- <h3 class="SECT3">
- <a name="AEN2919">8.4.2. The Path Pattern</a>
- </h3>
- <p>
- <span class="APPLICATION">Privoxy</span> uses <span class=
- "QUOTE">"modern"</span> POSIX 1003.2 <a href=
- "http://en.wikipedia.org/wiki/Regular_expressions" target=
- "_top"><span class="QUOTE">"Regular Expressions"</span></a> for
- matching the path portion (after the slash), and is thus more
- flexible.
- </p>
- <p>
- There is an <a href="appendix.html#REGEX">Appendix</a> with a
- brief quick-start into regular expressions, you also might want
- to have a look at your operating system's documentation on
- regular expressions (try <tt class="LITERAL">man re_format</tt>).
- </p>
- <p>
- Note that the path pattern is automatically left-anchored at the
- <span class="QUOTE">"/"</span>, i.e. it matches as if it would
- start with a <span class="QUOTE">"^"</span> (regular expression
- speak for the beginning of a line).
- </p>
- <p>
- Please also note that matching in the path is <span class=
- "emphasis"><i class="EMPHASIS">CASE INSENSITIVE</i></span> by
- default, but you can switch to case sensitive at any point in the
- pattern by using the <span class="QUOTE">"(?-i)"</span> switch:
- <tt class="LITERAL">www.example.com/(?-i)PaTtErN.*</tt> will
- match only documents whose path starts with <tt class=
- "LITERAL">PaTtErN</tt> in <span class="emphasis"><i class=
- "EMPHASIS">exactly</i></span> this capitalization.
- </p>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- <tt class="LITERAL">.example.com/.*</tt>
- </dt>
- <dd>
- <p>
- Is equivalent to just <span class=
- "QUOTE">".example.com"</span>, since any documents within
- that domain are matched with or without the <span class=
- "QUOTE">".*"</span> regular expression. This is redundant
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">.example.com/.*/index.html$</tt>
- </dt>
- <dd>
- <p>
- Will match any page in the domain of <span class=
- "QUOTE">"example.com"</span> that is named <span class=
- "QUOTE">"index.html"</span>, and that is part of some path.
- For example, it matches <span class=
- "QUOTE">"www.example.com/testing/index.html"</span> but NOT
- <span class="QUOTE">"www.example.com/index.html"</span>
- because the regular expression called for at least two
- <span class="QUOTE">"/'s"</span>, thus the path
- requirement. It also would match <span class=
- "QUOTE">"www.example.com/testing/index_html"</span>,
- because of the special meta-character <span class=
- "QUOTE">"."</span>.
- </p>
- </dd>
- <dt>
- <tt class="LITERAL">.example.com/(.*/)?index\.html$</tt>
- </dt>
- <dd>
- <p>
- This regular expression is conditional so it will match any
- page named <span class="QUOTE">"index.html"</span>
- regardless of path which in this case can have one or more
- <span class="QUOTE">"/'s"</span>. And this one must contain
- exactly <span class="QUOTE">".html"</span> (but does not
- have to end with that!).
- </p>
- </dd>
- <dt>
- <tt class=
- "LITERAL">.example.com/(.*/)(ads|banners?|junk)</tt>
- </dt>
- <dd>
- <p>
- This regular expression will match any path of <span class=
- "QUOTE">"example.com"</span> that contains any of the words
- <span class="QUOTE">"ads"</span>, <span class=
- "QUOTE">"banner"</span>, <span class=
- "QUOTE">"banners"</span> (because of the <span class=
- "QUOTE">"?"</span>) or <span class="QUOTE">"junk"</span>.
- The path does not have to end in these words, just contain
- them.
- </p>
- </dd>
- <dt>
- <tt class=
- "LITERAL">.example.com/(.*/)(ads|banners?|junk)/.*\.(jpe?g|gif|png)$</tt>
- </dt>
- <dd>
- <p>
- This is very much the same as above, except now it must end
- in either <span class="QUOTE">".jpg"</span>, <span class=
- "QUOTE">".jpeg"</span>, <span class="QUOTE">".gif"</span>
- or <span class="QUOTE">".png"</span>. So this one is
- limited to common image formats.
- </p>
- </dd>
- </dl>
- </div>
- <p>
- There are many, many good examples to be found in <tt class=
- "FILENAME">default.action</tt>, and more tutorials below in <a
- href="appendix.html#REGEX">Appendix on regular expressions</a>.
- </p>
- </div>
- <div class="SECT3">
- <h3 class="SECT3">
- <a name="TAG-PATTERN">8.4.3. The Tag Pattern</a>
- </h3>
- <p>
- Tag patterns are used to change the applying actions based on the
- request's tags. Tags can be created with either the <a href=
- "actions-file.html#CLIENT-HEADER-TAGGER">client-header-tagger</a>
- or the <a href=
- "actions-file.html#SERVER-HEADER-TAGGER">server-header-tagger</a>
- action.
- </p>
- <p>
- Tag patterns have to start with <span class=
- "QUOTE">"TAG:"</span>, so <span class=
- "APPLICATION">Privoxy</span> can tell them apart from URL
- patterns. Everything after the colon including white space, is
- interpreted as a regular expression with path pattern syntax,
- except that tag patterns aren't left-anchored automatically
- (<span class="APPLICATION">Privoxy</span> doesn't silently add a
- <span class="QUOTE">"^"</span>, you have to do it yourself if you
- need it).
- </p>
- <p>
- To match all requests that are tagged with <span class=
- "QUOTE">"foo"</span> your pattern line should be <span class=
- "QUOTE">"TAG:^foo$"</span>, <span class="QUOTE">"TAG:foo"</span>
- would work as well, but it would also match requests whose tags
- contain <span class="QUOTE">"foo"</span> somewhere. <span class=
- "QUOTE">"TAG: foo"</span> wouldn't work as it requires white
- space.
- </p>
- <p>
- Sections can contain URL and tag patterns at the same time, but
- tag patterns are checked after the URL patterns and thus always
- overrule them, even if they are located before the URL patterns.
- </p>
- <p>
- Once a new tag is added, Privoxy checks right away if it's
- matched by one of the tag patterns and updates the action
- settings accordingly. As a result tags can be used to activate
- other tagger actions, as long as these other taggers look for
- headers that haven't already be parsed.
- </p>
- <p>
- For example you could tag client requests which use the <tt
- class="LITERAL">POST</tt> method, then use this tag to activate
- another tagger that adds a tag if cookies are sent, and then use
- a block action based on the cookie tag. This allows the outcome
- of one action, to be input into a subsequent action. However if
- you'd reverse the position of the described taggers, and
- activated the method tagger based on the cookie tagger, no method
- tags would be created. The method tagger would look for the
- request line, but at the time the cookie tag is created, the
- request line has already been parsed.
- </p>
- <p>
- While this is a limitation you should be aware of, this kind of
- indirection is seldom needed anyway and even the example doesn't
- make too much sense.
- </p>
- </div>
- </div>
- <div class="SECT2">
- <h2 class="SECT2">
- <a name="ACTIONS">8.5. Actions</a>
- </h2>
- <p>
- All actions are disabled by default, until they are explicitly
- enabled somewhere in an actions file. Actions are turned on if
- preceded with a <span class="QUOTE">"+"</span>, and turned off if
- preceded with a <span class="QUOTE">"-"</span>. So a <tt class=
- "LITERAL">+action</tt> means <span class="QUOTE">"do that
- action"</span>, e.g. <tt class="LITERAL">+block</tt> means <span
- class="QUOTE">"please block URLs that match the following
- patterns"</span>, and <tt class="LITERAL">-block</tt> means <span
- class="QUOTE">"don't block URLs that match the following patterns,
- even if <tt class="LITERAL">+block</tt> previously
- applied."</span>
- </p>
- <p>
- Again, actions are invoked by placing them on a line, enclosed in
- curly braces and separated by whitespace, like in <tt class=
- "LITERAL">{+some-action -some-other-action{some-parameter}}</tt>,
- followed by a list of URL patterns, one per line, to which they
- apply. Together, the actions line and the following pattern lines
- make up a section of the actions file.
- </p>
- <p>
- Actions fall into three categories:
- </p>
- <p>
- </p>
- <ul>
- <li>
- <p>
- Boolean, i.e the action can only be <span class=
- "QUOTE">"enabled"</span> or <span class=
- "QUOTE">"disabled"</span>. Syntax:
- </p>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
- +<tt class="REPLACEABLE"><i>name</i></tt> # enable action <tt class=
-"REPLACEABLE"><i>name</i></tt>
- -<tt class="REPLACEABLE"><i>name</i></tt> # disable action <tt
-class="REPLACEABLE"><i>name</i></tt>
-</pre>
- </td>
- </tr>
- </table>
-
- <p>
- Example: <tt class="LITERAL">+handle-as-image</tt>
- </p>
- </li>
- <li>
- <p>
- Parameterized, where some value is required in order to enable
- this type of action. Syntax:
- </p>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
- +<tt class="REPLACEABLE"><i>name</i></tt>{<tt class=
-"REPLACEABLE"><i>param</i></tt>} # enable action and set parameter to <tt
-class="REPLACEABLE"><i>param</i></tt>,
- # overwriting parameter from previous match if necessary
- -<tt class=
-"REPLACEABLE"><i>name</i></tt> # disable action. The parameter can be omitted
-</pre>
- </td>
- </tr>
- </table>
-
- <p>
- Note that if the URL matches multiple positive forms of a
- parameterized action, the last match wins, i.e. the params from
- earlier matches are simply ignored.
- </p>
- <p>
- Example: <tt class="LITERAL">+hide-user-agent{Mozilla/5.0 (X11;
- U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602
- Firefox/2.0.0.4}</tt>
- </p>
- </li>
- <li>
- <p>
- Multi-value. These look exactly like parameterized actions, but
- they behave differently: If the action applies multiple times
- to the same URL, but with different parameters, <span class=
- "emphasis"><i class="EMPHASIS">all</i></span> the parameters
- from <span class="emphasis"><i class="EMPHASIS">all</i></span>
- matches are remembered. This is used for actions that can be
- executed for the same request repeatedly, like adding multiple
- headers, or filtering through multiple filters. Syntax:
- </p>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
- +<tt class="REPLACEABLE"><i>name</i></tt>{<tt class=
-"REPLACEABLE"><i>param</i></tt>} # enable action and add <tt class=
-"REPLACEABLE"><i>param</i></tt> to the list of parameters
- -<tt class="REPLACEABLE"><i>name</i></tt>{<tt class=
-"REPLACEABLE"><i>param</i></tt>} # remove the parameter <tt class=
-"REPLACEABLE"><i>param</i></tt> from the list of parameters
- # If it was the last one left, disable the action.
- <tt class=
-"REPLACEABLE"><i>-name</i></tt> # disable this action completely and remove all parameters from the list
-</pre>
- </td>
- </tr>
- </table>
-
- <p>
- Examples: <tt class="LITERAL">+add-header{X-Fun-Header: Some
- text}</tt> and <tt class=
- "LITERAL">+filter{html-annoyances}</tt>
- </p>
- </li>
- </ul>
-
- <p>
- If nothing is specified in any actions file, no <span class=
- "QUOTE">"actions"</span> are taken. So in this case <span class=
- "APPLICATION">Privoxy</span> would just be a normal, non-blocking,
- non-filtering proxy. You must specifically enable the privacy and
- blocking features you need (although the provided default actions
- files will give a good starting point).
- </p>
- <p>
- Later defined action sections always over-ride earlier ones of the
- same type. So exceptions to any rules you make, should come in the
- latter part of the file (or in a file that is processed later when
- using multiple actions files such as <tt class=
- "FILENAME">user.action</tt>). For multi-valued actions, the actions
- are applied in the order they are specified. Actions files are
- processed in the order they are defined in <tt class=
- "FILENAME">config</tt> (the default installation has three actions
- files). It also quite possible for any given URL to match more than
- one <span class="QUOTE">"pattern"</span> (because of wildcards and
- regular expressions), and thus to trigger more than one set of
- actions! Last match wins.
- </p>
- <p>
- The list of valid <span class="APPLICATION">Privoxy</span> actions
- are:
- </p>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="ADD-HEADER">8.5.1. add-header</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Confuse log analysis, custom applications
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Sends a user defined HTTP header to the web server.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Multi-value.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- Any string value is possible. Validity of the defined HTTP
- headers is not checked. It is recommended that you use the
- <span class="QUOTE">"<tt class="LITERAL">X-</tt>"</span>
- prefix for custom headers.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- This action may be specified multiple times, in order to
- define multiple headers. This is rarely needed for the
- typical user. If you don't know what <span class=
- "QUOTE">"HTTP headers"</span> are, you definitely don't
- need to worry about this one.
- </p>
- <p>
- Headers added by this action are not modified by other
- actions.
- </p>
- </dd>
- <dt>
- Example usage:
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-+add-header{X-User-Tracking: sucks}
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="BLOCK">8.5.2. block</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Block ads or other unwanted content
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Requests for URLs to which this action applies are blocked,
- i.e. the requests are trapped by <span class=
- "APPLICATION">Privoxy</span> and the requested URL is never
- retrieved, but is answered locally with a substitute page
- or image, as determined by the <tt class="LITERAL"><a href=
- "actions-file.html#HANDLE-AS-IMAGE">handle-as-image</a></tt>,
- <tt class="LITERAL"><a href=
- "actions-file.html#SET-IMAGE-BLOCKER">set-image-blocker</a></tt>,
- and <tt class="LITERAL"><a href=
- "actions-file.html#HANDLE-AS-EMPTY-DOCUMENT">handle-as-empty-document</a></tt>
- actions.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- A block reason that should be given to the user.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- <span class="APPLICATION">Privoxy</span> sends a special
- <span class="QUOTE">"BLOCKED"</span> page for requests to
- blocked pages. This page contains the block reason given as
- parameter, a link to find out why the block action applies,
- and a click-through to the blocked content (the latter only
- if the force feature is available and enabled).
- </p>
- <p>
- A very important exception occurs if <span class=
- "emphasis"><i class="EMPHASIS">both</i></span> <tt class=
- "LITERAL">block</tt> and <tt class="LITERAL"><a href=
- "actions-file.html#HANDLE-AS-IMAGE">handle-as-image</a></tt>,
- apply to the same request: it will then be replaced by an
- image. If <tt class="LITERAL"><a href=
- "actions-file.html#SET-IMAGE-BLOCKER">set-image-blocker</a></tt>
- (see below) also applies, the type of image will be
- determined by its parameter, if not, the standard
- checkerboard pattern is sent.
- </p>
- <p>
- It is important to understand this process, in order to
- understand how <span class="APPLICATION">Privoxy</span>
- deals with ads and other unwanted content. Blocking is a
- core feature, and one upon which various other features
- depend.
- </p>
- <p>
- The <tt class="LITERAL"><a href=
- "actions-file.html#FILTER">filter</a></tt> action can
- perform a very similar task, by <span class=
- "QUOTE">"blocking"</span> banner images and other content
- through rewriting the relevant URLs in the document's HTML
- source, so they don't get requested in the first place.
- Note that this is a totally different technique, and it's
- easy to confuse the two.
- </p>
- </dd>
- <dt>
- Example usage (section):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-{+block{No nasty stuff for you.}}
-# Block and replace with "blocked" page
- .nasty-stuff.example.com
-
-{+block{Doubleclick banners.} +handle-as-image}
-# Block and replace with image
- .ad.doubleclick.net
- .ads.r.us/banners/
-
-{+block{Layered ads.} +handle-as-empty-document}
-# Block and then ignore
- adserver.example.net/.*\.js$
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CHANGE-X-FORWARDED-FOR">8.5.3.
- change-x-forwarded-for</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Improve privacy by not forwarding the source of the request
- in the HTTP headers.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Deletes the <span class="QUOTE">"X-Forwarded-For:"</span>
- HTTP header from the client request, or adds a new one.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <ul>
- <li>
- <p>
- <span class="QUOTE">"block"</span> to delete the
- header.
- </p>
- </li>
- <li>
- <p>
- <span class="QUOTE">"add"</span> to create the header
- (or append the client's IP address to an already
- existing one).
- </p>
- </li>
- </ul>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- It is safe and recommended to use <tt class=
- "LITERAL">block</tt>.
- </p>
- <p>
- Forwarding the source address of the request may make sense
- in some multi-user setups but is also a privacy risk.
- </p>
- </dd>
- <dt>
- Example usage:
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-+change-x-forwarded-for{block}
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CLIENT-HEADER-FILTER">8.5.4. client-header-filter</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Rewrite or remove single client headers.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- All client headers to which this action applies are
- filtered on-the-fly through the specified regular
- expression based substitutions.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- The name of a client-header filter, as defined in one of
- the <a href="filter-file.html">filter files</a>.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- Client-header filters are applied to each header on its
- own, not to all at once. This makes it easier to diagnose
- problems, but on the downside you can't write filters that
- only change header x if header y's value is z. You can do
- that by using tags though.
- </p>
- <p>
- Client-header filters are executed after the other header
- actions have finished and use their output as input.
- </p>
- <p>
- If the request URL gets changed, <span class=
- "APPLICATION">Privoxy</span> will detect that and use the
- new one. This can be used to rewrite the request
- destination behind the client's back, for example to
- specify a Tor exit relay for certain requests.
- </p>
- <p>
- Please refer to the <a href="filter-file.html">filter file
- chapter</a> to learn which client-header filters are
- available by default, and how to create your own.
- </p>
- </dd>
- <dt>
- Example usage (section):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-# Hide Tor exit notation in Host and Referer Headers
-{+client-header-filter{hide-tor-exit-notation}}
-/
-
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CLIENT-HEADER-TAGGER">8.5.5. client-header-tagger</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Block requests based on their headers.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Client headers to which this action applies are filtered
- on-the-fly through the specified regular expression based
- substitutions, the result is used as tag.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- The name of a client-header tagger, as defined in one of
- the <a href="filter-file.html">filter files</a>.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- Client-header taggers are applied to each header on its
- own, and as the header isn't modified, each tagger <span
- class="QUOTE">"sees"</span> the original.
- </p>
- <p>
- Client-header taggers are the first actions that are
- executed and their tags can be used to control every other
- action.
- </p>
- </dd>
- <dt>
- Example usage (section):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-# Tag every request with the User-Agent header
-{+client-header-tagger{user-agent}}
-/
-
-# Tagging itself doesn't change the action
-# settings, sections with TAG patterns do:
-#
-# If it's a download agent, use a different forwarding proxy,
-# show the real User-Agent and make sure resume works.
-{+forward-override{forward-socks5 10.0.0.2:2222 .} \
- -hide-if-modified-since \
- -overwrite-last-modified \
- -hide-user-agent \
- -filter \
- -deanimate-gifs \
-}
-TAG:^User-Agent: NetBSD-ftp/
-TAG:^User-Agent: Novell ZYPP Installer
-TAG:^User-Agent: RPM APT-HTTP/
-TAG:^User-Agent: fetch libfetch/
-TAG:^User-Agent: Ubuntu APT-HTTP/
-TAG:^User-Agent: MPlayer/
-
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CONTENT-TYPE-OVERWRITE">8.5.6.
- content-type-overwrite</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Stop useless download menus from popping up, or change the
- browser's rendering mode
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Replaces the <span class="QUOTE">"Content-Type:"</span>
- HTTP server header.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- Any string.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- The <span class="QUOTE">"Content-Type:"</span> HTTP server
- header is used by the browser to decide what to do with the
- document. The value of this header can cause the browser to
- open a download menu instead of displaying the document by
- itself, even if the document's format is supported by the
- browser.
- </p>
- <p>
- The declared content type can also affect which rendering
- mode the browser chooses. If XHTML is delivered as <span
- class="QUOTE">"text/html"</span>, many browsers treat it as
- yet another broken HTML document. If it is send as <span
- class="QUOTE">"application/xml"</span>, browsers with XHTML
- support will only display it, if the syntax is correct.
- </p>
- <p>
- If you see a web site that proudly uses XHTML buttons, but
- sets <span class="QUOTE">"Content-Type: text/html"</span>,
- you can use <span class="APPLICATION">Privoxy</span> to
- overwrite it with <span class=
- "QUOTE">"application/xml"</span> and validate the web
- master's claim inside your XHTML-supporting browser. If the
- syntax is incorrect, the browser will complain loudly.
- </p>
- <p>
- You can also go the opposite direction: if your browser
- prints error messages instead of rendering a document
- falsely declared as XHTML, you can overwrite the content
- type with <span class="QUOTE">"text/html"</span> and have
- it rendered as broken HTML document.
- </p>
- <p>
- By default <tt class="LITERAL">content-type-overwrite</tt>
- only replaces <span class="QUOTE">"Content-Type:"</span>
- headers that look like some kind of text. If you want to
- overwrite it unconditionally, you have to combine it with
- <tt class="LITERAL"><a href=
- "actions-file.html#FORCE-TEXT-MODE">force-text-mode</a></tt>.
- This limitation exists for a reason, think twice before
- circumventing it.
- </p>
- <p>
- Most of the time it's easier to replace this action with a
- custom <tt class="LITERAL"><a href=
- "actions-file.html#SERVER-HEADER-FILTER">server-header
- filter</a></tt>. It allows you to activate it for every
- document of a certain site and it will still only replace
- the content types you aimed at.
- </p>
- <p>
- Of course you can apply <tt class=
- "LITERAL">content-type-overwrite</tt> to a whole site and
- then make URL based exceptions, but it's a lot more work to
- get the same precision.
- </p>
- </dd>
- <dt>
- Example usage (sections):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-# Check if www.example.net/ really uses valid XHTML
-{ +content-type-overwrite{application/xml} }
-www.example.net/
-
-# but leave the content type unmodified if the URL looks like a style sheet
-{-content-type-overwrite}
-www.example.net/.*\.css$
-www.example.net/.*style
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CRUNCH-CLIENT-HEADER">8.5.7. crunch-client-header</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Remove a client header <span class=
- "APPLICATION">Privoxy</span> has no dedicated action for.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Deletes every header sent by the client that contains the
- string the user supplied as parameter.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- Any string.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- This action allows you to block client headers for which no
- dedicated <span class="APPLICATION">Privoxy</span> action
- exists. <span class="APPLICATION">Privoxy</span> will
- remove every client header that contains the string you
- supplied as parameter.
- </p>
- <p>
- Regular expressions are <span class="emphasis"><i class=
- "EMPHASIS">not supported</i></span> and you can't use this
- action to block different headers in the same request,
- unless they contain the same string.
- </p>
- <p>
- <tt class="LITERAL">crunch-client-header</tt> is only meant
- for quick tests. If you have to block several different
- headers, or only want to modify parts of them, you should
- use a <tt class="LITERAL"><a href=
- "actions-file.html#CLIENT-HEADER-FILTER">client-header
- filter</a></tt>.
- </p>
- <div class="WARNING">
- <table class="WARNING" border="1" width="90%">
- <tr>
- <td align="CENTER">
- <b>Warning</b>
- </td>
- </tr>
- <tr>
- <td align="LEFT">
- <p>
- Don't block any header without understanding the
- consequences.
- </p>
- </td>
- </tr>
- </table>
- </div>
- </dd>
- <dt>
- Example usage (section):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-# Block the non-existent "Privacy-Violation:" client header
-{ +crunch-client-header{Privacy-Violation:} }
-/
-
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CRUNCH-IF-NONE-MATCH">8.5.8. crunch-if-none-match</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Prevent yet another way to track the user's steps between
- sessions.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Deletes the <span class="QUOTE">"If-None-Match:"</span>
- HTTP client header.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Boolean.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- N/A
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- Removing the <span class="QUOTE">"If-None-Match:"</span>
- HTTP client header is useful for filter testing, where you
- want to force a real reload instead of getting status code
- <span class="QUOTE">"304"</span> which would cause the
- browser to use a cached copy of the page.
- </p>
- <p>
- It is also useful to make sure the header isn't used as a
- cookie replacement (unlikely but possible).
- </p>
- <p>
- Blocking the <span class="QUOTE">"If-None-Match:"</span>
- header shouldn't cause any caching problems, as long as the
- <span class="QUOTE">"If-Modified-Since:"</span> header
- isn't blocked or missing as well.
- </p>
- <p>
- It is recommended to use this action together with <tt
- class="LITERAL"><a href=
- "actions-file.html#HIDE-IF-MODIFIED-SINCE">hide-if-modified-since</a></tt>
- and <tt class="LITERAL"><a href=
- "actions-file.html#OVERWRITE-LAST-MODIFIED">overwrite-last-modified</a></tt>.
- </p>
- </dd>
- <dt>
- Example usage (section):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-# Let the browser revalidate cached documents but don't
-# allow the server to use the revalidation headers for user tracking.
-{+hide-if-modified-since{-60} \
- +overwrite-last-modified{randomize} \
- +crunch-if-none-match}
-/
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CRUNCH-INCOMING-COOKIES">8.5.9.
- crunch-incoming-cookies</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Prevent the web server from setting HTTP cookies on your
- system
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Deletes any <span class="QUOTE">"Set-Cookie:"</span> HTTP
- headers from server replies.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Boolean.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- N/A
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- This action is only concerned with <span class=
- "emphasis"><i class="EMPHASIS">incoming</i></span> HTTP
- cookies. For <span class="emphasis"><i class=
- "EMPHASIS">outgoing</i></span> HTTP cookies, use <tt class=
- "LITERAL"><a href=
- "actions-file.html#CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</a></tt>.
- Use <span class="emphasis"><i class=
- "EMPHASIS">both</i></span> to disable HTTP cookies
- completely.
- </p>
- <p>
- It makes <span class="emphasis"><i class="EMPHASIS">no
- sense at all</i></span> to use this action in conjunction
- with the <tt class="LITERAL"><a href=
- "actions-file.html#SESSION-COOKIES-ONLY">session-cookies-only</a></tt>
- action, since it would prevent the session cookies from
- being set. See also <tt class="LITERAL"><a href=
- "actions-file.html#FILTER-CONTENT-COOKIES">filter-content-cookies</a></tt>.
- </p>
- </dd>
- <dt>
- Example usage:
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-+crunch-incoming-cookies
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CRUNCH-SERVER-HEADER">8.5.10. crunch-server-header</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Remove a server header <span class=
- "APPLICATION">Privoxy</span> has no dedicated action for.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Deletes every header sent by the server that contains the
- string the user supplied as parameter.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- Any string.
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- This action allows you to block server headers for which no
- dedicated <span class="APPLICATION">Privoxy</span> action
- exists. <span class="APPLICATION">Privoxy</span> will
- remove every server header that contains the string you
- supplied as parameter.
- </p>
- <p>
- Regular expressions are <span class="emphasis"><i class=
- "EMPHASIS">not supported</i></span> and you can't use this
- action to block different headers in the same request,
- unless they contain the same string.
- </p>
- <p>
- <tt class="LITERAL">crunch-server-header</tt> is only meant
- for quick tests. If you have to block several different
- headers, or only want to modify parts of them, you should
- use a custom <tt class="LITERAL"><a href=
- "actions-file.html#SERVER-HEADER-FILTER">server-header
- filter</a></tt>.
- </p>
- <div class="WARNING">
- <table class="WARNING" border="1" width="90%">
- <tr>
- <td align="CENTER">
- <b>Warning</b>
- </td>
- </tr>
- <tr>
- <td align="LEFT">
- <p>
- Don't block any header without understanding the
- consequences.
- </p>
- </td>
- </tr>
- </table>
- </div>
- </dd>
- <dt>
- Example usage (section):
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-# Crunch server headers that try to prevent caching
-{ +crunch-server-header{no-cache} }
-/
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="CRUNCH-OUTGOING-COOKIES">8.5.11.
- crunch-outgoing-cookies</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Prevent the web server from reading any HTTP cookies from
- your system
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- Deletes any <span class="QUOTE">"Cookie:"</span> HTTP
- headers from client requests.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Boolean.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- N/A
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- This action is only concerned with <span class=
- "emphasis"><i class="EMPHASIS">outgoing</i></span> HTTP
- cookies. For <span class="emphasis"><i class=
- "EMPHASIS">incoming</i></span> HTTP cookies, use <tt class=
- "LITERAL"><a href=
- "actions-file.html#CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</a></tt>.
- Use <span class="emphasis"><i class=
- "EMPHASIS">both</i></span> to disable HTTP cookies
- completely.
- </p>
- <p>
- It makes <span class="emphasis"><i class="EMPHASIS">no
- sense at all</i></span> to use this action in conjunction
- with the <tt class="LITERAL"><a href=
- "actions-file.html#SESSION-COOKIES-ONLY">session-cookies-only</a></tt>
- action, since it would prevent the session cookies from
- being read.
- </p>
- </dd>
- <dt>
- Example usage:
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">
- <tr>
- <td>
-<pre class="SCREEN">
-+crunch-outgoing-cookies
-</pre>
- </td>
- </tr>
- </table>
- </dd>
- </dl>
- </div>
- </div>
- <div class="SECT3">
- <h4 class="SECT3">
- <a name="DEANIMATE-GIFS">8.5.12. deanimate-gifs</a>
- </h4>
- <div class="VARIABLELIST">
- <dl>
- <dt>
- Typical use:
- </dt>
- <dd>
- <p>
- Stop those annoying, distracting animated GIF images.
- </p>
- </dd>
- <dt>
- Effect:
- </dt>
- <dd>
- <p>
- De-animate GIF animations, i.e. reduce them to their first
- or last image.
- </p>
- </dd>
- <dt>
- Type:
- </dt>
- <dd>
- <p>
- Parameterized.
- </p>
- </dd>
- <dt>
- Parameter:
- </dt>
- <dd>
- <p>
- <span class="QUOTE">"last"</span> or <span class=
- "QUOTE">"first"</span>
- </p>
- </dd>
- <dt>
- Notes:
- </dt>
- <dd>
- <p>
- This will also shrink the images considerably (in bytes,
- not pixels!). If the option <span class=
- "QUOTE">"first"</span> is given, the first frame of the
- animation is used as the replacement. If <span class=
- "QUOTE">"last"</span> is given, the last frame of the
- animation is used instead, which probably makes more sense
- for most banner animations, but also has the risk of not
- showing the entire last frame (if it is only a delta to an
- earlier frame).
- </p>
- <p>
- You can safely use this action with patterns that will also
- match non-GIF objects, because no attempt will be made at
- anything that doesn't look like a GIF.
- </p>
- </dd>
- <dt>
- Example usage:
- </dt>
- <dd>
- <p>
- </p>
- <table border="0" bgcolor="#E0E0E0" width="90%">