X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Ffilter-file.html;h=c1c08a43da8eae25987786e8ad526a0fbdfc9ce7;hp=d73b79dc189f72b9b2ab1969147e805baf22d2a3;hb=3db7a58b2bbed7b6356b2a0600e93ec4f2846499;hpb=03472355cc98c0a5f3e65deb0e4569bd14e0fb54 diff --git a/doc/webserver/user-manual/filter-file.html b/doc/webserver/user-manual/filter-file.html index d73b79dc..c1c08a43 100644 --- a/doc/webserver/user-manual/filter-file.html +++ b/doc/webserver/user-manual/filter-file.html @@ -1,13 +1,13 @@ + Filter Files +HREF="../p_doc.css"> Privoxy 3.0.6 User ManualPrivoxy 3.0.17 User Manual9. Filter Files9. Filter Files

On-the-fly text substitutions that can be invoked through the - filter action need +> On-the-fly text substitutions need to be defined in a "filter file""action". Multiple filter files can be - defined through the .

Privoxy supports three different filter actions: + filter to + rewrite the content that is send to the client, + client-header-filter + to rewrite headers that are send by the client, and + server-header-filter + to rewrite headers that are send by the server.

Privoxy also supports two tagger actions: + client-header-tagger + and + server-header-tagger. + Taggers and filters use the same syntax in the filter files, the difference + is that taggers don't modify the text they are filtering, but use a rewritten + version of the filtered text as tag. The tags can then be used to change the + applying actions through sections with tag-patterns.

Multiple filter files can be defined through the filterfile config directive. The filters - as supplied by the developers will be found in + as supplied by the developers are located in default.filter.

Typical reasons for doing these kinds of substitutions are to eliminate - common annoyances in HTML and JavaScript, such as pop-up windows, +> Common tasks for content filters are to eliminate common annoyances in + HTML and JavaScript, such as pop-up windows, exit consoles, crippled windows without navigation tools, the infamous <BLINK> tag etc, to suppress images with certain width and height attributes (standard banner sizes or web-bugs), - or just to have fun. The possibilities are endless.

Filtering works on any text-based document type, including - HTML, JavaScript, CSS etc. (all text/* - MIME types, except Enabled content filters are applied to any content whose + "Content Type" header is recognised as a sign + of text-based content, with the exception of text/plain). - Substitutions are made at the source level, so if you want to . + Use the force-text-mode action + to also filter other content.

Substitutions are made at the source level, so if you want to "roll your own" filters, you should first be familiar with HTML syntax, - and, of course, regular expressions. By default, filters are only applied - to the raw document content, but can be extended to the HTTP headers with - the supplemental actions: - filter-client-headers and - filter-server-headers.

Just like the filters - here. Each filter consists of a heading line, that starts with the + here. Each filter consists of a heading line, that starts with one of the keywordkeywords FILTER:, followed by - the filter's , + CLIENT-HEADER-FILTER: or SERVER-HEADER-FILTER: + followed by the filter's actions file.

A filter header line for a filter called Filter definitions start with a header line that contains the filter + type, the filter name and the filter description. + A content filter header line for a filter called "foo" could look @@ -314,14 +366,14 @@ CLASS="SECT2" >

9.1. Filter File Tutorial

9.1. Filter File Tutorial

Now, let's complete our "foo" filter. We have already defined +> content filter. We have already defined the heading, but the jobs are still missing. Since all it does is to replace 9.2. The Pre-defined Filters9.2. The Pre-defined Filters

The distribution

Header filter to change the Content-Type from xml to html. +> Server-header filter to change the Content-Type from xml to html.

Header filter to change the Content-Type from html to xml. +> Server-header filter to change the Content-Type from html to xml. +

no-ping

Removes the non-standard ping attribute from + anchor and area HTML tags. +

hide-tor-exit-notation

Client-header filter to remove the Tor exit node notation + found in Host and Referer headers. +

If Privoxy and Tor are chained and Privoxy + is configured to use socks4a, one can use "http://www.example.org.foobar.exit/" + to access the host "www.example.org" through the + Tor exit node "foobar". +

As the HTTP client isn't aware of this notation, it treats the + whole string "www.example.org.foobar.exit" as host and uses it + for the "Host" and "Referer" headers. From the + server's point of view the resulting headers are invalid and can cause problems. +

An invalid "Referer" header can trigger "hot-linking" + protections, an invalid "Host" header will make it impossible for + the server to find the right vhost (several domains hosted on the same IP address). +

This client-header filter removes the "foo.exit" part in those headers + to prevent the mentioned problems. Note that it only modifies + the HTTP headers, it doesn't make it impossible for the server + to detect your Tor exit node based on the IP address + the request is coming from.