X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Ffilter-file.html;h=e0de605f812edb6c1748feec413339289d56e5e3;hp=678c28629fef0e0a8bfc668754cd333a7350f3f5;hb=473cfd051580edfa1e2a3f6beeb9a0d09a8253fd;hpb=51dd3416173631d3cdbd51bd35d8cf6a349e13c2 diff --git a/doc/webserver/user-manual/filter-file.html b/doc/webserver/user-manual/filter-file.html index 678c2862..e0de605f 100644 --- a/doc/webserver/user-manual/filter-file.html +++ b/doc/webserver/user-manual/filter-file.html @@ -1,13 +1,13 @@ + Filter Files +HREF="../p_doc.css"> Privoxy 3.0.6 User ManualPrivoxy 3.0.9 User Manual9. Filter Files9. Filter Files

On-the-fly text substitutions that can be invoked through the - filter action need +> On-the-fly text substitutions need to be defined in a "filter file""action". Multiple filter files can be - defined through the .

Privoxy supports three different filter actions: + filter to + rewrite the content that is send to the client, + client-header-filter + to rewrite headers that are send by the client, and + server-header-filter + to rewrite headers that are send by the server.

Privoxy also supports two tagger actions: + client-header-tagger + and + server-header-tagger. + Taggers and filters use the same syntax in the filter files, the difference + is that taggers don't modify the text they are filtering, but use a rewritten + version of the filtered text as tag. The tags can then be used to change the + applying actions through sections with tag-patterns.

Multiple filter files can be defined through the filterfile config directive. The filters - as supplied by the developers will be found in + as supplied by the developers are located in default.filter.

Typical reasons for doing these kinds of substitutions are to eliminate - common annoyances in HTML and JavaScript, such as pop-up windows, +> Common tasks for content filters are to eliminate common annoyances in + HTML and JavaScript, such as pop-up windows, exit consoles, crippled windows without navigation tools, the infamous <BLINK> tag etc, to suppress images with certain width and height attributes (standard banner sizes or web-bugs), - or just to have fun. The possibilities are endless.

Filtering works on any text-based document type, including - HTML, JavaScript, CSS etc. (all text/* - MIME types, except Enabled content filters are applied to any content whose + "Content Type" header is recognised as a sign + of text-based content, with the exception of text/plain). - Substitutions are made at the source level, so if you want to . + Use the force-text-mode action + to also filter other content.

Substitutions are made at the source level, so if you want to "roll your own" filters, you should first be familiar with HTML syntax, - and, of course, regular expressions. By default, filters are only applied - to the raw document content, but can be extended to the HTTP headers with - the supplemental actions: - filter-client-headers and - filter-server-headers.

Just like the filters - here. Each filter consists of a heading line, that starts with the + here. Each filter consists of a heading line, that starts with one of the keywordkeywords FILTER:, followed by - the filter's , + CLIENT-HEADER-FILTER: or SERVER-HEADER-FILTER: + followed by the filter's actions file.

A filter header line for a filter called Filter definitions start with a header line that contains the filter + type, the filter name and the filter description. + A content filter header line for a filter called "foo" could look @@ -314,14 +366,14 @@ CLASS="SECT2" >

9.1. Filter File Tutorial

9.1. Filter File Tutorial

Now, let's complete our "foo" filter. We have already defined +> content filter. We have already defined the heading, but the jobs are still missing. Since all it does is to replace 9.2. The Pre-defined Filters9.2. The Pre-defined Filters

The distribution

Header filter to change the Content-Type from xml to html. +> Server-header filter to change the Content-Type from xml to html.

Header filter to change the Content-Type from html to xml. +> Server-header filter to change the Content-Type from html to xml.

Header filter to remove the Client-header filter to remove the Tor exit node notation found in Host and Referer headers.

If Privoxy and Tor are chained and Privoxy + is configured to use socks4a, one can use "http://www.example.org.foobar.exit/" + to access the host "www.example.org" through the + Tor exit node "foobar". +

As the HTTP client isn't aware of this notation, it treats the + whole string "www.example.org.foobar.exit" as host and uses it + for the "Host" and "Referer" headers. From the + server's point of view the resulting headers are invalid and can cause problems. +

An invalid "Referer" header can trigger "hot-linking" + protections, an invalid "Host" header will make it impossible for + the server to find the right vhost (several domains hosted on the same IP address). +

This client-header filter removes the "foo.exit" part in those headers + to prevent the mentioned problems. Note that it only modifies + the HTTP headers, it doesn't make it impossible for the server + to detect your Tor exit node based on the IP address + the request is coming from. +