X-Git-Url: http://www.privoxy.org/gitweb/?a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Ffilter-file.html;h=d225df1ed015e0443ec8c79cb550a2f7e352f7cf;hb=3b3e93244ab9a04daf15de964593063779e382ed;hp=d73b79dc189f72b9b2ab1969147e805baf22d2a3;hpb=03472355cc98c0a5f3e65deb0e4569bd14e0fb54;p=privoxy.git diff --git a/doc/webserver/user-manual/filter-file.html b/doc/webserver/user-manual/filter-file.html index d73b79dc..d225df1e 100644 --- a/doc/webserver/user-manual/filter-file.html +++ b/doc/webserver/user-manual/filter-file.html @@ -1,13 +1,13 @@ +
On-the-fly text substitutions that can be invoked through the - filter action need +> On-the-fly text substitutions need to be defined in a "filter file""action". Multiple filter files can be - defined through the .
Privoxy supports three different filter actions: + filter to + rewrite the content that is send to the client, + client-header-filter + to rewrite headers that are send by the client, and + server-header-filter + to rewrite headers that are send by the server.
Privoxy also supports two tagger actions: + client-header-tagger + and + server-header-tagger. + Taggers and filters use the same syntax in the filter files, the difference + is that taggers don't modify the text they are filtering, but use a rewritten + version of the filtered text as tag. The tags can then be used to change the + applying actions through sections with tag-patterns.
Multiple filter files can be defined through the filterfile config directive. The filters - as supplied by the developers will be found in + as supplied by the developers are located in default.filter.
Typical reasons for doing these kinds of substitutions are to eliminate - common annoyances in HTML and JavaScript, such as pop-up windows, +> Common tasks for content filters are to eliminate common annoyances in + HTML and JavaScript, such as pop-up windows, exit consoles, crippled windows without navigation tools, the infamous <BLINK> tag etc, to suppress images with certain width and height attributes (standard banner sizes or web-bugs), - or just to have fun. The possibilities are endless.
Filtering works on any text-based document type, including - HTML, JavaScript, CSS etc. (all text/* - MIME types, except Enabled content filters are applied to any content whose + "Content Type" header is recognised as a sign + of text-based content, with the exception of text/plain). - Substitutions are made at the source level, so if you want to . + Use the force-text-mode action + to also filter other content.
Substitutions are made at the source level, so if you want to "roll your own" filters, you should first be familiar with HTML syntax, - and, of course, regular expressions. By default, filters are only applied - to the raw document content, but can be extended to the HTTP headers with - the supplemental actions: - filter-client-headers and - filter-server-headers.
A filter header line for a filter called Filter definitions start with a header line that contains the filter
+ type, the filter name and the filter description.
+ A content filter header line for a filter called "foo" could look
@@ -314,14 +363,14 @@ CLASS="SECT2"
> Now, let's complete our "foo" filter. We have already defined
+> content filter. We have already defined
the heading, but the jobs are still missing. Since all it does is to replace
9.2. The Pre-defined Filters9.2. The Pre-defined Filters The distribution Header filter to change the Content-Type from xml to html.
+> Server-header filter to change the Content-Type from xml to html.
Header filter to change the Content-Type from html to xml.
+> Server-header filter to change the Content-Type from html to xml.
+ Removes the non-standard ping attribute from
+ anchor and area HTML tags.
+ Client-header filter to remove the Tor exit node notation
+ found in Host and Referer headers.
+ If Privoxy and Tor are chained and Privoxy
+ is configured to use socks4a, one can use "http://www.example.org.foobar.exit/"
+ to access the host "www.example.org" through the
+ Tor exit node "foobar".
+ As the HTTP client isn't aware of this notation, it treats the
+ whole string "www.example.org.foobar.exit" as host and uses it
+ for the "Host" and "Referer" headers. From the
+ server's point of view the resulting headers are invalid and can cause problems.
+ An invalid "Referer" header can trigger "hot-linking"
+ protections, an invalid "Host" header will make it impossible for
+ the server to find the right vhost (several domains hosted on the same IP address).
+ This client-header filter removes the "foo.exit" part in those headers
+ to prevent the mentioned problems. Note that it only modifies
+ the HTTP headers, it doesn't make it impossible for the server
+ to detect your Tor exit node based on the IP address
+ the request is coming from.
9.1. Filter File Tutorial
9.1. Filter File Tutorial