X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Ffilter-file.html;h=fa278a0a210744ed5202649ff82494ce56229bc6;hp=289d606c10a8a6d57992b7d778f2910111712513;hb=107c84d0c43b24ad437933c75774276f67165959;hpb=5fd77903894c0798908743d90ce72b9bdf2cce7d diff --git a/doc/webserver/user-manual/filter-file.html b/doc/webserver/user-manual/filter-file.html index 289d606c..fa278a0a 100644 --- a/doc/webserver/user-manual/filter-file.html +++ b/doc/webserver/user-manual/filter-file.html @@ -1,403 +1,484 @@ - - + -
-On-the-fly text substitutions need to be defined in a "filter file". Once defined, they can then be invoked as - an "action".
- -Privoxy supports three different - filter actions: filter to rewrite the content that is - send to the client, client-header-filter to - rewrite headers that are send by the client, and server-header-filter to - rewrite headers that are send by the server.
- -Privoxy also supports two tagger - actions: client-header-tagger - and server-header-tagger. - Taggers and filters use the same syntax in the filter files, the - difference is that taggers don't modify the text they are filtering, but - use a rewritten version of the filtered text as tag. The tags can then be - used to change the applying actions through sections with tag-patterns.
- -Multiple filter files can be defined through the filterfile config - directive. The filters as supplied by the developers are located in - default.filter. It is recommended that any - locally defined or modified filters go in a separately defined file such - as user.filter.
- -Common tasks for content filters are to eliminate common annoyances in - HTML and JavaScript, such as pop-up windows, exit consoles, crippled - windows without navigation tools, the infamous <BLINK> tag etc, to - suppress images with certain width and height attributes (standard banner - sizes or web-bugs), or just to have fun.
- -Enabled content filters are applied to any content whose "Content Type" header is recognised as a sign of - text-based content, with the exception of text/plain. Use the force-text-mode action to also - filter other content.
- -Substitutions are made at the source level, so if you want to - "roll your own" filters, you should first be - familiar with HTML syntax, and, of course, regular expressions.
- -Just like the actions files, the - filter file is organized in sections, which are called filters here. Each filter consists of a - heading line, that starts with one of the keywords FILTER:, - CLIENT-HEADER-FILTER: or SERVER-HEADER-FILTER: followed by the filter's - name, and a short (one line) - description of what it does. - Below that line come the jobs, - i.e. lines that define the actual text substitutions. By convention, the - name of a filter should describe what the filter eliminates. The comment is used in the - web-based user - interface.
- -Once a filter called name has been - defined in the filter file, it can be invoked by using an action of the - form +filter{name} in any actions file.
- -Filter definitions start with a header line that contains the filter - type, the filter name and the filter description. A content filter header - line for a filter called "foo" could look like - this:
- -
- -FILTER: foo Replace all "foo" with "bar" -- |
-
Below that line, and up to the next header line, come the jobs that - define what text replacements the filter executes. They are specified in - a syntax that imitates Perl's s/// operator. If you are - familiar with Perl, you will find this to be quite intuitive, and may - want to look at the PCRS documentation for the subtle differences to Perl - behaviour. Most notably, the non-standard option letter U is supported, which turns the default to ungreedy - matching.
- -If you are new to "Regular Expressions", you might - want to take a look at the Appendix on - regular expressions, and see the Perl manual for - the - s/// operator's syntax and Perl-style regular - expressions in general. The below examples might also help to get you - started.
- -Now, let's complete our "foo" content - filter. We have already defined the heading, but the jobs are still - missing. Since all it does is to replace "foo" with "bar", there is - only one (trivial) job needed:
- -
- -s/foo/bar/ -- |
+ + Privoxy 3.0.25 User Manual + |
---|
But wait! Didn't the comment say that all occurrences of "foo" should be replaced? Our current job will only take - care of the first "foo" on each page. For - global substitution, we'll need to add the g - option:
- -
- -s/foo/bar/g -+ | + Prev + | ++ | ++ Next |
Our complete filter now looks like this:
- -
- +FILTER: foo Replace all "foo" with "bar" -s/foo/bar/g |
Let's look at some real filters for more interesting examples. Here - you see a filter that protects against some common annoyances that - arise from JavaScript abuse. Let's look at its jobs one after the - other:
- -
- + |