1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2 "http://www.w3.org/TR/html4/loose.dtd">
6 <meta name="generator" content=
7 "HTML Tidy for Linux/x86 (vers 7 December 2008), see www.w3.org">
9 <title>Filter Files</title>
10 <meta name="GENERATOR" content=
11 "Modular DocBook HTML Stylesheet Version 1.79">
12 <link rel="HOME" title="Privoxy 3.0.18 User Manual" href="index.html">
13 <link rel="PREVIOUS" title="Actions Files" href="actions-file.html">
14 <link rel="NEXT" title="Privoxy's Template Files" href="templates.html">
15 <link rel="STYLESHEET" type="text/css" href="../p_doc.css">
16 <meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
17 <link rel="STYLESHEET" type="text/css" href="p_doc.css">
18 <style type="text/css">
20 background-color: #EEEEEE;
23 :link { color: #0000FF }
24 :visited { color: #840084 }
25 :active { color: #0000FF }
26 table.c4 {background-color: #E0E0E0}
27 tt.c3 {font-style: italic}
28 span.c2 {font-style: italic}
29 hr.c1 {text-align: left}
34 <div class="NAVHEADER">
35 <table summary="Header navigation table" width="100%" border="0"
36 cellpadding="0" cellspacing="0">
38 <th colspan="3" align="center">Privoxy 3.0.18 User Manual</th>
42 <td width="10%" align="left" valign="bottom"><a href=
43 "actions-file.html" accesskey="P">Prev</a></td>
45 <td width="80%" align="center" valign="bottom"></td>
47 <td width="10%" align="right" valign="bottom"><a href=
48 "templates.html" accesskey="N">Next</a></td>
51 <hr class="c1" width="100%">
55 <h1 class="SECT1"><a name="FILTER-FILE" id="FILTER-FILE">9. Filter
58 <p>On-the-fly text substitutions need to be defined in a <span class=
59 "QUOTE">"filter file"</span>. Once defined, they can then be invoked as
60 an <span class="QUOTE">"action"</span>.</p>
62 <p><span class="APPLICATION">Privoxy</span> supports three different
63 filter actions: <tt class="LITERAL"><a href=
64 "actions-file.html#FILTER">filter</a></tt> to rewrite the content that is
65 send to the client, <tt class="LITERAL"><a href=
66 "actions-file.html#CLIENT-HEADER-FILTER">client-header-filter</a></tt> to
67 rewrite headers that are send by the client, and <tt class=
69 "actions-file.html#SERVER-HEADER-FILTER">server-header-filter</a></tt> to
70 rewrite headers that are send by the server.</p>
72 <p><span class="APPLICATION">Privoxy</span> also supports two tagger
73 actions: <tt class="LITERAL"><a href=
74 "actions-file.html#CLIENT-HEADER-TAGGER">client-header-tagger</a></tt>
75 and <tt class="LITERAL"><a href=
76 "actions-file.html#SERVER-HEADER-TAGGER">server-header-tagger</a></tt>.
77 Taggers and filters use the same syntax in the filter files, the
78 difference is that taggers don't modify the text they are filtering, but
79 use a rewritten version of the filtered text as tag. The tags can then be
80 used to change the applying actions through sections with <a href=
81 "actions-file.html#TAG-PATTERN">tag-patterns</a>.</p>
83 <p>Multiple filter files can be defined through the <tt class=
84 "LITERAL"><a href="config.html#FILTERFILE">filterfile</a></tt> config
85 directive. The filters as supplied by the developers are located in
86 <tt class="FILENAME">default.filter</tt>. It is recommended that any
87 locally defined or modified filters go in a separately defined file such
88 as <tt class="FILENAME">user.filter</tt>.</p>
90 <p>Common tasks for content filters are to eliminate common annoyances in
91 HTML and JavaScript, such as pop-up windows, exit consoles, crippled
92 windows without navigation tools, the infamous <BLINK> tag etc, to
93 suppress images with certain width and height attributes (standard banner
94 sizes or web-bugs), or just to have fun.</p>
96 <p>Enabled content filters are applied to any content whose <span class=
97 "QUOTE">"Content Type"</span> header is recognised as a sign of
98 text-based content, with the exception of <tt class=
99 "LITERAL">text/plain</tt>. Use the <a href=
100 "actions-file.html#FORCE-TEXT-MODE">force-text-mode</a> action to also
101 filter other content.</p>
103 <p>Substitutions are made at the source level, so if you want to
104 <span class="QUOTE">"roll your own"</span> filters, you should first be
105 familiar with HTML syntax, and, of course, regular expressions.</p>
107 <p>Just like the <a href="actions-file.html">actions files</a>, the
108 filter file is organized in sections, which are called <span class=
109 "emphasis EMPHASIS c2">filters</span> here. Each filter consists of a
110 heading line, that starts with one of the <span class=
111 "emphasis EMPHASIS c2">keywords</span> <tt class="LITERAL">FILTER:</tt>,
112 <tt class="LITERAL">CLIENT-HEADER-FILTER:</tt> or <tt class=
113 "LITERAL">SERVER-HEADER-FILTER:</tt> followed by the filter's
114 <span class="emphasis EMPHASIS c2">name</span>, and a short (one line)
115 <span class="emphasis EMPHASIS c2">description</span> of what it does.
116 Below that line come the <span class="emphasis EMPHASIS c2">jobs</span>,
117 i.e. lines that define the actual text substitutions. By convention, the
118 name of a filter should describe what the filter <span class=
119 "emphasis EMPHASIS c2">eliminates</span>. The comment is used in the
120 <a href="http://config.privoxy.org/" target="_top">web-based user
123 <p>Once a filter called <tt class="REPLACEABLE c3">name</tt> has been
124 defined in the filter file, it can be invoked by using an action of the
125 form +<tt class="LITERAL"><a href=
126 "actions-file.html#FILTER">filter</a>{<tt class=
127 "REPLACEABLE c3">name</tt>}</tt> in any <a href=
128 "actions-file.html">actions file</a>.</p>
130 <p>Filter definitions start with a header line that contains the filter
131 type, the filter name and the filter description. A content filter header
132 line for a filter called <span class="QUOTE">"foo"</span> could look like
135 <table class="c4" border="0" width="100%">
139 FILTER: foo Replace all "foo" with "bar"
145 <p>Below that line, and up to the next header line, come the jobs that
146 define what text replacements the filter executes. They are specified in
147 a syntax that imitates <a href="http://www.perl.org/" target=
148 "_top">Perl</a>'s <tt class="LITERAL">s///</tt> operator. If you are
149 familiar with Perl, you will find this to be quite intuitive, and may
150 want to look at the PCRS documentation for the subtle differences to Perl
151 behaviour. Most notably, the non-standard option letter <tt class=
152 "LITERAL">U</tt> is supported, which turns the default to ungreedy
155 <p>If you are new to <a href=
156 "http://en.wikipedia.org/wiki/Regular_expressions" target=
157 "_top"><span class="QUOTE">"Regular Expressions"</span></a>, you might
158 want to take a look at the <a href="appendix.html#REGEX">Appendix on
159 regular expressions</a>, and see the <a href=
160 "http://perldoc.perl.org/perlre.html" target="_top">Perl manual</a> for
161 <a href="http://perldoc.perl.org/perlop.html" target="_top">the
162 <tt class="LITERAL">s///</tt> operator's syntax</a> and <a href=
163 "http://perldoc.perl.org/perlre.html" target="_top">Perl-style regular
164 expressions</a> in general. The below examples might also help to get you
168 <h2 class="SECT2"><a name="AEN5041" id="AEN5041">9.1. Filter File
171 <p>Now, let's complete our <span class="QUOTE">"foo"</span> content
172 filter. We have already defined the heading, but the jobs are still
173 missing. Since all it does is to replace <span class=
174 "QUOTE">"foo"</span> with <span class="QUOTE">"bar"</span>, there is
175 only one (trivial) job needed:</p>
177 <table class="c4" border="0" width="100%">
187 <p>But wait! Didn't the comment say that <span class=
188 "emphasis EMPHASIS c2">all</span> occurrences of <span class=
189 "QUOTE">"foo"</span> should be replaced? Our current job will only take
190 care of the first <span class="QUOTE">"foo"</span> on each page. For
191 global substitution, we'll need to add the <tt class="LITERAL">g</tt>
194 <table class="c4" border="0" width="100%">
204 <p>Our complete filter now looks like this:</p>
206 <table class="c4" border="0" width="100%">
210 FILTER: foo Replace all "foo" with "bar"
217 <p>Let's look at some real filters for more interesting examples. Here
218 you see a filter that protects against some common annoyances that
219 arise from JavaScript abuse. Let's look at its jobs one after the
222 <table class="c4" border="0" width="100%">
226 FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
228 # Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
230 s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|Usg
236 <p>Following the header line and a comment, you see the job. Note that
237 it uses <tt class="LITERAL">|</tt> as the delimiter instead of
238 <tt class="LITERAL">/</tt>, because the pattern contains a forward
239 slash, which would otherwise have to be escaped by a backslash
240 (<tt class="LITERAL">\</tt>).</p>
242 <p>Now, let's examine the pattern: it starts with the text <tt class=
243 "LITERAL"><script.*</tt> enclosed in parentheses. Since the dot
244 matches any character, and <tt class="LITERAL">*</tt> means:
245 <span class="QUOTE">"Match an arbitrary number of the element left of
246 myself"</span>, this matches <span class="QUOTE">"<script"</span>,
247 followed by <span class="emphasis EMPHASIS c2">any</span> text, i.e. it
248 matches the whole page, from the start of the first <script>
251 <p>That's more than we want, but the pattern continues: <tt class=
252 "LITERAL">document\.referrer</tt> matches only the exact string
253 <span class="QUOTE">"document.referrer"</span>. The dot needed to be
254 <span class="emphasis EMPHASIS c2">escaped</span>, i.e. preceded by a
255 backslash, to take away its special meaning as a joker, and make it
256 just a regular dot. So far, the meaning is: Match from the start of the
257 first <script> tag in a the page, up to, and including, the text
258 <span class="QUOTE">"document.referrer"</span>, if <span class=
259 "emphasis EMPHASIS c2">both</span> are present in the page (and appear
262 <p>But there's still more pattern to go. The next element, again
263 enclosed in parentheses, is <tt class="LITERAL">.*</script></tt>.
264 You already know what <tt class="LITERAL">.*</tt> means, so the whole
265 pattern translates to: Match from the start of the first <script>
266 tag in a page to the end of the last <script> tag, provided that
267 the text <span class="QUOTE">"document.referrer"</span> appears
268 somewhere in between.</p>
270 <p>This is still not the whole story, since we have ignored the options
271 and the parentheses: The portions of the page matched by sub-patterns
272 that are enclosed in parentheses, will be remembered and be available
273 through the variables <tt class="LITERAL">$1, $2, ...</tt> in the
274 substitute. The <tt class="LITERAL">U</tt> option switches to ungreedy
275 matching, which means that the first <tt class="LITERAL">.*</tt> in the
276 pattern will only <span class="QUOTE">"eat up"</span> all text in
277 between <span class="QUOTE">"<script"</span> and the <span class=
278 "emphasis EMPHASIS c2">first</span> occurrence of <span class=
279 "QUOTE">"document.referrer"</span>, and that the second <tt class=
280 "LITERAL">.*</tt> will only span the text up to the <span class=
281 "emphasis EMPHASIS c2">first</span> <span class=
282 "QUOTE">"</script>"</span> tag. Furthermore, the <tt class=
283 "LITERAL">s</tt> option says that the match may span multiple lines in
284 the page, and the <tt class="LITERAL">g</tt> option again means that
285 the substitution is global.</p>
287 <p>So, to summarize, the pattern means: Match all scripts that contain
288 the text <span class="QUOTE">"document.referrer"</span>. Remember the
289 parts of the script from (and including) the start tag up to (and
290 excluding) the string <span class="QUOTE">"document.referrer"</span> as
291 <tt class="LITERAL">$1</tt>, and the part following that string, up to
292 and including the closing tag, as <tt class="LITERAL">$2</tt>.</p>
294 <p>Now the pattern is deciphered, but wasn't this about substituting
295 things? So lets look at the substitute: <tt class="LITERAL">$1"Not Your
296 Business!"$2</tt> is easy to read: The text remembered as <tt class=
297 "LITERAL">$1</tt>, followed by <tt class="LITERAL">"Not Your
298 Business!"</tt> (<span class="emphasis EMPHASIS c2">including</span>
299 the quotation marks!), followed by the text remembered as <tt class=
300 "LITERAL">$2</tt>. This produces an exact copy of the original string,
301 with the middle part (the <span class=
302 "QUOTE">"document.referrer"</span>) replaced by <tt class=
303 "LITERAL">"Not Your Business!"</tt>.</p>
305 <p>The whole job now reads: Replace <span class=
306 "QUOTE">"document.referrer"</span> by <tt class="LITERAL">"Not Your
307 Business!"</tt> wherever it appears inside a <script> tag. Note
308 that this job won't break JavaScript syntax, since both the original
309 and the replacement are syntactically valid string objects. The script
310 just won't have access to the referrer information anymore.</p>
312 <p>We'll show you two other jobs from the JavaScript taming department,
313 but this time only point out the constructs of special interest:</p>
315 <table class="c4" border="0" width="100%">
319 # The status bar is for displaying link targets, not pointless blahblah
321 s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig
327 <p><tt class="LITERAL">\s</tt> stands for whitespace characters (space,
328 tab, newline, carriage return, form feed), so that <tt class=
329 "LITERAL">\s*</tt> means: <span class="QUOTE">"zero or more
330 whitespace"</span>. The <tt class="LITERAL">?</tt> in <tt class=
331 "LITERAL">.*?</tt> makes this matching of arbitrary text ungreedy.
332 (Note that the <tt class="LITERAL">U</tt> option is not set). The
333 <tt class="LITERAL">['"]</tt> construct means: <span class="QUOTE">"a
334 single <span class="emphasis EMPHASIS c2">or</span> a double
335 quote"</span>. Finally, <tt class="LITERAL">\1</tt> is a back-reference
336 to the first parenthesis just like <tt class="LITERAL">$1</tt> above,
337 with the difference that in the <span class=
338 "emphasis EMPHASIS c2">pattern</span>, a backslash indicates a
339 back-reference, whereas in the <span class=
340 "emphasis EMPHASIS c2">substitute</span>, it's the dollar.</p>
342 <p>So what does this job do? It replaces assignments of single- or
343 double-quoted strings to the <span class="QUOTE">"window.status"</span>
344 object with a dummy assignment (using a variable name that is hopefully
345 odd enough not to conflict with real variables in scripts). Thus, it
346 catches many cases where e.g. pointless descriptions are displayed in
347 the status bar instead of the link target when you move your mouse over
350 <table class="c4" border="0" width="100%">
354 # Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
356 s/(<body [^>]*)onunload(.*>)/$1never$2/iU
362 <p>Including the <a href=
363 "http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents"
364 target="_top">OnUnload event binding</a> in the HTML DOM was a
365 <span class="emphasis EMPHASIS c2">CRIME</span>. When I close a browser
366 window, I want it to close and die. Basta. This job replaces the
367 <span class="QUOTE">"onunload"</span> attribute in <span class=
368 "QUOTE">"<body>"</span> tags with the dummy word <tt class=
369 "LITERAL">never</tt>. Note that the <tt class="LITERAL">i</tt> option
370 makes the pattern matching case-insensitive. Also note that ungreedy
371 matching alone doesn't always guarantee a minimal match: In the first
372 parenthesis, we had to use <tt class="LITERAL">[^>]*</tt> instead of
373 <tt class="LITERAL">.*</tt> to prevent the match from exceeding the
374 <body> tag if it doesn't contain <span class=
375 "QUOTE">"OnUnload"</span>, but the page's content does.</p>
377 <p>The last example is from the fun department:</p>
379 <table class="c4" border="0" width="100%">
383 FILTER: fun Fun text replacements
385 # Spice the daily news:
387 s/microsoft(?!\.com)/MicroSuck/ig
393 <p>Note the <tt class="LITERAL">(?!\.com)</tt> part (a so-called
394 negative lookahead) in the job's pattern, which means: Don't match, if
395 the string <span class="QUOTE">".com"</span> appears directly following
396 <span class="QUOTE">"microsoft"</span> in the page. This prevents links
397 to microsoft.com from being trashed, while still replacing the word
400 <table class="c4" border="0" width="100%">
404 # Buzzword Bingo (example for extended regex syntax)
406 s* industry[ -]leading \
408 | customer[ -]focused \
410 | award[ -]winning # Comments are OK, too! \
411 | high[ -]performance \
412 | solutions[ -]based \
416 *<font color="red"><b>BINGO!</b></font> \
423 <p>The <tt class="LITERAL">x</tt> option in this job turns on extended
424 syntax, and allows for e.g. the liberal use of (non-interpreted!)
425 whitespace for nicer formatting.</p>
427 <p>You get the idea?</p>
431 <h2 class="SECT2"><a name="PREDEFINED-FILTERS" id=
432 "PREDEFINED-FILTERS">9.2. The Pre-defined Filters</a></h2>
434 <p>The distribution <tt class="FILENAME">default.filter</tt> file
435 contains a selection of pre-defined filters for your convenience:</p>
437 <div class="VARIABLELIST">
439 <dt><span class="emphasis EMPHASIS c2">js-annoyances</span></dt>
442 <p>The purpose of this filter is to get rid of particularly
443 annoying JavaScript abuse. To that end, it</p>
447 <p>replaces JavaScript references to the browser's referrer
448 information with the string "Not Your Business!". This
449 compliments the <tt class="LITERAL"><a href=
450 "actions-file.html#HIDE-REFERRER">hide-referrer</a></tt>
451 action on the content level.</p>
455 <p>removes the bindings to the DOM's <a href=
456 "http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-htmlevents"
457 target="_top">unload event</a> which we feel has no right to
458 exist and is responsible for most <span class="QUOTE">"exit
459 consoles"</span>, i.e. nasty windows that pop up when you
460 close another one.</p>
464 <p>removes code that causes new windows to be opened with
465 undesired properties, such as being full-screen,
466 non-resizeable, without location, status or menu bar etc.</p>
470 <p>Use with caution. This is an aggressive filter, and can break
471 sites that rely heavily on JavaScript.</p>
474 <dt><span class="emphasis EMPHASIS c2">js-events</span></dt>
477 <p>This is a very radical measure. It removes virtually all
478 JavaScript event bindings, which means that scripts can not react
479 to user actions such as mouse movements or clicks, window
480 resizing etc, anymore. Use with caution!</p>
482 <p>We <span class="emphasis EMPHASIS c2">strongly
483 discourage</span> using this filter as a default since it breaks
484 many legitimate scripts. It is meant for use only on extra-nasty
485 sites (should you really need to go there).</p>
488 <dt><span class="emphasis EMPHASIS c2">html-annoyances</span></dt>
491 <p>This filter will undo many common instances of HTML based
494 <p>The <tt class="LITERAL">BLINK</tt> and <tt class=
495 "LITERAL">MARQUEE</tt> tags are neutralized (yeah baby!), and
496 browser windows will be created as resizeable (as of course they
497 should be!), and will have location, scroll and menu bars -- even
498 if specified otherwise.</p>
501 <dt><span class="emphasis EMPHASIS c2">content-cookies</span></dt>
504 <p>Most cookies are set in the HTTP dialog, where they can be
505 intercepted by the <tt class="LITERAL"><a href=
506 "actions-file.html#CRUNCH-INCOMING-COOKIES">crunch-incoming-cookies</a></tt>
507 and <tt class="LITERAL"><a href=
508 "actions-file.html#CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</a></tt>
509 actions. But web sites increasingly make use of HTML meta tags
510 and JavaScript to sneak cookies to the browser on the content
513 <p>This filter disables most HTML and JavaScript code that reads
514 or sets cookies. It cannot detect all clever uses of these types
515 of code, so it should not be relied on as an absolute fix. Use it
516 wherever you would also use the cookie crunch actions.</p>
519 <dt><span class="emphasis EMPHASIS c2">refresh tags</span></dt>
522 <p>Disable any refresh tags if the interval is greater than nine
523 seconds (so that redirections done via refresh tags are not
524 destroyed). This is useful for dial-on-demand setups, or for
525 those who find this HTML feature annoying.</p>
529 "emphasis EMPHASIS c2">unsolicited-popups</span></dt>
532 <p>This filter attempts to prevent only <span class=
533 "QUOTE">"unsolicited"</span> pop-up windows from opening, yet
534 still allow pop-up windows that the user has explicitly chosen to
535 open. It was added in version 3.0.1, as an improvement over
536 earlier such filters.</p>
538 <p>Technical note: The filter works by redefining the window.open
539 JavaScript function to a dummy function, <tt class=
540 "LITERAL">PrivoxyWindowOpen()</tt>, during the loading and
541 rendering phase of each HTML page access, and restoring the
542 function afterward.</p>
544 <p>This is recommended only for browsers that cannot perform this
545 function reliably themselves. And be aware that some sites
546 require such windows in order to function normally. Use with
550 <dt><span class="emphasis EMPHASIS c2">all-popups</span></dt>
553 <p>Attempt to prevent <span class=
554 "emphasis EMPHASIS c2">all</span> pop-up windows from opening.
555 Note this should be used with even more discretion than the
556 above, since it is more likely to break some sites that require
557 pop-ups for normal usage. Use with caution.</p>
560 <dt><span class="emphasis EMPHASIS c2">img-reorder</span></dt>
563 <p>This is a helper filter that has no value if used alone. It
564 makes the <tt class="LITERAL">banners-by-size</tt> and <tt class=
565 "LITERAL">banners-by-link</tt> (see below) filters more effective
566 and should be enabled together with them.</p>
569 <dt><span class="emphasis EMPHASIS c2">banners-by-size</span></dt>
572 <p>This filter removes image tags purely based on what size they
573 are. Fortunately for us, many ads and banner images tend to
574 conform to certain standardized sizes, which makes this filter
575 quite effective for ad stripping purposes.</p>
577 <p>Occasionally this filter will cause false positives on images
578 that are not ads, but just happen to be of one of the standard
581 <p>Recommended only for those who require extreme ad blocking.
582 The default block rules should catch 95+% of all ads <span class=
583 "emphasis EMPHASIS c2">without</span> this filter enabled.</p>
586 <dt><span class="emphasis EMPHASIS c2">banners-by-link</span></dt>
589 <p>This is an experimental filter that attempts to kill any
590 banners if their URLs seem to point to known or suspected click
591 trackers. It is currently not of much value and is not
592 recommended for use by default.</p>
595 <dt><span class="emphasis EMPHASIS c2">webbugs</span></dt>
598 <p>Webbugs are small, invisible images (technically 1X1 GIF
599 images), that are used to track users across websites, and
600 collect information on them. As an HTML page is loaded by the
601 browser, an embedded image tag causes the browser to contact a
602 third-party site, disclosing the tracking information through the
603 requested URL and/or cookies for that third-party domain, without
604 the user ever becoming aware of the interaction with the
605 third-party site. HTML-ized spam also uses a similar technique to
606 verify email addresses.</p>
608 <p>This filter removes the HTML code that loads such <span class=
609 "QUOTE">"webbugs"</span>.</p>
612 <dt><span class="emphasis EMPHASIS c2">tiny-textforms</span></dt>
615 <p>A rather special-purpose filter that can be used to enlarge
616 textareas (those multi-line text boxes in web forms) and turn off
617 hard word wrap in them. It was written for the sourceforge.net
618 tracker system where such boxes are a nuisance, but it can be
619 handy on other sites, too.</p>
621 <p>It is not recommended to use this filter as a default.</p>
624 <dt><span class="emphasis EMPHASIS c2">jumping-windows</span></dt>
627 <p>Many consider windows that move, or resize themselves to be
628 abusive. This filter neutralizes the related JavaScript code.
629 Note that some sites might not display or behave as intended when
630 using this filter. Use with caution.</p>
633 <dt><span class="emphasis EMPHASIS c2">frameset-borders</span></dt>
636 <p>Some web designers seem to assume that everyone in the world
637 will view their web sites using the same browser brand and
638 version, screen resolution etc, because only that assumption
639 could explain why they'd use static frame sizes, yet prevent
640 their frames from being resized by the user, should they be too
641 small to show their whole content.</p>
643 <p>This filter removes the related HTML code. It should only be
644 applied to sites which need it.</p>
647 <dt><span class="emphasis EMPHASIS c2">demoronizer</span></dt>
650 <p>Many Microsoft products that generate HTML use non-standard
651 extensions (read: violations) of the ISO 8859-1 aka Latin-1
652 character set. This can cause those HTML documents to display
653 with errors on standard-compliant platforms.</p>
655 <p>This filter translates the MS-only characters into Latin-1
656 equivalents. It is not necessary when using MS products, and will
657 cause corruption of all documents that use 8-bit character sets
658 other than Latin-1. It's mostly worthwhile for Europeans on
659 non-MS platforms, if weird garbage characters sometimes appear on
660 some pages, or user agents that don't correct for this on the
664 <dt><span class="emphasis EMPHASIS c2">shockwave-flash</span></dt>
667 <p>A filter for shockwave haters. As the name suggests, this
668 filter strips code out of web pages that is used to embed
669 shockwave flash objects.</p>
673 "emphasis EMPHASIS c2">quicktime-kioskmode</span></dt>
676 <p>Change HTML code that embeds Quicktime objects so that
677 kioskmode, which prevents saving, is disabled.</p>
680 <dt><span class="emphasis EMPHASIS c2">fun</span></dt>
683 <p>Text replacements for subversive browsing fun. Make fun of
684 your favorite Monopolist or play buzzword bingo.</p>
687 <dt><span class="emphasis EMPHASIS c2">crude-parental</span></dt>
690 <p>A demonstration-only filter that shows how <span class=
691 "APPLICATION">Privoxy</span> can be used to delete web content on
695 <dt><span class="emphasis EMPHASIS c2">ie-exploits</span></dt>
698 <p>An experimental collection of text replacements to disable
699 malicious HTML and JavaScript code that exploits known security
700 holes in Internet Explorer.</p>
702 <p>Presently, it only protects against Nimda and a cross-site
703 scripting bug, and would need active maintenance to provide more
704 substantial protection.</p>
707 <dt><span class="emphasis EMPHASIS c2">site-specifics</span></dt>
710 <p>Some web sites have very specific problems, the cure for which
711 doesn't apply anywhere else, or could even cause damage on other
714 <p>This is a collection of such site-specific cures which should
715 only be applied to the sites they were intended for, which is
716 what the supplied <tt class="FILENAME">default.action</tt> file
717 does. Users shouldn't need to change anything regarding this
721 <dt><span class="emphasis EMPHASIS c2">google</span></dt>
724 <p>A CSS based block for Google text ads. Also removes a width
725 limitation and the toolbar advertisement.</p>
728 <dt><span class="emphasis EMPHASIS c2">yahoo</span></dt>
731 <p>Another CSS based block, this time for Yahoo text ads. And
732 removes a width limitation as well.</p>
735 <dt><span class="emphasis EMPHASIS c2">msn</span></dt>
738 <p>Another CSS based block, this time for MSN text ads. And
739 removes tracking URLs, as well as a width limitation.</p>
742 <dt><span class="emphasis EMPHASIS c2">blogspot</span></dt>
745 <p>Cleans up some Blogspot blogs. Read the fine print before
748 <p>This filter also intentionally removes some navigation stuff
749 and sets the page width to 100%. As a result, some rounded
750 <span class="QUOTE">"corners"</span> would appear to early or not
751 at all and as fixing this would require a browser that
752 understands background-size (CSS3), they are removed instead.</p>
755 <dt><span class="emphasis EMPHASIS c2">xml-to-html</span></dt>
758 <p>Server-header filter to change the Content-Type from xml to
762 <dt><span class="emphasis EMPHASIS c2">html-to-xml</span></dt>
765 <p>Server-header filter to change the Content-Type from html to
769 <dt><span class="emphasis EMPHASIS c2">no-ping</span></dt>
772 <p>Removes the non-standard <tt class="LITERAL">ping</tt>
773 attribute from anchor and area HTML tags.</p>
777 "emphasis EMPHASIS c2">hide-tor-exit-notation</span></dt>
780 <p>Client-header filter to remove the <b class="COMMAND">Tor</b>
781 exit node notation found in Host and Referer headers.</p>
783 <p>If <span class="APPLICATION">Privoxy</span> and <b class=
784 "COMMAND">Tor</b> are chained and <span class=
785 "APPLICATION">Privoxy</span> is configured to use socks4a, one
787 "QUOTE">"http://www.example.org.foobar.exit/"</span> to access
788 the host <span class="QUOTE">"www.example.org"</span> through the
789 <b class="COMMAND">Tor</b> exit node <span class=
790 "QUOTE">"foobar"</span>.</p>
792 <p>As the HTTP client isn't aware of this notation, it treats the
793 whole string <span class=
794 "QUOTE">"www.example.org.foobar.exit"</span> as host and uses it
795 for the <span class="QUOTE">"Host"</span> and <span class=
796 "QUOTE">"Referer"</span> headers. From the server's point of view
797 the resulting headers are invalid and can cause problems.</p>
799 <p>An invalid <span class="QUOTE">"Referer"</span> header can
800 trigger <span class="QUOTE">"hot-linking"</span> protections, an
801 invalid <span class="QUOTE">"Host"</span> header will make it
802 impossible for the server to find the right vhost (several
803 domains hosted on the same IP address).</p>
805 <p>This client-header filter removes the <span class=
806 "QUOTE">"foo.exit"</span> part in those headers to prevent the
807 mentioned problems. Note that it only modifies the HTTP headers,
808 it doesn't make it impossible for the server to detect your
809 <b class="COMMAND">Tor</b> exit node based on the IP address the
810 request is coming from.</p>
817 <div class="NAVFOOTER">
818 <hr class="c1" width="100%">
820 <table summary="Footer navigation table" width="100%" border="0"
821 cellpadding="0" cellspacing="0">
823 <td width="33%" align="left" valign="top"><a href="actions-file.html"
824 accesskey="P">Prev</a></td>
826 <td width="34%" align="center" valign="top"><a href="index.html"
827 accesskey="H">Home</a></td>
829 <td width="33%" align="right" valign="top"><a href="templates.html"
830 accesskey="N">Next</a></td>
834 <td width="33%" align="left" valign="top">Actions Files</td>
836 <td width="34%" align="center" valign="top"> </td>
838 <td width="33%" align="right" valign="top">Privoxy's Template