From: hal9 <hal9@users.sourceforge.net> Date: Mon, 2 Oct 2006 22:43:53 +0000 (+0000) Subject: Contains new filter definitions from Fabian, and few other miscellaneous X-Git-Tag: v_3_0_6~68 X-Git-Url: http://www.privoxy.org/gitweb/%22https:/faq/@default-cgi@/@default-cgi@send-stylesheet?a=commitdiff_plain;h=368c902fdd59356f93f21202797695664decef6b;p=privoxy.git Contains new filter definitions from Fabian, and few other miscellaneous touch-ups. --- diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml index 0cadea86..be15be01 100644 --- a/doc/source/user-manual.sgml +++ b/doc/source/user-manual.sgml @@ -11,8 +11,8 @@ <!entity license SYSTEM "license.sgml"> <!entity p-authors SYSTEM "p-authors.sgml"> <!entity config SYSTEM "p-config.sgml"> -<!entity p-version "3.0.5"> -<!entity p-status "BETA"> +<!entity p-version "3.0.6"> +<!entity p-status "UNRELEASED"> <!entity % p-authors-formal "INCLUDE"> <!-- include additional text, etc --> <!entity % p-not-stable "INCLUDE"> <!entity % p-stable "IGNORE"> @@ -33,7 +33,7 @@ This file belongs into ijbswa.sourceforge.net:/home/groups/i/ij/ijbswa/htdocs/ - $Id: user-manual.sgml,v 2.21 2006/09/20 03:21:36 david__schmidt Exp $ + $Id: user-manual.sgml,v 2.22 2006/09/22 01:27:55 hal9 Exp $ Copyright (C) 2001- 2006 Privoxy Developers http://www.privoxy.org See LICENSE. @@ -59,7 +59,7 @@ </subscript> </pubdate> -<pubdate>$Id: user-manual.sgml,v 2.21 2006/09/20 03:21:36 david__schmidt Exp $</pubdate> +<pubdate>$Id: user-manual.sgml,v 2.22 2006/09/22 01:27:55 hal9 Exp $</pubdate> <!-- @@ -746,7 +746,7 @@ How to install the binary packages depends on your operating system: helpful. You can also view and edit the actions files through the <ulink url="http://config.privoxy.org">web-based user interface</ulink>. The Appendix <quote><link linkend="actionsanat">Troubleshooting: Anatomy of an - Action</link></quote> has hints how to understand and debug actions that + Action</link></quote> has hints on how to understand and debug actions that <quote>misbehave</quote>. </para> </listitem> @@ -1764,7 +1764,9 @@ for details. editor</emphasis>. A default installation should be pre-set to <literal>Cautious</literal> (versions prior to 3.0.5 were set to <literal>Medium</literal>). New users should try this for a while before - adjusting the settings to more aggressive levels. + adjusting the settings to more aggressive levels. The more aggressive + the settings, then the more likelihood there is of problems such as sites + not working as they should. </para> <para> The <guibutton>Edit</guibutton> button allows you to turn each @@ -1836,7 +1838,7 @@ for details. <entry>Pop-up killing</entry> <entry>blocks only</entry> <entry>blocks only</entry> - <entry>all</entry> + <entry>blocks only</entry> </row> <row> @@ -1879,14 +1881,14 @@ for details. <row> <entry>HTML taming</entry> <entry>no</entry> - <entry>yes</entry> + <entry>no</entry> <entry>yes</entry> </row> <row> <entry>JavaScript taming</entry> <entry>no</entry> - <entry>yes</entry> + <entry>no</entry> <entry>yes</entry> </row> @@ -2450,8 +2452,9 @@ for details. are applied in the order they are specified. Actions files are processed in the order they are defined in <filename>config</filename> (the default installation has three actions files). It also quite possible for any given - URL pattern to match more than one pattern and thus more than one set of - actions! Last match wins. + URL to match more than one <quote>pattern</quote> (because of wildcards and + regular expressions), and thus to trigger more than one set of actions! Last + match wins. </para> <!-- start actions listing --> @@ -3407,7 +3410,8 @@ problem-host.example.com</screen> <varlistentry> <term>Typical use:</term> <listitem> - <para>Get rid of HTML and JavaScript annoyances, banner advertisements (by size), do fun text replacements, etc.</para> + <para>Get rid of HTML and JavaScript annoyances, banner advertisements (by size), + do fun text replacements, add personalized effects, etc.</para> </listitem> </varlistentry> @@ -3415,13 +3419,14 @@ problem-host.example.com</screen> <term>Effect:</term> <listitem> <para> - All files of text-based type, most notably HTML and JavaScript, to which this - action applies, are filtered on-the-fly through the specified regular expression - based substitutions. (Note: as of version 3.0.3 plain text documents - are exempted from filtering, because web servers often use the - <literal>text/plain</literal> MIME type for all files whose type they - don't know.) By default, filtering works only on the raw document content - itself (that which can be seen with <literal>View Source</literal>), + All files of text-based type, most notably HTML and + JavaScript, to which this action applies, can be filtered on-the-fly + through the specified regular expression based substitutions. (Note: as of + version 3.0.3 plain text documents are exempted from filtering, because + web servers often use the <literal>text/plain</literal> MIME type for all + files whose type they don't know.) By default, filtering works only on the + raw document content itself (that which can be seen with <literal>View + Source</literal>), not the headers. </para> </listitem> @@ -3475,8 +3480,9 @@ problem-host.example.com</screen> <ulink url="http://en.wikipedia.org/wiki/Regular_expressions"><quote>Regular Expressions</quote></ulink> and <ulink url="http://en.wikipedia.org/wiki/Html"><quote>HTML</quote></ulink>. - This is very powerful feature, and potentially very intrusive. Use - with caution. + This is very powerful feature, and potentially very intrusive. + Filters should be used with caution, and where an equivalent + <quote>action</quote> is not available. </para> <para> The amount of data that can be filtered is limited to the @@ -3496,7 +3502,7 @@ problem-host.example.com</screen> <para> At this time, <application>Privoxy</application> cannot uncompress compressed documents. If you want filtering to work on all documents, even those that - would normally be sent compressed, use the + would normally be sent compressed, you must use the <literal><link linkend="prevent-compression">prevent-compression</link></literal> action in conjunction with <literal>filter</literal>. </para> @@ -3548,11 +3554,11 @@ problem-host.example.com</screen> </para> <para> <anchor id="filter-unsolicited-popups"> - <screen>+filter{unsolicited-popups} # Disable only unsolicited pop-up windows</screen> + <screen>+filter{unsolicited-popups} # Disable only unsolicited pop-up windows. Useful if your browser lacks this ability.</screen> </para> <para> <anchor id="filter-all-popups"> - <screen>+filter{all-popups} # Kill all popups in JavaScript and HTML</screen> + <screen>+filter{all-popups} # Kill all popups in JavaScript and HTML. Useful if your browser lacks this ability.</screen> </para> <para> <anchor id="filter-img-reorder"> @@ -3606,6 +3612,34 @@ problem-host.example.com</screen> <anchor id="filter-ie-exploits"> <screen>+filter{ie-exploits} # Disable some known Internet Explorer bug exploits</screen> </para> + <para> + <anchor id="filter-site-specifics"> + <screen>+filter{site-specifics} # Custom filters for specific site related problems</screen> + </para> + <para> + <anchor id="filter-google"> + <screen>+filter{google} # Removes text ads and other Google specific improvements</screen> + </para> + <para> + <anchor id="filter-yahoo"> + <screen>+filter{yahoo} # Removes text ads and other Yahoo specific improvements</screen> + </para> + <para> + <anchor id="filter-msn"> + <screen>+filter{msn} # Removes text ads and other MSN specific improvements</screen> + </para> + <para> + <anchor id="filter-blogspot"> + <screen>+filter{blogspot} # Cleans up Blogspot blogs</screen> + </para> + <para> + <anchor id="filter-html-to-xml"> + <screen>+filter{html-to-xml} # Header filter to change the Content-Type from html to xml</screen> + </para> + <para> + <anchor id="filter-xml-to-html"> + <screen>+filter{xml-to-html} # Header filter to change the Content-Type from xml to html</screen> + </para> </listitem> </varlistentry> </variablelist> @@ -4760,7 +4794,7 @@ new action Killing all pop-ups unconditionally is problematic. Many shops and banks rely on pop-ups to display forms, shopping carts etc, and the <literal><link linkend="FILTER-UNSOLICITED-POPUPS">filter{<replaceable>unsolicited-popups</replaceable>}</link> - </literal> does a fairly good job of catching only the unwanted ones. + </literal> does a better job of catching only the unwanted ones. </para> <para> If the only kind of pop-ups that you want to kill are exit consoles (those @@ -4770,6 +4804,10 @@ new action linkend="filter">filter</link>{<replaceable>js-annoyances</replaceable>}</literal> instead. </para> + <para> + This action is most appropriate for browsers that don't have any controls + for unwanted pop-ups. Not recommended for general usage. + </para> <!-- <para> @@ -5810,20 +5848,20 @@ that also explains why and how aliases are used: -<link linkend="CRUNCH-OUTGOING-COOKIES">crunch-outgoing-cookies</link> \ +<link linkend="DEANIMATE-GIFS">deanimate-gifs</link> \ -<link linkend="DOWNGRADE-HTTP-VERSION">downgrade-http-version</link> \ - +<link linkend="FAST-REDIRECTS">fast-redirects{check-decoded-url}</link> \ - +<link linkend="FILTER-JS-ANNOYANCES">filter{js-annoyances}</link> \ + -<link linkend="FAST-REDIRECTS">fast-redirects{check-decoded-url}</link> \ + -<link linkend="FILTER-JS-ANNOYANCES">filter{js-annoyances}</link> \ -<link linkend="FILTER-JS-EVENTS">filter{js-events}</link> \ +<link linkend="FILTER-HTML-ANNOYANCES">filter{html-annoyances}</link> \ -<link linkend="FILTER-CONTENT-COOKIES">filter{content-cookies}</link> \ +<link linkend="FILTER-REFRESH-TAGS">filter{refresh-tags}</link> \ - +<link linkend="FILTER-UNSOLICITED-POPUPS">filter{unsolicited-popups}</link> \ + -<link linkend="FILTER-UNSOLICITED-POPUPS">filter{unsolicited-popups}</link> \ -<link linkend="FILTER-ALL-POPUPS">filter{all-popups}</link> \ - +<link linkend="FILTER-IMG-REORDER">filter{img-reorder}</link> \ - +<link linkend="FILTER-BANNERS-BY-SIZE">filter{banners-by-size}</link> \ + -<link linkend="FILTER-IMG-REORDER">filter{img-reorder}</link> \ + -<link linkend="FILTER-BANNERS-BY-SIZE">filter{banners-by-size}</link> \ -<link linkend="FILTER-BANNERS-BY-LINK">filter{banners-by-link}</link> \ +<link linkend="FILTER-WEBBUGS">filter{webbugs}</link> \ -<link linkend="FILTER-TINY-TEXTFORMS">filter{tiny-textforms}</link> \ - +<link linkend="FILTER-JUMPING-WINDOWS">filter{jumping-windows}</link> \ + -<link linkend="FILTER-JUMPING-WINDOWS">filter{jumping-windows}</link> \ -<link linkend="FILTER-FRAMESET-BORDERS">filter{frameset-borders}</link> \ -<link linkend="FILTER-DEMORONIZER">filter{demoronizer}</link> \ -<link linkend="FILTER-SHOCKWAVE-FLASH">filter{shockwave-flash}</link> \ @@ -5833,6 +5871,12 @@ that also explains why and how aliases are used: +<link linkend="FILTER-IE-EXPLOITS">filter{ie-exploits}</link> \ -<link linkend="FILTER-CLIENT-HEADERS">filter-client-headers</link> \ -<link linkend="FILTER-SERVER-HEADERS">filter-server-headers</link> \ + -<link linkend="FILTER-GOOGLE">filter-google</link> \ + -<link linkend="FILTER-YAHOO">filter-yahoo</link> \ + -<link linkend="FILTER-MSN">filter-msn</link> \ + -<link linkend="FILTER-BLOGSPOT">filter-blogspot</link> \ + -<link linkend="FILTER-XML-TO-HTML">filter-xml-to-html</link> \ + -<link linkend="FILTER-HTML-TO-XML">filter-html-to-xml</link> \ -<link linkend="FORCE-TEXT-MODE">force-text-mode</link> \ -<link linkend="HANDLE-AS-EMPTY-DOCUMENT">handle-as-empty-document</link> \ -<link linkend="HANDLE-AS-IMAGE">handle-as-image</link> \ @@ -5886,7 +5930,8 @@ that also explains why and how aliases are used: # { fragile } .office.microsoft.com # surprise, surprise! -.windowsupdate.microsoft.com</screen> +.windowsupdate.microsoft.com +mail.google.com</screen> </para> <para> @@ -6002,13 +6047,12 @@ ar.atwola.com .a.yimg.com/(?:(?!/i/).)*$ .a[0-9].yimg.com/(?:(?!/i/).)*$ bs*.gsanet.com -bs*.einets.com .qkimg.net</screen> </para> <para> One of the most important jobs of <application>Privoxy</application> - is to block banners. A huge bunch of them can be <quote>blocked</quote> + is to block banners. Many of these can be <quote>blocked</quote> by the <literal><link linkend="filter">filter</link>{banners-by-size}</literal> action, which we enabled above, and which deletes the references to banner images from the pages while they are loaded, so the browser doesn't request @@ -6018,7 +6062,7 @@ bs*.einets.com <literal><link linkend="block">block</link></literal> action to them. </para> <para> - First comes a bunch of generic patterns, which do most of the work, by + First comes many generic patterns, which do most of the work, by matching typical domain and path name components of banners. Then comes a list of individual patterns for specific sites, which is omitted here to keep the example short: @@ -6046,7 +6090,7 @@ count*. </para> <para> - You wouldn't believe how many advertisers actually call their banner + It's quite remarkable how many advertisers actually call their banner servers ads.<replaceable>company</replaceable>.com, or call the directory in which the banners are stored simply <quote>banners</quote>. So the above generic patterns are surprisingly effective. @@ -6084,6 +6128,7 @@ count*. { -<link linkend="BLOCK">block</link> } adv[io]*. # (for advogato.org and advice.*) adsl. # (has nothing to do with ads) +adobe. # (has nothing to do with ads either) ad[ud]*. # (adult.* and add.*) .edu # (universities don't host banners (yet!)) .*loads. # (downloads, uploads etc) @@ -6111,7 +6156,10 @@ www.ugu.com/sui/ugu/adv</screen> # Don't filter code! # { -<link linkend="FILTER">filter</link> } -/.*cvs +/(.*/)?cvs +bugzilla. +developer. +wiki. .sourceforge.net</screen> </para> @@ -6292,7 +6340,7 @@ stupid-server.example.com/</screen> <screen> { fragile } .forbes.com - mail.example.com + webmail.example.com .mybank.com</screen> </para> @@ -6749,6 +6797,10 @@ pre-defined filters for your convenience: </listitem> </itemizedlist> </para> + <para> + Use with caution. This is an aggressive filter, and can break sites that + rely heavily on JavaScript. + </para> </listitem> </varlistentry> @@ -6758,7 +6810,7 @@ pre-defined filters for your convenience: <para> This is a very radical measure. It removes virtually all JavaScript event bindings, which means that scripts can not react to user actions such as mouse movements or clicks, window - resizing etc, anymore. + resizing etc, anymore. Use with caution! </para> <para> We <emphasis>strongly discourage</emphasis> using this filter as a default since it breaks @@ -6795,8 +6847,10 @@ pre-defined filters for your convenience: to sneak cookies to the browser on the content level. </para> <para> - This filter disables HTML and JavaScript code that reads or sets cookies. Use - it wherever you would also use the cookie crunch actions. + This filter disables most HTML and JavaScript code that reads or sets + cookies. It cannot detect all clever uses of these types of code, so it + should not be relied on as an absolute fix. Use it wherever you would also + use the cookie crunch actions. </para> </listitem> </varlistentry> @@ -6824,8 +6878,14 @@ pre-defined filters for your convenience: </para> <para> Technical note: The filter works by redefining the window.open JavaScript - function to a dummy function during the loading and rendering phase of each - HTML page access, and restoring the function afterward. + function to a dummy function, <literal>PrivoxyWindowOpen()</literal>, + during the loading and rendering phase of each HTML page access, and + restoring the function afterward. + </para> + <para> + This is recommended only for browsers that cannot perform this function + reliably themselves. And be aware that some sites require such windows + in order to function normally. Use with caution. </para> </listitem> </varlistentry> @@ -6835,9 +6895,9 @@ pre-defined filters for your convenience: <listitem> <para> Attempt to prevent <emphasis>all</emphasis> pop-up windows from opening. - Note this should be used with more discretion than the above, since it is - more likely to break some sites that require pop-ups for normal usage. Use - with caution. + Note this should be used with even more discretion than the above, since + it is more likely to break some sites that require pop-ups for normal + usage. Use with caution. </para> </listitem> </varlistentry> @@ -6865,6 +6925,10 @@ pre-defined filters for your convenience: Occasionally this filter will cause false positives on images that are not ads, but just happen to be of one of the standard banner sizes. </para> + <para> + Recommended only for those who require extreme ad blocking. The default + block rules should catch 95+% of all ads <emphasis>without</emphasis> this filter enabled. + </para> </listitem> </varlistentry> @@ -6888,7 +6952,7 @@ pre-defined filters for your convenience: As an HTML page is loaded by the browser, an embedded image tag causes the browser to contact a third-party site, disclosing the tracking information through the requested URL and/or cookies for that third-party domain, without - the use ever becoming aware of the interaction with the third-party site. + the user ever becoming aware of the interaction with the third-party site. HTML-ized spam also uses a similar technique to verify email addresses. </para> <para> @@ -6918,7 +6982,7 @@ pre-defined filters for your convenience: <para> Many consider windows that move, or resize themselves to be abusive. This filter neutralizes the related JavaScript code. Note that some sites might not display - or behave as intended when using this filter. + or behave as intended when using this filter. Use with caution. </para> </listitem> </varlistentry> @@ -7010,7 +7074,7 @@ pre-defined filters for your convenience: <term><emphasis>ie-exploits</emphasis></term> <listitem> <para> - A collection of text replacements to disable malicious HTML and JavaScript + An experimental collection of text replacements to disable malicious HTML and JavaScript code that exploits known security holes in Internet Explorer. </para> <para> @@ -7036,6 +7100,68 @@ pre-defined filters for your convenience: </listitem> </varlistentry> + <varlistentry> + <term><emphasis>google</emphasis></term> + <listitem> + <para> + A CSS based block for Google text ads. Also removes a width limitation + and the toolbar advertisement. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><emphasis>yahoo</emphasis></term> + <listitem> + <para> + Another CSS based block, this time for Yahoo text ads. And removes + a width limitation as well. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><emphasis>msn</emphasis></term> + <listitem> + <para> + Another CSS based block, this time for MSN text ads. And removes + tracking URLs, as well as a width limitation. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><emphasis>blogspot</emphasis></term> + <listitem> + <para> + Cleans up some Blogspot blogs. Read the fine print before using this one! + </para> + <para> + This filter also intentionally removes some navigation stuff and sets the + page width to 100%. As a result, some rounded <quote>corners</quote> would + appear to early or not at all and as fixing this would require a browser + that understands background-size (CSS3), they are removed instead. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><emphasis>xml-to-html</emphasis></term> + <listitem> + <para> + Header filter to change the Content-Type from xml to html. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term><emphasis>html-to-xml</emphasis></term> + <listitem> + <para> + Header filter to change the Content-Type from html to xml. + </para> + </listitem> + </varlistentry> + <!-- <varlistentry> <term><emphasis> </emphasis></term> @@ -7846,15 +7972,21 @@ Requests</title> -filter {fun} -filter {crude-parental} -filter {site-specifics} - +filter {js-annoyances} - +filter {html-annoyances} + -filter {js-annoyances} + -filter {html-annoyances} +filter {refresh-tags} - +filter {unsolicited-popups} + -filter {unsolicited-popups} +filter {img-reorder} +filter {banners-by-size} +filter {webbugs} +filter {jumping-windows} +filter {ie-exploits} + -filter {google} + -filter {yahoo} + -filter {msn} + -filter {blogspot} + -filter {xml-to-html} + -filter {html-to-xml} -filter-client-headers -filter-server-headers -force-text-mode @@ -7963,16 +8095,34 @@ In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibut -crunch-server-header +deanimate-gifs {last} -downgrade-http-version - -fast-redirects - +filter {js-annoyances} - +filter {html-annoyances} + +fast-redirects {check-decoded-url} + -filter {js-events} + -filter {content-cookies} + -filter {all-popups} + -filter {banners-by-link} + -filter {tiny-textforms} + -filter {frameset-borders} + -filter {demoronizer} + -filter {shockwave-flash} + -filter {quicktime-kioskmode} + -filter {fun} + -filter {crude-parental} + -filter {site-specifics} + -filter {js-annoyances} + -filter {html-annoyances} +filter {refresh-tags} - +filter {unsolicited-popups} + -filter {unsolicited-popups} +filter {img-reorder} +filter {banners-by-size} +filter {webbugs} +filter {jumping-windows} +filter {ie-exploits} + -filter {google} + -filter {yahoo} + -filter {msn} + -filter {blogspot} + -filter {xml-to-html} + -filter {html-to-xml} -filter-client-headers -filter-server-headers -force-text-mode @@ -8070,15 +8220,34 @@ In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibut -crunch-server-header +deanimate-gifs -downgrade-http-version - +fast-redirects{check-decoded-url} - +filter{html-annoyances} - +filter{js-annoyances} - +filter{kill-popups} - +filter{webbugs} - +filter{nimda} - +filter{banners-by-size} - +filter{hal} - +filter{fun} + +fast-redirects {check-decoded-url} + -filter {js-events} + -filter {content-cookies} + -filter {all-popups} + -filter {banners-by-link} + -filter {tiny-textforms} + -filter {frameset-borders} + -filter {demoronizer} + -filter {shockwave-flash} + -filter {quicktime-kioskmode} + -filter {fun} + -filter {crude-parental} + -filter {site-specifics} + -filter {js-annoyances} + -filter {html-annoyances} + +filter {refresh-tags} + -filter {unsolicited-popups} + +filter {img-reorder} + +filter {banners-by-size} + +filter {webbugs} + +filter {jumping-windows} + +filter {ie-exploits} + -filter {google} + -filter {yahoo} + -filter {msn} + -filter {blogspot} + -filter {xml-to-html} + -filter {html-to-xml} -filter-client-headers -filter-server-headers -force-text-mode @@ -8091,7 +8260,7 @@ In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibut +hide-referer{forge} -hide-user-agent -inspect-jpegs - +kill-popups + -kill-popups -overwrite-last-modified +prevent-compression -redirect @@ -8260,6 +8429,10 @@ In file: user.action <guibutton>[ View ]</guibutton> <guibutton>[ Edit ]</guibut USA $Log: user-manual.sgml,v $ + Revision 2.22 2006/09/22 01:27:55 hal9 + Final commit of probably various minor changes here and there. Unless + something changes this should be ready for pending release. + Revision 2.21 2006/09/20 03:21:36 david__schmidt Just the tiniest tweak. Wafer thin!