X-Git-Url: http://www.privoxy.org/gitweb/?a=blobdiff_plain;f=doc%2Fsource%2Fuser-manual.sgml;h=5df5080cafe3944529099504f556b3fb31bcf22c;hb=ae931e26de71fe1d1bd124ed66e98dbd3f11deaf;hp=da22d3bff281b08ed5a6c3698b93d6b31744d099;hpb=b7cfc9285d9e3f0e002a1275fdffb0ee67802d14;p=privoxy.git diff --git a/doc/source/user-manual.sgml b/doc/source/user-manual.sgml index da22d3bf..5df5080c 100644 --- a/doc/source/user-manual.sgml +++ b/doc/source/user-manual.sgml @@ -11,11 +11,11 @@ - - + + - - + + @@ -24,6 +24,7 @@ + Privoxy"> ]> - Copyright &my-copy; 2001 - 2008 by + Copyright &my-copy; 2001-2010 by Privoxy Developers -$Id: user-manual.sgml,v 2.89 2008/09/21 15:38:56 fabiankeil Exp $ +$Id: user-manual.sgml,v 2.130 2010/12/01 19:28:28 fabiankeil Exp $ @@ -436,44 +437,521 @@ How to install the binary packages depends on your operating system: What's New in this Release - There are only a few improvements and new features since - Privoxy 3.0.10, the last stable release: + Privoxy 3.0.17 is a stable release. + The changes since 3.0.16 stable are: - The mingw32 version uses mutex locks now which prevents - log message corruption under load. As a side effect, - the "no thread-safe PRNG" warning could be removed as well. + Fixed last-chunk-detection for responses where the content was small + enough to be read with the body, causing Privoxy to wait for the + end of the content until the server closed the connection or the + request timed out. Reported by "Karsten" in #3028326. - Support for remote toggling is controlled by the configure - option --disable-toggle only. In previous versions it also - depended on the action editor and thus configuring with the - --disable-editor option would disable remote toggling support - as well. + Responses with status code 204 weren't properly detected as body-less + like RFC2616 mandates. Like the previous bug, this caused Privoxy to + wait for the end of the content until the server closed the connection + or the request timed out. Fixes #3022042 and #3025553, reported by a + user with no visible name. Most likely also fixes a bunch of other + AJAX-related problem reports that got closed in the past due to + insufficient information and lack of feedback. - The hide-forwarded-for-headers action has been replaced with - the change-x-forwarded-for{} action which can also be used to - add X-Forwarded-For headers. The latter functionality already - existed in Privoxy versions prior to 3.0.7 but has been removed - as it was often used unintentionally (by not using the - hide-forwarded-for-headers action). + Fixed an ACL bug that made it impossible to build a blacklist. + Usually the ACL directives are used in a whitelist, which worked + as expected, but blacklisting is still useful for public proxies + where one only needs to deny known abusers access. + + + + + Added LOG_LEVEL_RECEIVED to log the not-yet-parsed data read from the + network. This should make debugging various parsing issues a lot easier. + + + + + The IPv6 code is enabled by default on Windows versions that support it. + Patch submitted by oCameLo in #2942729. + + + + + In mingw32 versions, the user.filter file is reachable through the + GUI, just like default.filter is. Feature request 3040263. + + + + + Added the configure option --enable-large-file-support to set a few + defines that are required by platforms like GNU/Linux to support files + larger then 2GB. Mainly interesting for users without proper logfile + management. + + + + + Logging with "debug 16" no longer stops at the first nul byte which is + pretty useless. Non-printable characters are replaced with their hex value + so the result can't span multiple lines making parsing them harder then + necessary. + + + + + Privoxy logs when reading an action, filter or trust file. + + + + + Fixed incorrect regression test markup which caused a test in + 3.0.16 to fail while Privoxy itself was working correctly. + While Privoxy accepts hide-referer, too, the action name is actually + hide-referrer which is also the name used one the final results page, + where the test expected the alias. + + + + + CGI interface improvements: + + + + In finish_http_response(), continue to add the 'Connection: close' + header if the client connection will not be kept alive. + Anonymously pointed out in #2987454. + + + + + Apostrophes in block messages no longer cause parse errors + when the blocked page is viewed with JavaScript enabled. + Reported by dg1727 in #3062296. + + + + + Fix a bunch of anchors that used underscores instead of dashes. + + + + + Allow to keep the client connection alive after crunching the previous request. + Already opened server connections can be kept alive, too. + + + + + In cgi_show_url_info(), don't forget to prefix URLs that only contain + http:// or https:// in the path. Fixes #2975765 reported by Adam Piggott. + + + + + Show the 404 CGI page if cgi_send_user_manual() is called while + local user manual delivery is disabled. + + + + + + + + Action file improvements: + + + + Enable user.filter by default. Suggested by David White in #3001830. + + + + + Block .sitestat.com/. Reported by johnd16 in #3002725. + + + + + Block .atemda.com/. Reported by johnd16 in #3002723. + + + + + Block js.adlink.net/. Reported by johnd16 in #3002720. + + + + + Block .analytics.yahoo.com/. Reported by johnd16 in #3002713. + + + + + Block sb.scorecardresearch.com, too. Reported by dg1727 in #2992652. + + + + + Fix problems noticed on Yahoo mail and news pages. + + + + + Remove the too broad yahoo section, only keeping the + fast-redirects exception as discussed on ijbswa-devel@. + + + + + Don't block adesklets.sourceforge.net. Reported in #2974204. + + + + + Block chartbeat ping tracking. Reported in #2975895. + + + + + Tag CSS and image requests with cautious and medium settings, too. + + + + + Don't handle view.atdmt.com as image. It's used for click-throughs + so users should be able to "go there anyway". + Reported by Adam Piggott in #2975927. + + + + + Also let the refresh-tags filter remove invalid refresh tags where + the 'url=' part is missing. Anonymously reported in #2986382. + While at it, update the description to mention the fact that only + refresh tags with refresh times above 9 seconds are covered. + + + + + javascript needs to be blocked with +handle-as-empty-document to + work around Firefox bug 492459. So move .js blockers from + +block{Might be a web-bug.} -handle-as-empty-document to + +block{Might be a web-bug.} +handle-as-empty-document. + + + + + ijbswa-Feature Requests-3006719 - Block 160x578 Banners. + + + + + Block another omniture tracking domain. + + + + + Added a range-requests tagger. + + + + + Added two sections to get Flickr's Ajax interface working with + default pre-settings. If you change the configuration to block + cookies by default, you'll need additional exceptions. + Reported by Mathias Homann in #3101419 and by Patrick on ijbswa-users@. + + + + + + + + Documentation improvements: + + + + Explicitly mention how to match all URLs. + + + + + Consistently recommend socks5 in the Tor FAQ entry and mention + its advantage compared to socks4a. Reported by David in #2960129. + + + + + Slightly improve the explanation of why filtering may appear + slower than it is. + + + + + Grammar fixes for the ACL section. + + + + + Fixed a link to the 'intercepting' entry and add another one. + + + + + Rename the 'Other' section to 'Mailing Lists' and reword it + to make it clear that nobody is forced to use the trackers + + + + + Note that 'anonymously' posting on the trackers may not always + be possible. + + + + + Suggest to enable debug 32768 when suspecting parsing problems. + + + + + + + + Privoxy-Log-Parser improvements: + + + + Gather statistics for ressources, methods, and HTTP versions + used by the client. + + + + + Also gather statistics for blocked and redirected requests. + + + + + Provide the percentage of keep-alive offers the client accepted. + + + + + Add a --url-statistics-threshold option. + + + + + Add a --host-statistics-threshold option to also gather + statistics about how many request where made per host. + + + + + Fix a bug in handle_loglevel_header() where a 'scan: ' got lost. + + + + + Add a --shorten-thread-ids option to replace the thread id with + a decimal number. + + + + + Accept and ignore: Looks like we got the last chunk together + with the server headers. We better stop reading. + + + + + Accept and ignore: Continue hack in da house. + + + + + Accept and higlight: Rejecting connection from 10.0.0.2. + Maximum number of connections reached. + + + + + Accept and highlight: Loading actions file: /usr/local/etc/privoxy/default.action + + + + + Accept and highlight: Loading filter file: /usr/local/etc/privoxy/default.filter + + + + + Accept and highlight: Killed all-caps Host header line: HOST: bestproxydb.com + + + + + Accept and highlight: Reducing expected bytes to 0. Marking + the server socket tainted after throwing 4 bytes away. + + + + + Accept: Merged multiple header lines to: 'X-FORWARDED-PROTO: http X-HOST: 127.0.0.1' + + + + + + + + Code cleanups: + + + + Remove the next member from the client_state struct. Only the main + thread needs access to all client states so give it its own struct. + + + + + Garbage-collect request_contains_null_bytes(). + + + + + Ditch redundant code in unload_configfile(). + + + + + Ditch LogGetURLUnderCursor() which doesn't seem to be used anywhere. + + + + + In write_socket(), remove the write-only variable write_len in + an ifdef __OS2__ block. Spotted by cppcheck. + + + + + In connect_to(), don't declare the variable 'flags' on OS/2 where + it isn't used. Spotted by cppcheck. + + + + + Limit the scope of various variables. Spotted by cppcheck. + + + + + In add_to_iob(), turn an interestingly looking for loop into a + boring while loop. + + + + + Code cleanup in preparation for external filters. + + + + + In listen_loop(), mention the socket on which we accepted the + connection, not just the source IP address. + + + + + In write_socket(), also log the socket we're writing to. + + + + + In log_error(), assert that escaped characters get logged + completely or not at all. + + + + + In log_error(), assert that ival and sval have reasonable values. + There's no reason not to abort() if they don't. + + + + + Remove an incorrect cgi_error_unknown() call in a + cannot-happen-situation in send_crunch_response(). + + + + + Clean up white-space in http_response definition and + move the crunch_reason to the beginning. + + + + + Turn http_response.reason into an enum and rename it + to http_response.crunch_reason. + + + + + Silence a 'gcc (Debian 4.3.2-1.1) 4.3.2' warning on i686 GNU/Linux. + + + + + Fix white-space in a log message in remove_chunked_transfer_coding(). + While at it, add a note that the message doesn't seem to + be entirely correct and should be improved later on. + + + + + + + + GNUmakefile improvements: + + + + Use $(SSH) instead of ssh, so one only needs to specify a username once. + + + + + Removed references to the action feedback thingy that hasn't been + working for years. + + + + + Consistently use shell.sourceforge.net instead of shell.sf.net so + one doesn't need to check server fingerprints twice. + + + + + Removed GNUisms in the webserver and webactions targets so they + work with standard tar. + + + - - For a more detailed list of changes please have a look at the ChangeLog. - @@ -518,8 +996,8 @@ How to install the binary packages depends on your operating system: - standard.action now only includes the enabled actions. - Not all actions as before. + standard.action has been merged into + the default.action file. @@ -543,18 +1021,6 @@ How to install the binary packages depends on your operating system: - - - The filter-client-headers and - filter-server-headers actions that were introduced with - Privoxy 3.0.5 to apply content filters to - the headers have been removed and replaced with new actions. - See the What's New section above. - - - - The actions files are used to define what actions Privoxy takes for which URLs, and thus determines @@ -1759,77 +2231,71 @@ for details. There are three action files included with Privoxy with differing purposes: - - - - - - - default.action - is the primary action file - that sets the initial values for all actions. It is intended to - provide a base level of functionality for - Privoxy's array of features. So it is - a set of broad rules that should work reasonably well as-is for most users. - This is the file that the developers are keeping updated, and making available to users. - The user's preferences as set in standard.action, - e.g. either Cautious (the default), - Medium, or Advanced (see - below). - - - - - user.action - is intended to be for local site - preferences and exceptions. As an example, if your ISP or your bank - has specific requirements, and need special handling, this kind of - thing should go here. This file will not be upgraded. - + + + + + + match-all.action - is used to define which + actions relating to banner-blocking, images, pop-ups, + content modification, cookie handling etc should be applied by default. + It should be the first actions file loaded + - - - standard.action - is used only by the web based editor - at - http://config.privoxy.org/edit-actions-list?f=default, - to set various pre-defined sets of rules for the default actions section - in default.action. - - - Edit Set to Cautious Set to Medium Set to Advanced - - - These have increasing levels of aggressiveness and have no - influence on your browsing unless you select them explicitly in the - editor. A default installation should be pre-set to - Cautious (versions prior to 3.0.5 were set to - Medium). New users should try this for a while before - adjusting the settings to more aggressive levels. The more aggressive - the settings, then the more likelihood there is of problems such as sites - not working as they should. - - - The Edit button allows you to turn each - action on/off individually for fine-tuning. The Cautious - button changes the actions list to low/safe settings which will activate - ad blocking and a minimal set of &my-app;'s features, and subsequently - there will be less of a chance for accidental problems. The - Medium button sets the list to a medium level of - other features and a low level set of privacy features. The - Advanced button sets the list to a high level of - ad blocking and medium level of privacy. See the chart below. The latter - three buttons over-ride any changes via with the - Edit button. More fine-tuning can be done in the - lower sections of this internal page. - - - It is not recommend to edit the standard.action file - itself. - - - The default profiles, and their associated actions, as pre-defined in - standard.action are: - - + + + default.action - defines many exceptions (both + positive and negative) from the default set of actions that's configured + in match-all.action. It is a set of rules that should + work reasonably well as-is for most users. This file is only supposed to + be edited by the developers. It should be the second actions file loaded. + + + + + user.action - is intended to be for local site + preferences and exceptions. As an example, if your ISP or your bank + has specific requirements, and need special handling, this kind of + thing should go here. This file will not be upgraded. + + + + + Edit Set to Cautious Set to Medium Set to Advanced + + + These have increasing levels of aggressiveness and have no + influence on your browsing unless you select them explicitly in the + editor. A default installation should be pre-set to + Cautious. New users should try this for a while before + adjusting the settings to more aggressive levels. The more aggressive + the settings, then the more likelihood there is of problems such as sites + not working as they should. + + + The Edit button allows you to turn each + action on/off individually for fine-tuning. The Cautious + button changes the actions list to low/safe settings which will activate + ad blocking and a minimal set of &my-app;'s features, and subsequently + there will be less of a chance for accidental problems. The + Medium button sets the list to a medium level of + other features and a low level set of privacy features. The + Advanced button sets the list to a high level of + ad blocking and medium level of privacy. See the chart below. The latter + three buttons over-ride any changes via with the + Edit button. More fine-tuning can be done in the + lower sections of this internal page. + + + While the actions file editor allows to enable these settings in all + actions files, they are only supposed to be enabled in the first one + to make sure you don't unintentionally overrule earlier rules. + + + The default profiles, and their associated actions, as pre-defined in + default.action are: + + Default Configurations @@ -1902,7 +2368,6 @@ for details. yes - GIF de-animation no @@ -1910,7 +2375,6 @@ for details. yes - Fast redirects no @@ -1951,9 +2415,9 @@ for details.
-
-
-
+ + + The list of actions files to be used are defined in the main configuration @@ -2111,12 +2575,12 @@ for details. Generally, an URL pattern has the form - <domain>/<path>, where both the - <domain> and <path> are - optional. (This is why the special / pattern matches all - URLs). Note that the protocol portion of the URL pattern (e.g. - http://) should not be included in - the pattern. This is assumed already! + <domain><port>/<path>, where the + <domain>, the <port> + and the <path> are optional. (This is why the special + / pattern matches all URLs). Note that the protocol + portion of the URL pattern (e.g. http://) should + not be included in the pattern. This is assumed already! The pattern matching syntax is different for the domain and path parts of @@ -2125,6 +2589,12 @@ for details. Regular Expressions (POSIX 1003.2). + + The port part of a pattern is a decimal port number preceded by a colon + (:). If the domain part contains a numerical IPv6 address, + it has to be put into angle brackets + (<, >). + @@ -2174,6 +2644,32 @@ for details. + + / + + + Matches any URL because there's no requirement for either the + domain or the path to match anything. + + + + + :8000/ + + + Matches any URL pointing to TCP port 8000. + + + + + <2001:db8::1>/ + + + Matches any URL with the host address 2001:db8::1. + (Note that the real URL uses plain brackets, not angle brackets.) + + + index.html @@ -2639,6 +3135,9 @@ for details. HTTP headers are, you definitely don't need to worry about this one. + + Headers added by this action are not modified by other actions. + @@ -3826,9 +4325,10 @@ problem-host.example.com Filtering requires buffering the page content, which may appear to slow down page rendering since nothing is displayed until all content has - passed the filters. (It does not really take longer, but seems that way - since the page is not incrementally displayed.) This effect will be more - noticeable on slower connections. + passed the filters. (The total time until the page is completely rendered + doesn't change much, but it may be perceived as slower since the page is + not incrementally displayed.) + This effect will be more noticeable on slower connections. Rolling your own @@ -5135,7 +5635,7 @@ new action reset-to-request-time overwrites the value of the Last-Modified: header with the current time. You could use this option together with - hided-if-modified-since + hide-if-modified-since to further customize your random range. @@ -5793,24 +6293,71 @@ hal stop here linkend="actions">specified and applied to URLs, how patterns work, and how to define and use aliases. Now, let's look at an - example default.action and user.action - file and see how all these pieces come together: + example match-all.action, default.action + and user.action file and see how all these pieces come together: + + + +match-all.action + + Remember all actions are disabled when matching starts, + so we have to explicitly enable the ones we want. + + + + While the match-all.action file only contains a + single section, it is probably the most important one. It has only one + pattern, /, but this pattern + matches all URLs. Therefore, the set of + actions used in this default section will + be applied to all requests as a start. It can be partly or + wholly overridden by other actions files like default.action + and user.action, but it will still be largely responsible + for your overall browsing experience. + + + + Again, at the start of matching, all actions are disabled, so there is + no need to disable any actions here. (Remember: a + + preceding the action name enables the action, a - disables!). + Also note how this long line has been made more readable by splitting it into + multiple lines with line continuation. + + + + +{ \ + +change-x-forwarded-for{block} \ + +hide-from-header{block} \ + +set-image-blocker{pattern} \ +} +/ # Match all URLs + + + + + The default behavior is now set. + -default.action + +default.action -Every config file should start with a short comment stating its purpose: + If you aren't a developer, there's no need for you to edit the + default.action file. It is maintained by + the &my-app; developers and if you disagree with some of the + sections, you should overrule them in your user.action. - # Sample default.action file <ijbswa-developers@lists.sourceforge.net> + Understanding the default.action file can + help you with your user.action, though. -Then, since this is the default.action file, the -first section is a special section for internal use that you needn't -change or worry about: + The first section in this file is a special section for internal use + that prevents older &my-app; versions from reading the file: @@ -5818,15 +6365,14 @@ change or worry about: ########################################################################## # Settings -- Don't change! For internal Privoxy use ONLY. ########################################################################## - {{settings}} -for-privoxy-version=3.0 +for-privoxy-version=3.0.11 -After that comes the (optional) alias section. We'll use the example -section from the above chapter on aliases, -that also explains why and how aliases are used: + After that comes the (optional) alias section. We'll use the example + section from the above chapter on aliases, + that also explains why and how aliases are used: @@ -5851,68 +6397,6 @@ that also explains why and how aliases are used: shop = -crunch-all-cookies -filter{all-popups} - - Now come the regular sections, i.e. sets of actions, accompanied - by URL patterns to which they apply. Remember all actions - are disabled when matching starts, so we have to explicitly - enable the ones we want. - - - - The first regular section is probably the most important. It has only - one pattern, /, but this pattern - matches all URLs. Therefore, the - set of actions used in this default section will - be applied to all requests as a start. It can be partly or - wholly overridden by later matches further down this file, or in user.action, - but it will still be largely responsible for your overall browsing - experience. - - - - Again, at the start of matching, all actions are disabled, so there is - no need to disable any actions here. (Remember: a + - preceding the action name enables the action, a - disables!). - Also note how this long line has been made more readable by splitting it into - multiple lines with line continuation. - - - - -########################################################################## -# "Defaults" section: -########################################################################## - { \ - +change-x-forwarded-for{block} \ - +deanimate-gifs \ - +filter{html-annoyances} \ - +filter{refresh-tags} \ - +filter{webbugs} \ - +filter{ie-exploits} \ - +hide-from-header{block} \ - +hide-referrer{forge} \ - +prevent-compression \ - +session-cookies-only \ - +set-image-blocker{pattern} \ - } - / # forward slash will match *all* potential URL patterns. - - - - The default behavior is now set. - - - The first of our specialized sections is concerned with fragile sites, i.e. sites that require minimum interference, because they are either @@ -5953,36 +6437,10 @@ mail.google.com .scan.co.uk - - The fast-redirects - action, which we enabled per default above, breaks some sites. So disable - it for popular sites where we know it misbehaves: + action, which may have been enabled in match-all.action, + breaks some sites. So disable it for popular sites where we know it misbehaves: @@ -6002,8 +6460,8 @@ edit.*.yahoo.com be blocked, a substitute image can be sent, rather than an HTML page. Contacting the remote site to find out is not an option, since it would destroy the loading time advantage of banner blocking, and it - would feed the advertisers (in terms of money and - information). We can mark any URL as an image with the handle-as-image action, and marking all URLs that end in a known image file extension is a good start: @@ -8459,6 +8917,134 @@ In file: user.action [ View ] [ Edit ]