X-Git-Url: http://www.privoxy.org/gitweb/?a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Factions-file.html;h=17f6e835725614bf6c13482f25ea90e2eec84800;hb=1c4bd7276a5f733e283c0484803bfca670f76654;hp=3c39d976d4e007b6b12fe32cf627193458eee1be;hpb=23be303a582b85ccac7592d0135f0beb9cf170f6;p=privoxy.git diff --git a/doc/webserver/user-manual/actions-file.html b/doc/webserver/user-manual/actions-file.html index 3c39d976..17f6e835 100644 --- a/doc/webserver/user-manual/actions-file.html +++ b/doc/webserver/user-manual/actions-file.html @@ -1,3637 +1,4648 @@ - - + -
-The actions files are used to define what actions Privoxy takes for which URLs, and thus determines - how ad images, cookies and various other aspects of HTTP content and - transactions are handled, and on which sites (or even parts thereof). - There are a number of such actions, with a wide range of functionality. - Each action does something a little different. These actions give us a - veritable arsenal of tools with which to exert our control, preferences - and independence. Actions can be combined so that their effects are - aggregated when applied against a given set of URLs.
- -There are three action files included with Privoxy with differing purposes:
- -match-all.action - is used to define - which "actions" relating to - banner-blocking, images, pop-ups, content modification, cookie - handling etc should be applied by default. It should be the first - actions file loaded
-default.action - defines many exceptions - (both positive and negative) from the default set of actions that's - configured in match-all.action. It is a set - of rules that should work reasonably well as-is for most users. This - file is only supposed to be edited by the developers. It should be - the second actions file loaded.
-user.action - is intended to be for - local site preferences and exceptions. As an example, if your ISP or - your bank has specific requirements, and need special handling, this - kind of thing should go here. This file will not be upgraded.
-Edit Set to - Cautious Set to Medium - Set to Advanced
- -These have increasing levels of aggressiveness and have no influence on your browsing - unless you select them explicitly in the editor. A default - installation should be pre-set to Cautious. - New users should try this for a while before adjusting the settings - to more aggressive levels. The more aggressive the settings, then the - more likelihood there is of problems such as sites not working as - they should.
- -The Edit button allows you to turn - each action on/off individually for fine-tuning. The Cautious button changes the actions list to - low/safe settings which will activate ad blocking and a minimal set - of Privoxy's features, and - subsequently there will be less of a chance for accidental problems. - The Medium button sets the list to a - medium level of other features and a low level set of privacy - features. The Advanced button sets the - list to a high level of ad blocking and medium level of privacy. See - the chart below. The latter three buttons over-ride any changes via - with the Edit button. More fine-tuning - can be done in the lower sections of this internal page.
- -While the actions file editor allows to enable these settings in - all actions files, they are only supposed to be enabled in the first - one to make sure you don't unintentionally overrule earlier - rules.
- -The default profiles, and their associated actions, as pre-defined - in default.action are:
- -Table 1. Default Configurations
- -Feature | - -Cautious | - -Medium | - -Advanced | -
---|---|---|---|
Ad-blocking Aggressiveness | - -medium | - -high | - -high | -
Ad-filtering by size | - -no | - -yes | - -yes | -
Ad-filtering by link | - -no | - -no | - -yes | -
Pop-up killing | - -blocks only | - -blocks only | - -blocks only | -
Privacy Features | - -low | - -medium | - -medium/high | -
Cookie handling | - -none | - -session-only | - -kill | -
Referer forging | - -no | - -yes | - -yes | -
GIF de-animation | - -no | - -yes | - -yes | -
Fast redirects | - -no | - -no | - -yes | -
HTML taming | - -no | - -no | - -yes | -
JavaScript taming | - -no | - -no | - -yes | -
Web-bug killing | - -no | - -yes | - -yes | -
Image tag reordering | - -no | - -yes | - -yes | -
The list of actions files to be used are defined in the main - configuration file, and are processed in the order they are defined (e.g. - default.action is typically processed before - user.action). The content of these can all be - viewed and edited from http://config.privoxy.org/show-status. The over-riding - principle when applying actions, is that the last action that matches a - given URL wins. The broadest, most general rules go first (defined in - default.action), followed by any exceptions - (typically also in default.action), which are - then followed lastly by any local preferences (typically in user.action). Generally, user.action has the last word.
- -An actions file typically has multiple sections. If you want to use - "aliases" in an actions file, you have to - place the (optional) alias - section at the top of that file. Then comes the default set of rules - which will apply universally to all sites and pages (be very careful with using such a - universal set in user.action or any other - actions file after default.action, because it - will override the result from consulting any previous file). And then - below that, exceptions to the defined universal policies. You can regard - user.action as an appendix to default.action, with the advantage that it is a separate - file, which makes preserving your personal settings across Privoxy upgrades easier.
- -Actions can be used to block anything you want, including ads, - banners, or just some obnoxious URL whose content you would rather not - see. Cookies can be accepted or rejected, or accepted only during the - current browser session (i.e. not written to disk), content can be - modified, some JavaScripts tamed, user-tracking fooled, and much more. - See below for a complete list of - actions.
- -Note that some actions, like - cookie suppression or script disabling, may render some sites unusable - that rely on these techniques to work properly. Finding the right mix - of actions is not always easy and certainly a matter of personal taste. - And, things can always change, requiring refinements in the - configuration. In general, it can be said that the more "aggressive" your default settings (in the top section - of the actions file) are, the more exceptions for "trusted" sites you will have to make later. If, for - example, you want to crunch all cookies per default, you'll have to - make exceptions from that rule for sites that you regularly use and - that require cookies for actually useful purposes, like maybe your - bank, favorite shop, or newspaper.
- -We have tried to provide you with reasonable rules to start from in - the distribution actions files. But there is no general rule of thumb - on these things. There just are too many variables, and sites are - constantly changing. Sooner or later you will want to change the rules - (and read this chapter again :).
-The easiest way to edit the actions files is with a browser by using - our browser-based editor, which can be reached from http://config.privoxy.org/show-status. Note: the config file - option enable-edit-actions must be - enabled for this to work. The editor allows both fine-grained control - over every single feature on a per-URL basis, and easy choosing from - wholesale sets of defaults like "Cautious", - "Medium" or "Advanced". Warning: the "Advanced" setting is more aggressive, and will be more - likely to cause problems for some sites. Experienced users only!
- -If you prefer plain text editing to GUIs, you can of course also - directly edit the the actions files with your favorite text editor. - Look at default.action which is richly - commented with many good examples.
-Actions files are divided into sections. There are special sections, - like the "alias" sections which will be - discussed later. For now let's concentrate on regular sections: They - have a heading line (often split up to multiple lines for readability) - which consist of a list of actions, separated by whitespace and - enclosed in curly braces. Below that, there is a list of URL and tag - patterns, each on a separate line.
- -To determine which actions apply to a request, the URL of the - request is compared to all URL patterns in each "action file". Every time it matches, the list of - applicable actions for the request is incrementally updated, using the - heading of the section in which the pattern is located. The same is - done again for tags and tag patterns later on.
- -If multiple applying sections set the same action differently, the - last match wins. If not, the effects are aggregated. E.g. a URL might - match a regular section with a heading line of { - +handle-as-image - }, then later another one with just { - +block }, resulting in - both actions to - apply. And there may well be cases where you will want to combine - actions together. Such a section then might look like:
- -
- - { +handle-as-image +block{Banner ads.} } - # Block these as if they were images. Send no block page. - banners.example.com - media.example.com/.*banners - .example.com/images/ads/ -+ | + Privoxy 3.0.26 User Manual + | +||
---|---|---|---|
+ Prev + | ++ | ++ Next |
You can trace this process for URL patterns and any given URL by - visiting http://config.privoxy.org/show-url-info.
- -Examples and more detail on this is provided in the Appendix, - Troubleshooting: Anatomy of an - Action section.
-As mentioned, Privoxy uses - "patterns" to determine what actions might apply to which - sites and pages your browser attempts to access. These "patterns" use wild card type pattern matching to achieve a - high degree of flexibility. This allows one expression to be expanded - and potentially match against many similar patterns.
- -Generally, an URL pattern has the form <domain><port>/<path>, where the - <domain>, the <port> and the <path> are optional. (This is why the special - / pattern matches all URLs). Note that the - protocol portion of the URL pattern (e.g. http://) should not be included in the pattern. This is assumed - already!
- -The pattern matching syntax is different for the domain and path - parts of the URL. The domain part uses a simple globbing type matching - technique, while the path part uses more flexible "Regular Expressions" (POSIX - 1003.2).
- -The port part of a pattern is a decimal port number preceded by a - colon (:). If the domain part contains a - numerical IPv6 address, it has to be put into angle brackets - (<, >).
- -is a domain-only pattern and will match any request to - www.example.com, regardless of which - document on that server is requested. So ALL pages in this domain - would be covered by the scope of this action. Note that a simple - example.com is different and would NOT - match.
-means exactly the same. For domain-only patterns, the trailing - / may be omitted.
-matches all the documents on www.example.com whose name starts with /index.html.
-matches only the single document /index.html on www.example.com.
-matches the document /index.html, - regardless of the domain, i.e. on any web server - anywhere.
-Matches any URL because there's no requirement for either the - domain or the path to match anything.
-Matches any URL pointing to TCP port 8000.
-Matches any URL with the host address 2001:db8::1. (Note that the real URL uses plain - brackets, not angle brackets.)
-matches nothing, since it would be interpreted as a domain - name and there is no top-level domain called .html. So its a mistake.
-The matching of the domain part offers some flexible options: if - the domain starts or ends with a dot, it becomes unanchored at that - end. For example:
- -matches any domain with first-level domain com and second-level domain example. For example www.example.com, example.com and foo.bar.baz.example.com. Note that it wouldn't - match if the second-level domain was another-example.
-matches any domain that STARTS with www. - (It also matches the domain www but - most of the time that doesn't matter.)
-matches any domain that CONTAINS .example.. And, by the way, also included would - be any files or documents that exist within that domain since - no path limitations are specified. (Correctly speaking: It - matches any FQDN that contains example - as a domain.) This might be www.example.com, news.example.de, or www.example.net/cgi/testing.pl for instance. All - these cases are matched.
-Additionally, there are wild-cards that you can use in the domain - names themselves. These work similarly to shell globbing type - wild-cards: "*" represents zero or more - arbitrary characters (this is equivalent to the "Regular Expression" based - syntax of ".*"), "?" represents any single character (this is - equivalent to the regular expression syntax of a simple "."), and you can define "character classes" in square brackets which is - similar to the same regular expression technique. All of this can be - freely mixed:
- -matches "adserver.example.com", - "ads.example.com", etc but not - "sfads.example.com"
-matches all of the above, and then some.
-matches www.ipix.com, pictures.epix.com, a.b.c.d.e.upix.com etc.
-matches www1.example.com, - www4.example.cc, wwwd.example.cy, wwwz.example.com etc., but not wwww.example.com.
-While flexible, this is not the sophistication of full regular - expression based syntax.
-Privoxy uses "modern" POSIX 1003.2 "Regular Expressions" for - matching the path portion (after the slash), and is thus more - flexible.
- -There is an Appendix with a - brief quick-start into regular expressions, you also might want to - have a look at your operating system's documentation on regular - expressions (try man re_format).
- -Note that the path pattern is automatically left-anchored at the - "/", i.e. it matches as if it would start - with a "^" (regular expression speak for - the beginning of a line).
- -Please also note that matching in the path is CASE INSENSITIVE by - default, but you can switch to case sensitive at any point in the - pattern by using the "(?-i)" switch: - www.example.com/(?-i)PaTtErN.* will match - only documents whose path starts with PaTtErN in exactly this capitalization.
- -Is equivalent to just ".example.com", since any documents within that - domain are matched with or without the ".*" regular expression. This is redundant
-Will match any page in the domain of "example.com" that is named "index.html", and that is part of some path. For - example, it matches "www.example.com/testing/index.html" but NOT - "www.example.com/index.html" because - the regular expression called for at least two "/'s", thus the path requirement. It also would - match "www.example.com/testing/index_html", because of - the special meta-character ".".
-This regular expression is conditional so it will match any - page named "index.html" regardless - of path which in this case can have one or more "/'s". And this one must contain exactly - ".html" (but does not have to end - with that!).
-This regular expression will match any path of "example.com" that contains any of the words - "ads", "banner", "banners" - (because of the "?") or "junk". The path does not have to end in these - words, just contain them.
-This is very much the same as above, except now it must end - in either ".jpg", ".jpeg", ".gif" or - ".png". So this one is limited to - common image formats.
-There are many, many good examples to be found in default.action, and more tutorials below in Appendix on regular expressions.
-Tag patterns are used to change the applying actions based on the - request's tags. Tags can be created with either the client-header-tagger or - the server-header-tagger - action.
- -Tag patterns have to start with "TAG:", - so Privoxy can tell them apart from - URL patterns. Everything after the colon including white space, is - interpreted as a regular expression with path pattern syntax, except - that tag patterns aren't left-anchored automatically (Privoxy doesn't silently add a "^", you have to do it yourself if you need it).
- -To match all requests that are tagged with "foo" your pattern line should be "TAG:^foo$", "TAG:foo" - would work as well, but it would also match requests whose tags - contain "foo" somewhere. "TAG: foo" wouldn't work as it requires white - space.
- -Sections can contain URL and tag patterns at the same time, but - tag patterns are checked after the URL patterns and thus always - overrule them, even if they are located before the URL patterns.
- -Once a new tag is added, Privoxy checks right away if it's matched - by one of the tag patterns and updates the action settings - accordingly. As a result tags can be used to activate other tagger - actions, as long as these other taggers look for headers that haven't - already be parsed.
- -For example you could tag client requests which use the POST method, then use this tag to activate another - tagger that adds a tag if cookies are sent, and then use a block - action based on the cookie tag. This allows the outcome of one - action, to be input into a subsequent action. However if you'd - reverse the position of the described taggers, and activated the - method tagger based on the cookie tagger, no method tags would be - created. The method tagger would look for the request line, but at - the time the cookie tag is created, the request line has already been - parsed.
- -While this is a limitation you should be aware of, this kind of - indirection is seldom needed anyway and even the example doesn't make - too much sense.
-All actions are disabled by default, until they are explicitly - enabled somewhere in an actions file. Actions are turned on if preceded - with a "+", and turned off if preceded with - a "-". So a +action - means "do that action", e.g. +block means "please block URLs that - match the following patterns", and -block means "don't block URLs that - match the following patterns, even if +block - previously applied."
- -Again, actions are invoked by placing them on a line, enclosed in - curly braces and separated by whitespace, like in {+some-action -some-other-action{some-parameter}}, - followed by a list of URL patterns, one per line, to which they apply. - Together, the actions line and the following pattern lines make up a - section of the actions file.
- -Actions fall into three categories:
- ++ The actions files are used to define what actions Privoxy takes for which URLs, and thus + determines how ad images, cookies and various other aspects of HTTP + content and transactions are handled, and on which sites (or even + parts thereof). There are a number of such actions, with a wide range + of functionality. Each action does something a little different. + These actions give us a veritable arsenal of tools with which to + exert our control, preferences and independence. Actions can be + combined so that their effects are aggregated when applied against a + given set of URLs. +
++ There are three action files included with Privoxy with differing purposes: +
++
Boolean, i.e the action can only be "enabled" or "disabled". - Syntax:
- -
- - +name # enable action name - -name # disable action name -- |
-
Example: +handle-as-image
++ match-all.action - is used to define + which "actions" relating to + banner-blocking, images, pop-ups, content modification, cookie + handling etc should be applied by default. It should be the first + actions file loaded +
Parameterized, where some value is required in order to enable - this type of action. Syntax:
- -
- - +name{param} # enable action and set parameter to param, - # overwriting parameter from previous match if necessary - -name # disable action. The parameter can be omitted -- |
-
Note that if the URL matches multiple positive forms of a - parameterized action, the last match wins, i.e. the params from - earlier matches are simply ignored.
- -Example: +hide-user-agent{Mozilla/5.0 (X11; - U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602 - Firefox/2.0.0.4}
++ default.action - defines many + exceptions (both positive and negative) from the default set of + actions that's configured in match-all.action. It is a set of rules that + should work reasonably well as-is for most users. This file is + only supposed to be edited by the developers. It should be the + second actions file loaded. +
Multi-value. These look exactly like parameterized actions, but - they behave differently: If the action applies multiple times to - the same URL, but with different parameters, all the parameters from - all matches - are remembered. This is used for actions that can be executed for - the same request repeatedly, like adding multiple headers, or - filtering through multiple filters. Syntax:
- -
- - +name{param} # enable action and add param to the list of parameters - -name{param} # remove the parameter param from the list of parameters - # If it was the last one left, disable the action. - -name # disable this action completely and remove all parameters from the list -- |
-
Examples: +add-header{X-Fun-Header: Some - text} and +filter{html-annoyances}
++ user.action - is intended to be for + local site preferences and exceptions. As an example, if your ISP + or your bank has specific requirements, and need special + handling, this kind of thing should go here. This file will not + be upgraded. +
If nothing is specified in any actions file, no "actions" are taken. So in this case Privoxy would just be a normal, non-blocking, - non-filtering proxy. You must specifically enable the privacy and - blocking features you need (although the provided default actions files - will give a good starting point).
- -Later defined action sections always over-ride earlier ones of the - same type. So exceptions to any rules you make, should come in the - latter part of the file (or in a file that is processed later when - using multiple actions files such as user.action). For multi-valued actions, the actions are - applied in the order they are specified. Actions files are processed in - the order they are defined in config (the - default installation has three actions files). It also quite possible - for any given URL to match more than one "pattern" (because of wildcards and regular - expressions), and thus to trigger more than one set of actions! Last - match wins.
- -The list of valid Privoxy actions - are:
- -Confuse log analysis, custom applications
-Sends a user defined HTTP header to the web server.
-Multi-value.
-Any string value is possible. Validity of the defined HTTP - headers is not checked. It is recommended that you use the - "X-" prefix - for custom headers.
-This action may be specified multiple times, in order to - define multiple headers. This is rarely needed for the typical - user. If you don't know what "HTTP - headers" are, you definitely don't need to worry about - this one.
- -Headers added by this action are not modified by other - actions.
-
- -+add-header{X-User-Tracking: sucks} -- |
-
Block ads or other unwanted content
-Requests for URLs to which this action applies are blocked, - i.e. the requests are trapped by Privoxy and the requested URL is never - retrieved, but is answered locally with a substitute page or - image, as determined by the handle-as-image, - set-image-blocker, - and handle-as-empty-document - actions.
-Parameterized.
-A block reason that should be given to the user.
-Privoxy sends a special - "BLOCKED" page for requests to - blocked pages. This page contains the block reason given as - parameter, a link to find out why the block action applies, and - a click-through to the blocked content (the latter only if the - force feature is available and enabled).
- -A very important exception occurs if both block and handle-as-image, - apply to the same request: it will then be replaced by an - image. If set-image-blocker - (see below) also applies, the type of image will be determined - by its parameter, if not, the standard checkerboard pattern is - sent.
- -It is important to understand this process, in order to - understand how Privoxy deals - with ads and other unwanted content. Blocking is a core - feature, and one upon which various other features depend.
- -The filter action can perform a - very similar task, by "blocking" - banner images and other content through rewriting the relevant - URLs in the document's HTML source, so they don't get requested - in the first place. Note that this is a totally different - technique, and it's easy to confuse the two.
-
- -{+block{No nasty stuff for you.}} -# Block and replace with "blocked" page - .nasty-stuff.example.com - -{+block{Doubleclick banners.} +handle-as-image} -# Block and replace with image - .ad.doubleclick.net - .ads.r.us/banners/ - -{+block{Layered ads.} +handle-as-empty-document} -# Block and then ignore - adserver.example.net/.*\.js$ -- |
-
Improve privacy by not forwarding the source of the request - in the HTTP headers.
-Deletes the "X-Forwarded-For:" - HTTP header from the client request, or adds a new one.
-Parameterized.
-"block" to delete the - header.
-"add" to create the header - (or append the client's IP address to an already existing - one).
-It is safe and recommended to use block.
- -Forwarding the source address of the request may make sense - in some multi-user setups but is also a privacy risk.
-
- -+change-x-forwarded-for{block} -- |
-
Rewrite or remove single client headers.
-All client headers to which this action applies are filtered - on-the-fly through the specified regular expression based - substitutions.
-Parameterized.
-The name of a client-header filter, as defined in one of the - filter files.
-Client-header filters are applied to each header on its own, - not to all at once. This makes it easier to diagnose problems, - but on the downside you can't write filters that only change - header x if header y's value is z. You can do that by using - tags though.
- -Client-header filters are executed after the other header - actions have finished and use their output as input.
- -If the request URL gets changed, Privoxy will detect that and use the new - one. This can be used to rewrite the request destination behind - the client's back, for example to specify a Tor exit relay for - certain requests.
- -Please refer to the filter file - chapter to learn which client-header filters are available - by default, and how to create your own.
-
- -# Hide Tor exit notation in Host and Referer Headers -{+client-header-filter{hide-tor-exit-notation}} -/ - -- |
-
Block requests based on their headers.
-Client headers to which this action applies are filtered - on-the-fly through the specified regular expression based - substitutions, the result is used as tag.
-Parameterized.
-The name of a client-header tagger, as defined in one of the - filter files.
-Client-header taggers are applied to each header on its own, - and as the header isn't modified, each tagger "sees" the original.
- -Client-header taggers are the first actions that are - executed and their tags can be used to control every other - action.
-
- -# Tag every request with the User-Agent header -{+client-header-tagger{user-agent}} -/ - -# Tagging itself doesn't change the action -# settings, sections with TAG patterns do: -# -# If it's a download agent, use a different forwarding proxy, -# show the real User-Agent and make sure resume works. -{+forward-override{forward-socks5 10.0.0.2:2222 .} \ - -hide-if-modified-since \ - -overwrite-last-modified \ - -hide-user-agent \ - -filter \ - -deanimate-gifs \ -} -TAG:^User-Agent: NetBSD-ftp/ -TAG:^User-Agent: Novell ZYPP Installer -TAG:^User-Agent: RPM APT-HTTP/ -TAG:^User-Agent: fetch libfetch/ -TAG:^User-Agent: Ubuntu APT-HTTP/ -TAG:^User-Agent: MPlayer/ - -- |
-
- -# Tag all requests with the Range header set -{+client-header-tagger{range-requests}} -/ - -# Disable filtering for the tagged requests. -# -# With filtering enabled Privoxy would remove the Range headers -# to be able to filter the whole response. The downside is that -# it prevents clients from resuming downloads or skipping over -# parts of multimedia files. -{-filter -deanimate-gifs} -TAG:^RANGE-REQUEST$ - -- |
-
Stop useless download menus from popping up, or change the - browser's rendering mode
-Replaces the "Content-Type:" HTTP - server header.
-Parameterized.
-Any string.
-The "Content-Type:" HTTP server - header is used by the browser to decide what to do with the - document. The value of this header can cause the browser to - open a download menu instead of displaying the document by - itself, even if the document's format is supported by the - browser.
- -The declared content type can also affect which rendering - mode the browser chooses. If XHTML is delivered as "text/html", many browsers treat it as yet - another broken HTML document. If it is send as "application/xml", browsers with XHTML support - will only display it, if the syntax is correct.
- -If you see a web site that proudly uses XHTML buttons, but - sets "Content-Type: text/html", you - can use Privoxy to overwrite - it with "application/xml" and - validate the web master's claim inside your XHTML-supporting - browser. If the syntax is incorrect, the browser will complain - loudly.
- -You can also go the opposite direction: if your browser - prints error messages instead of rendering a document falsely - declared as XHTML, you can overwrite the content type with - "text/html" and have it rendered as - broken HTML document.
- -By default content-type-overwrite - only replaces "Content-Type:" - headers that look like some kind of text. If you want to - overwrite it unconditionally, you have to combine it with - force-text-mode. - This limitation exists for a reason, think twice before - circumventing it.
- -Most of the time it's easier to replace this action with a - custom server-header - filter. It allows you to activate it for every - document of a certain site and it will still only replace the - content types you aimed at.
- -Of course you can apply content-type-overwrite to a whole site and then - make URL based exceptions, but it's a lot more work to get the - same precision.
-
- -# Check if www.example.net/ really uses valid XHTML -{ +content-type-overwrite{application/xml} } -www.example.net/ - -# but leave the content type unmodified if the URL looks like a style sheet -{-content-type-overwrite} -www.example.net/.*\.css$ -www.example.net/.*style -- |
-
Remove a client header Privoxy has no dedicated action for.
-Deletes every header sent by the client that contains the - string the user supplied as parameter.
-Parameterized.
-Any string.
-This action allows you to block client headers for which no - dedicated Privoxy action - exists. Privoxy will remove - every client header that contains the string you supplied as - parameter.
- -Regular expressions are not supported and you can't use this - action to block different headers in the same request, unless - they contain the same string.
- -crunch-client-header is only meant - for quick tests. If you have to block several different - headers, or only want to modify parts of them, you should use a - client-header - filter.
- -Warning | -
- Don't block any header without understanding the - consequences. - |
-
- -# Block the non-existent "Privacy-Violation:" client header -{ +crunch-client-header{Privacy-Violation:} } -/ - -- |
-
Prevent yet another way to track the user's steps between - sessions.
-Deletes the "If-None-Match:" HTTP - client header.
-Boolean.
-N/A
-Removing the "If-None-Match:" - HTTP client header is useful for filter testing, where you want - to force a real reload instead of getting status code - "304" which would cause the browser - to use a cached copy of the page.
- -It is also useful to make sure the header isn't used as a - cookie replacement (unlikely but possible).
- -Blocking the "If-None-Match:" - header shouldn't cause any caching problems, as long as the - "If-Modified-Since:" header isn't - blocked or missing as well.
- -It is recommended to use this action together with - hide-if-modified-since - and overwrite-last-modified.
-
- -# Let the browser revalidate cached documents but don't -# allow the server to use the revalidation headers for user tracking. -{+hide-if-modified-since{-60} \ - +overwrite-last-modified{randomize} \ - +crunch-if-none-match} -/ -- |
-
Prevent the web server from setting HTTP cookies on your - system
-Deletes any "Set-Cookie:" HTTP - headers from server replies.
-Boolean.
-N/A
-This action is only concerned with incoming HTTP - cookies. For outgoing HTTP cookies, use crunch-outgoing-cookies. - Use both - to disable HTTP cookies completely.
- -It makes no sense - at all to use this action in conjunction with the - session-cookies-only - action, since it would prevent the session cookies from being - set. See also filter-content-cookies.
-
- -+crunch-incoming-cookies -- |
+ + Feature + | ++ Cautious + | ++ Medium + | ++ Advanced + |
---|
Remove a server header Privoxy has no dedicated action for.
-Deletes every header sent by the server that contains the - string the user supplied as parameter.
-Parameterized.
-Any string.
-This action allows you to block server headers for which no - dedicated Privoxy action - exists. Privoxy will remove - every server header that contains the string you supplied as - parameter.
- -Regular expressions are not supported and you can't use this - action to block different headers in the same request, unless - they contain the same string.
- -crunch-server-header is only meant - for quick tests. If you have to block several different - headers, or only want to modify parts of them, you should use a - custom server-header - filter.
- -Warning | -
- Don't block any header without understanding the - consequences. - |
-
- -# Crunch server headers that try to prevent caching -{ +crunch-server-header{no-cache} } -/ -+ Ad-blocking Aggressiveness |
-
Prevent the web server from reading any HTTP cookies from - your system
-Deletes any "Cookie:" HTTP - headers from client requests.
-Boolean.
-N/A
-This action is only concerned with outgoing HTTP - cookies. For incoming HTTP cookies, use crunch-incoming-cookies. - Use both - to disable HTTP cookies completely.
- -It makes no sense - at all to use this action in conjunction with the - session-cookies-only - action, since it would prevent the session cookies from being - read.
-
- -+crunch-outgoing-cookies -+ medium + |
+ + high + | ++ high |
Stop those annoying, distracting animated GIF images.
-De-animate GIF animations, i.e. reduce them to their first - or last image.
-Parameterized.
-"last" or "first"
-This will also shrink the images considerably (in bytes, not - pixels!). If the option "first" is - given, the first frame of the animation is used as the - replacement. If "last" is given, the - last frame of the animation is used instead, which probably - makes more sense for most banner animations, but also has the - risk of not showing the entire last frame (if it is only a - delta to an earlier frame).
- -You can safely use this action with patterns that will also - match non-GIF objects, because no attempt will be made at - anything that doesn't look like a GIF.
-
- -+deanimate-gifs{last} -+ Ad-filtering by size + |
+ + no + | ++ yes + | ++ yes |
Work around (very rare) problems with HTTP/1.1
-Downgrades HTTP/1.1 client requests and server replies to - HTTP/1.0.
-Boolean.
-N/A
-This is a left-over from the time when Privoxy didn't support important HTTP/1.1 - features well. It is left here for the unlikely case that you - experience HTTP/1.1-related problems with some server out - there.
- -Note that enabling this action is only a workaround. It - should not be enabled for sites that work without it. While it - shouldn't break any pages, it has an (usually negative) - performance impact.
- -If you come across a site where enabling this action helps, - please report it, so the cause of the problem can be analyzed. - If the problem turns out to be caused by a bug in Privoxy it should be fixed so the - following release works without the work around.
-
- -{+downgrade-http-version} -problem-host.example.com -+ Ad-filtering by link + |
+ + no + | ++ no + | ++ yes |
Fool some click-tracking scripts and speed up indirect - links.
-Detects redirection URLs and redirects the browser without - contacting the redirection server first.
-Parameterized.
-"simple-check" to just search - for the string "http://" to - detect redirection URLs.
-"check-decoded-url" to decode - URLs (if necessary) before searching for redirection - URLs.
-Many sites, like yahoo.com, don't just link to other sites. - Instead, they will link to some script on their own servers, - giving the destination as a parameter, which will then redirect - you to the final target. URLs resulting from this scheme - typically look like: "http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/".
- -Sometimes, there are even multiple consecutive redirects - encoded in the URL. These redirections via scripts make your - web browsing more traceable, since the server from which you - follow such a link can see where you go to. Apart from that, - valuable bandwidth and time is wasted, while your browser asks - the server for one redirect after the other. Plus, it feeds the - advertisers.
- -This feature is currently not very smart and is scheduled - for improvement. If it is enabled by default, you will have to - create some exceptions to this action. It can lead to failures - in several ways:
- -Not every URLs with other URLs as parameters is evil. Some - sites offer a real service that requires this information to - work. For example a validation service needs to know, which - document to validate. fast-redirects - assumes that every URL parameter that looks like another URL is - a redirection target, and will always redirect to the last one. - Most of the time the assumption is correct, but if it isn't, - the user gets redirected anyway.
- -Another failure occurs if the URL contains other parameters - after the URL parameter. The URL: "http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar". - contains the redirection URL "http://www.example.net/", followed by another - parameter. fast-redirects doesn't know - that and will cause a redirect to "http://www.example.net/&foo=bar". Depending - on the target server configuration, the parameter will be - silently ignored or lead to a "page not - found" error. You can prevent this problem by first - using the redirect action to remove - the last part of the URL, but it requires a little effort.
- -To detect a redirection URL, fast-redirects only looks for the string - "http://", either in plain text - (invalid but often used) or encoded as "http%3a//". Some sites use their own URL - encoding scheme, encrypt the address of the target server or - replace it with a database id. In theses cases fast-redirects is fooled and the request reaches - the redirection server where it probably gets logged.
-
- - { +fast-redirects{simple-check} } - one.example.com - - { +fast-redirects{check-decoded-url} } - another.example.com/testing -+ Pop-up killing + |
+ + blocks only + | ++ blocks only + | ++ blocks only |
Get rid of HTML and JavaScript annoyances, banner - advertisements (by size), do fun text replacements, add - personalized effects, etc.
-All instances of text-based type, most notably HTML and - JavaScript, to which this action applies, can be filtered - on-the-fly through the specified regular expression based - substitutions. (Note: as of version 3.0.3 plain text documents - are exempted from filtering, because web servers often use the - text/plain MIME type for all files - whose type they don't know.)
-Parameterized.
-The name of a content filter, as defined in the filter file. Filters can be defined in - one or more files as defined by the filterfile - option in the config file. default.filter is the collection of filters - supplied by the developers. Locally defined filters should go - in their own file, such as user.filter.
- -When used in its negative form, and without parameters, - all - filtering is completely disabled.
-For your convenience, there are a number of pre-defined - filters available in the distribution filter file that you can - use. See the examples below for a list.
- -Filtering requires buffering the page content, which may - appear to slow down page rendering since nothing is displayed - until all content has passed the filters. (The total time until - the page is completely rendered doesn't change much, but it may - be perceived as slower since the page is not incrementally - displayed.) This effect will be more noticeable on slower - connections.
- -"Rolling your own" filters - requires a knowledge of "Regular Expressions" and - "HTML". This is very - powerful feature, and potentially very intrusive. Filters - should be used with caution, and where an equivalent - "action" is not available.
- -The amount of data that can be filtered is limited to the - buffer-limit option in the - main config file. The default is 4096 - KB (4 Megs). Once this limit is exceeded, the buffered data, - and all pending data, is passed through unfiltered.
- -Inappropriate MIME types, such as zipped files, are not - filtered at all. (Again, only text-based types except plain - text). Encrypted SSL data (from HTTPS servers) cannot be - filtered either, since this would violate the integrity of the - secure transaction. In some situations it might be necessary to - protect certain text, like source code, from filtering by - defining appropriate -filter - exceptions.
- -Compressed content can't be filtered either, but if - Privoxy is compiled with zlib - support and a supported compression algorithm is used (gzip or - deflate), Privoxy can first - decompress the content and then filter it.
- -If you use a Privoxy - version without zlib support, but want filtering to work on as - much documents as possible, even those that would normally be - sent compressed, you must use the prevent-compression - action in conjunction with filter.
- -Content filtering can achieve some of the same effects as - the block action, i.e. it can be - used to block ads and banners. But the mechanism works quite - differently. One effective use, is to block ad banners based on - their size (see below), since many of these seem to be somewhat - standardized.
- -Feedback with suggestions for new - or improved filters is particularly welcome!
- -The below list has only the names and a one-line description - of each predefined filter. There are more verbose - explanations of what these filters do in the filter file chapter.
-
- -+filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse. -+ Privacy Features + |
+ + low + | ++ medium + | ++ medium/high |
- -+filter{js-events} # Kill all JS event bindings and timers (Radically destructive! Only for extra nasty sites). -+ Cookie handling + |
+ + none + | ++ session-only + | ++ kill |
- -+filter{html-annoyances} # Get rid of particularly annoying HTML abuse. -+ Referer forging + |
+ + no + | ++ yes + | ++ yes |
- -+filter{content-cookies} # Kill cookies that come in the HTML or JS content. -+ GIF de-animation + |
+ + no + | ++ yes + | ++ yes |
- -+filter{refresh-tags} # Kill automatic refresh tags (for dial-on-demand setups). -+ Fast redirects + |
+ + no + | ++ no + | ++ yes |
- -+filter{unsolicited-popups} # Disable only unsolicited pop-up windows. Useful if your browser lacks this ability. -+ HTML taming + |
+ + no + | ++ no + | ++ yes |
- -+filter{all-popups} # Kill all popups in JavaScript and HTML. Useful if your browser lacks this ability. -+ JavaScript taming + |
+ + no + | ++ no + | ++ yes |
+ The list of actions files to be used are defined in the main + configuration file, and are processed in the order they are defined + (e.g. default.action is typically processed + before user.action). The content of these + can all be viewed and edited from http://config.privoxy.org/show-status. The over-riding + principle when applying actions, is that the last action that matches + a given URL wins. The broadest, most general rules go first (defined + in default.action), followed by any + exceptions (typically also in default.action), which are then followed lastly by + any local preferences (typically in user.action). + Generally, user.action has the last word. +
++ An actions file typically has multiple sections. If you want to use + "aliases" in an actions file, you have to + place the (optional) alias + section at the top of that file. Then comes the default set of + rules which will apply universally to all sites and pages (be very careful with + using such a universal set in user.action + or any other actions file after default.action, because it will override the result + from consulting any previous file). And then below that, exceptions + to the defined universal policies. You can regard user.action as an appendix to default.action, with the advantage that it is a + separate file, which makes preserving your personal settings across + Privoxy upgrades easier. +
++ Actions can be used to block anything you want, including ads, + banners, or just some obnoxious URL whose content you would rather + not see. Cookies can be accepted or rejected, or accepted only during + the current browser session (i.e. not written to disk), content can + be modified, some JavaScripts tamed, user-tracking fooled, and much + more. See below for a complete + list of actions. +
++ Note that some actions, + like cookie suppression or script disabling, may render some sites + unusable that rely on these techniques to work properly. Finding + the right mix of actions is not always easy and certainly a matter + of personal taste. And, things can always change, requiring + refinements in the configuration. In general, it can be said that + the more "aggressive" your default + settings (in the top section of the actions file) are, the more + exceptions for "trusted" sites you will + have to make later. If, for example, you want to crunch all cookies + per default, you'll have to make exceptions from that rule for + sites that you regularly use and that require cookies for actually + useful purposes, like maybe your bank, favorite shop, or newspaper. +
++ We have tried to provide you with reasonable rules to start from in + the distribution actions files. But there is no general rule of + thumb on these things. There just are too many variables, and sites + are constantly changing. Sooner or later you will want to change + the rules (and read this chapter again :). +
++ The easiest way to edit the actions files is with a browser by + using our browser-based editor, which can be reached from http://config.privoxy.org/show-status. Note: the config + file option enable-edit-actions must be + enabled for this to work. The editor allows both fine-grained + control over every single feature on a per-URL basis, and easy + choosing from wholesale sets of defaults like "Cautious", "Medium" or + "Advanced". Warning: the "Advanced" setting is more aggressive, and will be + more likely to cause problems for some sites. Experienced users + only! +
++ If you prefer plain text editing to GUIs, you can of course also + directly edit the the actions files with your favorite text editor. + Look at default.action which is richly + commented with many good examples. +
++ Actions files are divided into sections. There are special + sections, like the "alias" sections which will + be discussed later. For now let's concentrate on regular sections: + They have a heading line (often split up to multiple lines for + readability) which consist of a list of actions, separated by + whitespace and enclosed in curly braces. Below that, there is a + list of URL and tag patterns, each on a separate line. +
++ To determine which actions apply to a request, the URL of the + request is compared to all URL patterns in each "action file". Every time it matches, the list of + applicable actions for the request is incrementally updated, using + the heading of the section in which the pattern is located. The + same is done again for tags and tag patterns later on. +
++ If multiple applying sections set the same action differently, the + last match wins. If not, the effects are aggregated. E.g. a URL + might match a regular section with a heading line of { +handle-as-image }, + then later another one with just { +block }, resulting in both actions to + apply. And there may well be cases where you will want to combine + actions together. Such a section then might look like: +
++
+
++ { +handle-as-image +block{Banner ads.} } + # Block these as if they were images. Send no block page. + banners.example.com + media.example.com/.*banners + .example.com/images/ads/ ++ |
+
+ You can trace this process for URL patterns and any given URL by + visiting http://config.privoxy.org/show-url-info. +
++ Examples and more detail on this is provided in the Appendix, Troubleshooting: Anatomy of an + Action section. +
++ As mentioned, Privoxy uses "patterns" to determine what actions might apply to + which sites and pages your browser attempts to access. These "patterns" use wild card type pattern matching to + achieve a high degree of flexibility. This allows one expression to + be expanded and potentially match against many similar patterns. +
++ Generally, an URL pattern has the form <host><port>/<path>, where the <host>, the <port> and the <path> are optional. (This is why the special + / pattern matches all URLs). Note that the + protocol portion of the URL pattern (e.g. http://) should not be included in the pattern. This is + assumed already! +
++ The pattern matching syntax is different for the host and path + parts of the URL. The host part uses a simple globbing type + matching technique, while the path part uses more flexible "Regular Expressions" (POSIX + 1003.2). +
++ The port part of a pattern is a decimal port number preceded by a + colon (:). If the host part contains a + numerical IPv6 address, it has to be put into angle brackets (<, >). +
++ is a host-only pattern and will match any request to www.example.com, regardless of which + document on that server is requested. So ALL pages in this + domain would be covered by the scope of this action. Note + that a simple example.com is + different and would NOT match. +
++ means exactly the same. For host-only patterns, the trailing + / may be omitted. +
++ matches all the documents on www.example.com whose name starts with /index.html. +
++ matches only the single document /index.html on www.example.com. +
++ matches the document /index.html, + regardless of the domain, i.e. on any web server anywhere. +
++ Matches any URL because there's no requirement for either the + domain or the path to match anything. +
++ Matches any URL pointing to TCP port 8000. +
++ Matches any URL with the host address 10.0.0.1. (Note that the real URL uses plain + brackets, not angle brackets.) +
++ Matches any URL with the host address 2001:db8::1. (Note that the real URL uses + plain brackets, not angle brackets.) +
++ matches nothing, since it would be interpreted as a domain + name and there is no top-level domain called .html. So its a mistake. +
++ The matching of the host part offers some flexible options: if + the host pattern starts or ends with a dot, it becomes unanchored + at that end. The host pattern is often referred to as domain + pattern as it is usually used to match domain names and not IP + addresses. For example: +
++ matches any domain with first-level domain com and second-level domain example. For example www.example.com, example.com and foo.bar.baz.example.com. Note that it + wouldn't match if the second-level domain was another-example. +
++ matches any domain that STARTS with www. (It also matches the domain www but most of the time that doesn't + matter.) +
++ matches any domain that CONTAINS .example.. And, by the way, also included + would be any files or documents that exist within that + domain since no path limitations are specified. (Correctly + speaking: It matches any FQDN that contains example as a domain.) This might be www.example.com, news.example.de, or www.example.net/cgi/testing.pl for instance. + All these cases are matched. +
++ Additionally, there are wild-cards that you can use in the domain + names themselves. These work similarly to shell globbing type + wild-cards: "*" represents zero or + more arbitrary characters (this is equivalent to the "Regular Expression" based + syntax of ".*"), "?" represents any single character (this is + equivalent to the regular expression syntax of a simple "."), and you can define "character classes" in square brackets which is + similar to the same regular expression technique. All of this can + be freely mixed: +
++ matches "adserver.example.com", + "ads.example.com", etc but not + "sfads.example.com" +
++ matches all of the above, and then some. +
++ matches www.ipix.com, pictures.epix.com, a.b.c.d.e.upix.com etc. +
++ matches www1.example.com, www4.example.cc, wwwd.example.cy, wwwz.example.com etc., but not wwww.example.com. +
++ While flexible, this is not the sophistication of full regular + expression based syntax. +
++ Privoxy uses "modern" POSIX 1003.2 "Regular Expressions" for + matching the path portion (after the slash), and is thus more + flexible. +
++ There is an Appendix with a + brief quick-start into regular expressions, you also might want + to have a look at your operating system's documentation on + regular expressions (try man re_format). +
++ Note that the path pattern is automatically left-anchored at the + "/", i.e. it matches as if it would + start with a "^" (regular expression + speak for the beginning of a line). +
++ Please also note that matching in the path is CASE INSENSITIVE by + default, but you can switch to case sensitive at any point in the + pattern by using the "(?-i)" switch: + www.example.com/(?-i)PaTtErN.* will + match only documents whose path starts with PaTtErN in exactly this capitalization. +
++ Is equivalent to just ".example.com", since any documents within + that domain are matched with or without the ".*" regular expression. This is redundant +
++ Will match any page in the domain of "example.com" that is named "index.html", and that is part of some path. + For example, it matches "www.example.com/testing/index.html" but NOT + "www.example.com/index.html" + because the regular expression called for at least two + "/'s", thus the path + requirement. It also would match "www.example.com/testing/index_html", + because of the special meta-character ".". +
++ This regular expression is conditional so it will match any + page named "index.html" + regardless of path which in this case can have one or more + "/'s". And this one must contain + exactly ".html" (but does not + have to end with that!). +
++ This regular expression will match any path of "example.com" that contains any of the words + "ads", "banner", "banners" (because of the "?") or "junk". + The path does not have to end in these words, just contain + them. +
++ This is very much the same as above, except now it must end + in either ".jpg", ".jpeg", ".gif" + or ".png". So this one is + limited to common image formats. +
++ There are many, many good examples to be found in default.action, and more tutorials below in Appendix on regular expressions. +
++ Request tag patterns are used to change the applying actions + based on the request's tags. Tags can be created based on HTTP + headers with either the client-header-tagger + or the server-header-tagger + action. +
++ Request tag patterns have to start with "TAG:", so Privoxy can tell them apart from other + patterns. Everything after the colon including white space, is + interpreted as a regular expression with path pattern syntax, + except that tag patterns aren't left-anchored automatically + (Privoxy doesn't silently add a + "^", you have to do it yourself if you + need it). +
++ To match all requests that are tagged with "foo" your pattern line should be "TAG:^foo$", "TAG:foo" + would work as well, but it would also match requests whose tags + contain "foo" somewhere. "TAG: foo" wouldn't work as it requires white + space. +
++ Sections can contain URL and request tag patterns at the same + time, but request tag patterns are checked after the URL patterns + and thus always overrule them, even if they are located before + the URL patterns. +
++ Once a new request tag is added, Privoxy checks right away if + it's matched by one of the request tag patterns and updates the + action settings accordingly. As a result request tags can be used + to activate other tagger actions, as long as these other taggers + look for headers that haven't already be parsed. +
++ For example you could tag client requests which use the POST method, then use this tag to activate + another tagger that adds a tag if cookies are sent, and then use + a block action based on the cookie tag. This allows the outcome + of one action, to be input into a subsequent action. However if + you'd reverse the position of the described taggers, and + activated the method tagger based on the cookie tagger, no method + tags would be created. The method tagger would look for the + request line, but at the time the cookie tag is created, the + request line has already been parsed. +
++ While this is a limitation you should be aware of, this kind of + indirection is seldom needed anyway and even the example doesn't + make too much sense. +
++ To match requests that do not have a certain request tag, specify + a negative tag pattern by prefixing the tag pattern line with + either "NO-REQUEST-TAG:" or "NO-RESPONSE-TAG:" instead of "TAG:". +
++ Negative request tag patterns created with "NO-REQUEST-TAG:" are checked after all client + headers are scanned, the ones created with "NO-RESPONSE-TAG:" are checked after all server + headers are scanned. In both cases all the created tags are + considered. +
++ Warning + | +
+ + This is an experimental feature. The syntax is likely to + change in future versions. + + |
+
+ Client tag patterns are not set based on HTTP headers but based + on the client's IP address. Users can enable them themselves, but + the Privoxy admin controls which tags are available and what + their effect is. +
++ After a client-specific tag has been defined with the client-specific-tag, + directive, action sections can be activated based on the tag by + using a CLIENT-TAG pattern. The CLIENT-TAG pattern is evaluated + at the same priority as URL patterns, as a result the last + matching pattern wins. Tags that are created based on client or + server headers are evaluated later on and can overrule CLIENT-TAG + and URL patterns! +
++ The tag is set for all requests that come from clients that + requested it to be set. Note that "clients" are differentiated by + IP address, if the IP address changes the tag has to be requested + again. +
++ Clients can request tags to be set by using the CGI interface http://config.privoxy.org/client-tags. +
++ Example: +
++
+
++# If the admin defined the client-specific-tag circumvent-blocks, +# and the request comes from a client that previously requested +# the tag to be set, overrule all previous +block actions that +# are enabled based on URL to CLIENT-TAG patterns. +{-block} +CLIENT-TAG:^circumvent-blocks$ + +# This section is not overruled because it's located after +# the previous one. +{+block{Nobody is supposed to request this.}} +example.org/blocked-example-page ++ |
+
+ All actions are disabled by default, until they are explicitly + enabled somewhere in an actions file. Actions are turned on if + preceded with a "+", and turned off if + preceded with a "-". So a +action means "do that + action", e.g. +block means "please block URLs that match the following + patterns", and -block means "don't block URLs that match the following patterns, + even if +block previously + applied." +
++ Again, actions are invoked by placing them on a line, enclosed in + curly braces and separated by whitespace, like in {+some-action -some-other-action{some-parameter}}, + followed by a list of URL patterns, one per line, to which they + apply. Together, the actions line and the following pattern lines + make up a section of the actions file. +
++ Actions fall into three categories: +
++
++ Boolean, i.e the action can only be "enabled" or "disabled". Syntax: +
++
+
++ +name # enable action name + -name # disable action name ++ |
+
+ Example: +handle-as-image +
++ Parameterized, where some value is required in order to enable + this type of action. Syntax: +
++
+
++ +name{param} # enable action and set parameter to param, + # overwriting parameter from previous match if necessary + -name # disable action. The parameter can be omitted ++ |
+
+ Note that if the URL matches multiple positive forms of a + parameterized action, the last match wins, i.e. the params from + earlier matches are simply ignored. +
++ Example: +hide-user-agent{Mozilla/5.0 (X11; + U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602 + Firefox/2.0.0.4} +
++ Multi-value. These look exactly like parameterized actions, but + they behave differently: If the action applies multiple times + to the same URL, but with different parameters, all the parameters + from all + matches are remembered. This is used for actions that can be + executed for the same request repeatedly, like adding multiple + headers, or filtering through multiple filters. Syntax: +
++
+
++ +name{param} # enable action and add param to the list of parameters + -name{param} # remove the parameter param from the list of parameters + # If it was the last one left, disable the action. + -name # disable this action completely and remove all parameters from the list ++ |
+
+ Examples: +add-header{X-Fun-Header: Some + text} and +filter{html-annoyances} +
++ If nothing is specified in any actions file, no "actions" are taken. So in this case Privoxy would just be a normal, non-blocking, + non-filtering proxy. You must specifically enable the privacy and + blocking features you need (although the provided default actions + files will give a good starting point). +
++ Later defined action sections always over-ride earlier ones of the + same type. So exceptions to any rules you make, should come in the + latter part of the file (or in a file that is processed later when + using multiple actions files such as user.action). For multi-valued actions, the actions + are applied in the order they are specified. Actions files are + processed in the order they are defined in config (the default installation has three actions + files). It also quite possible for any given URL to match more than + one "pattern" (because of wildcards and + regular expressions), and thus to trigger more than one set of + actions! Last match wins. +
++ The list of valid Privoxy actions + are: +
++ Confuse log analysis, custom applications +
++ Sends a user defined HTTP header to the web server. +
++ Multi-value. +
++ Any string value is possible. Validity of the defined HTTP + headers is not checked. It is recommended that you use the + "X-" + prefix for custom headers. +
++ This action may be specified multiple times, in order to + define multiple headers. This is rarely needed for the + typical user. If you don't know what "HTTP headers" are, you definitely don't + need to worry about this one. +
++ Headers added by this action are not modified by other + actions. +
++
+
++# Add a DNT ("Do not track") header to all requests, +# event to those that already have one. +# +# This is just an example, not a recommendation. +# +# There is no reason to believe that user-tracking websites care +# about the DNT header and depending on the User-Agent, adding the +# header may make user-tracking easier. +{+add-header{DNT: 1}} +/ ++ |
+
+ Block ads or other unwanted content +
++ Requests for URLs to which this action applies are blocked, + i.e. the requests are trapped by Privoxy and the requested URL is never + retrieved, but is answered locally with a substitute page + or image, as determined by the handle-as-image, + set-image-blocker, + and handle-as-empty-document + actions. +
++ Parameterized. +
++ A block reason that should be given to the user. +
++ Privoxy sends a special + "BLOCKED" page for requests to + blocked pages. This page contains the block reason given as + parameter, a link to find out why the block action applies, + and a click-through to the blocked content (the latter only + if the force feature is available and enabled). +
++ A very important exception occurs if both block and handle-as-image, + apply to the same request: it will then be replaced by an + image. If set-image-blocker + (see below) also applies, the type of image will be + determined by its parameter, if not, the standard + checkerboard pattern is sent. +
++ It is important to understand this process, in order to + understand how Privoxy + deals with ads and other unwanted content. Blocking is a + core feature, and one upon which various other features + depend. +
++ The filter action can + perform a very similar task, by "blocking" banner images and other content + through rewriting the relevant URLs in the document's HTML + source, so they don't get requested in the first place. + Note that this is a totally different technique, and it's + easy to confuse the two. +
++
+
++{+block{No nasty stuff for you.}} +# Block and replace with "blocked" page + .nasty-stuff.example.com + +{+block{Doubleclick banners.} +handle-as-image} +# Block and replace with image + .ad.doubleclick.net + .ads.r.us/banners/ + +{+block{Layered ads.} +handle-as-empty-document} +# Block and then ignore + adserver.example.net/.*\.js$ ++ |
+
+ Improve privacy by not forwarding the source of the request + in the HTTP headers. +
++ Deletes the "X-Forwarded-For:" + HTTP header from the client request, or adds a new one. +
++ Parameterized. +
++ "block" to delete the + header. +
++ "add" to create the header + (or append the client's IP address to an already + existing one). +
++ It is safe and recommended to use block. +
++ Forwarding the source address of the request may make sense + in some multi-user setups but is also a privacy risk. +
++
+
+++change-x-forwarded-for{block} ++ |
+
+ Rewrite or remove single client headers. +
++ All client headers to which this action applies are + filtered on-the-fly through the specified regular + expression based substitutions. +
++ Multi-value. +
++ The name of a client-header filter, as defined in one of + the filter files. +
++ Client-header filters are applied to each header on its + own, not to all at once. This makes it easier to diagnose + problems, but on the downside you can't write filters that + only change header x if header y's value is z. You can do + that by using tags though. +
++ Client-header filters are executed after the other header + actions have finished and use their output as input. +
++ If the request URI gets changed, Privoxy will detect that and use the + new one. This can be used to rewrite the request + destination behind the client's back, for example to + specify a Tor exit relay for certain requests. +
++ Please refer to the filter file + chapter to learn which client-header filters are + available by default, and how to create your own. +
++
+
++# Hide Tor exit notation in Host and Referer Headers +{+client-header-filter{hide-tor-exit-notation}} +/ + ++ |
+
+ Block requests based on their headers. +
++ Client headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions, the result is used as tag. +
++ Multi-value. +
++ The name of a client-header tagger, as defined in one of + the filter files. +
++ Client-header taggers are applied to each header on its + own, and as the header isn't modified, each tagger "sees" the original. +
++ Client-header taggers are the first actions that are + executed and their tags can be used to control every other + action. +
++
+
++# Tag every request with the User-Agent header +{+client-header-tagger{user-agent}} +/ + +# Tagging itself doesn't change the action +# settings, sections with TAG patterns do: +# +# If it's a download agent, use a different forwarding proxy, +# show the real User-Agent and make sure resume works. +{+forward-override{forward-socks5 10.0.0.2:2222 .} \ + -hide-if-modified-since \ + -overwrite-last-modified \ + -hide-user-agent \ + -filter \ + -deanimate-gifs \ +} +TAG:^User-Agent: NetBSD-ftp/ +TAG:^User-Agent: Novell ZYPP Installer +TAG:^User-Agent: RPM APT-HTTP/ +TAG:^User-Agent: fetch libfetch/ +TAG:^User-Agent: Ubuntu APT-HTTP/ +TAG:^User-Agent: MPlayer/ + ++ |
+
+
+
++# Tag all requests with the Range header set +{+client-header-tagger{range-requests}} +/ + +# Disable filtering for the tagged requests. +# +# With filtering enabled Privoxy would remove the Range headers +# to be able to filter the whole response. The downside is that +# it prevents clients from resuming downloads or skipping over +# parts of multimedia files. +{-filter -deanimate-gifs} +TAG:^RANGE-REQUEST$ + ++ |
+
+ Stop useless download menus from popping up, or change the + browser's rendering mode +
++ Replaces the "Content-Type:" + HTTP server header. +
++ Parameterized. +
++ Any string. +
++ The "Content-Type:" HTTP server + header is used by the browser to decide what to do with the + document. The value of this header can cause the browser to + open a download menu instead of displaying the document by + itself, even if the document's format is supported by the + browser. +
++ The declared content type can also affect which rendering + mode the browser chooses. If XHTML is delivered as "text/html", many browsers treat it as + yet another broken HTML document. If it is send as "application/xml", browsers with XHTML + support will only display it, if the syntax is correct. +
++ If you see a web site that proudly uses XHTML buttons, but + sets "Content-Type: text/html", + you can use Privoxy to + overwrite it with "application/xml" and validate the web + master's claim inside your XHTML-supporting browser. If the + syntax is incorrect, the browser will complain loudly. +
++ You can also go the opposite direction: if your browser + prints error messages instead of rendering a document + falsely declared as XHTML, you can overwrite the content + type with "text/html" and have + it rendered as broken HTML document. +
++ By default content-type-overwrite + only replaces "Content-Type:" + headers that look like some kind of text. If you want to + overwrite it unconditionally, you have to combine it with + force-text-mode. + This limitation exists for a reason, think twice before + circumventing it. +
++ Most of the time it's easier to replace this action with a + custom server-header + filter. It allows you to activate it for every + document of a certain site and it will still only replace + the content types you aimed at. +
++ Of course you can apply content-type-overwrite to a whole site and + then make URL based exceptions, but it's a lot more work to + get the same precision. +
++
+
++# Check if www.example.net/ really uses valid XHTML +{ +content-type-overwrite{application/xml} } +www.example.net/ - +# but leave the content type unmodified if the URL looks like a style sheet +{-content-type-overwrite} +www.example.net/.*\.css$ +www.example.net/.*style ++ |
+
+ Remove a client header Privoxy has no dedicated action for. +
++ Deletes every header sent by the client that contains the + string the user supplied as parameter. +
++ Parameterized. +
++ Any string. +
++ This action allows you to block client headers for which no + dedicated Privoxy action + exists. Privoxy will + remove every client header that contains the string you + supplied as parameter. +
++ Regular expressions are not supported and you can't use this + action to block different headers in the same request, + unless they contain the same string. +
++ crunch-client-header is only meant + for quick tests. If you have to block several different + headers, or only want to modify parts of them, you should + use a client-header + filter. +
++ Warning + | +
+ + Don't block any header without understanding the + consequences. + + |
+
+
+
++# Block the non-existent "Privacy-Violation:" client header +{ +crunch-client-header{Privacy-Violation:} } +/ -
+
+ + 8.5.8. crunch-if-none-match ++
+
+
+
+ + 8.5.9. + crunch-incoming-cookies ++
+
+
+
+ + 8.5.10. crunch-server-header ++
+
+
+
+ + 8.5.11. + crunch-outgoing-cookies ++
+
+
+
+ + 8.5.12. deanimate-gifs ++
+
+
+
+ + 8.5.13. + downgrade-http-version ++
+
+
+ + 8.5.14. external-filter ++
+
|
+
+ Fool some click-tracking scripts and speed up indirect + links. +
++ Detects redirection URLs and redirects the browser without + contacting the redirection server first. +
++ Parameterized. +
++ "simple-check" to just + search for the string "http://" to detect redirection URLs. +
++ "check-decoded-url" to + decode URLs (if necessary) before searching for + redirection URLs. +
++ Many sites, like yahoo.com, don't just link to other sites. + Instead, they will link to some script on their own + servers, giving the destination as a parameter, which will + then redirect you to the final target. URLs resulting from + this scheme typically look like: "http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/". +
++ Sometimes, there are even multiple consecutive redirects + encoded in the URL. These redirections via scripts make + your web browsing more traceable, since the server from + which you follow such a link can see where you go to. Apart + from that, valuable bandwidth and time is wasted, while + your browser asks the server for one redirect after the + other. Plus, it feeds the advertisers. +
++ This feature is currently not very smart and is scheduled + for improvement. If it is enabled by default, you will have + to create some exceptions to this action. It can lead to + failures in several ways: +
++ Not every URLs with other URLs as parameters is evil. Some + sites offer a real service that requires this information + to work. For example a validation service needs to know, + which document to validate. fast-redirects assumes that every URL + parameter that looks like another URL is a redirection + target, and will always redirect to the last one. Most of + the time the assumption is correct, but if it isn't, the + user gets redirected anyway. +
++ Another failure occurs if the URL contains other parameters + after the URL parameter. The URL: "http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar". + contains the redirection URL "http://www.example.net/", followed by + another parameter. fast-redirects + doesn't know that and will cause a redirect to "http://www.example.net/&foo=bar". + Depending on the target server configuration, the parameter + will be silently ignored or lead to a "page not found" error. You can prevent this + problem by first using the redirect action to + remove the last part of the URL, but it requires a little + effort. +
++ To detect a redirection URL, fast-redirects only looks for the string + "http://", either in plain text + (invalid but often used) or encoded as "http%3a//". Some sites use their own URL + encoding scheme, encrypt the address of the target server + or replace it with a database id. In theses cases fast-redirects is fooled and the + request reaches the redirection server where it probably + gets logged. +
++
+
++ { +fast-redirects{simple-check} } + one.example.com -
+ + 8.5.16. filter ++
+
|
+
- -+filter{fun} # Text replacements for subversive browsing fun! + +
|
+
- -+filter{crude-parental} # Crude parental filtering. Note that this filter doesn't work reliably. + +
|
+
- -+filter{ie-exploits} # Disable some known Internet Explorer bug exploits. + +
|
+
- -+filter{site-specifics} # Cure for site-specific problems. Don't apply generally! + +
|
+
- -+filter{no-ping} # Removes non-standard ping attributes in <a> and <area> tags. + +
|
+
- -+filter{google} # CSS-based block for Google text ads. Also removes a width limitation and the toolbar advertisement. + +
|
+
- -+filter{yahoo} # CSS-based block for Yahoo text ads. Also removes a width limitation. + +
|
+
- -+filter{msn} # CSS-based block for MSN text ads. Also removes tracking URLs and a width limitation. + +
|
+
- -+filter{blogspot} # Cleans up some Blogspot blogs. Read the fine print before using this. + +
- 8.5.16. force-text-mode- -
-
|
+
Warning | +
+++filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking). ++ |
- Think twice before activating this action. Filtering - binary data with regular expressions can cause file - damage. + |
+++filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap. + |
- -+force-text-mode + +
- 8.5.17. forward-override- -
-
|
+
+++filter{frameset-borders} # Give frames a border and make them resizable. ++ |
+
Multi-value.
-
+++filter{iframes} # Removes all detected iframes. Should only be enabled for individual sites. ++ |
+
+++filter{demoronizer} # Fix MS's non-standard use of standard charsets. ++ |
+
"forward ." to use a direct - connection without any additional proxies.
-
+++filter{shockwave-flash} # Kill embedded Shockwave Flash objects. ++ |
+
"forward 127.0.0.1:8123" to - use the HTTP proxy listening at 127.0.0.1 port 8123.
-
+++filter{quicktime-kioskmode} # Make Quicktime movies saveable. ++ |
+
"forward-socks4a 127.0.0.1:9050 - ." to use the socks4a proxy listening at 127.0.0.1 - port 9050. Replace "forward-socks4a" with "forward-socks4" to use a socks4 connection - (with local DNS resolution) instead, use "forward-socks5" for socks5 connections - (with remote DNS resolution).
-
+++filter{fun} # Text replacements for subversive browsing fun! ++ |
+
"forward-socks4a 127.0.0.1:9050 - proxy.example.org:8000" to use the socks4a proxy - listening at 127.0.0.1 port 9050 to reach the HTTP proxy - listening at proxy.example.org port 8000. Replace - "forward-socks4a" with - "forward-socks4" to use a socks4 - connection (with local DNS resolution) instead, use - "forward-socks5" for socks5 - connections (with remote DNS resolution).
-
+++filter{crude-parental} # Crude parental filtering. Note that this filter doesn't work reliably. ++ |
+
+++filter{ie-exploits} # Disable some known Internet Explorer bug exploits. ++ |
+
This action takes parameters similar to the forward directives in the - configuration file, but without the URL pattern. It can be used - as replacement, but normally it's only used in cases where - matching based on the request URL isn't sufficient.
+ +
+++filter{site-specifics} # Cure for site-specific problems. Don't apply generally! ++ |
+
Warning | +
+++filter{no-ping} # Removes non-standard ping attributes in <a> and <area> tags. ++ |
- Please read the description for the forward directives before - using this action. Forwarding to the wrong people will - reduce your privacy and increase the chances of - man-in-the-middle attacks. + |
+++filter{google} # CSS-based block for Google text ads. Also removes a width limitation and the toolbar advertisement. ++ |
+
If the ports are missing or invalid, default values - will be used. This might change in the future and you - shouldn't rely on it. Otherwise incorrect syntax causes - Privoxy to exit.
+ +
+++filter{yahoo} # CSS-based block for Yahoo text ads. Also removes a width limitation. ++ |
+
Use the show-url-info CGI page to verify that your - forward settings do what you thought the do.
+ +
+++filter{msn} # CSS-based block for MSN text ads. Also removes tracking URLs and a width limitation. + |
+++filter{blogspot} # Cleans up some Blogspot blogs. Read the fine print before using this. ++ |
+
+ Force Privoxy to treat a + document as if it was in some kind of text format. +
++ Declares a document as text, even if the "Content-Type:" isn't detected as such. +
++ Boolean. +
++ N/A +
++ As explained above, Privoxy tries to only filter files + that are in some kind of text format. The same restrictions + apply to content-type-overwrite. + force-text-mode declares a + document as text, without looking at the "Content-Type:" first. +
++ Warning + | +
+ + Think twice before activating this action. + Filtering binary data with regular expressions can + cause file damage. + + |
+
+
+
+++force-text-mode -
+
-
- + 8.5.18. forward-override ++
+
-
-
-
- 8.5.18. handle-as-empty-document- -
-
-
-
-
- 8.5.19. handle-as-image- -
-
-
- 8.5.20. hide-accept-language- -
-
|
+
+ Mark URLs that should be replaced by empty documents if they get + blocked +
++ This action alone doesn't do anything noticeable. It just + marks URLs. If the block action also + applies, the presence or absence of this mark + decides whether an HTML "BLOCKED" page, or an empty document will be + sent to the client as a substitute for the blocked content. + The empty document isn't literally empty, + but actually contains a single space. +
++ Boolean. +
++ N/A +
++ Some browsers complain about syntax errors if JavaScript + documents are blocked with Privoxy's default HTML page; this + option can be used to silence them. And of course this + action can also be used to eliminate the Privoxy BLOCKED message in frames. +
++ The content type for the empty document can be specified + with content-type-overwrite{}, + but usually this isn't necessary. +
++
+
++# Block all documents on example.org that end with ".js", +# but send an empty document instead of the usual HTML message. +{+block{Blocked JavaScript} +handle-as-empty-document} +example.org/.*\.js$ - |
+
+ Mark URLs as belonging to images (so they'll be replaced by + images if they + do get blocked, rather than HTML pages) +
++ This action alone doesn't do anything noticeable. It just + marks URLs as images. If the block action also + applies, the presence or absence of this mark + decides whether an HTML "blocked" page, or a replacement image (as + determined by the set-image-blocker + action) will be sent to the client as a substitute for the + blocked content. +
++ Boolean. +
++ N/A +
++ The below generic example section is actually part of default.action. It marks all URLs + with well-known image file name extensions as images and + should be left intact. +
++ Users will probably only want to use the handle-as-image + action in conjunction with block, to block sources + of banners, whose URLs don't reflect the file type, like in + the second example section. +
++ Note that you cannot treat HTML pages as images in most + cases. For instance, (in-line) ad frames require an HTML + page to be sent, or they won't display properly. Forcing + handle-as-image in this situation + will not replace the ad frame with an image, but lead to + error messages. +
++
+
++# Generic image extensions: +# +{+handle-as-image} +/.*\.(gif|jpg|jpeg|png|bmp|ico)$ -
+ + 8.5.21. hide-accept-language ++
+
|
+
Prevent download menus for content you prefer to view inside - the browser.
-Deletes or replaces the "Content-Disposition:" HTTP header set by some - servers.
-Parameterized.
-Keyword: "block", or any user - defined value.
-Some servers set the "Content-Disposition:" HTTP header for documents - they assume you want to save locally before viewing them. The - "Content-Disposition:" header - contains the file name the browser is supposed to use by - default.
- -In most browsers that understand this header, it makes it - impossible to just - view the document, without downloading it first, - even if it's just a simple text file or an image.
- -Removing the "Content-Disposition:" header helps to prevent - this annoyance, but some browsers additionally check the - "Content-Type:" header, before they - decide if they can display a document without saving it first. - In these cases, you have to change this header as well, before - the browser stops displaying download menus.
- -It is also possible to change the server's file name - suggestion to another one, but in most cases it isn't worth the - time to set it up.
- -This action will probably be removed in the future, use - server-header filters instead.
-
- + |
+
Prevent yet another way to track the user's steps between - sessions.
-Deletes the "If-Modified-Since:" - HTTP client header or modifies its value.
-Parameterized.
-Keyword: "block", or a user - defined value that specifies a range of hours.
-Removing this header is useful for filter testing, where you - want to force a real reload instead of getting status code - "304", which would cause the browser - to use a cached copy of the page.
- -Instead of removing the header, hide-if-modified-since can also add or subtract - a random amount of time to/from the header's value. You specify - a range of minutes where the random factor should be chosen - from and Privoxy does the - rest. A negative value means subtracting, a positive value - adding.
- -Randomizing the value of the "If-Modified-Since:" makes it less likely that - the server can use the time as a cookie replacement, but you - will run into caching problems if the random range is too - high.
- -It is a good idea to only use a small negative value and let - overwrite-last-modified - handle the greater changes.
- -It is also recommended to use this action together with - crunch-if-none-match, - otherwise it's more or less pointless.
-
- + |
+
Keep your (old and ill) browser from telling web servers - your email address
-Deletes any existing "From:" HTTP - header, or replaces it with the specified string.
-Parameterized.
-Keyword: "block", or any user - defined value.
-The keyword "block" will - completely remove the header (not to be confused with the - block action).
- -Alternately, you can specify any value you prefer to be sent - to the web server. If you do, it is a matter of fairness not to - use any address that is actually used by a real person.
- -This action is rarely needed, as modern web browsers don't - send "From:" headers anymore.
-
- + |
+
Conceal which link you followed to get to a particular - site
-Deletes the "Referer:" (sic) HTTP - header from the client request, or replaces it with a forged - one.
-Parameterized.
-"conditional-block" to delete - the header completely if the host has changed.
-"conditional-forge" to forge - the header if the host has changed.
-"block" to delete the header - unconditionally.
-"forge" to pretend to be - coming from the homepage of the server we are talking - to.
-Any other string to set a user defined referrer.
-conditional-block is the only - parameter, that isn't easily detected in the server's log file. - If it blocks the referrer, the request will look like the - visitor used a bookmark or typed in the address directly.
- -Leaving the referrer unmodified for requests on the same - host allows the server owner to see the visitor's "click path", but in most cases she could also - get that information by comparing other parts of the log file: - for example the User-Agent if it isn't a very common one, or - the user's IP address if it doesn't change between different - requests.
- -Always blocking the referrer, or using a custom one, can - lead to failures on servers that check the referrer before they - answer any requests, in an attempt to prevent their content - from being embedded or linked to elsewhere.
- -Both conditional-block and - forge will work with referrer checks, - as long as content and valid referring page are on the same - host. Most of the time that's the case.
- -hide-referer is an alternate - spelling of hide-referrer and the two - can be can be freely substituted with each other. ("referrer" is the correct English spelling, - however the HTTP specification has a bug - it requires it to be - spelled as "referer".)
-
- + |
+
Prevent abuse of Privoxy as - a TCP proxy relay or disable SSL for untrusted sites
-Specifies to which ports HTTP CONNECT requests are - allowable.
-Parameterized.
-A comma-separated list of ports or port ranges (the latter - using dashes, with the minimum defaulting to 0 and the maximum - to 65K).
-By default, i.e. if no limit-connect action applies, Privoxy allows HTTP CONNECT requests to - all ports. Use limit-connect if - fine-grained control is desired for some or all - destinations.
- -The CONNECT methods exists in HTTP to allow access to secure - websites ("https://" URLs) through - proxies. It works very simply: the proxy connects to the server - on the specified port, and then short-circuits its connections - to the client and to the remote server. This means - CONNECT-enabled proxies can be used as TCP relays very - easily.
- -Privoxy relays HTTPS - traffic without seeing the decoded content. Websites can - leverage this limitation to circumvent Privoxy's filters. By specifying an - invalid port range you can disable HTTPS entirely.
-
- + |
+
Ensure that servers send the content uncompressed, so it can - be passed through filters.
-Removes the Accept-Encoding header which can be used to ask - for compressed transfer.
-Boolean.
-N/A
-More and more websites send their content compressed by - default, which is generally a good idea and saves bandwidth. - But the filter and deanimate-gifs - actions need access to the uncompressed data.
- -When compiled with zlib support (available since - Privoxy 3.0.7), content that - should be filtered is decompressed on-the-fly and you don't - have to worry about this action. If you are using an older - Privoxy version, or one that - hasn't been compiled with zlib support, this action can be used - to convince the server to send the content uncompressed.
- -Most text-based instances compress very well, the size is - seldom decreased by less than 50%, for markup-heavy instances - like news feeds saving more than 90% of the original size isn't - unusual.
- -Not using compression will therefore slow down the transfer, - and you should only enable this action if you really need it. - As of Privoxy 3.0.7 it's - disabled in all predefined action settings.
- -Note that some (rare) ill-configured sites don't handle - requests for uncompressed documents correctly. Broken PHP - applications tend to send an empty document body, some IIS - versions only send the beginning of the content. If you enable - prevent-compression per default, you - might want to add exceptions for those sites. See the example - for how to do that.
-+ Limit the lifetime of HTTP cookies to a couple of minutes + or hours. +
++ Overwrites the expires field in Set-Cookie server headers + if it's above the specified limit. +
++ Parameterized. +
++ The lifetime limit in minutes, or 0. +
++ This action reduces the lifetime of HTTP cookies coming + from the server to the specified number of minutes, + starting from the time the cookie passes Privoxy. +
++ Cookies with a lifetime below the limit are not modified. + The lifetime of session cookies is set to the specified + limit. +
++ The effect of this action depends on the server. +
++ In case of servers which refresh their cookies with each + response (or at least frequently), the lifetime limit set + by this action is updated as well. Thus, a session + associated with the cookie continues to work with this + action enabled, as long as a new request is made before the + last limit set is reached. +
++ However, some servers send their cookies once, with a + lifetime of several years (the year 2037 is a popular + choice), and do not refresh them until a certain event in + the future, for example the user logging out. In this case + this action may limit the absolute lifetime of the session, + even if requests are made frequently. +
++ If the parameter is "0", this + action behaves like session-cookies-only. +
++
+
+++limit-cookie-lifetime{60} -
+ + 8.5.29. prevent-compression ++
+
|
+
Prevent yet another way to track the user's steps between - sessions.
-Deletes the "Last-Modified:" HTTP - server header or modifies its value.
-Parameterized.
-One of the keywords: "block", - "reset-to-request-time" and - "randomize"
-Removing the "Last-Modified:" - header is useful for filter testing, where you want to force a - real reload instead of getting status code "304", which would cause the browser to reuse - the old version of the page.
- -The "randomize" option overwrites - the value of the "Last-Modified:" - header with a randomly chosen time between the original value - and the current time. In theory the server could send each - document with a different "Last-Modified:" header to track visits without - using cookies. "Randomize" makes it - impossible and the browser can still revalidate cached - documents.
- -"reset-to-request-time" - overwrites the value of the "Last-Modified:" header with the current time. - You could use this option together with hide-if-modified-since - to further customize your random range.
- -The preferred parameter here is "randomize". It is safe to use, as long as the - time settings are more or less correct. If the server sets the - "Last-Modified:" header to the time - of the request, the random range becomes zero and the value - stays the same. Therefore you should later randomize it a - second time with hided-if-modified-since, - just to be sure.
- -It is also recommended to use this action together with - crunch-if-none-match.
-
- + |
+
Redirect requests to other sites.
-Convinces the browser that the requested document has been - moved to another location and the browser should get it from - there.
-Parameterized
-An absolute URL or a single pcrs command.
-Requests to which this action applies are answered with a - HTTP redirect to URLs of your choosing. The new URL is either - provided as parameter, or derived by applying a single pcrs - command to the original URL.
- -The syntax for pcrs commands is documented in the filter file section.
- -This action will be ignored if you use it together with - block. It can be combined - with fast-redirects{check-decoded-url} - to redirect to a decoded version of a rewritten URL.
- -Use this action carefully, make sure not to create - redirection loops and be aware that using your own redirects - might make it possible to fingerprint your requests.
- -In case of problems with your redirects, or simply to watch - them working, enable debug - 128.
-
- + |
+
+ Rewrite or remove single server headers. +
++ All server headers to which this action applies are + filtered on-the-fly through the specified regular + expression based substitutions. +
++ Multi-value. +
++ The name of a server-header filter, as defined in one of + the filter files. +
++ Server-header filters are applied to each header on its + own, not to all at once. This makes it easier to diagnose + problems, but on the downside you can't write filters that + only change header x if header y's value is z. You can do + that by using tags though. +
++ Server-header filters are executed after the other header + actions have finished and use their output as input. +
++ Please refer to the filter file + chapter to learn which server-header filters are + available by default, and how to create your own. +
++
+
++{+server-header-filter{html-to-xml}} +example.org/xml-instance-that-is-delivered-as-html -
+ + 8.5.33. server-header-tagger ++
+
|
+
Allow only temporary "session" - cookies (for the current browser session only).
-Deletes the "expires" field from - "Set-Cookie:" server headers. Most - browsers will not store such cookies permanently and forget - them in between sessions.
-Boolean.
-N/A
-This is less strict than crunch-incoming-cookies - / crunch-outgoing-cookies - and allows you to browse websites that insist or rely on - setting cookies, without compromising your privacy too - badly.
- -Most browsers will not permanently store cookies that have - been processed by session-cookies-only - and will forget about them between sessions. This makes - profiling cookies useless, but won't break sites which require - cookies so that you can log in for transactions. This is - generally turned on for all sites, and is the recommended - setting.
- -It makes no sense - at all to use session-cookies-only together with crunch-incoming-cookies - or crunch-outgoing-cookies. - If you do, cookies will be plainly killed.
- -Note that it is up to the browser how it handles such - cookies without an "expires" field. - If you use an exotic browser, you might want to try it out to - be sure.
- -This setting also has no effect on cookies that may have - been stored previously by the browser before starting - Privoxy. These would have to - be removed manually.
- -Privoxy also uses the - content-cookies - filter to block some types of cookies. Content cookies are - not effected by session-cookies-only.
-
- + |
+
Choose the replacement for blocked images
-This action alone doesn't do anything noticeable. If - both - block and handle-as-image - also - apply, i.e. if the request is to be blocked as an image, - then the - parameter of this action decides what will be sent as a - replacement.
-Parameterized.
-"pattern" to send a built-in - checkerboard pattern image. The image is visually decent, - scales very well, and makes it obvious where banners were - busted.
-"blank" to send a built-in - transparent image. This makes banners disappear completely, - but makes it hard to detect where Privoxy has blocked images on a given - page and complicates troubleshooting if Privoxy has blocked innocent images, - like navigation icons.
-"target-url" to send a - redirect to target-url. - You can redirect to any image anywhere, even in your local - filesystem via "file:///" URL. - (But note that not all browsers support redirecting to a - local file system).
- -A good application of redirects is to use special
- Privoxy-built-in URLs,
- which send the built-in images, as target-url. This has the same
- visual effect as specifying
+
+ Choose the replacement for blocked images
+
+ This action alone doesn't do anything noticeable. If both block and handle-as-image
+ also
+ apply, i.e. if the request is to be blocked as an image,
+ then
+ the parameter of this action decides what will be sent as a
+ replacement.
+
+ Parameterized.
+
+ "pattern" to send a built-in
+ checkerboard pattern image. The image is visually
+ decent, scales very well, and makes it obvious where
+ banners were busted.
+
+ "blank" to send a built-in
+ transparent image. This makes banners disappear
+ completely, but makes it hard to detect where Privoxy has blocked images
+ on a given page and complicates troubleshooting if
+ Privoxy has blocked
+ innocent images, like navigation icons.
+
+ "target-url" to send a
+ redirect to target-url. You can redirect
+ to any image anywhere, even in your local filesystem
+ via "file:///" URL. (But
+ note that not all browsers support redirecting to a
+ local file system).
+
+ A good application of redirects is to use special Privoxy-built-in URLs, which
+ send the built-in images, as target-url. This has the same
+ visual effect as specifying "blank" or "pattern" in the first place, but
+ enables your browser to cache the replacement image,
+ instead of requesting it over and over again.
+
+ The URLs for the built-in images are "http://config.privoxy.org/send-banner?type=type", where type is either "blank" or "pattern" in the first place, but enables
- your browser to cache the replacement image, instead of
- requesting it over and over again. The URLs for the built-in images are "http://config.privoxy.org/send-banner?type=type", where type is either "blank" or "pattern". There is a third (advanced) type, called "auto". It is NOT to be used in set-image-blocker, but meant for use from
- filters. Auto will select the
- type of image that would have applied to the referring page,
- had it been an image. Built-in pattern:
+ There is a third (advanced) type, called "auto". It is NOT to be used in set-image-blocker, but meant for use from filters. Auto will select the
+ type of image that would have applied to the referring
+ page, had it been an image.
+
+ Built-in pattern:
+
+ Redirect to the BSD daemon:
+ Redirect to the BSD daemon:
+
+ Redirect to the built-in pattern for better caching:
+ Redirect to the built-in pattern for better caching:
+
+
+ 8.5.35. set-image-blocker
+
+
+
+
+
+
-
-
-
+
-
+
+ "QUOTE">"pattern".
+
+
+
-
-
+
-
+
-
+set-image-blocker{pattern}
-
-
-
-
+
-
+
+
+
-
-
+
-
+
-
+set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif}
-
-
+
-
+
-
+
+
+
-
-
+
+
-
+
-
+set-image-blocker{http://config.privoxy.org/send-banner?type=pattern}
-
+ Note that many of these actions have the potential to cause a + page to misbehave, possibly even not to display at all. There are + many ways a site designer may choose to design his site, and what + HTTP header content, and other criteria, he may depend on. There + is no way to have hard and fast rules for all sites. See the Appendix for a brief example + on troubleshooting actions. +
Note that many of these actions have the potential to cause a page - to misbehave, possibly even not to display at all. There are many - ways a site designer may choose to design his site, and what HTTP - header content, and other criteria, he may depend on. There is no way - to have hard and fast rules for all sites. See the Appendix for a brief example on - troubleshooting actions.
-Custom "actions", known to Privoxy as "aliases", - can be defined by combining other actions. These can in turn be invoked - just like the built-in actions. Currently, an alias name can contain - any character except space, tab, "=", - "{" and "}", but - we strongly - recommend that you only use "a" - to "z", "0" to - "9", "+", and - "-". Alias names are not case sensitive, and - are not required to start with a "+" or - "-" sign, since they are merely textually - expanded.
- -Aliases can be used throughout the actions file, but they - must be defined in a special - section at the top of the file! And there can only be one - such section per actions file. Each actions file may have its own alias - section, and the aliases defined in it are only visible within that - file.
- -There are two main reasons to use aliases: One is to save typing for - frequently used combinations of actions, the other one is a gain in - flexibility: If you decide once how you want to handle shops by - defining an alias called "shop", you can - later change your policy on shops in one place, and your changes will take effect - everywhere in the actions file where the "shop" alias is used. Calling aliases by their purpose - also makes your actions files more readable.
- -Currently, there is one big drawback to using aliases, though: - Privoxy's built-in web-based action - file editor honors aliases when reading the actions files, but it - expands them before writing. So the effects of your aliases are of - course preserved, but the aliases themselves are lost when you edit - sections that use aliases with it.
- -Now let's define some aliases...
- -
- + |
+
- +
Aliases like "shop" and "fragile" are typically used for "problem" sites that require more than one action to be - disabled in order to function properly. - - -
- 8.7. Actions - Files Tutorial- -The above chapters have shown which - actions files there are and how they are organized, how actions are - specified and applied to URLs, how patterns work, and how to define - and use aliases. Now, let's - look at an example match-all.action, - default.action and user.action file and see how all these pieces come - together: - -
- 8.7.1. - match-all.action- -Remember all actions - are disabled when matching starts, so we have to - explicitly enable the ones we want. - -While the match-all.action file only - contains a single section, it is probably the most important one. It - has only one pattern, "/", but this pattern matches all URLs. Therefore, the - set of actions used in this "default" - section will be applied to - all requests as a start. It can be partly or wholly - overridden by other actions files like default.action and user.action, but it will still be largely responsible - for your overall browsing experience. - -Again, at the start of matching, all actions are disabled, so - there is no need to disable any actions here. (Remember: a - "+" preceding the action name enables the - action, a "-" disables!). Also note how - this long line has been made more readable by splitting it into - multiple lines with line continuation. + |
+
- + |
+
- + |
+
- +
The first of our specialized sections is concerned with - "fragile" sites, i.e. sites that require - minimum interference, because they are either very complex or very - keen on tracking you (and have mechanisms in place that make them - unusable for people who avoid being tracked). We will simply use our - pre-defined fragile alias instead of stating - the list of actions explicitly: + |
+
- +
Shopping sites are not as fragile, but they typically require - cookies to log in, and pop-up windows for shopping carts or item - details. Again, we'll use a pre-defined alias: + |
+
- +
The fast-redirects action, - which may have been enabled in match-all.action, breaks some sites. So disable it - for popular sites where we know it misbehaves: + |
+
- +
It is important that Privoxy - knows which URLs belong to images, so that if they are to be blocked, - a substitute image can be sent, rather than an HTML page. Contacting - the remote site to find out is not an option, since it would destroy - the loading time advantage of banner blocking, and it would feed the - advertisers information about you. We can mark any URL as an image - with the handle-as-image action, - and marking all URLs that end in a known image file extension is a - good start: + |
+
- +
And then there are known banner sources. They often use scripts to - generate the banners, so it won't be visible from the URL that the - request is for an image. Hence we block them and mark them as images in - one go, with the help of our +block-as-image - alias defined above. (We could of course just as well use +block +handle-as-image here.) - Remember that the type of the replacement image is chosen by the - set-image-blocker - action. Since all URLs have matched the default section with its - +set-image-blocker{pattern} - action before, it still applies and needn't be repeated: + |
+
- +
One of the most important jobs of Privoxy is to block banners. Many of these can - be "blocked" by the filter{banners-by-size} action, - which we enabled above, and which deletes the references to banner - images from the pages while they are loaded, so the browser doesn't - request them anymore, and hence they don't need to be blocked here. - But this naturally doesn't catch all banners, and some people choose - not to use filters, so we need a comprehensive list of patterns for - banner URLs here, and apply the block action to them. - -First comes many generic patterns, which do most of the work, by - matching typical domain and path name components of banners. Then - comes a list of individual patterns for specific sites, which is - omitted here to keep the example short: + |
+
- +
It's quite remarkable how many advertisers actually call their - banner servers ads.company.com, - or call the directory in which the banners are stored simply - "banners". So the above generic patterns - are surprisingly effective. - -But being very generic, they necessarily also catch URLs that we - don't want to block. The pattern .*ads. e.g. - catches "nasty-ads.nasty-corp.com" as intended, but - also "downloads.sourcefroge.net" or "adsl.some-provider.net." So here come - some well-known exceptions to the +block section above. - -Note that these are exceptions to exceptions from the default! - Consider the URL "downloads.sourcefroge.net": Initially, all actions - are deactivated, so it wouldn't get blocked. Then comes the defaults - section, which matches the URL, but just deactivates the block action - once again. Then it matches .*ads., an - exception to the general non-blocking policy, and suddenly +block applies. - And now, it'll match .*loads., where - -block - applies, so (unless it matches again further down) it ends up with no - block - action applying. + |
+
- +
Filtering source code can have nasty side effects, so make an - exception for our friends at sourceforge.net, and all paths with - "cvs" in them. Note that -filter - disables all - filters in one fell swoop! + |
+
- +
The actual default.action is of course - much more comprehensive, but we hope this example made clear how it - works. - - -
- 8.7.3. - user.action- -So far we are painting with a broad brush by setting general - policies, which would be a reasonable starting point for many people. - Now, you might want to be more specific and have customized rules - that are more suitable to your personal habits and preferences. These - would be for narrowly defined situations like your ISP or your bank, - and should be placed in user.action, which - is parsed after all other actions files and hence has the last word, - over-riding any previously defined actions. user.action is also a safe place for your - personal settings, since default.action is - actively maintained by the Privoxy - developers and you'll probably want to install updated versions from - time to time. - -So let's look at a few examples of things that one might typically - do in user.action: + |
+
- + |
+
- +
Say you have accounts on some sites that you visit regularly, and - you don't want to have to log in manually each time. So you'd like to - allow persistent cookies for these sites. The allow-all-cookies alias defined above does exactly - that, i.e. it disables crunching of cookies in any direction, and the - processing of cookies to make them only temporary. - -
+ + + Say you have accounts on some sites that you visit regularly, and + you don't want to have to log in manually each time. So you'd + like to allow persistent cookies for these sites. The allow-all-cookies alias defined above does exactly + that, i.e. it disables crunching of cookies in any direction, and + the processing of cookies to make them only temporary. + ++ +
Your bank is allergic to some filter, but you don't know which, so - you disable them all: + |
+
- +
Some file types you may not want to filter for various - reasons: + |
+
- +
Example of a simple block - action. Say you've seen an ad on your favourite page on example.com - that you want to get rid of. You have right-clicked the image, - selected "copy image location" and pasted - the URL below while removing the leading http://, into a { +block{} } section. Note that { - +handle-as-image } need not be specified, since all URLs ending - in .gif will be tagged as images by the - general rules as set in default.action anyway: + |
+
- +
The URLs of dynamically generated banners, especially from large - banner farms, often don't use the well-known image file name - extensions, which makes it impossible for Privoxy to guess the file type just by looking - at the URL. You can use the +block-as-image - alias defined above for these cases. Note that objects which match - this rule but then turn out NOT to be an image are typically rendered - as a "broken image" icon by the browser. - Use cautiously. + |
+
- +
Now you noticed that the default configuration breaks Forbes - Magazine, but you were too lazy to find out which action is the - culprit, and you were again too lazy to give feedback, so you just used the fragile alias on the site, and -- whoa! -- it worked. The - fragile aliases disables those actions that - are most likely to break a site. Also, good for testing purposes to - see if it is Privoxy that is causing - the problem or not. We later find other regular sites that misbehave, - and add those to our personalized list of troublemakers: + |
+
- +
You like the "fun" text replacements in - default.filter, but it is disabled in the - distributed actions file. So you'd like to turn it on in your - private, update-safe config, once and for all: + |
+
- +
Note that the above is not really a good idea: There are - exceptions to the filters in default.action - for things that really shouldn't be filtered, like code on - CVS->Web interfaces. Since user.action - has the last word, these exceptions won't be valid for the - "fun" filtering specified here. - -You might also worry about how your favourite free websites are - funded, and find that they rely on displaying banner advertisements - to survive. So you might want to specifically allow banners for those - sites that you feel provide value to you: + |
+
- +
Note that allow-ads has been aliased to - -block, -filter{banners-by-size}, - and -filter{banners-by-link} - above. - -Invoke another alias here to force an over-ride of the MIME type - application/x-sh which typically would open - a download type dialog. In my case, I want to look at the shell - script, and then I can save it should I choose to. + |
+
- +
user.action is generally the best place - to define exceptions and additions to the default policies of - default.action. Some actions are safe to - have their default policies set here though. So let's set a default - policy to have a "blank" image as opposed - to the checkerboard pattern for ALL sites. "/" of - course matches all URL paths and patterns: + |
+
- +
|
+