X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Factions-file.html;h=79744ea43efc3d0141900cd5da216ed9230b37f8;hp=17f6e835725614bf6c13482f25ea90e2eec84800;hb=6de6bda5b29cbb5a8aef6863c1b5ca999ab4887b;hpb=94a4fa45e28600ad0e1e0455bda3ee4fd0303ffe diff --git a/doc/webserver/user-manual/actions-file.html b/doc/webserver/user-manual/actions-file.html index 17f6e835..79744ea4 100644 --- a/doc/webserver/user-manual/actions-file.html +++ b/doc/webserver/user-manual/actions-file.html @@ -1,1660 +1,1116 @@ - + -
-- Privoxy 3.0.26 User Manual - | -
---|
Feature | +Cautious | +Medium | +Advanced | +
---|---|---|---|
Ad-blocking Aggressiveness | +medium | +high | +high | +
Ad-filtering by size | +no | +yes | +yes | +
Ad-filtering by link | +no | +no | +yes | +
Pop-up killing | +blocks only | +blocks only | +blocks only | +
Privacy Features | +low | +medium | +medium/high | +
Cookie handling | +none | +session-only | +kill | +
Referer forging | +no | +yes | +yes | +
GIF de-animation | +no | +yes | +yes | +
Fast redirects | +no | +no | +yes | +
HTML taming | +no | +no | +yes | +
JavaScript taming | +no | +no | +yes | +
Web-bug killing | +no | +yes | +yes | +
Image tag reordering | +no | +yes | +yes | +
The list of actions files to be used are defined in the main + configuration file, and are processed in the order they are defined (e.g. + default.action is typically processed before + user.action). The content of these can all be + viewed and edited from http://config.privoxy.org/show-status. The over-riding + principle when applying actions, is that the last action that matches a + given URL wins. The broadest, most general rules go first (defined in + default.action), followed by any exceptions + (typically also in default.action), which are + then followed lastly by any local preferences (typically in user.action). Generally, user.action has the last word.
+An actions file typically has multiple sections. If you want to use + "aliases" in an actions file, you have to + place the (optional) alias + section at the top of that file. Then comes the default set of rules + which will apply universally to all sites and pages (be very careful with using such a + universal set in user.action or any other + actions file after default.action, because it + will override the result from consulting any previous file). And then + below that, exceptions to the defined universal policies. You can regard + user.action as an appendix to default.action, with the advantage that it is a separate + file, which makes preserving your personal settings across Privoxy upgrades easier.
+Actions can be used to block anything you want, including ads, + banners, or just some obnoxious URL whose content you would rather not + see. Cookies can be accepted or rejected, or accepted only during the + current browser session (i.e. not written to disk), content can be + modified, some JavaScripts tamed, user-tracking fooled, and much more. + See below for a complete list of + actions.
+Note that some actions, like + cookie suppression or script disabling, may render some sites unusable + that rely on these techniques to work properly. Finding the right mix + of actions is not always easy and certainly a matter of personal taste. + And, things can always change, requiring refinements in the + configuration. In general, it can be said that the more "aggressive" your default settings (in the top section + of the actions file) are, the more exceptions for "trusted" sites you will have to make later. If, for + example, you want to crunch all cookies per default, you'll have to + make exceptions from that rule for sites that you regularly use and + that require cookies for actually useful purposes, like maybe your + bank, favorite shop, or newspaper.
+We have tried to provide you with reasonable rules to start from in + the distribution actions files. But there is no general rule of thumb + on these things. There just are too many variables, and sites are + constantly changing. Sooner or later you will want to change the rules + (and read this chapter again :).
+The easiest way to edit the actions files is with a browser by using + our browser-based editor, which can be reached from http://config.privoxy.org/show-status. Note: the config file + option enable-edit-actions must be + enabled for this to work. The editor allows both fine-grained control + over every single feature on a per-URL basis, and easy choosing from + wholesale sets of defaults like "Cautious", + "Medium" or "Advanced". Warning: the "Advanced" setting is more aggressive, and will be more + likely to cause problems for some sites. Experienced users only!
+If you prefer plain text editing to GUIs, you can of course also + directly edit the the actions files with your favorite text editor. + Look at default.action which is richly + commented with many good examples.
+Actions files are divided into sections. There are special sections, + like the "alias" sections which will be + discussed later. For now let's concentrate on regular sections: They + have a heading line (often split up to multiple lines for readability) + which consist of a list of actions, separated by whitespace and + enclosed in curly braces. Below that, there is a list of URL and tag + patterns, each on a separate line.
+To determine which actions apply to a request, the URL of the + request is compared to all URL patterns in each "action file". Every time it matches, the list of + applicable actions for the request is incrementally updated, using the + heading of the section in which the pattern is located. The same is + done again for tags and tag patterns later on.
+If multiple applying sections set the same action differently, the + last match wins. If not, the effects are aggregated. E.g. a URL might + match a regular section with a heading line of { + +handle-as-image + }, then later another one with just { + +block }, resulting in + both actions to + apply. And there may well be cases where you will want to combine + actions together. Such a section then might look like:
+- Prev - | -- | -- Next + |
+ { +handle-as-image +block{Banner ads.} } + # Block these as if they were images. Send no block page. + banners.example.com + media.example.com/.*banners + .example.com/images/ads/ |
You can trace this process for URL patterns and any given URL by + visiting http://config.privoxy.org/show-url-info.
+Examples and more detail on this is provided in the Appendix, + Troubleshooting: Anatomy of an + Action section.
+As mentioned, Privoxy uses + "patterns" to determine what actions might apply to which + sites and pages your browser attempts to access. These "patterns" use wild card type pattern matching to achieve a + high degree of flexibility. This allows one expression to be expanded + and potentially match against many similar patterns.
+Generally, an URL pattern has the form <host><port>/<path>, where the + <host>, the <port> and the <path> are optional. (This is why the special + / pattern matches all URLs). Note that the + protocol portion of the URL pattern (e.g. http://) should not be included in the pattern. This is assumed + already!
+The pattern matching syntax is different for the host and path parts + of the URL. The host part uses a simple globbing type matching + technique, while the path part uses more flexible "Regular Expressions" (POSIX + 1003.2).
+The port part of a pattern is a decimal port number preceded by a + colon (:). If the host part contains a + numerical IPv6 address, it has to be put into angle brackets + (<, >).
+is a host-only pattern and will match any request to + www.example.com, regardless of which + document on that server is requested. So ALL pages in this domain + would be covered by the scope of this action. Note that a simple + example.com is different and would NOT + match.
+means exactly the same. For host-only patterns, the trailing + / may be omitted.
+matches all the documents on www.example.com whose name starts with /index.html.
+matches only the single document /index.html on www.example.com.
+matches the document /index.html, + regardless of the domain, i.e. on any web server + anywhere.
+Matches any URL because there's no requirement for either the + domain or the path to match anything.
+Matches any URL pointing to TCP port 8000.
+Matches any URL with the host address 10.0.0.1. (Note that the real URL uses plain + brackets, not angle brackets.)
+Matches any URL with the host address 2001:db8::1. (Note that the real URL uses plain + brackets, not angle brackets.)
+matches nothing, since it would be interpreted as a domain + name and there is no top-level domain called .html. So its a mistake.
+The matching of the host part offers some flexible options: if the + host pattern starts or ends with a dot, it becomes unanchored at that + end. The host pattern is often referred to as domain pattern as it is + usually used to match domain names and not IP addresses. For + example:
+matches any domain with first-level domain com and second-level domain example. For example www.example.com, example.com and foo.bar.baz.example.com. Note that it wouldn't + match if the second-level domain was another-example.
+matches any domain that STARTS with www. + (It also matches the domain www but + most of the time that doesn't matter.)
+matches any domain that CONTAINS .example.. And, by the way, also included would + be any files or documents that exist within that domain since + no path limitations are specified. (Correctly speaking: It + matches any FQDN that contains example + as a domain.) This might be www.example.com, news.example.de, or www.example.net/cgi/testing.pl for instance. All + these cases are matched.
+Additionally, there are wild-cards that you can use in the domain + names themselves. These work similarly to shell globbing type + wild-cards: "*" represents zero or more + arbitrary characters (this is equivalent to the "Regular Expression" based + syntax of ".*"), "?" represents any single character (this is + equivalent to the regular expression syntax of a simple "."), and you can define "character classes" in square brackets which is + similar to the same regular expression technique. All of this can be + freely mixed:
+matches "adserver.example.com", + "ads.example.com", etc but not + "sfads.example.com"
+matches all of the above, and then some.
+matches www.ipix.com, pictures.epix.com, a.b.c.d.e.upix.com etc.
+matches www1.example.com, + www4.example.cc, wwwd.example.cy, wwwz.example.com etc., but not wwww.example.com.
+While flexible, this is not the sophistication of full regular + expression based syntax.
+Privoxy uses "modern" POSIX 1003.2 "Regular Expressions" for + matching the path portion (after the slash), and is thus more + flexible.
+There is an Appendix with a + brief quick-start into regular expressions, you also might want to + have a look at your operating system's documentation on regular + expressions (try man re_format).
+Note that the path pattern is automatically left-anchored at the + "/", i.e. it matches as if it would start + with a "^" (regular expression speak for + the beginning of a line).
+Please also note that matching in the path is CASE INSENSITIVE by + default, but you can switch to case sensitive at any point in the + pattern by using the "(?-i)" switch: + www.example.com/(?-i)PaTtErN.* will match + only documents whose path starts with PaTtErN in exactly this capitalization.
+Is equivalent to just ".example.com", since any documents within that + domain are matched with or without the ".*" regular expression. This is redundant
+Will match any page in the domain of "example.com" that is named "index.html", and that is part of some path. For + example, it matches "www.example.com/testing/index.html" but NOT + "www.example.com/index.html" because + the regular expression called for at least two "/'s", thus the path requirement. It also would + match "www.example.com/testing/index_html", because of + the special meta-character ".".
+This regular expression is conditional so it will match any + page named "index.html" regardless + of path which in this case can have one or more "/'s". And this one must contain exactly + ".html" (but does not have to end + with that!).
+This regular expression will match any path of "example.com" that contains any of the words + "ads", "banner", "banners" + (because of the "?") or "junk". The path does not have to end in these + words, just contain them.
+This is very much the same as above, except now it must end + in either ".jpg", ".jpeg", ".gif" or + ".png". So this one is limited to + common image formats.
+There are many, many good examples to be found in default.action, and more tutorials below in Appendix on regular expressions.
+Request tag patterns are used to change the applying actions based + on the request's tags. Tags can be created based on HTTP headers with + either the client-header-tagger or + the server-header-tagger + action.
+Request tag patterns have to start with "TAG:", so Privoxy + can tell them apart from other patterns. Everything after the colon + including white space, is interpreted as a regular expression with + path pattern syntax, except that tag patterns aren't left-anchored + automatically (Privoxy doesn't + silently add a "^", you have to do it + yourself if you need it).
+To match all requests that are tagged with "foo" your pattern line should be "TAG:^foo$", "TAG:foo" + would work as well, but it would also match requests whose tags + contain "foo" somewhere. "TAG: foo" wouldn't work as it requires white + space.
+Sections can contain URL and request tag patterns at the same + time, but request tag patterns are checked after the URL patterns and + thus always overrule them, even if they are located before the URL + patterns.
+Once a new request tag is added, Privoxy checks right away if it's + matched by one of the request tag patterns and updates the action + settings accordingly. As a result request tags can be used to + activate other tagger actions, as long as these other taggers look + for headers that haven't already be parsed.
+For example you could tag client requests which use the POST method, then use this tag to activate another + tagger that adds a tag if cookies are sent, and then use a block + action based on the cookie tag. This allows the outcome of one + action, to be input into a subsequent action. However if you'd + reverse the position of the described taggers, and activated the + method tagger based on the cookie tagger, no method tags would be + created. The method tagger would look for the request line, but at + the time the cookie tag is created, the request line has already been + parsed.
+While this is a limitation you should be aware of, this kind of + indirection is seldom needed anyway and even the example doesn't make + too much sense.
+To match requests that do not have a certain request tag, specify + a negative tag pattern by prefixing the tag pattern line with either + "NO-REQUEST-TAG:" or "NO-RESPONSE-TAG:" instead of "TAG:".
+Negative request tag patterns created with "NO-REQUEST-TAG:" are checked after all client headers + are scanned, the ones created with "NO-RESPONSE-TAG:" are checked after all server + headers are scanned. In both cases all the created tags are + considered.
+Warning | +
+ This is an experimental feature. The syntax is likely to + change in future versions. + |
+
Client tag patterns are not set based on HTTP headers but based on + the client's IP address. Users can enable them themselves, but the + Privoxy admin controls which tags are available and what their effect + is.
+After a client-specific tag has been defined with the client-specific-tag, directive, + action sections can be activated based on the tag by using a + CLIENT-TAG pattern. The CLIENT-TAG pattern is evaluated at the same + priority as URL patterns, as a result the last matching pattern wins. + Tags that are created based on client or server headers are evaluated + later on and can overrule CLIENT-TAG and URL patterns!
+The tag is set for all requests that come from clients that + requested it to be set. Note that "clients" are differentiated by IP + address, if the IP address changes the tag has to be requested + again.
+Clients can request tags to be set by using the CGI interface + http://config.privoxy.org/client-tags.
+Example:
+
+ + # If the admin defined the client-specific-tag circumvent-blocks, +# and the request comes from a client that previously requested +# the tag to be set, overrule all previous +block actions that +# are enabled based on URL to CLIENT-TAG patterns. +{-block} +CLIENT-TAG:^circumvent-blocks$ + +# This section is not overruled because it's located after +# the previous one. +{+block{Nobody is supposed to request this.}} +example.org/blocked-example-page+ |
+
- The actions files are used to define what actions Privoxy takes for which URLs, and thus - determines how ad images, cookies and various other aspects of HTTP - content and transactions are handled, and on which sites (or even - parts thereof). There are a number of such actions, with a wide range - of functionality. Each action does something a little different. - These actions give us a veritable arsenal of tools with which to - exert our control, preferences and independence. Actions can be - combined so that their effects are aggregated when applied against a - given set of URLs. -
-- There are three action files included with Privoxy with differing purposes: -
--
+All actions are disabled by default, until they are explicitly + enabled somewhere in an actions file. Actions are turned on if preceded + with a "+", and turned off if preceded with + a "-". So a +action + means "do that action", e.g. +block means "please block URLs that + match the following patterns", and -block means "don't block URLs that + match the following patterns, even if +block + previously applied."
+Again, actions are invoked by placing them on a line, enclosed in + curly braces and separated by whitespace, like in {+some-action -some-other-action{some-parameter}}, + followed by a list of URL patterns, one per line, to which they apply. + Together, the actions line and the following pattern lines make up a + section of the actions file.
+Actions fall into three categories:
- match-all.action - is used to define - which "actions" relating to - banner-blocking, images, pop-ups, content modification, cookie - handling etc should be applied by default. It should be the first - actions file loaded -
+Boolean, i.e the action can only be "enabled" or "disabled". + Syntax:
+
+ +name # enable action name + -name # disable action name+ |
+
Example: +handle-as-image
- default.action - defines many - exceptions (both positive and negative) from the default set of - actions that's configured in match-all.action. It is a set of rules that - should work reasonably well as-is for most users. This file is - only supposed to be edited by the developers. It should be the - second actions file loaded. -
+Parameterized, where some value is required in order to enable + this type of action. Syntax:
+
+ +name{param} # enable action and set parameter to param, + # overwriting parameter from previous match if necessary + -name # disable action. The parameter can be omitted+ |
+
Note that if the URL matches multiple positive forms of a + parameterized action, the last match wins, i.e. the params from + earlier matches are simply ignored.
+Example: +hide-user-agent{Mozilla/5.0 (X11; + U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602 + Firefox/2.0.0.4}
- user.action - is intended to be for - local site preferences and exceptions. As an example, if your ISP - or your bank has specific requirements, and need special - handling, this kind of thing should go here. This file will not - be upgraded. -
+Multi-value. These look exactly like parameterized actions, but + they behave differently: If the action applies multiple times to + the same URL, but with different parameters, all the parameters from + all matches + are remembered. This is used for actions that can be executed for + the same request repeatedly, like adding multiple headers, or + filtering through multiple filters. Syntax:
+
+ +name{param} # enable action and add param to the list of parameters + -name{param} # remove the parameter param from the list of parameters + # If it was the last one left, disable the action. + -name # disable this action completely and remove all parameters from the list+ |
+
Examples: +add-header{X-Fun-Header: Some + text} and +filter{html-annoyances}
- Edit Set - to Cautious Set to Medium - Set to Advanced -
-- These have increasing levels of aggressiveness and have no influence on your - browsing unless you select them explicitly in the - editor. A default installation should be pre-set to - Cautious. New users should try this for - a while before adjusting the settings to more aggressive levels. - The more aggressive the settings, then the more likelihood there - is of problems such as sites not working as they should. -
-- The Edit button allows you to turn - each action on/off individually for fine-tuning. The Cautious button changes the actions list to - low/safe settings which will activate ad blocking and a minimal - set of Privoxy's features, and - subsequently there will be less of a chance for accidental - problems. The Medium button sets - the list to a medium level of other features and a low level set - of privacy features. The Advanced - button sets the list to a high level of ad blocking and medium - level of privacy. See the chart below. The latter three buttons - over-ride any changes via with the Edit button. More fine-tuning can be done in - the lower sections of this internal page. -
-- While the actions file editor allows to enable these settings in - all actions files, they are only supposed to be enabled in the - first one to make sure you don't unintentionally overrule earlier - rules. -
-- The default profiles, and their associated actions, as - pre-defined in default.action are: -
--
-- Table 1. Default Configurations -
-- Feature - | -- Cautious - | -- Medium - | -- Advanced - | -
---|
- Ad-blocking Aggressiveness - | -- medium - | -- high - | -
- high
+ + # Add a DNT ("Do not track") header to all requests, +# event to those that already have one. +# +# This is just an example, not a recommendation. +# +# There is no reason to believe that user-tracking websites care +# about the DNT header and depending on the User-Agent, adding the +# header may make user-tracking easier. +{+add-header{DNT: 1}} +/ |
Block ads or other unwanted content
+Requests for URLs to which this action applies are blocked, + i.e. the requests are trapped by Privoxy and the requested URL is never + retrieved, but is answered locally with a substitute page or + image, as determined by the handle-as-image, + set-image-blocker, + and handle-as-empty-document + actions.
+Parameterized.
+A block reason that should be given to the user.
+Privoxy sends a special + "BLOCKED" page for requests to + blocked pages. This page contains the block reason given as + parameter, a link to find out why the block action applies, and + a click-through to the blocked content (the latter only if the + force feature is available and enabled).
+A very important exception occurs if both block and handle-as-image, + apply to the same request: it will then be replaced by an + image. If set-image-blocker + (see below) also applies, the type of image will be determined + by its parameter, if not, the standard checkerboard pattern is + sent.
+It is important to understand this process, in order to + understand how Privoxy deals + with ads and other unwanted content. Blocking is a core + feature, and one upon which various other features depend.
+The filter action can perform a + very similar task, by "blocking" + banner images and other content through rewriting the relevant + URLs in the document's HTML source, so they don't get requested + in the first place. Note that this is a totally different + technique, and it's easy to confuse the two.
+- Ad-filtering by size - | -- no - | -- yes - | -
- yes
+ {+block{No nasty stuff for you.}} +# Block and replace with "blocked" page + .nasty-stuff.example.com + +{+block{Doubleclick banners.} +handle-as-image} +# Block and replace with image + .ad.doubleclick.net + .ads.r.us/banners/ + +{+block{Layered ads.} +handle-as-empty-document} +# Block and then ignore + adserver.example.net/.*\.js$ |
Improve privacy by not forwarding the source of the request + in the HTTP headers.
+Deletes the "X-Forwarded-For:" + HTTP header from the client request, or adds a new one.
+Parameterized.
+"block" to delete the + header.
+"add" to create the header + (or append the client's IP address to an already existing + one).
+It is safe and recommended to use block.
+Forwarding the source address of the request may make sense + in some multi-user setups but is also a privacy risk.
+- Ad-filtering by link - | -- no - | -- no - | -
- yes
+ +change-x-forwarded-for{block} |
Rewrite or remove single client headers.
+All client headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions.
+Multi-value.
+The name of a client-header filter, as defined in one of the + filter files.
+Client-header filters are applied to each header on its own, + not to all at once. This makes it easier to diagnose problems, + but on the downside you can't write filters that only change + header x if header y's value is z. You can do that by using + tags though.
+Client-header filters are executed after the other header + actions have finished and use their output as input.
+If the request URI gets changed, Privoxy will detect that and use the new + one. This can be used to rewrite the request destination behind + the client's back, for example to specify a Tor exit relay for + certain requests.
+Please refer to the filter file + chapter to learn which client-header filters are available + by default, and how to create your own.
+- Pop-up killing - | -- blocks only - | -- blocks only - | -
- blocks only
+ + # Hide Tor exit notation in Host and Referer Headers +{+client-header-filter{hide-tor-exit-notation}} +/ + |
Block requests based on their headers.
+Client headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions, the result is used as tag.
+Multi-value.
+The name of a client-header tagger, as defined in one of the + filter files.
+Client-header taggers are applied to each header on its own, + and as the header isn't modified, each tagger "sees" the original.
+Client-header taggers are the first actions that are + executed and their tags can be used to control every other + action.
+- Privacy Features - | -- low - | -- medium - | -- medium/high - | -
- Cookie handling - | -- none - | -- session-only - | -- kill - | -
- Referer forging - | -- no - | -- yes - | -- yes - | -
- GIF de-animation - | -- no - | -- yes - | -- yes - | -
- Fast redirects - | -- no - | -- no - | -- yes - | -
- HTML taming - | -- no - | -- no - | -- yes - | -
- JavaScript taming - | -- no - | -- no - | -- yes - | -
- Web-bug killing - | -- no - | -- yes - | -- yes - | -
- Image tag reordering - | -- no - | -- yes - | -- yes - | -
- The list of actions files to be used are defined in the main - configuration file, and are processed in the order they are defined - (e.g. default.action is typically processed - before user.action). The content of these - can all be viewed and edited from http://config.privoxy.org/show-status. The over-riding - principle when applying actions, is that the last action that matches - a given URL wins. The broadest, most general rules go first (defined - in default.action), followed by any - exceptions (typically also in default.action), which are then followed lastly by - any local preferences (typically in user.action). - Generally, user.action has the last word. -
-- An actions file typically has multiple sections. If you want to use - "aliases" in an actions file, you have to - place the (optional) alias - section at the top of that file. Then comes the default set of - rules which will apply universally to all sites and pages (be very careful with - using such a universal set in user.action - or any other actions file after default.action, because it will override the result - from consulting any previous file). And then below that, exceptions - to the defined universal policies. You can regard user.action as an appendix to default.action, with the advantage that it is a - separate file, which makes preserving your personal settings across - Privoxy upgrades easier. -
-- Actions can be used to block anything you want, including ads, - banners, or just some obnoxious URL whose content you would rather - not see. Cookies can be accepted or rejected, or accepted only during - the current browser session (i.e. not written to disk), content can - be modified, some JavaScripts tamed, user-tracking fooled, and much - more. See below for a complete - list of actions. -
-- Note that some actions, - like cookie suppression or script disabling, may render some sites - unusable that rely on these techniques to work properly. Finding - the right mix of actions is not always easy and certainly a matter - of personal taste. And, things can always change, requiring - refinements in the configuration. In general, it can be said that - the more "aggressive" your default - settings (in the top section of the actions file) are, the more - exceptions for "trusted" sites you will - have to make later. If, for example, you want to crunch all cookies - per default, you'll have to make exceptions from that rule for - sites that you regularly use and that require cookies for actually - useful purposes, like maybe your bank, favorite shop, or newspaper. -
-- We have tried to provide you with reasonable rules to start from in - the distribution actions files. But there is no general rule of - thumb on these things. There just are too many variables, and sites - are constantly changing. Sooner or later you will want to change - the rules (and read this chapter again :). -
-- The easiest way to edit the actions files is with a browser by - using our browser-based editor, which can be reached from http://config.privoxy.org/show-status. Note: the config - file option enable-edit-actions must be - enabled for this to work. The editor allows both fine-grained - control over every single feature on a per-URL basis, and easy - choosing from wholesale sets of defaults like "Cautious", "Medium" or - "Advanced". Warning: the "Advanced" setting is more aggressive, and will be - more likely to cause problems for some sites. Experienced users - only! -
-- If you prefer plain text editing to GUIs, you can of course also - directly edit the the actions files with your favorite text editor. - Look at default.action which is richly - commented with many good examples. -
-- Actions files are divided into sections. There are special - sections, like the "alias" sections which will - be discussed later. For now let's concentrate on regular sections: - They have a heading line (often split up to multiple lines for - readability) which consist of a list of actions, separated by - whitespace and enclosed in curly braces. Below that, there is a - list of URL and tag patterns, each on a separate line. -
-- To determine which actions apply to a request, the URL of the - request is compared to all URL patterns in each "action file". Every time it matches, the list of - applicable actions for the request is incrementally updated, using - the heading of the section in which the pattern is located. The - same is done again for tags and tag patterns later on. -
-- If multiple applying sections set the same action differently, the - last match wins. If not, the effects are aggregated. E.g. a URL - might match a regular section with a heading line of { +handle-as-image }, - then later another one with just { +block }, resulting in both actions to - apply. And there may well be cases where you will want to combine - actions together. Such a section then might look like: -
--
-
-- { +handle-as-image +block{Banner ads.} } - # Block these as if they were images. Send no block page. - banners.example.com - media.example.com/.*banners - .example.com/images/ads/ -- |
-
- You can trace this process for URL patterns and any given URL by - visiting http://config.privoxy.org/show-url-info. -
-- Examples and more detail on this is provided in the Appendix, Troubleshooting: Anatomy of an - Action section. -
-- As mentioned, Privoxy uses "patterns" to determine what actions might apply to - which sites and pages your browser attempts to access. These "patterns" use wild card type pattern matching to - achieve a high degree of flexibility. This allows one expression to - be expanded and potentially match against many similar patterns. -
-- Generally, an URL pattern has the form <host><port>/<path>, where the <host>, the <port> and the <path> are optional. (This is why the special - / pattern matches all URLs). Note that the - protocol portion of the URL pattern (e.g. http://) should not be included in the pattern. This is - assumed already! -
-- The pattern matching syntax is different for the host and path - parts of the URL. The host part uses a simple globbing type - matching technique, while the path part uses more flexible "Regular Expressions" (POSIX - 1003.2). -
-- The port part of a pattern is a decimal port number preceded by a - colon (:). If the host part contains a - numerical IPv6 address, it has to be put into angle brackets (<, >). -
-- is a host-only pattern and will match any request to www.example.com, regardless of which - document on that server is requested. So ALL pages in this - domain would be covered by the scope of this action. Note - that a simple example.com is - different and would NOT match. -
-- means exactly the same. For host-only patterns, the trailing - / may be omitted. -
-- matches all the documents on www.example.com whose name starts with /index.html. -
-- matches only the single document /index.html on www.example.com. -
-- matches the document /index.html, - regardless of the domain, i.e. on any web server anywhere. -
-- Matches any URL because there's no requirement for either the - domain or the path to match anything. -
-- Matches any URL pointing to TCP port 8000. -
-- Matches any URL with the host address 10.0.0.1. (Note that the real URL uses plain - brackets, not angle brackets.) -
-- Matches any URL with the host address 2001:db8::1. (Note that the real URL uses - plain brackets, not angle brackets.) -
-- matches nothing, since it would be interpreted as a domain - name and there is no top-level domain called .html. So its a mistake. -
-- The matching of the host part offers some flexible options: if - the host pattern starts or ends with a dot, it becomes unanchored - at that end. The host pattern is often referred to as domain - pattern as it is usually used to match domain names and not IP - addresses. For example: -
-- matches any domain with first-level domain com and second-level domain example. For example www.example.com, example.com and foo.bar.baz.example.com. Note that it - wouldn't match if the second-level domain was another-example. -
-- matches any domain that STARTS with www. (It also matches the domain www but most of the time that doesn't - matter.) -
-- matches any domain that CONTAINS .example.. And, by the way, also included - would be any files or documents that exist within that - domain since no path limitations are specified. (Correctly - speaking: It matches any FQDN that contains example as a domain.) This might be www.example.com, news.example.de, or www.example.net/cgi/testing.pl for instance. - All these cases are matched. -
-- Additionally, there are wild-cards that you can use in the domain - names themselves. These work similarly to shell globbing type - wild-cards: "*" represents zero or - more arbitrary characters (this is equivalent to the "Regular Expression" based - syntax of ".*"), "?" represents any single character (this is - equivalent to the regular expression syntax of a simple "."), and you can define "character classes" in square brackets which is - similar to the same regular expression technique. All of this can - be freely mixed: -
-- matches "adserver.example.com", - "ads.example.com", etc but not - "sfads.example.com" -
-- matches all of the above, and then some. -
-- matches www.ipix.com, pictures.epix.com, a.b.c.d.e.upix.com etc. -
-- matches www1.example.com, www4.example.cc, wwwd.example.cy, wwwz.example.com etc., but not wwww.example.com. -
-- While flexible, this is not the sophistication of full regular - expression based syntax. -
-- Privoxy uses "modern" POSIX 1003.2 "Regular Expressions" for - matching the path portion (after the slash), and is thus more - flexible. -
-- There is an Appendix with a - brief quick-start into regular expressions, you also might want - to have a look at your operating system's documentation on - regular expressions (try man re_format). -
-- Note that the path pattern is automatically left-anchored at the - "/", i.e. it matches as if it would - start with a "^" (regular expression - speak for the beginning of a line). -
-- Please also note that matching in the path is CASE INSENSITIVE by - default, but you can switch to case sensitive at any point in the - pattern by using the "(?-i)" switch: - www.example.com/(?-i)PaTtErN.* will - match only documents whose path starts with PaTtErN in exactly this capitalization. -
-- Is equivalent to just ".example.com", since any documents within - that domain are matched with or without the ".*" regular expression. This is redundant -
-- Will match any page in the domain of "example.com" that is named "index.html", and that is part of some path. - For example, it matches "www.example.com/testing/index.html" but NOT - "www.example.com/index.html" - because the regular expression called for at least two - "/'s", thus the path - requirement. It also would match "www.example.com/testing/index_html", - because of the special meta-character ".". -
-- This regular expression is conditional so it will match any - page named "index.html" - regardless of path which in this case can have one or more - "/'s". And this one must contain - exactly ".html" (but does not - have to end with that!). -
-- This regular expression will match any path of "example.com" that contains any of the words - "ads", "banner", "banners" (because of the "?") or "junk". - The path does not have to end in these words, just contain - them. -
-- This is very much the same as above, except now it must end - in either ".jpg", ".jpeg", ".gif" - or ".png". So this one is - limited to common image formats. -
-- There are many, many good examples to be found in default.action, and more tutorials below in Appendix on regular expressions. -
-- Request tag patterns are used to change the applying actions - based on the request's tags. Tags can be created based on HTTP - headers with either the client-header-tagger - or the server-header-tagger - action. -
-- Request tag patterns have to start with "TAG:", so Privoxy can tell them apart from other - patterns. Everything after the colon including white space, is - interpreted as a regular expression with path pattern syntax, - except that tag patterns aren't left-anchored automatically - (Privoxy doesn't silently add a - "^", you have to do it yourself if you - need it). -
-- To match all requests that are tagged with "foo" your pattern line should be "TAG:^foo$", "TAG:foo" - would work as well, but it would also match requests whose tags - contain "foo" somewhere. "TAG: foo" wouldn't work as it requires white - space. -
-- Sections can contain URL and request tag patterns at the same - time, but request tag patterns are checked after the URL patterns - and thus always overrule them, even if they are located before - the URL patterns. -
-- Once a new request tag is added, Privoxy checks right away if - it's matched by one of the request tag patterns and updates the - action settings accordingly. As a result request tags can be used - to activate other tagger actions, as long as these other taggers - look for headers that haven't already be parsed. -
-- For example you could tag client requests which use the POST method, then use this tag to activate - another tagger that adds a tag if cookies are sent, and then use - a block action based on the cookie tag. This allows the outcome - of one action, to be input into a subsequent action. However if - you'd reverse the position of the described taggers, and - activated the method tagger based on the cookie tagger, no method - tags would be created. The method tagger would look for the - request line, but at the time the cookie tag is created, the - request line has already been parsed. -
-- While this is a limitation you should be aware of, this kind of - indirection is seldom needed anyway and even the example doesn't - make too much sense. -
-- To match requests that do not have a certain request tag, specify - a negative tag pattern by prefixing the tag pattern line with - either "NO-REQUEST-TAG:" or "NO-RESPONSE-TAG:" instead of "TAG:". -
-- Negative request tag patterns created with "NO-REQUEST-TAG:" are checked after all client - headers are scanned, the ones created with "NO-RESPONSE-TAG:" are checked after all server - headers are scanned. In both cases all the created tags are - considered. -
-- Warning - | -
- - This is an experimental feature. The syntax is likely to - change in future versions. - - |
-
- Client tag patterns are not set based on HTTP headers but based - on the client's IP address. Users can enable them themselves, but - the Privoxy admin controls which tags are available and what - their effect is. -
-- After a client-specific tag has been defined with the client-specific-tag, - directive, action sections can be activated based on the tag by - using a CLIENT-TAG pattern. The CLIENT-TAG pattern is evaluated - at the same priority as URL patterns, as a result the last - matching pattern wins. Tags that are created based on client or - server headers are evaluated later on and can overrule CLIENT-TAG - and URL patterns! -
-- The tag is set for all requests that come from clients that - requested it to be set. Note that "clients" are differentiated by - IP address, if the IP address changes the tag has to be requested - again. -
-- Clients can request tags to be set by using the CGI interface http://config.privoxy.org/client-tags. -
-- Example: -
--
-
--# If the admin defined the client-specific-tag circumvent-blocks, -# and the request comes from a client that previously requested -# the tag to be set, overrule all previous +block actions that -# are enabled based on URL to CLIENT-TAG patterns. -{-block} -CLIENT-TAG:^circumvent-blocks$ - -# This section is not overruled because it's located after -# the previous one. -{+block{Nobody is supposed to request this.}} -example.org/blocked-example-page -- |
-
- All actions are disabled by default, until they are explicitly - enabled somewhere in an actions file. Actions are turned on if - preceded with a "+", and turned off if - preceded with a "-". So a +action means "do that - action", e.g. +block means "please block URLs that match the following - patterns", and -block means "don't block URLs that match the following patterns, - even if +block previously - applied." -
-- Again, actions are invoked by placing them on a line, enclosed in - curly braces and separated by whitespace, like in {+some-action -some-other-action{some-parameter}}, - followed by a list of URL patterns, one per line, to which they - apply. Together, the actions line and the following pattern lines - make up a section of the actions file. -
-- Actions fall into three categories: -
--
-- Boolean, i.e the action can only be "enabled" or "disabled". Syntax: -
--
-
-- +name # enable action name - -name # disable action name -- |
-
- Example: +handle-as-image -
-- Parameterized, where some value is required in order to enable - this type of action. Syntax: -
--
-
-- +name{param} # enable action and set parameter to param, - # overwriting parameter from previous match if necessary - -name # disable action. The parameter can be omitted -- |
-
- Note that if the URL matches multiple positive forms of a - parameterized action, the last match wins, i.e. the params from - earlier matches are simply ignored. -
-- Example: +hide-user-agent{Mozilla/5.0 (X11; - U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602 - Firefox/2.0.0.4} -
-- Multi-value. These look exactly like parameterized actions, but - they behave differently: If the action applies multiple times - to the same URL, but with different parameters, all the parameters - from all - matches are remembered. This is used for actions that can be - executed for the same request repeatedly, like adding multiple - headers, or filtering through multiple filters. Syntax: -
--
-
-- +name{param} # enable action and add param to the list of parameters - -name{param} # remove the parameter param from the list of parameters - # If it was the last one left, disable the action. - -name # disable this action completely and remove all parameters from the list -- |
-
- Examples: +add-header{X-Fun-Header: Some - text} and +filter{html-annoyances} -
-- If nothing is specified in any actions file, no "actions" are taken. So in this case Privoxy would just be a normal, non-blocking, - non-filtering proxy. You must specifically enable the privacy and - blocking features you need (although the provided default actions - files will give a good starting point). -
-- Later defined action sections always over-ride earlier ones of the - same type. So exceptions to any rules you make, should come in the - latter part of the file (or in a file that is processed later when - using multiple actions files such as user.action). For multi-valued actions, the actions - are applied in the order they are specified. Actions files are - processed in the order they are defined in config (the default installation has three actions - files). It also quite possible for any given URL to match more than - one "pattern" (because of wildcards and - regular expressions), and thus to trigger more than one set of - actions! Last match wins. -
-- The list of valid Privoxy actions - are: -
-- Confuse log analysis, custom applications -
-- Sends a user defined HTTP header to the web server. -
-- Multi-value. -
-- Any string value is possible. Validity of the defined HTTP - headers is not checked. It is recommended that you use the - "X-" - prefix for custom headers. -
-- This action may be specified multiple times, in order to - define multiple headers. This is rarely needed for the - typical user. If you don't know what "HTTP headers" are, you definitely don't - need to worry about this one. -
-- Headers added by this action are not modified by other - actions. -
--
-
--# Add a DNT ("Do not track") header to all requests, -# event to those that already have one. -# -# This is just an example, not a recommendation. -# -# There is no reason to believe that user-tracking websites care -# about the DNT header and depending on the User-Agent, adding the -# header may make user-tracking easier. -{+add-header{DNT: 1}} -/ -- |
-
- Block ads or other unwanted content -
-- Requests for URLs to which this action applies are blocked, - i.e. the requests are trapped by Privoxy and the requested URL is never - retrieved, but is answered locally with a substitute page - or image, as determined by the handle-as-image, - set-image-blocker, - and handle-as-empty-document - actions. -
-- Parameterized. -
-- A block reason that should be given to the user. -
-- Privoxy sends a special - "BLOCKED" page for requests to - blocked pages. This page contains the block reason given as - parameter, a link to find out why the block action applies, - and a click-through to the blocked content (the latter only - if the force feature is available and enabled). -
-- A very important exception occurs if both block and handle-as-image, - apply to the same request: it will then be replaced by an - image. If set-image-blocker - (see below) also applies, the type of image will be - determined by its parameter, if not, the standard - checkerboard pattern is sent. -
-- It is important to understand this process, in order to - understand how Privoxy - deals with ads and other unwanted content. Blocking is a - core feature, and one upon which various other features - depend. -
-- The filter action can - perform a very similar task, by "blocking" banner images and other content - through rewriting the relevant URLs in the document's HTML - source, so they don't get requested in the first place. - Note that this is a totally different technique, and it's - easy to confuse the two. -
--
-
--{+block{No nasty stuff for you.}} -# Block and replace with "blocked" page - .nasty-stuff.example.com - -{+block{Doubleclick banners.} +handle-as-image} -# Block and replace with image - .ad.doubleclick.net - .ads.r.us/banners/ - -{+block{Layered ads.} +handle-as-empty-document} -# Block and then ignore - adserver.example.net/.*\.js$ -- |
-
- Improve privacy by not forwarding the source of the request - in the HTTP headers. -
-- Deletes the "X-Forwarded-For:" - HTTP header from the client request, or adds a new one. -
-- Parameterized. -
-- "block" to delete the - header. -
-- "add" to create the header - (or append the client's IP address to an already - existing one). -
-- It is safe and recommended to use block. -
-- Forwarding the source address of the request may make sense - in some multi-user setups but is also a privacy risk. -
--
-
--+change-x-forwarded-for{block} -- |
-
- Rewrite or remove single client headers. -
-- All client headers to which this action applies are - filtered on-the-fly through the specified regular - expression based substitutions. -
-- Multi-value. -
-- The name of a client-header filter, as defined in one of - the filter files. -
-- Client-header filters are applied to each header on its - own, not to all at once. This makes it easier to diagnose - problems, but on the downside you can't write filters that - only change header x if header y's value is z. You can do - that by using tags though. -
-- Client-header filters are executed after the other header - actions have finished and use their output as input. -
-- If the request URI gets changed, Privoxy will detect that and use the - new one. This can be used to rewrite the request - destination behind the client's back, for example to - specify a Tor exit relay for certain requests. -
-- Please refer to the filter file - chapter to learn which client-header filters are - available by default, and how to create your own. -
--
-
--# Hide Tor exit notation in Host and Referer Headers -{+client-header-filter{hide-tor-exit-notation}} -/ - -- |
-
- Block requests based on their headers. -
-- Client headers to which this action applies are filtered - on-the-fly through the specified regular expression based - substitutions, the result is used as tag. -
-- Multi-value. -
-- The name of a client-header tagger, as defined in one of - the filter files. -
-- Client-header taggers are applied to each header on its - own, and as the header isn't modified, each tagger "sees" the original. -
-- Client-header taggers are the first actions that are - executed and their tags can be used to control every other - action. -
--
-
--# Tag every request with the User-Agent header ++ # Tag every request with the User-Agent header {+client-header-tagger{user-agent}} / @@ -1676,19 +1132,15 @@ TAG:^User-Agent: RPM APT-HTTP/ TAG:^User-Agent: fetch libfetch/ TAG:^User-Agent: Ubuntu APT-HTTP/ TAG:^User-Agent: MPlayer/ - -- |
-
-
-
--# Tag all requests with the Range header set ++ |
+
+ + # Tag all requests with the Range header set {+client-header-tagger{range-requests}} / @@ -1700,1796 +1152,1240 @@ TAG:^User-Agent: MPlayer/ # parts of multimedia files. {-filter -deanimate-gifs} TAG:^RANGE-REQUEST$ - -- |
-
- Stop useless download menus from popping up, or change the - browser's rendering mode -
-- Replaces the "Content-Type:" - HTTP server header. -
-- Parameterized. -
-- Any string. -
-- The "Content-Type:" HTTP server - header is used by the browser to decide what to do with the - document. The value of this header can cause the browser to - open a download menu instead of displaying the document by - itself, even if the document's format is supported by the - browser. -
-- The declared content type can also affect which rendering - mode the browser chooses. If XHTML is delivered as "text/html", many browsers treat it as - yet another broken HTML document. If it is send as "application/xml", browsers with XHTML - support will only display it, if the syntax is correct. -
-- If you see a web site that proudly uses XHTML buttons, but - sets "Content-Type: text/html", - you can use Privoxy to - overwrite it with "application/xml" and validate the web - master's claim inside your XHTML-supporting browser. If the - syntax is incorrect, the browser will complain loudly. -
-- You can also go the opposite direction: if your browser - prints error messages instead of rendering a document - falsely declared as XHTML, you can overwrite the content - type with "text/html" and have - it rendered as broken HTML document. -
-- By default content-type-overwrite - only replaces "Content-Type:" - headers that look like some kind of text. If you want to - overwrite it unconditionally, you have to combine it with - force-text-mode. - This limitation exists for a reason, think twice before - circumventing it. -
-- Most of the time it's easier to replace this action with a - custom server-header - filter. It allows you to activate it for every - document of a certain site and it will still only replace - the content types you aimed at. -
-- Of course you can apply content-type-overwrite to a whole site and - then make URL based exceptions, but it's a lot more work to - get the same precision. -
--
-
--# Check if www.example.net/ really uses valid XHTML + + |
+
- Remove a client header Privoxy has no dedicated action for. -
-- Deletes every header sent by the client that contains the - string the user supplied as parameter. -
-- Parameterized. -
-- Any string. -
-- This action allows you to block client headers for which no - dedicated Privoxy action - exists. Privoxy will - remove every client header that contains the string you - supplied as parameter. -
-- Regular expressions are not supported and you can't use this - action to block different headers in the same request, - unless they contain the same string. -
-- crunch-client-header is only meant - for quick tests. If you have to block several different - headers, or only want to modify parts of them, you should - use a client-header - filter. -
-- Warning - | -
- - Don't block any header without understanding the - consequences. - - |
-
-
-
--# Block the non-existent "Privacy-Violation:" client header -{ +crunch-client-header{Privacy-Violation:} } -/ - -+ | Warning | +
+ Don't block any header without understanding the + consequences. |
+ + # Block the non-existent "Privacy-Violation:" client header +{ +crunch-client-header{Privacy-Violation:} } +/ ++ |
+
- Prevent yet another way to track the user's steps between - sessions. -
-- Deletes the "If-None-Match:" - HTTP client header. -
-- Boolean. -
-- N/A -
-- Removing the "If-None-Match:" - HTTP client header is useful for filter testing, where you - want to force a real reload instead of getting status code - "304" which would cause the - browser to use a cached copy of the page. -
-- It is also useful to make sure the header isn't used as a - cookie replacement (unlikely but possible). -
-- Blocking the "If-None-Match:" - header shouldn't cause any caching problems, as long as the - "If-Modified-Since:" header - isn't blocked or missing as well. -
-- It is recommended to use this action together with hide-if-modified-since - and overwrite-last-modified. -
--
-
--# Let the browser revalidate cached documents but don't + + |
+
- Stop those annoying, distracting animated GIF images. -
-- De-animate GIF animations, i.e. reduce them to their first - or last image. -
-- Parameterized. -
-- "last" or "first" -
-- This will also shrink the images considerably (in bytes, - not pixels!). If the option "first" is given, the first frame of the - animation is used as the replacement. If "last" is given, the last frame of the - animation is used instead, which probably makes more sense - for most banner animations, but also has the risk of not - showing the entire last frame (if it is only a delta to an - earlier frame). -
-- You can safely use this action with patterns that will also - match non-GIF objects, because no attempt will be made at - anything that doesn't look like a GIF. -
--
-
--+deanimate-gifs{last} -- |
-
Prevent the web server from setting HTTP cookies on your + system
+Deletes any "Set-Cookie:" HTTP + headers from server replies.
+Boolean.
+N/A
+This action is only concerned with incoming HTTP + cookies. For outgoing HTTP cookies, use crunch-outgoing-cookies. + Use both + to disable HTTP cookies completely.
+It makes no sense + at all to use this action in conjunction with the + session-cookies-only + action, since it would prevent the session cookies from being + set. See also filter-content-cookies.
+
+ +crunch-incoming-cookies+ |
+
- Work around (very rare) problems with HTTP/1.1 -
-- Downgrades HTTP/1.1 client requests and server replies to - HTTP/1.0. -
-- Boolean. -
-- N/A -
-- This is a left-over from the time when Privoxy didn't support important - HTTP/1.1 features well. It is left here for the unlikely - case that you experience HTTP/1.1-related problems with - some server out there. -
-- Note that enabling this action is only a workaround. It - should not be enabled for sites that work without it. While - it shouldn't break any pages, it has an (usually negative) - performance impact. -
-- If you come across a site where enabling this action helps, - please report it, so the cause of the problem can be - analyzed. If the problem turns out to be caused by a bug in - Privoxy it should be fixed - so the following release works without the work around. -
--
-
--{+downgrade-http-version} -problem-host.example.com -- |
+ Warning |
- Modify content using a programming language of your choice. -
-- All instances of text-based type, most notably HTML and - JavaScript, to which this action applies, can be filtered - on-the-fly through the specified external filter. By - default plain text documents are exempted from filtering, - because web servers often use the text/plain MIME type for all files whose - type they don't know.) -
-- Multi-value. -
-- The name of an external content filter, as defined in the - filter file. External - filters can be defined in one or more files as defined by - the filterfile option in the - config file. -
-- When used in its negative form, and without parameters, - all - filtering with external filters is completely disabled. -
-- External filters are scripts or programs that can modify - the content in case common filters aren't powerful - enough. With the exception that this action doesn't use - pcrs-based filters, the notes in the filter section - apply. -
-- Warning - | -
- - Currently external filters are executed with Privoxy's privileges. - Only use external filters you understand and trust. - - |
-
- This feature is experimental, the syntax - may change in the future. -
--
-
--+external-filter{fancy-filter} -+ |
+ Don't block any header without understanding the + consequences. |
+ + # Crunch server headers that try to prevent caching +{ +crunch-server-header{no-cache} } +/+ |
+
- Fool some click-tracking scripts and speed up indirect - links. -
-- Detects redirection URLs and redirects the browser without - contacting the redirection server first. -
-- Parameterized. -
-- "simple-check" to just - search for the string "http://" to detect redirection URLs. -
-- "check-decoded-url" to - decode URLs (if necessary) before searching for - redirection URLs. -
-- Many sites, like yahoo.com, don't just link to other sites. - Instead, they will link to some script on their own - servers, giving the destination as a parameter, which will - then redirect you to the final target. URLs resulting from - this scheme typically look like: "http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/". -
-- Sometimes, there are even multiple consecutive redirects - encoded in the URL. These redirections via scripts make - your web browsing more traceable, since the server from - which you follow such a link can see where you go to. Apart - from that, valuable bandwidth and time is wasted, while - your browser asks the server for one redirect after the - other. Plus, it feeds the advertisers. -
-- This feature is currently not very smart and is scheduled - for improvement. If it is enabled by default, you will have - to create some exceptions to this action. It can lead to - failures in several ways: -
-- Not every URLs with other URLs as parameters is evil. Some - sites offer a real service that requires this information - to work. For example a validation service needs to know, - which document to validate. fast-redirects assumes that every URL - parameter that looks like another URL is a redirection - target, and will always redirect to the last one. Most of - the time the assumption is correct, but if it isn't, the - user gets redirected anyway. -
-- Another failure occurs if the URL contains other parameters - after the URL parameter. The URL: "http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar". - contains the redirection URL "http://www.example.net/", followed by - another parameter. fast-redirects - doesn't know that and will cause a redirect to "http://www.example.net/&foo=bar". - Depending on the target server configuration, the parameter - will be silently ignored or lead to a "page not found" error. You can prevent this - problem by first using the redirect action to - remove the last part of the URL, but it requires a little - effort. -
-- To detect a redirection URL, fast-redirects only looks for the string - "http://", either in plain text - (invalid but often used) or encoded as "http%3a//". Some sites use their own URL - encoding scheme, encrypt the address of the target server - or replace it with a database id. In theses cases fast-redirects is fooled and the - request reaches the redirection server where it probably - gets logged. -
--
-
-- { +fast-redirects{simple-check} } - one.example.com - - { +fast-redirects{check-decoded-url} } - another.example.com/testing -- |
-
Prevent the web server from reading any HTTP cookies from + your system
+Deletes any "Cookie:" HTTP + headers from client requests.
+Boolean.
+N/A
+This action is only concerned with outgoing HTTP + cookies. For incoming HTTP cookies, use crunch-incoming-cookies. + Use both + to disable HTTP cookies completely.
+It makes no sense + at all to use this action in conjunction with the + session-cookies-only + action, since it would prevent the session cookies from being + read.
+
+ +crunch-outgoing-cookies+ |
+
- Get rid of HTML and JavaScript annoyances, banner - advertisements (by size), do fun text replacements, add - personalized effects, etc. -
-- All instances of text-based type, most notably HTML and - JavaScript, to which this action applies, can be filtered - on-the-fly through the specified regular expression based - substitutions. (Note: as of version 3.0.3 plain text - documents are exempted from filtering, because web servers - often use the text/plain MIME type - for all files whose type they don't know.) -
-- Multi-value. -
-- The name of a content filter, as defined in the filter file. Filters can be defined - in one or more files as defined by the filterfile option in the - config file. default.filter is the collection of filters - supplied by the developers. Locally defined filters should - go in their own file, such as user.filter. -
-- When used in its negative form, and without parameters, - all - filtering is completely disabled. -
-- For your convenience, there are a number of pre-defined - filters available in the distribution filter file that you - can use. See the examples below for a list. -
-- Filtering requires buffering the page content, which may - appear to slow down page rendering since nothing is - displayed until all content has passed the filters. (The - total time until the page is completely rendered doesn't - change much, but it may be perceived as slower since the - page is not incrementally displayed.) This effect will be - more noticeable on slower connections. -
-- "Rolling your own" filters - requires a knowledge of "Regular Expressions" - and "HTML". This is very - powerful feature, and potentially very intrusive. Filters - should be used with caution, and where an equivalent "action" is not available. -
-- The amount of data that can be filtered is limited to the - buffer-limit option in - the main config file. The default - is 4096 KB (4 Megs). Once this limit is exceeded, the - buffered data, and all pending data, is passed through - unfiltered. -
-- Inappropriate MIME types, such as zipped files, are not - filtered at all. (Again, only text-based types except plain - text). Encrypted SSL data (from HTTPS servers) cannot be - filtered either, since this would violate the integrity of - the secure transaction. In some situations it might be - necessary to protect certain text, like source code, from - filtering by defining appropriate -filter exceptions. -
-- Compressed content can't be filtered either, but if Privoxy is compiled with zlib - support and a supported compression algorithm is used (gzip - or deflate), Privoxy can - first decompress the content and then filter it. -
-- If you use a Privoxy - version without zlib support, but want filtering to work on - as much documents as possible, even those that would - normally be sent compressed, you must use the prevent-compression - action in conjunction with filter. -
-- Content filtering can achieve some of the same effects as - the block action, i.e. it - can be used to block ads and banners. But the mechanism - works quite differently. One effective use, is to block ad - banners based on their size (see below), since many of - these seem to be somewhat standardized. -
-- Feedback with suggestions for - new or improved filters is particularly welcome! -
-- The below list has only the names and a one-line - description of each predefined filter. There are more verbose - explanations of what these filters do in the filter file chapter. -
-
--+filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse. -- |
-
--+filter{js-events} # Kill JavaScript event bindings and timers (Radically destructive! Only for extra nasty sites). -- |
-
--+filter{html-annoyances} # Get rid of particularly annoying HTML abuse. -- |
-
--+filter{content-cookies} # Kill cookies that come in the HTML or JS content. -- |
-
--+filter{refresh-tags} # Kill automatic refresh tags if refresh time is larger than 9 seconds. -- |
-
--+filter{unsolicited-popups} # Disable only unsolicited pop-up windows. -- |
-
--+filter{all-popups} # Kill all popups in JavaScript and HTML. -- |
-
--+filter{img-reorder} # Reorder attributes in <img> tags to make the banners-by-* filters more effective. -- |
-
--+filter{banners-by-size} # Kill banners by size. -- |
-
--+filter{banners-by-link} # Kill banners by their links to known clicktrackers. -- |
-
--+filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking). -- |
-
--+filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap. -- |
-
--+filter{jumping-windows} # Prevent windows from resizing and moving themselves. -- |
-
--+filter{frameset-borders} # Give frames a border and make them resizable. -- |
-
--+filter{iframes} # Removes all detected iframes. Should only be enabled for individual sites. -- |
-
--+filter{demoronizer} # Fix MS's non-standard use of standard charsets. -- |
-
--+filter{shockwave-flash} # Kill embedded Shockwave Flash objects. -- |
-
--+filter{quicktime-kioskmode} # Make Quicktime movies saveable. -- |
-
--+filter{fun} # Text replacements for subversive browsing fun! -- |
-
--+filter{crude-parental} # Crude parental filtering. Note that this filter doesn't work reliably. -- |
-
--+filter{ie-exploits} # Disable some known Internet Explorer bug exploits. -- |
-
--+filter{site-specifics} # Cure for site-specific problems. Don't apply generally! -- |
-
+ +deanimate-gifs{last}+ |
+
Work around (very rare) problems with HTTP/1.1
+Downgrades HTTP/1.1 client requests and server replies to + HTTP/1.0.
+Boolean.
+N/A
+This is a left-over from the time when Privoxy didn't support important HTTP/1.1 + features well. It is left here for the unlikely case that you + experience HTTP/1.1-related problems with some server out + there.
+Note that enabling this action is only a workaround. It + should not be enabled for sites that work without it. While it + shouldn't break any pages, it has an (usually negative) + performance impact.
+If you come across a site where enabling this action helps, + please report it, so the cause of the problem can be analyzed. + If the problem turns out to be caused by a bug in Privoxy it should be fixed so the + following release works without the work around.
+
+ {+downgrade-http-version} +problem-host.example.com+ |
+
Modify content using a programming language of your + choice.
+All instances of text-based type, most notably HTML and + JavaScript, to which this action applies, can be filtered + on-the-fly through the specified external filter. By default + plain text documents are exempted from filtering, because web + servers often use the text/plain MIME + type for all files whose type they don't know.)
+Multi-value.
+The name of an external content filter, as defined in the + filter file. External filters + can be defined in one or more files as defined by the + filterfile option in the + config file.
+When used in its negative form, and without parameters, + all + filtering with external filters is completely disabled.
+External filters are scripts or programs that can modify the + content in case common filters aren't powerful + enough. With the exception that this action doesn't use + pcrs-based filters, the notes in the filter + section apply.
+
--+filter{no-ping} # Removes non-standard ping attributes in <a> and <area> tags. -- |
+ Warning |
--+filter{google} # CSS-based block for Google text ads. Also removes a width limitation and the toolbar advertisement. -+ |
+ Currently external filters are executed with + Privoxy's privileges. + Only use external filters you understand and trust. |
This feature is experimental, the syntax may + change in the future.
+
+ +external-filter{fancy-filter}+ |
+
Fool some click-tracking scripts and speed up indirect + links.
+Detects redirection URLs and redirects the browser without + contacting the redirection server first.
+Parameterized.
+"simple-check" to just search + for the string "http://" to + detect redirection URLs.
+"check-decoded-url" to decode + URLs (if necessary) before searching for redirection + URLs.
+Many sites, like yahoo.com, don't just link to other sites. + Instead, they will link to some script on their own servers, + giving the destination as a parameter, which will then redirect + you to the final target. URLs resulting from this scheme + typically look like: "http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/".
+Sometimes, there are even multiple consecutive redirects + encoded in the URL. These redirections via scripts make your + web browsing more traceable, since the server from which you + follow such a link can see where you go to. Apart from that, + valuable bandwidth and time is wasted, while your browser asks + the server for one redirect after the other. Plus, it feeds the + advertisers.
+This feature is currently not very smart and is scheduled + for improvement. If it is enabled by default, you will have to + create some exceptions to this action. It can lead to failures + in several ways:
+Not every URLs with other URLs as parameters is evil. Some + sites offer a real service that requires this information to + work. For example a validation service needs to know, which + document to validate. fast-redirects + assumes that every URL parameter that looks like another URL is + a redirection target, and will always redirect to the last one. + Most of the time the assumption is correct, but if it isn't, + the user gets redirected anyway.
+Another failure occurs if the URL contains other parameters + after the URL parameter. The URL: "http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar". + contains the redirection URL "http://www.example.net/", followed by another + parameter. fast-redirects doesn't know + that and will cause a redirect to "http://www.example.net/&foo=bar". Depending + on the target server configuration, the parameter will be + silently ignored or lead to a "page not + found" error. You can prevent this problem by first + using the redirect action to remove + the last part of the URL, but it requires a little effort.
+To detect a redirection URL, fast-redirects only looks for the string + "http://", either in plain text + (invalid but often used) or encoded as "http%3a//". Some sites use their own URL + encoding scheme, encrypt the address of the target server or + replace it with a database id. In theses cases fast-redirects is fooled and the request reaches + the redirection server where it probably gets logged.
+
+ { +fast-redirects{simple-check} } + one.example.com - -
+
+ 8.5.16. + filter+
+
+
+ 8.5.17. force-text-mode+
+
|