X-Git-Url: http://www.privoxy.org/gitweb/?a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Factions-file.html;h=bc85832d2f6a9b98178ac8ef03de3f8b9ed634cf;hb=f8dbc81f51ddf04121644ad5da727f94f3ad11a5;hp=9998abf41c12f0f886e59aaabab2d2d5f4039067;hpb=0212c18282eaa5f73843cbbec12c9137ea596e1c;p=privoxy.git diff --git a/doc/webserver/user-manual/actions-file.html b/doc/webserver/user-manual/actions-file.html index 9998abf4..68cf2b69 100644 --- a/doc/webserver/user-manual/actions-file.html +++ b/doc/webserver/user-manual/actions-file.html @@ -1,3710 +1,4429 @@ -
The actions files are used to define what actions - Privoxy takes for which URLs, and thus determine - how ad images, cookies and various other aspects of HTTP content and - transactions are handled, and on which sites (or even parts thereof). There - are three such files included with Privoxy (as of - version 2.9.15), with differing purposes: -
standard.action - is used by the web based editor, - to set various pre-defined sets of rules for the default actions section - in default.action. These have increasing levels of - aggressiveness and have no influence on your browsing unless - you select them explicitly in the editor. It is not recommend - to edit this file. -
default.action - is the primary action file - that sets the initial values for all actions. It is intended to - provide a base level of functionality for - Privoxy's array of features. So it is - a set of broad rules that should work reasonably well for users everywhere. - This is the file that the developers are keeping updated, and making - available to users. -
user.action - is intended to be for local site - preferences and exceptions. As an example, if your ISP or your bank - has specific requirements, and need special handling, this kind of - thing should go here. This file will not be upgraded. -
The list of actions files to be used are defined in the main configuration - file, and are processed in the order they are defined. The content of these - can all be viewed and edited from http://config.privoxy.org/show-status.
An actions file typically has multiple sections. If you want to use - "aliases" in an actions file, you have to place the (optional) - alias section at the top of that file. - Then comes the default set of rules which will apply universally to all - sites and pages (be very careful with using such a - universal set in user.action or any other actions file after - default.action, because it will override the result - from consulting any previous file). And then below that, - exceptions to the defined universal policies. You can regard - user.action as an appendix to default.action, - with the advantage that is a separate file, which makes preserving your - personal settings across Privoxy upgrades easier.
- Actions can be used to block anything you want, including ads, banners, or - just some obnoxious URL that you would rather not see. Cookies can be accepted - or rejected, or accepted only during the current browser session (i.e. not - written to disk), content can be modified, JavaScripts tamed, user-tracking - fooled, and much more. See below for a complete list - of actions.
Note that some actions, like cookie suppression - or script disabling, may render some sites unusable that rely on these - techniques to work properly. Finding the right mix of actions is not always easy and - certainly a matter of personal taste. In general, it can be said that the more - "aggressive" your default settings (in the top section of the - actions file) are, the more exceptions for "trusted" sites you - will have to make later. If, for example, you want to kill popup windows per - default, you'll have to make exceptions from that rule for sites that you - regularly use and that require popups for actually useful content, like maybe - your bank, favorite shop, or newspaper.
We have tried to provide you with reasonable rules to start from in the - distribution actions files. But there is no general rule of thumb on these - things. There just are too many variables, and sites are constantly changing. - Sooner or later you will want to change the rules (and read this chapter again :).
The easiest way to edit the actions files is with a browser by - using our browser-based editor, which can be reached from http://config.privoxy.org/show-status. - The editor allows both fine-grained control over every single feature on a - per-URL basis, and easy choosing from wholesale sets of defaults like - "Cautious", "Medium" or "Advanced".
If you prefer plain text editing to GUIs, you can of course also directly edit the - the actions files. Look at default.action which is richly - commented.
Actions files are divided into sections. There are special sections, - like the "alias" sections which will be discussed later. For now - let's concentrate on regular sections: They have a heading line (often split - up to multiple lines for readability) which consist of a list of actions, - separated by whitespace and enclosed in curly braces. Below that, there - is a list of URL patterns, each on a separate line.
To determine which actions apply to a request, the URL of the request is - compared to all patterns in each action file file. Every time it matches, the list of - applicable actions for the URL is incrementally updated, using the heading - of the section in which the pattern is located. If multiple matches for - the same URL set the same action differently, the last match wins. If not, - the effects are aggregated (e.g. a URL might match both the - "+handle-as-image" - and "+block" actions). -
You can trace this process for any given URL by visiting http://config.privoxy.org/show-url-info.
More detail on this is provided in the Appendix, Anatomy of an Action.
Generally, a pattern has the form <domain>/<path>, - where both the <domain> and <path> - are optional. (This is why the pattern / matches all URLs).
is a domain-only pattern and will match any request to www.example.com, - regardless of which document on that server is requested. -
means exactly the same. For domain-only patterns, the trailing / may - be omitted. -
matches only the single document /index.html - on www.example.com. -
matches the document /index.html, regardless of the domain, - i.e. on any web server. -
matches nothing, since it would be interpreted as a domain name and - there is no top-level domain called .html. -
The matching of the domain part offers some flexible options: if the - domain starts or ends with a dot, it becomes unanchored at that end. - For example:
matches any domain that ENDS in - .example.com -
matches any domain that STARTS with - www. -
matches any domain that CONTAINS .example. - (Correctly speaking: It matches any FQDN that contains example as a domain.) -
Additionally, there are wild-cards that you can use in the domain names - themselves. They work pretty similar to shell wild-cards: "*" - stands for zero or more arbitrary characters, "?" stands for - any single character, you can define character classes in square - brackets and all of that can be freely mixed:
matches "adserver.example.com", - "ads.example.com", etc but not "sfads.example.com" -
matches all of the above, and then some. -
matches www.ipix.com, - pictures.epix.com, a.b.c.d.e.upix.com etc. -
matches www1.example.com, - www4.example.cc, wwwd.example.cy, - wwwz.example.com etc., but not - wwww.example.com. -
Privoxy uses Perl compatible regular expressions - (through the PCRE library) for - matching the path.
There is an Appendix with a brief quick-start into regular - expressions, and full (very technical) documentation on PCRE regex syntax is available on-line - at http://www.pcre.org/man.txt. - You might also find the Perl man page on regular expressions (man perlre) - useful, which is available on-line at http://www.perldoc.com/perl5.6/pod/perlre.html.
Note that the path pattern is automatically left-anchored at the "/", - i.e. it matches as if it would start with a "^" (regular expression speak - for the beginning of a line).
Please also note that matching in the path is case - INSENSITIVE by default, but you can switch to case - sensitive at any point in the pattern by using the - "(?-i)" switch: - www.example.com/(?-i)PaTtErN.* will match only - documents whose path starts with PaTtErN in - exactly this capitalization.
All actions are disabled by default, until they are explicitly enabled - somewhere in an actions file. Actions are turned on if preceded with a - "+", and turned off if preceded with a "-". So a - +action means "do that action", e.g. - +block means "please block URLs that match the - following patterns", and -block means "don't - block URLs that match the following patterns, even if +block - previously applied."
- Again, actions are invoked by placing them on a line, enclosed in curly braces and - separated by whitespace, like in - {+some-action -some-other-action{some-parameter}}, - followed by a list of URL patterns, one per line, to which they apply. - Together, the actions line and the following pattern lines make up a section - of the actions file.
- There are three classes of actions:
- Boolean, i.e the action can only be "enabled" or - "disabled". Syntax: -
+name # enable action name - -name # disable action name |
- Example: +block -
- Parameterized, where some value is required in order to enable this type of action. - Syntax: -
+name{param} # enable action and set parameter to param, + + + ++ |
+
Examples: +add-header{X-Fun-Header: Some + text} and +filter{html-annoyances}
+If nothing is specified in any actions file, no "actions" are taken. So in this case Privoxy would just be a normal, non-blocking, + non-filtering proxy. You must specifically enable the privacy and + blocking features you need (although the provided default actions files + will give a good starting point).
+Later defined action sections always over-ride earlier ones of the + same type. So exceptions to any rules you make, should come in the + latter part of the file (or in a file that is processed later when + using multiple actions files such as user.action). For multi-valued actions, the actions are + applied in the order they are specified. Actions files are processed in + the order they are defined in config (the + default installation has three actions files). It also quite possible + for any given URL to match more than one "pattern" (because of wildcards and regular + expressions), and thus to trigger more than one set of actions! Last + match wins.
+The list of valid Privoxy actions + are:
+Confuse log analysis, custom applications
+Sends a user defined HTTP header to the web server.
+Multi-value.
+Any string value is possible. Validity of the defined HTTP + headers is not checked. It is recommended that you use the + "X-" prefix + for custom headers.
+This action may be specified multiple times, in order to + define multiple headers. This is rarely needed for the typical + user. If you don't know what "HTTP + headers" are, you definitely don't need to worry about + this one.
+Headers added by this action are not modified by other + actions.
+
+ + # Add a DNT ("Do not track") header to all requests, +# event to those that already have one. +# +# This is just an example, not a recommendation. +# +# There is no reason to believe that user-tracking websites care +# about the DNT header and depending on the User-Agent, adding the +# header may make user-tracking easier. +{+add-header{DNT: 1}} +/+ |
+
Block ads or other unwanted content
+Requests for URLs to which this action applies are blocked, + i.e. the requests are trapped by Privoxy and the requested URL is never + retrieved, but is answered locally with a substitute page or + image, as determined by the handle-as-image, + set-image-blocker, + and handle-as-empty-document + actions.
+Parameterized.
+A block reason that should be given to the user.
+Privoxy sends a special + "BLOCKED" page for requests to + blocked pages. This page contains the block reason given as + parameter, a link to find out why the block action applies, and + a click-through to the blocked content (the latter only if the + force feature is available and enabled).
+A very important exception occurs if both block and handle-as-image, + apply to the same request: it will then be replaced by an + image. If set-image-blocker + (see below) also applies, the type of image will be determined + by its parameter, if not, the standard checkerboard pattern is + sent.
+It is important to understand this process, in order to + understand how Privoxy deals + with ads and other unwanted content. Blocking is a core + feature, and one upon which various other features depend.
+The filter action can perform a + very similar task, by "blocking" + banner images and other content through rewriting the relevant + URLs in the document's HTML source, so they don't get requested + in the first place. Note that this is a totally different + technique, and it's easy to confuse the two.
+
+ {+block{No nasty stuff for you.}} +# Block and replace with "blocked" page + .nasty-stuff.example.com + +{+block{Doubleclick banners.} +handle-as-image} +# Block and replace with image + .ad.doubleclick.net + .ads.r.us/banners/ + +{+block{Layered ads.} +handle-as-empty-document} +# Block and then ignore + adserver.example.net/.*\.js$+ |
+
Improve privacy by not forwarding the source of the request + in the HTTP headers.
+Deletes the "X-Forwarded-For:" + HTTP header from the client request, or adds a new one.
+Parameterized.
+"block" to delete the + header.
+"add" to create the header + (or append the client's IP address to an already existing + one).
+It is safe and recommended to use block.
+Forwarding the source address of the request may make sense + in some multi-user setups but is also a privacy risk.
+
+ +change-x-forwarded-for{block}+ |
+
Rewrite or remove single client headers.
+All client headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions.
+Multi-value.
+The name of a client-header filter, as defined in one of the + filter files.
+Client-header filters are applied to each header on its own, + not to all at once. This makes it easier to diagnose problems, + but on the downside you can't write filters that only change + header x if header y's value is z. You can do that by using + tags though.
+Client-header filters are executed after the other header + actions have finished and use their output as input.
+If the request URI gets changed, Privoxy will detect that and use the new + one. This can be used to rewrite the request destination behind + the client's back, for example to specify a Tor exit relay for + certain requests.
+Please refer to the filter file + chapter to learn which client-header filters are available + by default, and how to create your own.
+
+ + # Hide Tor exit notation in Host and Referer Headers +{+client-header-filter{hide-tor-exit-notation}} +/ ++ |
+
Block requests based on their headers.
+Client headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions, the result is used as tag.
+Multi-value.
+The name of a client-header tagger, as defined in one of the + filter files.
+Client-header taggers are applied to each header on its own, + and as the header isn't modified, each tagger "sees" the original.
+Client-header taggers are the first actions that are + executed and their tags can be used to control every other + action.
+
+ + # Tag every request with the User-Agent header +{+client-header-tagger{user-agent}} +/ + +# Tagging itself doesn't change the action +# settings, sections with TAG patterns do: +# +# If it's a download agent, use a different forwarding proxy, +# show the real User-Agent and make sure resume works. +{+forward-override{forward-socks5 10.0.0.2:2222 .} \ + -hide-if-modified-since \ + -overwrite-last-modified \ + -hide-user-agent \ + -filter \ + -deanimate-gifs \ +} +TAG:^User-Agent: NetBSD-ftp/ +TAG:^User-Agent: Novell ZYPP Installer +TAG:^User-Agent: RPM APT-HTTP/ +TAG:^User-Agent: fetch libfetch/ +TAG:^User-Agent: Ubuntu APT-HTTP/ +TAG:^User-Agent: MPlayer/ ++ |
+
+ + # Tag all requests with the Range header set +{+client-header-tagger{range-requests}} +/ + +# Disable filtering for the tagged requests. +# +# With filtering enabled Privoxy would remove the Range headers +# to be able to filter the whole response. The downside is that +# it prevents clients from resuming downloads or skipping over +# parts of multimedia files. +{-filter -deanimate-gifs} +TAG:^RANGE-REQUEST$ ++ |
+
Stop useless download menus from popping up, or change the + browser's rendering mode
+Replaces the "Content-Type:" HTTP + server header.
+Parameterized.
+Any string.
+The "Content-Type:" HTTP server + header is used by the browser to decide what to do with the + document. The value of this header can cause the browser to + open a download menu instead of displaying the document by + itself, even if the document's format is supported by the + browser.
+The declared content type can also affect which rendering + mode the browser chooses. If XHTML is delivered as "text/html", many browsers treat it as yet + another broken HTML document. If it is send as "application/xml", browsers with XHTML support + will only display it, if the syntax is correct.
+If you see a web site that proudly uses XHTML buttons, but + sets "Content-Type: text/html", you + can use Privoxy to overwrite + it with "application/xml" and + validate the web master's claim inside your XHTML-supporting + browser. If the syntax is incorrect, the browser will complain + loudly.
+You can also go the opposite direction: if your browser + prints error messages instead of rendering a document falsely + declared as XHTML, you can overwrite the content type with + "text/html" and have it rendered as + broken HTML document.
+By default content-type-overwrite + only replaces "Content-Type:" + headers that look like some kind of text. If you want to + overwrite it unconditionally, you have to combine it with + force-text-mode. + This limitation exists for a reason, think twice before + circumventing it.
+Most of the time it's easier to replace this action with a + custom server-header + filter. It allows you to activate it for every + document of a certain site and it will still only replace the + content types you aimed at.
+Of course you can apply content-type-overwrite to a whole site and then + make URL based exceptions, but it's a lot more work to get the + same precision.
+
+ + # Check if www.example.net/ really uses valid XHTML +{ +content-type-overwrite{application/xml} } +www.example.net/ + +# but leave the content type unmodified if the URL looks like a style sheet +{-content-type-overwrite} +www.example.net/.*\.css$ +www.example.net/.*style+ |
+
Remove a client header Privoxy has no dedicated action for.
+Deletes every header sent by the client that contains the + string the user supplied as parameter.
+Parameterized.
+Any string.
+This action allows you to block client headers for which no + dedicated Privoxy action + exists. Privoxy will remove + every client header that contains the string you supplied as + parameter.
+Regular expressions are not supported and you can't use this + action to block different headers in the same request, unless + they contain the same string.
+crunch-client-header is only meant + for quick tests. If you have to block several different + headers, or only want to modify parts of them, you should use a + client-header + filter.
+Warning | +
+ Don't block any header without understanding the + consequences. + |
+
+ + # Block the non-existent "Privacy-Violation:" client header +{ +crunch-client-header{Privacy-Violation:} } +/ ++ |
+
Prevent yet another way to track the user's steps between + sessions.
+Deletes the "If-None-Match:" HTTP + client header.
+Boolean.
+N/A
+Removing the "If-None-Match:" + HTTP client header is useful for filter testing, where you want + to force a real reload instead of getting status code + "304" which would cause the browser + to use a cached copy of the page.
+It is also useful to make sure the header isn't used as a + cookie replacement (unlikely but possible).
+Blocking the "If-None-Match:" + header shouldn't cause any caching problems, as long as the + "If-Modified-Since:" header isn't + blocked or missing as well.
+It is recommended to use this action together with + hide-if-modified-since + and overwrite-last-modified.
+
+ + # Let the browser revalidate cached documents but don't +# allow the server to use the revalidation headers for user tracking. +{+hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} +/+ |
+
Prevent the web server from setting HTTP cookies on your + system
+Deletes any "Set-Cookie:" HTTP + headers from server replies.
+Boolean.
+N/A
+This action is only concerned with incoming HTTP + cookies. For outgoing HTTP cookies, use crunch-outgoing-cookies. + Use both + to disable HTTP cookies completely.
+It makes no sense + at all to use this action in conjunction with the + session-cookies-only + action, since it would prevent the session cookies from being + set. See also filter-content-cookies.
+
+ +crunch-incoming-cookies+ |
+
Remove a server header Privoxy has no dedicated action for.
+Deletes every header sent by the server that contains the + string the user supplied as parameter.
+Parameterized.
+Any string.
+This action allows you to block server headers for which no + dedicated Privoxy action + exists. Privoxy will remove + every server header that contains the string you supplied as + parameter.
+Regular expressions are not supported and you can't use this + action to block different headers in the same request, unless + they contain the same string.
+crunch-server-header is only meant + for quick tests. If you have to block several different + headers, or only want to modify parts of them, you should use a + custom server-header + filter.
+Warning | +
+ Don't block any header without understanding the + consequences. + |
+
+ + # Crunch server headers that try to prevent caching +{ +crunch-server-header{no-cache} } +/+ |
+
Prevent the web server from reading any HTTP cookies from + your system
+Deletes any "Cookie:" HTTP + headers from client requests.
+Boolean.
+N/A
+This action is only concerned with outgoing HTTP + cookies. For incoming HTTP cookies, use crunch-incoming-cookies. + Use both + to disable HTTP cookies completely.
+It makes no sense + at all to use this action in conjunction with the + session-cookies-only + action, since it would prevent the session cookies from being + read.
+
+ +crunch-outgoing-cookies+ |
+
Stop those annoying, distracting animated GIF images.
+De-animate GIF animations, i.e. reduce them to their first + or last image.
+Parameterized.
+"last" or "first"
+This will also shrink the images considerably (in bytes, not + pixels!). If the option "first" is + given, the first frame of the animation is used as the + replacement. If "last" is given, the + last frame of the animation is used instead, which probably + makes more sense for most banner animations, but also has the + risk of not showing the entire last frame (if it is only a + delta to an earlier frame).
+You can safely use this action with patterns that will also + match non-GIF objects, because no attempt will be made at + anything that doesn't look like a GIF.
+
+ +deanimate-gifs{last}+ |
+
Work around (very rare) problems with HTTP/1.1
+Downgrades HTTP/1.1 client requests and server replies to + HTTP/1.0.
+Boolean.
+N/A
+This is a left-over from the time when Privoxy didn't support important HTTP/1.1 + features well. It is left here for the unlikely case that you + experience HTTP/1.1-related problems with some server out + there.
+Note that enabling this action is only a workaround. It + should not be enabled for sites that work without it. While it + shouldn't break any pages, it has an (usually negative) + performance impact.
+If you come across a site where enabling this action helps, + please report it, so the cause of the problem can be analyzed. + If the problem turns out to be caused by a bug in Privoxy it should be fixed so the + following release works without the work around.
+
+ {+downgrade-http-version} +problem-host.example.com+ |
+
Modify content using a programming language of your + choice.
+All instances of text-based type, most notably HTML and + JavaScript, to which this action applies, can be filtered + on-the-fly through the specified external filter. By default + plain text documents are exempted from filtering, because web + servers often use the text/plain MIME + type for all files whose type they don't know.)
+Multi-value.
+The name of an external content filter, as defined in the + filter file. External filters + can be defined in one or more files as defined by the + filterfile option in the + config file.
+When used in its negative form, and without parameters, + all + filtering with external filters is completely disabled.
+External filters are scripts or programs that can modify the + content in case common filters aren't powerful + enough. With the exception that this action doesn't use + pcrs-based filters, the notes in the filter + section apply.
+Warning | +
+ Currently external filters are executed with + Privoxy's privileges. + Only use external filters you understand and trust. + |
+
This feature is experimental, the syntax may + change in the future.
+
+ +external-filter{fancy-filter}+ |
+
Fool some click-tracking scripts and speed up indirect + links.
+Detects redirection URLs and redirects the browser without + contacting the redirection server first.
+Parameterized.
+"simple-check" to just search + for the string "http://" to + detect redirection URLs.
+"check-decoded-url" to decode + URLs (if necessary) before searching for redirection + URLs.
+Many sites, like yahoo.com, don't just link to other sites. + Instead, they will link to some script on their own servers, + giving the destination as a parameter, which will then redirect + you to the final target. URLs resulting from this scheme + typically look like: "http://www.example.org/click-tracker.cgi?target=http%3a//www.example.net/".
+Sometimes, there are even multiple consecutive redirects + encoded in the URL. These redirections via scripts make your + web browsing more traceable, since the server from which you + follow such a link can see where you go to. Apart from that, + valuable bandwidth and time is wasted, while your browser asks + the server for one redirect after the other. Plus, it feeds the + advertisers.
+This feature is currently not very smart and is scheduled + for improvement. If it is enabled by default, you will have to + create some exceptions to this action. It can lead to failures + in several ways:
+Not every URLs with other URLs as parameters is evil. Some + sites offer a real service that requires this information to + work. For example a validation service needs to know, which + document to validate. fast-redirects + assumes that every URL parameter that looks like another URL is + a redirection target, and will always redirect to the last one. + Most of the time the assumption is correct, but if it isn't, + the user gets redirected anyway.
+Another failure occurs if the URL contains other parameters + after the URL parameter. The URL: "http://www.example.org/?redirect=http%3a//www.example.net/&foo=bar". + contains the redirection URL "http://www.example.net/", followed by another + parameter. fast-redirects doesn't know + that and will cause a redirect to "http://www.example.net/&foo=bar". Depending + on the target server configuration, the parameter will be + silently ignored or lead to a "page not + found" error. You can prevent this problem by first + using the redirect action to remove + the last part of the URL, but it requires a little effort.
+To detect a redirection URL, fast-redirects only looks for the string + "http://", either in plain text + (invalid but often used) or encoded as "http%3a//". Some sites use their own URL + encoding scheme, encrypt the address of the target server or + replace it with a database id. In theses cases fast-redirects is fooled and the request reaches + the redirection server where it probably gets logged.
+
+ { +fast-redirects{simple-check} } + one.example.com + + { +fast-redirects{check-decoded-url} } + another.example.com/testing+ |
+
Get rid of HTML and JavaScript annoyances, banner + advertisements (by size), do fun text replacements, add + personalized effects, etc.
+All instances of text-based type, most notably HTML and + JavaScript, to which this action applies, can be filtered + on-the-fly through the specified regular expression based + substitutions. (Note: as of version 3.0.3 plain text documents + are exempted from filtering, because web servers often use the + text/plain MIME type for all files + whose type they don't know.)
+Multi-value.
+The name of a content filter, as defined in the filter file. Filters can be defined in + one or more files as defined by the filterfile + option in the config file. default.filter is the collection of filters + supplied by the developers. Locally defined filters should go + in their own file, such as user.filter.
+When used in its negative form, and without parameters, + all + filtering is completely disabled.
+For your convenience, there are a number of pre-defined + filters available in the distribution filter file that you can + use. See the examples below for a list.
+Filtering requires buffering the page content, which may + appear to slow down page rendering since nothing is displayed + until all content has passed the filters. (The total time until + the page is completely rendered doesn't change much, but it may + be perceived as slower since the page is not incrementally + displayed.) This effect will be more noticeable on slower + connections.
+"Rolling your own" filters + requires a knowledge of "Regular Expressions" and + "HTML". This is very + powerful feature, and potentially very intrusive. Filters + should be used with caution, and where an equivalent + "action" is not available.
+The amount of data that can be filtered is limited to the + buffer-limit option in the + main config file. The default is 4096 + KB (4 Megs). Once this limit is exceeded, the buffered data, + and all pending data, is passed through unfiltered.
+Inappropriate MIME types, such as zipped files, are not + filtered at all. (Again, only text-based types except plain + text). Encrypted SSL data (from HTTPS servers) cannot be + filtered either, since this would violate the integrity of the + secure transaction. In some situations it might be necessary to + protect certain text, like source code, from filtering by + defining appropriate -filter + exceptions.
+Compressed content can't be filtered either, but if + Privoxy is compiled with zlib + support and a supported compression algorithm is used (gzip or + deflate), Privoxy can first + decompress the content and then filter it.
+If you use a Privoxy + version without zlib support, but want filtering to work on as + much documents as possible, even those that would normally be + sent compressed, you must use the prevent-compression + action in conjunction with filter.
+Content filtering can achieve some of the same effects as + the block action, i.e. it can be + used to block ads and banners. But the mechanism works quite + differently. One effective use, is to block ad banners based on + their size (see below), since many of these seem to be somewhat + standardized.
+Feedback with suggestions for new + or improved filters is particularly welcome!
+The below list has only the names and a one-line description + of each predefined filter. There are more verbose + explanations of what these filters do in the filter file chapter.
+
+ + +filter{js-annoyances} # Get rid of particularly annoying JavaScript abuse.+ |
+
+ + +filter{js-events} # Kill JavaScript event bindings and timers (Radically destructive! Only for extra nasty sites).+ |
+
+ + +filter{html-annoyances} # Get rid of particularly annoying HTML abuse.+ |
+
+ + +filter{content-cookies} # Kill cookies that come in the HTML or JS content.+ |
+
+ + +filter{refresh-tags} # Kill automatic refresh tags if refresh time is larger than 9 seconds.+ |
+
+ + +filter{unsolicited-popups} # Disable only unsolicited pop-up windows.+ |
+
+ + +filter{all-popups} # Kill all popups in JavaScript and HTML.+ |
+
+ + +filter{img-reorder} # Reorder attributes in <img> tags to make the banners-by-* filters more effective.+ |
+
+ + +filter{banners-by-size} # Kill banners by size.+ |
+
+ + +filter{banners-by-link} # Kill banners by their links to known clicktrackers.+ |
+
+ + +filter{webbugs} # Squish WebBugs (1x1 invisible GIFs used for user tracking).+ |
+
+ + +filter{tiny-textforms} # Extend those tiny textareas up to 40x80 and kill the hard wrap.+ |
+
+ + +filter{jumping-windows} # Prevent windows from resizing and moving themselves.+ |
+
+ + +filter{frameset-borders} # Give frames a border and make them resizable.+ |
+
+ + +filter{iframes} # Removes all detected iframes. Should only be enabled for individual sites.+ |
+
+ + +filter{demoronizer} # Fix MS's non-standard use of standard charsets.+ |
+
+ + +filter{shockwave-flash} # Kill embedded Shockwave Flash objects.+ |
+
+ + +filter{quicktime-kioskmode} # Make Quicktime movies saveable.+ |
+
+ + +filter{fun} # Text replacements for subversive browsing fun!+ |
+
+ + +filter{crude-parental} # Crude parental filtering. Note that this filter doesn't work reliably.+ |
+
+ + +filter{ie-exploits} # Disable some known Internet Explorer bug exploits.+ |
+
+ + +filter{site-specifics} # Cure for site-specific problems. Don't apply generally!+ |
+
+ + +filter{no-ping} # Removes non-standard ping attributes in <a> and <area> tags.+ |
+
+ + +filter{google} # CSS-based block for Google text ads. Also removes a width limitation and the toolbar advertisement.+ |
+
+ + +filter{yahoo} # CSS-based block for Yahoo text ads. Also removes a width limitation.+ |
+
+ + +filter{msn} # CSS-based block for MSN text ads. Also removes tracking URLs and a width limitation.+ |
+
+ + +filter{blogspot} # Cleans up some Blogspot blogs. Read the fine print before using this.+ |
+
Force Privoxy to treat a + document as if it was in some kind of text format.
+Declares a document as text, even if the "Content-Type:" isn't detected as such.
+Boolean.
+N/A
+As explained above, Privoxy tries to only filter files that + are in some kind of text format. The same restrictions apply to + content-type-overwrite. + force-text-mode declares a document as + text, without looking at the "Content-Type:" first.
+Warning | +
+ Think twice before activating this action. Filtering + binary data with regular expressions can cause file + damage. + |
+
+ +force-text-mode ++ |
+
Change the forwarding settings based on User-Agent or + request origin
+Overrules the forward directives in the configuration + file.
+Parameterized.
+"forward ." to use a direct + connection without any additional proxies.
+"forward 127.0.0.1:8123" to + use the HTTP proxy listening at 127.0.0.1 port 8123.
+"forward-socks4a 127.0.0.1:9050 + ." to use the socks4a proxy listening at 127.0.0.1 + port 9050. Replace "forward-socks4a" with "forward-socks4" to use a socks4 connection + (with local DNS resolution) instead, use "forward-socks5" for socks5 connections + (with remote DNS resolution).
+"forward-socks4a 127.0.0.1:9050 + proxy.example.org:8000" to use the socks4a proxy + listening at 127.0.0.1 port 9050 to reach the HTTP proxy + listening at proxy.example.org port 8000. Replace + "forward-socks4a" with + "forward-socks4" to use a socks4 + connection (with local DNS resolution) instead, use + "forward-socks5" for socks5 + connections (with remote DNS resolution).
+"forward-webserver + 127.0.0.1:80" to use the HTTP server listening at + 127.0.0.1 port 80 without adjusting the request + headers.
+This makes it more convenient to use Privoxy to make + existing websites available as onion services as well.
+Many websites serve content with hardcoded URLs and + can't be easily adjusted to change the domain based on the + one used by the client.
+Putting Privoxy between Tor and the webserver (or an + stunnel that forwards to the webserver) allows to rewrite + headers and content to make client and server happy at the + same time.
+Using Privoxy for webservers that are only reachable + through onion addresses and whose location is supposed to + be secret is not recommended and should not be necessary + anyway.
+This action takes parameters similar to the forward directives in the + configuration file, but without the URL pattern. It can be used + as replacement, but normally it's only used in cases where + matching based on the request URL isn't sufficient.
+Warning | +
+ Please read the description for the forward directives before + using this action. Forwarding to the wrong people will + reduce your privacy and increase the chances of + man-in-the-middle attacks. +If the ports are missing or invalid, default values + will be used. This might change in the future and you + shouldn't rely on it. Otherwise incorrect syntax causes + Privoxy to exit. Due to design limitations, invalid + parameter syntax isn't detected until the action is + used the first time. +Use the show-url-info CGI page to verify that your + forward settings do what you thought the do. + |
+
+
+ # Use an ssh tunnel for requests previously tagged as
+# "User-Agent: fetch libfetch/2.0" and make sure
+# resuming downloads continues to work.
+#
+# This way you can continue to use Tor for your normal browsing,
+# without overloading the Tor network with your FreeBSD ports updates
+# or downloads of bigger files like ISOs.
+#
+# Note that HTTP headers are easy to fake and therefore their
+# values are as (un)trustworthy as your clients and users.
+{+forward-override{forward-socks5 10.0.0.2:2222 .} \
+ -hide-if-modified-since \
+ -overwrite-last-modified \
+}
+TAG:^User-Agent: fetch libfetch/2\.0$
+
+ |
+
Mark URLs that should be replaced by empty documents + if they get + blocked
+This action alone doesn't do anything noticeable. It just + marks URLs. If the block action also applies, the + presence or absence of this mark decides whether an HTML + "BLOCKED" page, or an empty document + will be sent to the client as a substitute for the blocked + content. The empty document isn't literally empty, but + actually contains a single space.
+Boolean.
+N/A
+Some browsers complain about syntax errors if JavaScript + documents are blocked with Privoxy's default HTML page; this option + can be used to silence them. And of course this action can also + be used to eliminate the Privoxy BLOCKED message in frames.
+The content type for the empty document can be specified + with content-type-overwrite{}, + but usually this isn't necessary.
+
+ + # Block all documents on example.org that end with ".js", +# but send an empty document instead of the usual HTML message. +{+block{Blocked JavaScript} +handle-as-empty-document} +example.org/.*\.js$ ++ |
+
Mark URLs as belonging to images (so they'll be replaced by + images if they do + get blocked, rather than HTML pages)
+This action alone doesn't do anything noticeable. It just + marks URLs as images. If the block action also applies, the + presence or absence of this mark decides whether an HTML + "blocked" page, or a replacement + image (as determined by the set-image-blocker + action) will be sent to the client as a substitute for the + blocked content.
+Boolean.
+N/A
+The below generic example section is actually part of + default.action. It marks all URLs + with well-known image file name extensions as images and should + be left intact.
+Users will probably only want to use the handle-as-image + action in conjunction with block, to block sources of + banners, whose URLs don't reflect the file type, like in the + second example section.
+Note that you cannot treat HTML pages as images in most + cases. For instance, (in-line) ad frames require an HTML page + to be sent, or they won't display properly. Forcing handle-as-image in this situation will not + replace the ad frame with an image, but lead to error + messages.
+
+ # Generic image extensions: +# +{+handle-as-image} +/.*\.(gif|jpg|jpeg|png|bmp|ico)$ + +# These don't look like images, but they're banners and should be +# blocked as images: +# +{+block{Nasty banners.} +handle-as-image} +nasty-banner-server.example.com/junk.cgi\?output=trash+ |
+
Pretend to use different language settings.
+Deletes or replaces the "Accept-Language:" HTTP header in client + requests.
+Parameterized.
+Keyword: "block", or any user + defined value.
+Faking the browser's language settings can be useful to make + a foreign User-Agent set with hide-user-agent + more believable.
+However some sites with content in different languages check + the "Accept-Language:" to decide + which one to take by default. Sometimes it isn't possible to + later switch to another language without changing the + "Accept-Language:" header first.
+Therefore it's a good idea to either only change the + "Accept-Language:" header to + languages you understand, or to languages that aren't wide + spread.
+Before setting the "Accept-Language:" header to a rare language, + you should consider that it helps to make your requests unique + and thus easier to trace. If you don't plan to change this + header frequently, you should stick to a common language.
+
+ + # Pretend to use Canadian language settings. +{+hide-accept-language{en-ca} \ ++hide-user-agent{Mozilla/5.0 (X11; U; OpenBSD i386; en-CA; rv:1.8.0.4) Gecko/20060628 Firefox/1.5.0.4} \ +} +/+ |
+
Prevent download menus for content you prefer to view inside + the browser.
+Deletes or replaces the "Content-Disposition:" HTTP header set by some + servers.
+Parameterized.
+Keyword: "block", or any user + defined value.
+Some servers set the "Content-Disposition:" HTTP header for documents + they assume you want to save locally before viewing them. The + "Content-Disposition:" header + contains the file name the browser is supposed to use by + default.
+In most browsers that understand this header, it makes it + impossible to just + view the document, without downloading it first, + even if it's just a simple text file or an image.
+Removing the "Content-Disposition:" header helps to prevent + this annoyance, but some browsers additionally check the + "Content-Type:" header, before they + decide if they can display a document without saving it first. + In these cases, you have to change this header as well, before + the browser stops displaying download menus.
+It is also possible to change the server's file name + suggestion to another one, but in most cases it isn't worth the + time to set it up.
+This action will probably be removed in the future, use + server-header filters instead.
+
+ + # Disarm the download link in Sourceforge's patch tracker +{ -filter \ + +content-type-overwrite{text/plain}\ + +hide-content-disposition{block} } + .sourceforge.net/tracker/download\.php+ |
+
Prevent yet another way to track the user's steps between + sessions.
+Deletes the "If-Modified-Since:" + HTTP client header or modifies its value.
+Parameterized.
+Keyword: "block", or a user + defined value that specifies a range of hours.
+Removing this header is useful for filter testing, where you + want to force a real reload instead of getting status code + "304", which would cause the browser + to use a cached copy of the page.
+Instead of removing the header, hide-if-modified-since can also add or subtract + a random amount of time to/from the header's value. You specify + a range of minutes where the random factor should be chosen + from and Privoxy does the + rest. A negative value means subtracting, a positive value + adding.
+Randomizing the value of the "If-Modified-Since:" makes it less likely that + the server can use the time as a cookie replacement, but you + will run into caching problems if the random range is too + high.
+It is a good idea to only use a small negative value and let + overwrite-last-modified + handle the greater changes.
+It is also recommended to use this action together with + crunch-if-none-match, + otherwise it's more or less pointless.
+
+ + # Let the browser revalidate but make tracking based on the time less likely. +{+hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} +/+ |
+
Keep your (old and ill) browser from telling web servers + your email address
+Deletes any existing "From:" HTTP + header, or replaces it with the specified string.
+Parameterized.
+Keyword: "block", or any user + defined value.
+The keyword "block" will + completely remove the header (not to be confused with the + block action).
+Alternately, you can specify any value you prefer to be sent + to the web server. If you do, it is a matter of fairness not to + use any address that is actually used by a real person.
+This action is rarely needed, as modern web browsers don't + send "From:" headers anymore.
+
+ +hide-from-header{block}+ |
+
+ + +hide-from-header{spam-me-senseless@sittingduck.example.com}+ |
+
Conceal which link you followed to get to a particular + site
+Deletes the "Referer:" (sic) HTTP + header from the client request, or replaces it with a forged + one.
+Parameterized.
+"conditional-block" to delete + the header completely if the host has changed.
+"conditional-forge" to forge + the header if the host has changed.
+"block" to delete the header + unconditionally.
+"forge" to pretend to be + coming from the homepage of the server we are talking + to.
+Any other string to set a user defined referrer.
+conditional-block is the only + parameter, that isn't easily detected in the server's log file. + If it blocks the referrer, the request will look like the + visitor used a bookmark or typed in the address directly.
+Leaving the referrer unmodified for requests on the same + host allows the server owner to see the visitor's "click path", but in most cases she could also + get that information by comparing other parts of the log file: + for example the User-Agent if it isn't a very common one, or + the user's IP address if it doesn't change between different + requests.
+Always blocking the referrer, or using a custom one, can + lead to failures on servers that check the referrer before they + answer any requests, in an attempt to prevent their content + from being embedded or linked to elsewhere.
+Both conditional-block and + forge will work with referrer checks, + as long as content and valid referring page are on the same + host. Most of the time that's the case.
+hide-referer is an alternate + spelling of hide-referrer and the two + can be can be freely substituted with each other. ("referrer" is the correct English spelling, + however the HTTP specification has a bug - it requires it to be + spelled as "referer".)
+
+ +hide-referrer{forge}+ |
+
+ + +hide-referrer{http://www.yahoo.com/}+ |
+
Try to conceal your type of browser and client operating + system
+Replaces the value of the "User-Agent:" HTTP header in client requests + with the specified value.
+Parameterized.
+Any user-defined string.
+Warning | +
+ This can lead to problems on web sites that depend + on looking at this header in order to customize their + content for different browsers (which, by the way, is + NOT the right thing to do: good + web sites work browser-independently). + |
+
Using this action in multi-user setups or wherever different + types of browsers will access the same Privoxy is not recommended. In + single-user, single-browser setups, you might use it to delete + your OS version information from the headers, because it is an + invitation to exploit known bugs for your OS. It is also + occasionally useful to forge this in order to access sites that + won't let you in otherwise (though there may be a good reason + in some cases).
+More information on known user-agent strings can be found at + http://www.user-agents.org/ and http://en.wikipedia.org/wiki/User_agent.
+
+ + +hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}+ |
+
Prevent abuse of Privoxy as + a TCP proxy relay or disable SSL for untrusted sites
+Specifies to which ports HTTP CONNECT requests are + allowable.
+Parameterized.
+A comma-separated list of ports or port ranges (the latter + using dashes, with the minimum defaulting to 0 and the maximum + to 65K).
+By default, i.e. if no limit-connect action applies, Privoxy allows HTTP CONNECT requests to + all ports. Use limit-connect if + fine-grained control is desired for some or all + destinations.
+The CONNECT methods exists in HTTP to allow access to secure + websites ("https://" URLs) through + proxies. It works very simply: the proxy connects to the server + on the specified port, and then short-circuits its connections + to the client and to the remote server. This means + CONNECT-enabled proxies can be used as TCP relays very + easily.
+Privoxy relays HTTPS + traffic without seeing the decoded content. Websites can + leverage this limitation to circumvent Privoxy's filters. By specifying an + invalid port range you can disable HTTPS entirely.
+
+ + +limit-connect{443} # Port 443 is OK. ++limit-connect{80,443} # Ports 80 and 443 are OK. ++limit-connect{-3, 7, 20-100, 500-} # Ports less than 3, 7, 20 to 100 and above 500 are OK. ++limit-connect{-} # All ports are OK ++limit-connect{,} # No HTTPS/SSL traffic is allowed+ |
+
Limit the lifetime of HTTP cookies to a couple of minutes or + hours.
+Overwrites the expires field in Set-Cookie server headers if + it's above the specified limit.
+Parameterized.
+The lifetime limit in minutes, or 0.
+This action reduces the lifetime of HTTP cookies coming from + the server to the specified number of minutes, starting from + the time the cookie passes Privoxy.
+Cookies with a lifetime below the limit are not modified. + The lifetime of session cookies is set to the specified + limit.
+The effect of this action depends on the server.
+In case of servers which refresh their cookies with each + response (or at least frequently), the lifetime limit set by + this action is updated as well. Thus, a session associated with + the cookie continues to work with this action enabled, as long + as a new request is made before the last limit set is + reached.
+However, some servers send their cookies once, with a + lifetime of several years (the year 2037 is a popular choice), + and do not refresh them until a certain event in the future, + for example the user logging out. In this case this action may + limit the absolute lifetime of the session, even if requests + are made frequently.
+If the parameter is "0", this + action behaves like session-cookies-only.
+
+ +limit-cookie-lifetime{60} ++ |
+
Ensure that servers send the content uncompressed, so it can + be passed through filters.
+Removes the Accept-Encoding header which can be used to ask + for compressed transfer.
+Boolean.
+N/A
+More and more websites send their content compressed by + default, which is generally a good idea and saves bandwidth. + But the filter and deanimate-gifs + actions need access to the uncompressed data.
+When compiled with zlib support (available since + Privoxy 3.0.7), content that + should be filtered is decompressed on-the-fly and you don't + have to worry about this action. If you are using an older + Privoxy version, or one that + hasn't been compiled with zlib support, this action can be used + to convince the server to send the content uncompressed.
+Most text-based instances compress very well, the size is + seldom decreased by less than 50%, for markup-heavy instances + like news feeds saving more than 90% of the original size isn't + unusual.
+Not using compression will therefore slow down the transfer, + and you should only enable this action if you really need it. + As of Privoxy 3.0.7 it's + disabled in all predefined action settings.
+Note that some (rare) ill-configured sites don't handle + requests for uncompressed documents correctly. Broken PHP + applications tend to send an empty document body, some IIS + versions only send the beginning of the content. If you enable + prevent-compression per default, you + might want to add exceptions for those sites. See the example + for how to do that.
+
+ + # Selectively turn off compression, and enable a filter +# +{ +filter{tiny-textforms} +prevent-compression } +# Match only these sites + .google. + sourceforge.net + sf.net + +# Or instead, we could set a universal default: +# +{ +prevent-compression } + / # Match all sites + +# Then maybe make exceptions for broken sites: +# +{ -prevent-compression } +.compusa.com/+ |
+
Prevent yet another way to track the user's steps between + sessions.
+Deletes the "Last-Modified:" HTTP + server header or modifies its value.
+Parameterized.
+One of the keywords: "block", + "reset-to-request-time" and + "randomize"
+Removing the "Last-Modified:" + header is useful for filter testing, where you want to force a + real reload instead of getting status code "304", which would cause the browser to reuse + the old version of the page.
+The "randomize" option overwrites + the value of the "Last-Modified:" + header with a randomly chosen time between the original value + and the current time. In theory the server could send each + document with a different "Last-Modified:" header to track visits without + using cookies. "Randomize" makes it + impossible and the browser can still revalidate cached + documents.
+"reset-to-request-time" + overwrites the value of the "Last-Modified:" header with the current time. + You could use this option together with hide-if-modified-since + to further customize your random range.
+The preferred parameter here is "randomize". It is safe to use, as long as the + time settings are more or less correct. If the server sets the + "Last-Modified:" header to the time + of the request, the random range becomes zero and the value + stays the same. Therefore you should later randomize it a + second time with hided-if-modified-since, + just to be sure.
+It is also recommended to use this action together with + crunch-if-none-match.
+
+ + # Let the browser revalidate without being tracked across sessions +{ +hide-if-modified-since{-60} \ + +overwrite-last-modified{randomize} \ + +crunch-if-none-match} +/+ |
+
Redirect requests to other sites.
+Convinces the browser that the requested document has been + moved to another location and the browser should get it from + there.
+Parameterized
+An absolute URL or a single pcrs command.
+Requests to which this action applies are answered with a + HTTP redirect to URLs of your choosing. The new URL is either + provided as parameter, or derived by applying a single pcrs + command to the original URL.
+The syntax for pcrs commands is documented in the filter file section.
+Requests can't be blocked and redirected at the same time, + applying this action together with block is a configuration + error. Currently the request is blocked and an error message + logged, the behavior may change in the future and result in + Privoxy rejecting the action file.
+This action can be combined with fast-redirects{check-decoded-url} + to redirect to a decoded version of a rewritten URL.
+Use this action carefully, make sure not to create + redirection loops and be aware that using your own redirects + might make it possible to fingerprint your requests.
+In case of problems with your redirects, or simply to watch + them working, enable debug + 128.
+
+
+ # Replace example.com's style sheet with another one
+{ +redirect{http://localhost/css-replacements/example.com.css} }
+ example.com/stylesheet\.css
+
+# Create a short, easy to remember nickname for a favorite site
+# (relies on the browser to accept and forward invalid URLs to Privoxy)
+{ +redirect{https://www.privoxy.org/user-manual/actions-file.html} }
+ a
+
+# Always use the expanded view for Undeadly.org articles
+# (Note the $ at the end of the URL pattern to make sure
+# the request for the rewritten URL isn't redirected as well)
+{+redirect{s@$@&mode=expanded@}}
+undeadly.org/cgi\?action=article&sid=\d*$
+
+# Redirect Google search requests to MSN
+{+redirect{s@^http://[^/]*/search\?q=([^&]*).*@http://search.msn.com/results.aspx?q=$1@}}
+.google.com/search
+
+# Redirect MSN search requests to Yahoo
+{+redirect{s@^http://[^/]*/results\.aspx\?q=([^&]*).*@http://search.yahoo.com/search?p=$1@}}
+search.msn.com//results\.aspx\?q=
+
+# Redirect http://example.com/&bla=fasel&toChange=foo (and any other value but "bar")
+# to http://example.com/&bla=fasel&toChange=bar
+#
+# The URL pattern makes sure that the following request isn't redirected again.
+{+redirect{s@toChange=[^&]+@toChange=bar@}}
+example.com/.*toChange=(?!bar)
+
+# Add a shortcut to look up illumos bugs
+{+redirect{s@^http://i([0-9]+)/.*@https://www.illumos.org/issues/$1@}}
+# Redirected URL = http://i4974/
+# Redirect Destination = https://www.illumos.org/issues/4974
+i[0-9][0-9][0-9][0-9]*/
+
+# Redirect remote requests for this manual
+# to the local version delivered by Privoxy
+{+redirect{s@^http://www@http://config@}}
+www.privoxy.org/user-manual/
+ |
+
Rewrite or remove single server headers.
+All server headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions.
+Multi-value.
+The name of a server-header filter, as defined in one of the + filter files.
+Server-header filters are applied to each header on its own, + not to all at once. This makes it easier to diagnose problems, + but on the downside you can't write filters that only change + header x if header y's value is z. You can do that by using + tags though.
+Server-header filters are executed after the other header + actions have finished and use their output as input.
+Please refer to the filter file + chapter to learn which server-header filters are available + by default, and how to create your own.
+
+ {+server-header-filter{html-to-xml}} +example.org/xml-instance-that-is-delivered-as-html + +{+server-header-filter{xml-to-html}} +example.org/instance-that-is-delivered-as-xml-but-is-not ++ |
+
Enable or disable filters based on the Content-Type + header.
+Server headers to which this action applies are filtered + on-the-fly through the specified regular expression based + substitutions, the result is used as tag.
+Multi-value.
+The name of a server-header tagger, as defined in one of the + filter files.
+Server-header taggers are applied to each header on its own, + and as the header isn't modified, each tagger "sees" the original.
+Server-header taggers are executed before all other header + actions that modify server headers. Their tags can be used to + control all of the other server-header actions, the content + filters and the crunch actions (redirect and block).
+Obviously crunching based on tags created by server-header + taggers doesn't prevent the request from showing up in the + server's log file.
+
+ + # Tag every request with the content type declared by the server +{+server-header-tagger{content-type}} +/ + +# If the response has a tag starting with 'image/' enable an external +# filter that only applies to images. +# +# Note that the filter is not available by default, it's just a +# silly example. +{+external-filter{rotate-image} +force-text-mode} +TAG:^image/ ++ |
+
Allow only temporary "session" + cookies (for the current browser session only).
+Deletes the "expires" field from + "Set-Cookie:" server headers. Most + browsers will not store such cookies permanently and forget + them in between sessions.
+Boolean.
+N/A
+This is less strict than crunch-incoming-cookies + / crunch-outgoing-cookies + and allows you to browse websites that insist or rely on + setting cookies, without compromising your privacy too + badly.
+Most browsers will not permanently store cookies that have + been processed by session-cookies-only + and will forget about them between sessions. This makes + profiling cookies useless, but won't break sites which require + cookies so that you can log in for transactions. This is + generally turned on for all sites, and is the recommended + setting.
+It makes no sense + at all to use session-cookies-only together with crunch-incoming-cookies + or crunch-outgoing-cookies. + If you do, cookies will be plainly killed.
+Note that it is up to the browser how it handles such + cookies without an "expires" field. + If you use an exotic browser, you might want to try it out to + be sure.
+This setting also has no effect on cookies that may have + been stored previously by the browser before starting + Privoxy. These would have to + be removed manually.
+Privoxy also uses the + content-cookies + filter to block some types of cookies. Content cookies are + not effected by session-cookies-only.
+
+ +session-cookies-only+ |
+
Choose the replacement for blocked images
+This action alone doesn't do anything noticeable. If + both + block and handle-as-image + also + apply, i.e. if the request is to be blocked as an image, + then the + parameter of this action decides what will be sent as a + replacement.
+Parameterized.
+"pattern" to send a built-in + checkerboard pattern image. The image is visually decent, + scales very well, and makes it obvious where banners were + busted.
+"blank" to send a built-in + transparent image. This makes banners disappear completely, + but makes it hard to detect where Privoxy has blocked images on a given + page and complicates troubleshooting if Privoxy has blocked innocent images, + like navigation icons.
+"target-url" to send a + redirect to target-url. + You can redirect to any image anywhere, even in your local + filesystem via "file:///" URL. + (But note that not all browsers support redirecting to a + local file system).
+A good application of redirects is to use special + Privoxy-built-in URLs, + which send the built-in images, as target-url. This has the same + visual effect as specifying "blank" or "pattern" in the first place, but enables + your browser to cache the replacement image, instead of + requesting it over and over again.
+The URLs for the built-in images are "http://config.privoxy.org/send-banner?type=type", where type is either "blank" or "pattern".
+There is a third (advanced) type, called "auto". It is NOT to be used in set-image-blocker, but meant for use from + filters. Auto will select the + type of image that would have applied to the referring page, + had it been an image.
+Built-in pattern:
+
+ +set-image-blocker{pattern}+ |
+
Redirect to the BSD daemon:
+
+ + +set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif}+ |
+
Redirect to the built-in pattern for better caching:
+
+ + +set-image-blocker{http://config.privoxy.org/send-banner?type=pattern}+ |
+
Note that many of these actions have the potential to cause a page + to misbehave, possibly even not to display at all. There are many + ways a site designer may choose to design his site, and what HTTP + header content, and other criteria, he may depend on. There is no way + to have hard and fast rules for all sites. See the Appendix for a brief example on + troubleshooting actions.
+Custom "actions", known to Privoxy as "aliases", + can be defined by combining other actions. These can in turn be invoked + just like the built-in actions. Currently, an alias name can contain + any character except space, tab, "=", + "{" and "}", but + we strongly + recommend that you only use "a" + to "z", "0" to + "9", "+", and + "-". Alias names are not case sensitive, and + are not required to start with a "+" or + "-" sign, since they are merely textually + expanded.
+Aliases can be used throughout the actions file, but they + must be defined in a special + section at the top of the file! And there can only be one + such section per actions file. Each actions file may have its own alias + section, and the aliases defined in it are only visible within that + file.
+There are two main reasons to use aliases: One is to save typing for + frequently used combinations of actions, the other one is a gain in + flexibility: If you decide once how you want to handle shops by + defining an alias called "shop", you can + later change your policy on shops in one place, and your changes will take effect + everywhere in the actions file where the "shop" alias is used. Calling aliases by their purpose + also makes your actions files more readable.
+Currently, there is one big drawback to using aliases, though: + Privoxy's built-in web-based action + file editor honors aliases when reading the actions files, but it + expands them before writing. So the effects of your aliases are of + course preserved, but the aliases themselves are lost when you edit + sections that use aliases with it.
+Now let's define some aliases...
+
+ # Useful custom aliases we can use later. + # + # Note the (required!) section header line and that this section + # must be at the top of the actions file! + # + {{alias}} + + # These aliases just save typing later: + # (Note that some already use other aliases!) + # + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + +block-as-image = +block{Blocked image.} +handle-as-image + allow-all-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies} + + # These aliases define combinations of actions + # that are useful for certain types of sites: + # + fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -prevent-compression + + shop = -crunch-all-cookies -filter{all-popups} + + # Short names for other aliases, for really lazy people ;-) + # + c0 = +crunch-all-cookies + c1 = -crunch-all-cookies+ |
+
...and put them to use. These sections would appear in the lower + part of an actions file and define exceptions to the default actions + (as specified further up for the "/" + pattern):
+
+ + # These sites are either very complex or very keen on + # user data and require minimal interference to work: + # + {fragile} + .office.microsoft.com + .windowsupdate.microsoft.com + # Gmail is really mail.google.com, not gmail.com + mail.google.com + + # Shopping sites: + # Allow cookies (for setting and retrieving your customer data) + # + {shop} + .quietpc.com + .worldpay.com # for quietpc.com + mybank.example.com + + # These shops require pop-ups: + # + {-filter{all-popups} -filter{unsolicited-popups}} + .dabs.com + .overclockers.co.uk+ |
+
Aliases like "shop" and "fragile" are typically used for "problem" sites that require more than one action to be + disabled in order to function properly.
+The above chapters have shown which + actions files there are and how they are organized, how actions are + specified and applied to URLs, how patterns work, and how to define + and use aliases. Now, let's + look at an example match-all.action, + default.action and user.action file and see how all these pieces come + together:
+Remember all actions + are disabled when matching starts, so we have to + explicitly enable the ones we want.
+While the match-all.action file only + contains a single section, it is probably the most important one. It + has only one pattern, "/", but this pattern matches all URLs. Therefore, the + set of actions used in this "default" + section will be applied to + all requests as a start. It can be partly or wholly + overridden by other actions files like default.action and user.action, but it will still be largely responsible + for your overall browsing experience.
+Again, at the start of matching, all actions are disabled, so + there is no need to disable any actions here. (Remember: a + "+" preceding the action name enables the + action, a "-" disables!). Also note how + this long line has been made more readable by splitting it into + multiple lines with line continuation.
+
+ { \ + +change-x-forwarded-for{block} \ + +hide-from-header{block} \ + +set-image-blocker{pattern} \ +} +/ # Match all URLs ++ |
+
The default behavior is now set.
+If you aren't a developer, there's no need for you to edit the + default.action file. It is maintained by + the Privoxy developers and if you + disagree with some of the sections, you should overrule them in your + user.action.
+Understanding the default.action file + can help you with your user.action, + though.
+The first section in this file is a special section for internal + use that prevents older Privoxy + versions from reading the file:
+
+ + ########################################################################## +# Settings -- Don't change! For internal Privoxy use ONLY. +########################################################################## +{{settings}} +for-privoxy-version=3.0.11+ |
+
After that comes the (optional) alias section. We'll use the + example section from the above chapter on aliases, that also + explains why and how aliases are used:
+
+ + ########################################################################## +# Aliases +########################################################################## +{{alias}} + + # These aliases just save typing later: + # (Note that some already use other aliases!) + # + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + +block-as-image = +block{Blocked image.} +handle-as-image + mercy-for-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies} + + # These aliases define combinations of actions + # that are useful for certain types of sites: + # + fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer + shop = -crunch-all-cookies -filter{all-popups}+ |
+
The first of our specialized sections is concerned with + "fragile" sites, i.e. sites that require + minimum interference, because they are either very complex or very + keen on tracking you (and have mechanisms in place that make them + unusable for people who avoid being tracked). We will use our + pre-defined fragile alias instead of stating + the list of actions explicitly:
+
+ + ########################################################################## +# Exceptions for sites that'll break under the default action set: +########################################################################## + +# "Fragile" Use a minimum set of actions for these sites (see alias above): +# +{ fragile } +.office.microsoft.com # surprise, surprise! +.windowsupdate.microsoft.com +mail.google.com+ |
+
Shopping sites are not as fragile, but they typically require + cookies to log in, and pop-up windows for shopping carts or item + details. Again, we'll use a pre-defined alias:
+
+ # Shopping sites: +# +{ shop } +.quietpc.com +.worldpay.com # for quietpc.com +.jungle.com +.scan.co.uk+ |
+
The fast-redirects action, + which may have been enabled in match-all.action, breaks some sites. So disable it + for popular sites where we know it misbehaves:
+
+ { -fast-redirects } +login.yahoo.com +edit.*.yahoo.com +.google.com +.altavista.com/.*(like|url|link):http +.altavista.com/trans.*urltext=http +.nytimes.com+ |
+
It is important that Privoxy + knows which URLs belong to images, so that if they are to be blocked, + a substitute image can be sent, rather than an HTML page. Contacting + the remote site to find out is not an option, since it would destroy + the loading time advantage of banner blocking, and it would feed the + advertisers information about you. We can mark any URL as an image + with the handle-as-image action, + and marking all URLs that end in a known image file extension is a + good start:
+
+ + ########################################################################## +# Images: +########################################################################## + +# Define which file types will be treated as images, in case they get +# blocked further down this file: +# +{ +handle-as-image } +/.*\.(gif|jpe?g|png|bmp|ico)$+ |
+
And then there are known banner sources. They often use scripts to + generate the banners, so it won't be visible from the URL that the + request is for an image. Hence we block them and mark them as images in + one go, with the help of our +block-as-image + alias defined above. (We could of course just as well use +block +handle-as-image here.) + Remember that the type of the replacement image is chosen by the + set-image-blocker + action. Since all URLs have matched the default section with its + +set-image-blocker{pattern} + action before, it still applies and needn't be repeated:
+
+ # Known ad generators: +# +{ +block-as-image } +ar.atwola.com +.ad.doubleclick.net +.ad.*.doubleclick.net +.a.yimg.com/(?:(?!/i/).)*$ +.a[0-9].yimg.com/(?:(?!/i/).)*$ +bs*.gsanet.com +.qkimg.net+ |
+
One of the most important jobs of Privoxy is to block banners. Many of these can + be "blocked" by the filter{banners-by-size} action, + which we enabled above, and which deletes the references to banner + images from the pages while they are loaded, so the browser doesn't + request them anymore, and hence they don't need to be blocked here. + But this naturally doesn't catch all banners, and some people choose + not to use filters, so we need a comprehensive list of patterns for + banner URLs here, and apply the block action to them.
+First comes many generic patterns, which do most of the work, by + matching typical domain and path name components of banners. Then + comes a list of individual patterns for specific sites, which is + omitted here to keep the example short:
+
+ + ########################################################################## +# Block these fine banners: +########################################################################## +{ +block{Banner ads.} } + +# Generic patterns: +# +ad*. +.*ads. +banner?. +count*. +/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?) +/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/ + +# Site-specific patterns (abbreviated): +# +.hitbox.com+ |
+
It's quite remarkable how many advertisers actually call their + banner servers ads.company.com, + or call the directory in which the banners are stored literally + "banners". So the above generic patterns + are surprisingly effective.
+But being very generic, they necessarily also catch URLs that we + don't want to block. The pattern .*ads. e.g. + catches "nasty-ads.nasty-corp.com" as intended, but + also "downloads.sourcefroge.net" or "adsl.some-provider.net." So here come + some well-known exceptions to the +block section above.
+Note that these are exceptions to exceptions from the default! + Consider the URL "downloads.sourcefroge.net": Initially, all actions + are deactivated, so it wouldn't get blocked. Then comes the defaults + section, which matches the URL, but just deactivates the block action + once again. Then it matches .*ads., an + exception to the general non-blocking policy, and suddenly +block applies. + And now, it'll match .*loads., where + -block + applies, so (unless it matches again further down) it ends up with no + block + action applying.
+
+ + ########################################################################## +# Save some innocent victims of the above generic block patterns: +########################################################################## + +# By domain: +# +{ -block } +adv[io]*. # (for advogato.org and advice.*) +adsl. # (has nothing to do with ads) +adobe. # (has nothing to do with ads either) +ad[ud]*. # (adult.* and add.*) +.edu # (universities don't host banners (yet!)) +.*loads. # (downloads, uploads etc) + +# By path: +# +/.*loads/ + +# Site-specific: +# +www.globalintersec.com/adv # (adv = advanced) +www.ugu.com/sui/ugu/adv+ |
+
Filtering source code can have nasty side effects, so make an + exception for our friends at sourceforge.net, and all paths with + "cvs" in them. Note that -filter + disables all + filters in one fell swoop!
+
+ # Don't filter code! +# +{ -filter } +/(.*/)?cvs +bugzilla. +developer. +wiki. +.sourceforge.net+ |
+
The actual default.action is of course + much more comprehensive, but we hope this example made clear how it + works.
+So far we are painting with a broad brush by setting general + policies, which would be a reasonable starting point for many people. + Now, you might want to be more specific and have customized rules + that are more suitable to your personal habits and preferences. These + would be for narrowly defined situations like your ISP or your bank, + and should be placed in user.action, which + is parsed after all other actions files and hence has the last word, + over-riding any previously defined actions. user.action is also a safe place for your + personal settings, since default.action is + actively maintained by the Privoxy + developers and you'll probably want to install updated versions from + time to time.
+So let's look at a few examples of things that one might typically + do in user.action:
+
+ + # My user.action file. <fred@example.com>+ |
+
As aliases are local to + the actions file that they are defined in, you can't use the ones + from default.action, unless you repeat them + here:
+
+ + # Aliases are local to the file they are defined in. +# (Re-)define aliases for this file: +# +{{alias}} +# +# These aliases just save typing later, and the alias names should +# be self explanatory. +# ++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies +-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies + allow-all-cookies = -crunch-all-cookies -session-cookies-only + allow-popups = -filter{all-popups} ++block-as-image = +block{Blocked as image.} +handle-as-image +-block-as-image = -block + +# These aliases define combinations of actions that are useful for +# certain types of sites: +# +fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer +shop = -crunch-all-cookies allow-popups + +# Allow ads for selected useful free sites: +# +allow-ads = -block -filter{banners-by-size} -filter{banners-by-link} + +# Alias for specific file types that are text, but might have conflicting +# MIME types. We want the browser to force these to be text documents. +handle-as-text = -filter +-content-type-overwrite{text/plain} +-force-text-mode -hide-content-disposition+ |
+
Say you have accounts on some sites that you visit regularly, and + you don't want to have to log in manually each time. So you'd like to + allow persistent cookies for these sites. The allow-all-cookies alias defined above does exactly + that, i.e. it disables crunching of cookies in any direction, and the + processing of cookies to make them only temporary.
+
+ { allow-all-cookies } + sourceforge.net + .yahoo.com + .msdn.microsoft.com + .redhat.com+ |
+
Your bank is allergic to some filter, but you don't know which, so + you disable them all:
+
+ { -filter } + .your-home-banking-site.com+ |
+
Some file types you may not want to filter for various + reasons:
+
+ + # Technical documentation is likely to contain strings that might +# erroneously get altered by the JavaScript-oriented filters: +# +.tldp.org +/(.*/)?selfhtml/ + +# And this stupid host sends streaming video with a wrong MIME type, +# so that Privoxy thinks it is getting HTML and starts filtering: +# +stupid-server.example.com/+ |
+
Example of a simple block + action. Say you've seen an ad on your favourite page on example.com + that you want to get rid of. You have right-clicked the image, + selected "copy image location" and pasted + the URL below while removing the leading http://, into a { +block{} } section. Note that { + +handle-as-image } need not be specified, since all URLs ending + in .gif will be tagged as images by the + general rules as set in default.action anyway:
+
+ { +block{Nasty ads.} } + www.example.com/nasty-ads/sponsor\.gif + another.example.net/more/junk/here/+ |
+
The URLs of dynamically generated banners, especially from large + banner farms, often don't use the well-known image file name + extensions, which makes it impossible for Privoxy to guess the file type just by looking + at the URL. You can use the +block-as-image + alias defined above for these cases. Note that objects which match + this rule but then turn out NOT to be an image are typically rendered + as a "broken image" icon by the browser. + Use cautiously.
+
+ { +block-as-image } + .doubleclick.net + .fastclick.net + /Realmedia/ads/ + ar.atwola.com/+ |
+
Now you noticed that the default configuration breaks Forbes + Magazine, but you were too lazy to find out which action is the + culprit, and you were again too lazy to give feedback, so you just used the fragile alias on the site, and -- whoa! -- it worked. The + fragile aliases disables those actions that + are most likely to break a site. Also, good for testing purposes to + see if it is Privoxy that is causing + the problem or not. We later find other regular sites that misbehave, + and add those to our personalized list of troublemakers:
+
+ { fragile } + .forbes.com + webmail.example.com + .mybank.com+ |
+
You like the "fun" text replacements in + default.filter, but it is disabled in the + distributed actions file. So you'd like to turn it on in your + private, update-safe config, once and for all:
+
+ { +filter{fun} } + / # For ALL sites!+ |
+
Note that the above is not really a good idea: There are + exceptions to the filters in default.action + for things that really shouldn't be filtered, like code on + CVS->Web interfaces. Since user.action + has the last word, these exceptions won't be valid for the + "fun" filtering specified here.
+You might also worry about how your favourite free websites are + funded, and find that they rely on displaying banner advertisements + to survive. So you might want to specifically allow banners for those + sites that you feel provide value to you:
+
+ { allow-ads } + .sourceforge.net + .slashdot.org + .osdn.net+ |
+
Note that allow-ads has been aliased to + -block, -filter{banners-by-size}, + and -filter{banners-by-link} + above.
+Invoke another alias here to force an over-ride of the MIME type + application/x-sh which typically would open + a download type dialog. In my case, I want to look at the shell + script, and then I can save it should I choose to.
+
+ { handle-as-text } + /.*\.sh$+ |
+
user.action is generally the best place + to define exceptions and additions to the default policies of + default.action. Some actions are safe to + have their default policies set here though. So let's set a default + policy to have a "blank" image as opposed + to the checkerboard pattern for ALL sites. "/" of + course matches all URL paths and patterns:
+
+ { +set-image-blocker{blank} } +/ # ALL sites+ |
+