+8.6. Aliases
+
+Custom "actions", known to Privoxy as "aliases", can be defined by combining
+other actions. These can in turn be invoked just like the built-in actions.
+Currently, an alias name can contain any character except space, tab, "=", "{"
+and "}", but we strongly recommend that you only use "a" to "z", "0" to "9",
+"+", and "-". Alias names are not case sensitive, and are not required to start
+with a "+" or "-" sign, since they are merely textually expanded.
+
+Aliases can be used throughout the actions file, but they must be defined in a
+special section at the top of the file! And there can only be one such section
+per actions file. Each actions file may have its own alias section, and the
+aliases defined in it are only visible within that file.
+
+There are two main reasons to use aliases: One is to save typing for frequently
+used combinations of actions, the other one is a gain in flexibility: If you
+decide once how you want to handle shops by defining an alias called "shop",
+you can later change your policy on shops in one place, and your changes will
+take effect everywhere in the actions file where the "shop" alias is used.
+Calling aliases by their purpose also makes your actions files more readable.
+
+Currently, there is one big drawback to using aliases, though: Privoxy's
+built-in web-based action file editor honors aliases when reading the actions
+files, but it expands them before writing. So the effects of your aliases are
+of course preserved, but the aliases themselves are lost when you edit sections
+that use aliases with it. This is likely to change in future versions of
+Privoxy.
+
+Now let's define some aliases...
+
+ # Useful custom aliases we can use later.
+ #
+ # Note the (required!) section header line and that this section
+ # must be at the top of the actions file!
+ #
+ {{alias}}
+
+ # These aliases just save typing later:
+ # (Note that some already use other aliases!)
+ #
+ +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
+ -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
+ block-as-image = +block +handle-as-image
+ mercy-for-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies}
+
+ # These aliases define combinations of actions
+ # that are useful for certain types of sites:
+ #
+ fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -kill-popups
+ shop = -crunch-all-cookies -filter{all-popups} -kill-popups
+
+ # Short names for other aliases, for really lazy people ;-)
+ #
+ c0 = +crunch-all-cookies
+ c1 = -crunch-all-cookies
+
+
+...and put them to use. These sections would appear in the lower part of an
+actions file and define exceptions to the default actions (as specified further
+up for the "/" pattern):
+
+ # These sites are either very complex or very keen on
+ # user data and require minimal interference to work:
+ #
+ {fragile}
+ .office.microsoft.com
+ .windowsupdate.microsoft.com
+ .nytimes.com
+
+ # Shopping sites:
+ # Allow cookies (for setting and retrieving your customer data)
+ #
+ {shop}
+ .quietpc.com
+ .worldpay.com # for quietpc.com
+ .scan.co.uk
+
+ # These shops require pop-ups:
+ #
+ {shop -kill-popups -filter{all-popups}}
+ .dabs.com
+ .overclockers.co.uk
+
+
+Aliases like "shop" and "fragile" are often used for "problem" sites that
+require some actions to be disabled in order to function properly.
+
+-------------------------------------------------------------------------------
+
+8.7. Actions Files Tutorial
+
+The above chapters have shown which actions files there are and how they are
+organized, how actions are specified and applied to URLs, how patterns work,
+and how to define and use aliases. Now, let's look at an example default.action
+and user.action file and see how all these pieces come together:
+
+-------------------------------------------------------------------------------
+
+8.7.1. default.action
+
+Every config file should start with a short comment stating its purpose:
+
+# Sample default.action file <developers@privoxy.org>
+
+
+Then, since this is the default.action file, the first section is a special
+section for internal use that you needn't change or worry about:
+
+##########################################################################
+# Settings -- Don't change! For internal Privoxy use ONLY.
+##########################################################################
+
+{{settings}}
+for-privoxy-version=3.0
+
+
+After that comes the (optional) alias section. We'll use the example section
+from the above chapter on aliases, that also explains why and how aliases are
+used:
+
+##########################################################################
+# Aliases
+##########################################################################
+{{alias}}
+
+ # These aliases just save typing later:
+ # (Note that some already use other aliases!)
+ #
+ +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
+ -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
+ block-as-image = +block +handle-as-image
+ mercy-for-cookies = -crunch-all-cookies -session-cookies-only -filter{content-cookies}
+
+ # These aliases define combinations of actions
+ # that are useful for certain types of sites:
+ #
+ fragile = -block -filter -crunch-all-cookies -fast-redirects -hide-referrer -kill-popups
+ shop = -crunch-all-cookies -filter{all-popups} -kill-popups
+
+
+Now come the regular sections, i.e. sets of actions, accompanied by URL
+patterns to which they apply. Remember all actions are disabled when matching
+starts, so we have to explicitly enable the ones we want.
+
+The first regular section is probably the most important. It has only one
+pattern, "/", but this pattern matches all URLs. Therefore, the set of actions
+used in this "default" section will be applied to all requests as a start. It
+can be partly or wholly overridden by later matches further down this file, or
+in user.action, but it will still be largely responsible for your overall
+browsing experience.
+
+Again, at the start of matching, all actions are disabled, so there is no real
+need to disable any actions here, but we will do that nonetheless, to have a
+complete listing for your reference. (Remember: a "+" preceding the action name
+enables the action, a "-" disables!). Also note how this long line has been
+made more readable by splitting it into multiple lines with line continuation.
+
+##########################################################################
+# "Defaults" section:
+##########################################################################
+ { \
+ -add-header \
+ -block \
+ -crunch-incoming-cookies \
+ -crunch-outgoing-cookies \
+ +deanimate-gifs \
+ -downgrade-http-version \
+ +fast-redirects \
+ +filter{js-annoyances} \
+ -filter{js-events} \
+ +filter{html-annoyances} \
+ -filter{content-cookies} \
+ +filter{refresh-tags} \
+ +filter{unsolicited-popups} \
+ -filter{all-popups} \
+ +filter{img-reorder} \
+ +filter{banners-by-size} \
+ -filter{banners-by-link} \
+ +filter{webbugs} \
+ -filter{tiny-textforms} \
+ +filter{jumping-windows} \
+ -filter{frameset-borders} \
+ -filter{demoronizer} \
+ -filter{shockwave-flash} \
+ -filter{quicktime-kioskmode} \
+ -filter{fun} \
+ -filter{crude-parental} \
+ +filter{ie-exploits} \
+ -handle-as-image \
+ +hide-forwarded-for-headers \
+ +hide-from-header{block} \
+ +hide-referrer{forge} \
+ -hide-user-agent \
+ -kill-popups \
+ -limit-connect \
+ +prevent-compression \
+ -send-vanilla-wafer \
+ -send-wafer \
+ +session-cookies-only \
+ +set-image-blocker{pattern} \
+ }
+ / # forward slash will match *all* potential URL patterns.
+
+
+The default behavior is now set. Note that some actions, like not hiding the
+user agent, are part of a "general policy" that applies universally and won't
+get any exceptions defined later. Other choices, like not blocking (which is
+understandably the default!) need exceptions, i.e. we need to specify
+explicitly what we want to block in later sections.
+
+The first of our specialized sections is concerned with "fragile" sites, i.e.
+sites that require minimum interference, because they are either very complex
+or very keen on tracking you (and have mechanisms in place that make them
+unusable for people who avoid being tracked). We will simply use our
+pre-defined fragile alias instead of stating the list of actions explicitly:
+
+##########################################################################
+# Exceptions for sites that'll break under the default action set:
+##########################################################################
+
+# "Fragile" Use a minimum set of actions for these sites (see alias above):
+#
+{ fragile }
+.office.microsoft.com # surprise, surprise!
+.windowsupdate.microsoft.com
+
+
+Shopping sites are not as fragile, but they typically require cookies to log
+in, and pop-up windows for shopping carts or item details. Again, we'll use a
+pre-defined alias:
+
+# Shopping sites:
+#
+{ shop }
+.quietpc.com
+.worldpay.com # for quietpc.com
+.jungle.com
+.scan.co.uk
+
+
+The fast-redirects action, which we enabled per default above, breaks some
+sites. So disable it for popular sites where we know it misbehaves:
+
+{ -fast-redirects }
+login.yahoo.com
+edit.*.yahoo.com
+.google.com
+.altavista.com/.*(like|url|link):http
+.altavista.com/trans.*urltext=http
+.nytimes.com
+
+
+It is important that Privoxy knows which URLs belong to images, so that if they
+are to be blocked, a substitute image can be sent, rather than an HTML page.
+Contacting the remote site to find out is not an option, since it would destroy
+the loading time advantage of banner blocking, and it would feed the
+advertisers (in terms of money and information). We can mark any URL as an
+image with the handle-as-image action, and marking all URLs that end in a known
+image file extension is a good start:
+
+##########################################################################
+# Images:
+##########################################################################
+
+# Define which file types will be treated as images, in case they get
+# blocked further down this file:
+#
+{ +handle-as-image }
+/.*\.(gif|jpe?g|png|bmp|ico)$
+
+
+And then there are known banner sources. They often use scripts to generate the
+banners, so it won't be visible from the URL that the request is for an image.
+Hence we block them and mark them as images in one go, with the help of our
+block-as-image alias defined above. (We could of course just as well use +block
++handle-as-image here.) Remember that the type of the replacement image is
+chosen by the set-image-blocker action. Since all URLs have matched the default
+section with its +set-image-blocker{pattern} action before, it still applies
+and needn't be repeated:
+
+# Known ad generators:
+#
+{ block-as-image }
+ar.atwola.com
+.ad.doubleclick.net
+.ad.*.doubleclick.net
+.a.yimg.com/(?:(?!/i/).)*$
+.a[0-9].yimg.com/(?:(?!/i/).)*$
+bs*.gsanet.com
+bs*.einets.com
+.qkimg.net
+
+
+One of the most important jobs of Privoxy is to block banners. A huge bunch of
+them are already "blocked" by the filter{banners-by-size} action, which we
+enabled above, and which deletes the references to banner images from the pages
+while they are loaded, so the browser doesn't request them anymore, and hence
+they don't need to be blocked here. But this naturally doesn't catch all
+banners, and some people choose not to use filters, so we need a comprehensive
+list of patterns for banner URLs here, and apply the block action to them.
+
+First comes a bunch of generic patterns, which do most of the work, by matching
+typical domain and path name components of banners. Then comes a list of
+individual patterns for specific sites, which is omitted here to keep the
+example short:
+
+##########################################################################
+# Block these fine banners:
+##########################################################################
+{ +block }
+
+# Generic patterns:
+#
+ad*.
+.*ads.
+banner?.
+count*.
+/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
+/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
+
+# Site-specific patterns (abbreviated):
+#
+.hitbox.com
+
+
+You wouldn't believe how many advertisers actually call their banner servers
+ads.company.com, or call the directory in which the banners are stored simply
+"banners". So the above generic patterns are surprisingly effective.
+
+But being very generic, they necessarily also catch URLs that we don't want to
+block. The pattern .*ads. e.g. catches "nasty-ads.nasty-corp.com" as intended,
+but also "downloads.sourcefroge.net" or "adsl.some-provider.net." So here come
+some well-known exceptions to the +block section above.
+
+Note that these are exceptions to exceptions from the default! Consider the URL
+"downloads.sourcefroge.net": Initially, all actions are deactivated, so it
+wouldn't get blocked. Then comes the defaults section, which matches the URL,
+but just deactivates the block action once again. Then it matches .*ads., an
+exception to the general non-blocking policy, and suddenly +block applies. And
+now, it'll match .*loads., where -block applies, so (unless it matches again
+further down) it ends up with no block action applying.
+
+##########################################################################
+# Save some innocent victims of the above generic block patterns:
+##########################################################################
+
+# By domain:
+#
+{ -block }
+adv[io]*. # (for advogato.org and advice.*)
+adsl. # (has nothing to do with ads)
+ad[ud]*. # (adult.* and add.*)
+.edu # (universities don't host banners (yet!))
+.*loads. # (downloads, uploads etc)
+
+# By path:
+#
+/.*loads/
+
+# Site-specific:
+#
+www.globalintersec.com/adv # (adv = advanced)
+www.ugu.com/sui/ugu/adv
+
+
+Filtering source code can have nasty side effects, so make an exception for our
+friends at sourceforge.net, and all paths with "cvs" in them. Note that -filter
+disables all filters in one fell swoop!
+
+# Don't filter code!
+#
+{ -filter }
+/.*cvs
+.sourceforge.net
+
+
+The actual default.action is of course more comprehensive, but we hope this
+example made clear how it works.
+
+-------------------------------------------------------------------------------
+
+8.7.2. user.action
+
+So far we are painting with a broad brush by setting general policies, which
+would be a reasonable starting point for many people. Now, you might want to be
+more specific and have customized rules that are more suitable to your personal
+habits and preferences. These would be for narrowly defined situations like
+your ISP or your bank, and should be placed in user.action, which is parsed
+after all other actions files and hence has the last word, over-riding any
+previously defined actions. user.action is also a safe place for your personal
+settings, since default.action is actively maintained by the Privoxy developers
+and you'll probably want to install updated versions from time to time.
+
+So let's look at a few examples of things that one might typically do in
+user.action:
+
+# My user.action file. <fred@foobar.com>
+
+
+As aliases are local to the actions file that they are defined in, you can't
+use the ones from default.action, unless you repeat them here:
+
+# Aliases are local to the file they are defined in.
+# (Re-)define aliases for this file:
+#
+{{alias}}
+#
+# These aliases just save typing later, and the alias names should
+# be self explanatory.
+#
++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
+-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
+ allow-all-cookies = -crunch-all-cookies -session-cookies-only
+ allow-popups = -filter{all-popups} -kill-popups
++block-as-image = +block +handle-as-image
+-block-as-image = -block
+
+# These aliases define combinations of actions that are useful for
+# certain types of sites:
+#
+fragile = -block -crunch-all-cookies -filter -fast-redirects -hide-referrer -kill-popups
+shop = -crunch-all-cookies allow-popups
+
+# Allow ads for selected useful free sites:
+#
+allow-ads = -block -filter{banners-by-size} -filter{banners-by-link}
+
+
+Say you have accounts on some sites that you visit regularly, and you don't
+want to have to log in manually each time. So you'd like to allow persistent
+cookies for these sites. The allow-all-cookies alias defined above does exactly
+that, i.e. it disables crunching of cookies in any direction, and the
+processing of cookies to make them only temporary.
+
+{ allow-all-cookies }
+sourceforge.net
+sunsolve.sun.com
+.slashdot.org
+.yahoo.com
+.msdn.microsoft.com
+.redhat.com
+
+
+Your bank is allergic to some filter, but you don't know which, so you disable
+them all:
+
+{ -filter }
+.your-home-banking-site.com
+
+
+Some file types you may not want to filter for various reasons:
+
+# Technical documentation is likely to contain strings that might
+# erroneously get altered by the JavaScript-oriented filters:
+#
+.tldp.org
+/(.*/)?selfhtml/
+
+# And this stupid host sends streaming video with a wrong MIME type,
+# so that Privoxy thinks it is getting HTML and starts filtering:
+#
+stupid-server.example.com/
+
+
+Example of a simple block action. Say you've seen an ad on your favourite page
+on example.com that you want to get rid of. You have right-clicked the image,
+selected "copy image location" and pasted the URL below while removing the
+leading http://, into a { +block } section. Note that { +handle-as-image } need
+not be specified, since all URLs ending in .gif will be tagged as images by the
+general rules as set in default.action anyway:
+
+{ +block }
+www.example.com/nasty-ads/sponsor.gif
+another.popular.site.net/more/junk/here/
+
+
+The URLs of dynamically generated banners, especially from large banner farms,
+often don't use the well-known image file name extensions, which makes it
+impossible for Privoxy to guess the file type just by looking at the URL. You
+can use the +block-as-image alias defined above for these cases. Note that
+objects which match this rule but then turn out NOT to be an image are
+typically rendered as a "broken image" icon by the browser. Use cautiously.
+
+{ +block-as-image }
+.doubleclick.net
+/Realmedia/ads/
+ar.atwola.com/
+
+
+Now you noticed that the default configuration breaks Forbes Magazine, but you
+were too lazy to find out which action is the culprit, and you were again too
+lazy to give feedback, so you just used the fragile alias on the site, and --
+whoa! -- it worked. The fragile aliases disables those actions that are most
+likely to break a site. Also, good for testing purposes to see if it is Privoxy
+that is causing the problem or not.
+
+{ fragile }
+.forbes.com
+
+
+You like the "fun" text replacements in default.filter, but it is disabled in
+the distributed actions file. (My colleagues on the team just don't have a
+sense of humour, that's why! ;-). So you'd like to turn it on in your private,
+update-safe config, once and for all:
+
+{ +filter{fun} }
+/ # For ALL sites!
+
+
+Note that the above is not really a good idea: There are exceptions to the
+filters in default.action for things that really shouldn't be filtered, like
+code on CVS->Web interfaces. Since user.action has the last word, these
+exceptions won't be valid for the "fun" filtering specified here.
+
+You might also worry about how your favourite free websites are funded, and
+find that they rely on displaying banner advertisements to survive. So you
+might want to specifically allow banners for those sites that you feel provide
+value to you:
+
+{ allow-ads }
+.sourceforge.net
+.slashdot.org
+.osdn.net
+
+
+Note that allow-ads has been aliased to -block, -filter{banners-by-size}, and -
+filter{banners-by-link} above.
+
+user.action is generally the best place to define exceptions and additions to
+the default policies of default.action. Some actions are safe to have their
+default policies set here though. So let's set a default policy to have a
+"blank" image as opposed to the checkerboard pattern for ALL sites. "/" of
+course matches all URL paths and patterns:
+
+{ +set-image-blocker{blank} }
+/ # ALL sites
+
+
+-------------------------------------------------------------------------------
+
+9. The Filter File
+
+All text substitutions that can be invoked through the filter action must first
+be defined in the filter file, which is typically called default.filter and
+which can be selected through the filterfile config option.
+
+Typical reasons for doing such substitutions are to eliminate common annoyances
+in HTML and JavaScript, such as pop-up windows, exit consoles, crippled windows
+without navigation tools, the infamous <BLINK> tag etc, to suppress images with
+certain width and height attributes (standard banner sizes or web-bugs), or
+just to have fun. The possibilities are endless.
+
+Filtering works on any text-based document type, including HTML, JavaScript,
+CSS etc. (all text/* MIME types, except text/plain). Substitutions are made at
+the source level, so if you want to "roll your own" filters, you should be
+familiar with HTML syntax.
+
+Just like the actions files, the filter file is organized in sections, which
+are called filters here. Each filter consists of a heading line, that starts
+with the keyword FILTER:, followed by the filter's name, and a short (one line)
+description of what it does. Below that line come the jobs, i.e. lines that
+define the actual text substitutions. By convention, the name of a filter
+should describe what the filter eliminates. The comment is used in the
+web-based user interface.
+
+Once a filter called name has been defined in the filter file, it can be
+invoked by using an action of the form +filter{name} in any actions file.
+
+A filter header line for a filter called "foo" could look like this:
+
+FILTER: foo Replace all "foo" with "bar"
+
+
+Below that line, and up to the next header line, come the jobs that define what
+text replacements the filter executes. They are specified in a syntax that
+imitates Perl's s/// operator. If you are familiar with Perl, you will find
+this to be quite intuitive, and may want to look at the PCRS man page for the
+subtle differences to Perl behaviour. Most notably, the non-standard option
+letter U is supported, which turns the default to ungreedy matching.
+
+If you are new to regular expressions, you might want to take a look at the
+Appendix on regular expressions, and see the Perl manual for the s///
+operator's syntax and Perl-style regular expressions in general. The below
+examples might also help to get you started.
+
+-------------------------------------------------------------------------------
+
+9.1. Filter File Tutorial
+
+Now, let's complete our "foo" filter. We have already defined the heading, but
+the jobs are still missing. Since all it does is to replace "foo" with "bar",
+there is only one (trivial) job needed:
+
+s/foo/bar/
+
+
+But wait! Didn't the comment say that all occurrences of "foo" should be
+replaced? Our current job will only take care of the first "foo" on each page.
+For global substitution, we'll need to add the g option:
+
+s/foo/bar/g
+
+
+Our complete filter now looks like this:
+
+FILTER: foo Replace all "foo" with "bar"
+s/foo/bar/g
+
+
+Let's look at some real filters for more interesting examples. Here you see a
+filter that protects against some common annoyances that arise from JavaScript
+abuse. Let's look at its jobs one after the other:
+
+FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
+
+# Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
+#
+s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|Usg
+
+
+Following the header line and a comment, you see the job. Note that it uses |
+as the delimiter instead of /, because the pattern contains a forward slash,
+which would otherwise have to be escaped by a backslash (\).
+
+Now, let's examine the pattern: it starts with the text <script.* enclosed in
+parentheses. Since the dot matches any character, and * means: "Match an
+arbitrary number of the element left of myself", this matches "<script",
+followed by any text, i.e. it matches the whole page, from the start of the
+first <script> tag.
+
+That's more than we want, but the pattern continues: document\.referrer matches
+only the exact string "document.referrer". The dot needed to be escaped, i.e.
+preceded by a backslash, to take away its special meaning as a joker, and make
+it just a regular dot. So far, the meaning is: Match from the start of the
+first <script> tag in a the page, up to, and including, the text
+"document.referrer", if both are present in the page (and appear in that
+order).
+
+But there's still more pattern to go. The next element, again enclosed in
+parentheses, is .*</script>. You already know what .* means, so the whole
+pattern translates to: Match from the start of the first <script> tag in a page
+to the end of the last <script> tag, provided that the text "document.referrer"
+appears somewhere in between.
+
+This is still not the whole story, since we have ignored the options and the
+parentheses: The portions of the page matched by sub-patterns that are enclosed
+in parentheses, will be remembered and be available through the variables $1,
+$2, ... in the substitute. The U option switches to ungreedy matching, which
+means that the first .* in the pattern will only "eat up" all text in between "
+<script" and the first occurrence of "document.referrer", and that the second
+.* will only span the text up to the first "</script>" tag. Furthermore, the s
+option says that the match may span multiple lines in the page, and the g
+option again means that the substitution is global.
+
+So, to summarize, the pattern means: Match all scripts that contain the text
+"document.referrer". Remember the parts of the script from (and including) the
+start tag up to (and excluding) the string "document.referrer" as $1, and the
+part following that string, up to and including the closing tag, as $2.
+
+Now the pattern is deciphered, but wasn't this about substituting things? So
+lets look at the substitute: $1"Not Your Business!"$2 is easy to read: The text
+remembered as $1, followed by "Not Your Business!" (including the quotation
+marks!), followed by the text remembered as $2. This produces an exact copy of
+the original string, with the middle part (the "document.referrer") replaced by
+"Not Your Business!".
+
+The whole job now reads: Replace "document.referrer" by "Not Your Business!"
+wherever it appears inside a <script> tag. Note that this job won't break
+JavaScript syntax, since both the original and the replacement are
+syntactically valid string objects. The script just won't have access to the
+referrer information anymore.
+
+We'll show you two other jobs from the JavaScript taming department, but this
+time only point out the constructs of special interest:
+
+# The status bar is for displaying link targets, not pointless blahblah
+#
+s/window\.status\s*=\s*(['"]).*?\1/dUmMy=1/ig
+
+
+\s stands for whitespace characters (space, tab, newline, carriage return, form
+feed), so that \s* means: "zero or more whitespace". The ? in .*? makes this
+matching of arbitrary text ungreedy. (Note that the U option is not set). The
+['"] construct means: "a single or a double quote". Finally, \1 is a
+backreference to the first parenthesis just like $1 above, with the difference
+that in the pattern, a backslash indicates a backreference, whereas in the
+substitute, it's the dollar.