From 6f113c5cca4a173f76c1000a093fc4a8618e3668 Mon Sep 17 00:00:00 2001 From: oes Date: Fri, 17 May 2002 14:08:27 +0000 Subject: [PATCH] generated --- doc/webserver/user-manual/actions-file.html | 4722 +++++++++++------- doc/webserver/user-manual/appendix.html | 234 +- doc/webserver/user-manual/config.html | 128 +- doc/webserver/user-manual/configuration.html | 7 +- doc/webserver/user-manual/contact.html | 164 +- doc/webserver/user-manual/copyright.html | 154 +- doc/webserver/user-manual/filter-file.html | 717 ++- doc/webserver/user-manual/index.html | 158 +- doc/webserver/user-manual/installation.html | 24 +- doc/webserver/user-manual/introduction.html | 5 +- doc/webserver/user-manual/quickstart.html | 113 +- doc/webserver/user-manual/seealso.html | 115 +- doc/webserver/user-manual/startup.html | 11 +- doc/webserver/user-manual/templates.html | 171 +- doc/webserver/user-manual/upgradersnote.html | 10 +- 15 files changed, 4416 insertions(+), 2317 deletions(-) diff --git a/doc/webserver/user-manual/actions-file.html b/doc/webserver/user-manual/actions-file.html index 9998abf4..fb9d76b3 100644 --- a/doc/webserver/user-manual/actions-file.html +++ b/doc/webserver/user-manual/actions-file.html @@ -4,8 +4,7 @@ >Actions Files

8.1. Finding the Right Mix

8.2. How to Edit

8.3. How Actions are Applied to URLs

8.4. Patterns

8.4.1. The Domain Pattern

8.4.2. The Path Pattern

8.5.1. +add-headeradd-header

Type:
Typical use:

Multi-value.

Confuse log analysis, custom applications

Purpose and typical uses:
Effect:

Send a user defined HTTP header to the web server. Can be used to confuse log analysis. +> Sends a user defined HTTP header to the web server.

Possible values:
Type:

Multi-value.

Parameter:

Any value is possible. Validity of the defined HTTP headers is not checked. +> Any string value is possible. Validity of the defined HTTP headers is not checked. It is recommended that you use the "

Example usage:

     {+add-header{X-User-Tracking: sucks}}
-     .example.com

Notes:

Example usage:

+add-header{X-User-Tracking: sucks}
+

8.5.2. +blockblock

Type:
Typical use:

Boolean.

Block ads or other obnoxious content

Purpose and typical uses:
Effect:

Requests for URLs to which this action applies are blocked, i.e. the requests are not forwarded to the remote server, but answered locally with a substitute page or image, - as determined by the handle-as-image and - + and set-image-blocker actions. - It is typically used to block ads or other obnoxious content.

Possible values:
Type:

N/A

Boolean.

Example usage:
Parameter:

     {+block}
-     .banners.example.com
-     .ads.r.us
-    

N/A

Notes:

If a URL matches one of the blocked patterns, Privoxy - will intercept the URL and display its special sends a special "BLOCKED" page - instead. If there is sufficient space, a large red banner will appear with - a friendly message about why the page was blocked, and a way to go there - anyway. If there is insufficient space a smaller "BLOCKED" page adapts to the available + screen space -- it displays full-blown if space allows, or miniaturized and text-only + if loaded into a small frame or window. If you are using Privoxy - page will appear without the red banner. + right now, you can take a look at the Click here"BLOCKED" - to view the default blocked HTML page (Privoxy must be running - for this to work as intended!). + page.

- A very important exception is if the URL matches both - "+block" and both + block and "+handle-as-image"handle-as-image, - then it will be handled by - "+set-image-blocker"set-image-blocker - (see below). It is important to understand this process, in order + (see below) also applies, the type of image will be determined by its parameter, + if not, the standard checkerboard pattern is sent. +

It is important to understand this process, in order to understand how Privoxy is able to deal with - ads and other objectionable content. +> deals with + ads and other unwanted content.

The The "+filter"filter - action can also perform some of the - same functionality as "+block", but by virtue of very - different programming techniques, and is most often used for different - reasons. +>"blocking" + banner images and other content through rewriting the relevant URLs in the + document's HTML source, so they don't get requested in the first place. + Note that this is a totally different technique, and it's easy to confuse the two.

Example usage (section):

{+block}      # Block and replace with "blocked" page
+.nasty-stuff.example.com
+
+{+block +handle-as-image} # Block and replace with image
+.ad.doubleclick.net
+.ads.r.us
+

8.5.3. +deanimate-gifscrunch-incoming-cookies

Type:
Typical use:

Parameterized.

Prevent the web server from setting any cookies on your system +

Typical uses:
Effect:

To stop those annoying, distracting animated GIF images. +> Deletes any "Set-Cookie:" HTTP headers from server replies.

Possible values:
Type:

"last" or "first" +>Boolean.

Parameter:

N/A

Example usage:
Notes:

       This action is only concerned with incoming cookies. For + outgoing cookies, use + crunch-outgoing-cookies. + Use {+deanimate-gifs{last}}
-      both to disable cookies completely. +

It makes .example.com
-    

no sense at all to use this action in conjunction + with the session-cookies-only action, + since it would prevent the session cookies from being set. +

Notes:
Example usage:

De-animate all animated GIF images, i.e. reduce them to their last frame. - This will also shrink the images considerably (in bytes, not pixels!). If - the option "first" is given, the first frame of the animation - is used as the replacement. If "last" is given, the last - frame of the animation is used instead, which probably makes more sense for - most banner animations, but also has the risk of not showing the entire - last frame (if it is only a delta to an earlier frame). +>
+crunch-incoming-cookies

8.5.4. +downgrade-http-versioncrunch-outgoing-cookies

Type:
Typical use:

Boolean.

Prevent the web server from reading any cookies from your system +

Typical uses:
Effect:

Deletes any "+downgrade-http-version" will downgrade HTTP/1.1 client requests to - HTTP/1.0 and downgrade the responses as well. +>"Cookie:" HTTP headers from client requests.

Possible values:
Type:

Boolean.

Parameter:

N/A

Example usage:
Notes:

      This action is only concerned with outgoing cookies. For + incoming cookies, use + crunch-incoming-cookies. + Use {+downgrade-http-version}
-     both to disable cookies completely. +

It makes .example.com
-    

no sense at all to use this action in conjunction + with the session-cookies-only action, + since it would prevent the session cookies from being read. +

Notes:
Example usage:

Use this action for servers that use HTTP/1.1 protocol features that - Privoxy doesn't handle well yet. HTTP/1.1 is - only partially implemented. Default is not to downgrade requests. This is - an infrequently needed action, and is used to help with rare problem sites only. +>
+crunch-outgoing-cookies

8.5.5. +fast-redirectsdeanimate-gifs

Type:
Typical use:

Boolean.

Stop those annoying, distracting animated GIF images.

Typical uses:
Effect:

The "+fast-redirects" action enables interception of - De-animate GIF animations, i.e. reduce them to their first or last image. +

Type:

Parameterized.

Parameter:

"redirect" requests from one server to another, which - are used to track users.Privoxy can cut off - all but the last valid URL in a redirect request and send a local redirect - back to your browser without contacting the intermediate site(s). +>"last" or "first"

Possible values:
Notes:

N/A +> This will also shrink the images considerably (in bytes, not pixels!). If + the option "first" is given, the first frame of the animation + is used as the replacement. If "last" is given, the last + frame of the animation is used instead, which probably makes more sense for + most banner animations, but also has the risk of not showing the entire + last frame (if it is only a delta to an earlier frame). +

You can safely use this action with patterns that will also match non-GIF + objects, because no attempt will be made at anything that doesn't look like + a GIF.

Example usage:

     
+deanimate-gifs{last}
+

8.5.6. {+fast-redirects}
-     downgrade-http-version

Typical use:

Work around (very rare) problems with HTTP/1.1

Effect:

Downgrades HTTP/1.1 client requests and server replies to HTTP/1.0. +

Type:

Boolean.

Parameter:

N/A +

Notes:

This is a left-over from the time when Privoxy + didn't support important HTTP/1.1 features well. It is left here for the + unlikely case that you experience HTTP/1.1 related problems with some server + out there. Not all (optional) HTTP/1.1 features are supported yet, so there + is a chance you might need this action. +

Example usage (section):

{+downgrade-http-version}
+problem-host.example.com
+

8.5.7. .example.com
-    

fast-redirects

Typical use:

Fool some click-tracking scripts and speed up indirect links

Effect:

Cut off all but the last valid URL from requests. +

Type:

Boolean.

Parameter:

N/A +

Notes:

Many sites, like yahoo.com, don't just link to other sites. Instead, they - will link to some script on their own server, giving the destination as a + will link to some script on their own servers, giving the destination as a parameter, which will then redirect you to the final target. URLs resulting from this scheme typically look like: http://some.place/some_script?http://some.where-elsehttp://some.place/click-tracker.cgi?target=http://some.where.else.

This is a normally "on" feature, and often requires exceptions - for sites that are sensitive to defeating this mechanism. +> This feature is currently not very smart and is scheduled for improvement. + It is likely to break some sites. You should expect to need possibly + many exceptions to this action, if it is enabled by default in + default.action. Some sites just don't work without + it.

Example usage:

{+fast-redirects}
+

8.5.6. 8.5.8. +filterfilter

Typical use:

Get rid of HTML and JavaScript annoyances, banner advertisements (by size), do fun text replacements, etc.

Effect:

Text documents, including HTML and JavaScript, to which this action applies, are filtered on-the-fly + through the specified regular expression based substitutions. +

Type:

Parameterized.

Typical uses:
Parameter:

Apply page filtering as defined by named sections of the - The name of a filter, as defined in the filter file + (typically default.filter file to the specified site(s). - "Filtering" can be any modification of the raw - page content, including re-writing or deletion of content. +>, set by the + filterfile + option in the config file)

Possible values:
Notes:

For your convenience, there are a bunch of pre-defined filters available + in the distribution filter file that you can use. See the example below for + a list. +

This is potentially a very powerful feature! But "+filter" must include the name of one of the section identifiers - from default.filter (or whatever - filterfile is specified in config). +>"rolling your own" + filters requires a knowledge of regular expressions and HTML. +

Filtering requires buffering the page content, which may appear to + slow down page rendering since nothing is displayed until all content has + passed the filters. (It does not really take longer, but seems that way + since the page is not incrementally displayed.) This effect will be more + noticeable on slower connections. +

At this time, Privoxy cannot (yet!) uncompress compressed + documents. If you want filtering to work on all documents, even those that + would normally be sent compressed, use the + prevent-compression + action in conjunction with filter. +

Filtering can achieve some of the effects as the + block + action, i.e. it can be used to block ads and banners. +

Feedback with suggestions for new or improved filters is particularly + welcome!

Example usage (from the current Example usage (with filters from the distribution default.filter):
file):

- +filter{html-annoyances}: Get rid of particularly annoying HTML abuse. -

+filter{html-annoyances}     # Get rid of particularly annoying HTML abuse.
- +filter{js-annoyances}: Get rid of particularly annoying JavaScript abuse -
+

+
+filter{js-annoyances}       # Get rid of particularly annoying JavaScript abuse
+

-

+filter{banners-by-size}     # Kill banners by size (+filter{content-cookies}:   Kill cookies that come in the HTML or JS content 
-   very efficient!)
+

+
+filter{content-cookies}     # Kill cookies that come sneaking in the HTML or JS content
+

- +filter{popups}: Kill all popups in JS and HTML -

+filter{popups}              # Kill all popups in JS and HTML
- +filter{frameset-borders}: Give frames a border and make them resizable -
+

- +filter{webbugs}: Squish WebBugs (1x1 invisible GIFs used for user tracking) -

+filter{webbugs}             # Squish WebBugs (1x1 invisible GIFs used for user tracking)
- +filter{refresh-tags}: Kill automatic refresh tags (for dial-on-demand setups) -
+

- +filter{fun}: Text replacements for subversive browsing fun! -

+filter{fun}                 # Text replacements for subversive browsing fun!
- +filter{nimda}: Remove Nimda (virus) code. -
+

+
+filter{frameset-borders}    # Give frames a border and make them resizeable
+

- +filter{banners-by-size}: Kill banners by size (very efficient!) -

+filter{refresh-tags}        # Kill automatic refresh tags (for dial-on-demand setups)
+

+
+filter{nimda}               # Remove Nimda (virus) code.
+

- +filter{shockwave-flash}: Kill embedded Shockwave Flash objects -

+filter{shockwave-flash}     # Kill embedded Shockwave Flash objects
+

- +filter{crude-parental}: Kill all web pages that contain the words "sex" or "warez" -

+filter{crude-parental}      # Kill all web pages that contain the words "sex" or "warez"

Notes:

This is potentially a very powerful feature! And requires a knowledge - of regular expressions if you want to "roll your own". - Filtering operates on a line by line basis throughout the entire page. -

Filtering requires buffering the page content, which may appear to - slow down page rendering since nothing is displayed until all content has - passed the filters. (It does not really take longer, but seems that way - since the page is not incrementally displayed.) This effect will be more - noticeable on slower connections. -

Filtering can achieve some of the effects as the - "+block" - action, i.e. it can be used to block ads and banners. In the overall - scheme of things, filtering is one of the first things "Privoxy" - does with a web page. So other most other actions are applied to the - already "filtered" page.

8.5.7. 8.5.9. +hide-forwarded-for-headershandle-as-image

Type:

Boolean.

Typical uses:
Typical use:

Block any existing X-Forwarded-for HTTP header, and do not add a new one. -

Mark URLs as belonging to images (so they'll be replaced by images if they get blocked)

Possible values:
Effect:

This action alone doesn't do anything noticeable. It just marks URLs as images. + If the block action also applies, + the presence or absence of this mark decides whether an HTML "blocked" + page, or a replacement image (as determined by the set-image-blocker action) will be sent to the + client as a substitute for the blocked content. +

Type:

Boolean.

Parameter:

N/A

Example usage:
Notes:

     {+hide-forwarded-for-headers}
-      The below generic example section is actually part of default.action. + It marks all URLs with well-known image file name extensions as images and should + be left intact. +

Users will probably only want to use the handle-as-image action in conjunction with + block, to block sources of banners, whose URLs don't + reflect the file type, like in the second example section. +

Note that you cannot treat HTML pages as images in most cases. For instance, (inline) ad + frames require an HTML page to be sent, or they won't display properly. + Forcing handle-as-image in this situation will not replace the + ad frame with an image, but lead to error messages. +

Example usage (sections):

# Generic image extensions:
+#
+{+handle-as-image}
+/.*\.(gif|jpg|jpeg|png|bmp|ico)$
+
+# These don't look like images, but they're banners and should be
+# blocked as images:
+#
+{+block +handle-as-image}
+some.nasty-banner-server.com/junk.cgi?output=trash
+
+# Banner source! Who cares if they also have non-image content?
+ad.doubleclick.net 
+

8.5.10. .example.com
-    

hide-forwarded-for-headers

Typical use:

Improve privacy by hiding the true source of the request

Effect:

Deletes any existing "X-Forwarded-for:" HTTP header from client requests, + and prevents adding a new one. +

Type:

Boolean.

Parameter:

N/A +

Notes:

It is fairly safe to leave this on. It does not seem to break many sites. +> It is fairly safe to leave this on. +

This action is scheduled for improvement: It should be able to generate forged + "X-Forwarded-for:" headers using random IP addresses from a specified network, + to make successive requests from the same client look like requests from a pool of different + users sharing the same proxy. +

Example usage:

+hide-forwarded-for-headers

8.5.8. 8.5.11. +hide-from-headerhide-from-header

Type:
Typical use:

Parameterized.

Keep your (old and ill) browser from telling web servers your email address

Typical uses:
Effect:

To block the browser from sending your email address in a Deletes any existing "From:" - header. +> HTTP header, or replaces it with the + specified string.

Possible values:
Type:

Parameterized.

Parameter:

Keyword:

Example usage:

     {+hide-from-header{block}}
-     .example.com
-    

Notes:

"block" will completely remove the header - (not to be confused with the block + action). +

Alternately, you can specify any value you prefer to be sent to the web + server. If you do, it is a matter of fairness not to use any address that + is actually used by a real person. +

This action is rarely needed, as modern web browsers don't send + "+block" action). - Alternately, you can specify any value you prefer to send to the web - server. +>"From:" headers anymore. +

Example usage:

+hide-from-header{block}
or +
+hide-from-header{spam-me-senseless@sittingduck.example.com}

8.5.9. 8.5.12. +hide-refererhide-referrer

Type:
Typical use:

Parameterized.

Conceal which link you followed to get to a particular site

Typical uses:
Effect:

Don't send the Deletes the "Referer:" (sic) HTTP header to the web site. - Or, alternately send a forged header instead. +> (sic) HTTP header from the client request, + or replaces it with a forged one.

Possible values:
Type:

Parameterized.

Parameter:

Prevent the header from being sent with the keyword,

  • "block". - Or, to delete the header completely.

  • "forge" a URL to one from the same server as the request. - Or, set to user defined value of your choice. -

Example usage:
to pretend to be coming from the homepage of the server we are talking to.

  •      {+hide-referer{forge}}
    -     .example.com
    -    

    Any other string to set a user defined referrer.

  • Notes:
    "forge" is the preferred option here, since some servers will - not send images back otherwise. + not send images back otherwise, in an attempt to prevent their valuable + content from being embedded elsewhere (and hence, without being surrounded + by their banners).

    - "+hide-referrer"hide-referer is an alternate spelling of - "+hide-referer". It has the exact same parameters, and can be freely - mixed with, "+hide-referer". (hide-referrer and the two can be can be freely + substituted with each other. ("referrer" is the @@ -2038,6 +2495,38 @@ CLASS="QUOTE" >.)

    Example usage:

    +hide-referrer{forge}
    or +
    +hide-referrer{http://www.yahoo.com/}
    +

    8.5.10. 8.5.13. +hide-user-agenthide-user-agent

    Type:
    Typical use:

    Parameterized.

    Conceal your type of browser and client operating system

    Typical uses:
    Effect:

    To change the Replaces the value of the "User-Agent:" header so web servers can't tell - your browser type. Who's business is it anyway? +> HTTP header + in client requests with the specified value.

    Possible values:
    Type:

    Any user defined string. +>Parameterized.

    Parameter:

    Any user-defined string.

    Example usage:
    Notes:

    Warning

          This breaks many web sites that depend on looking at this header in order + to customize their content for different browsers (which, by the + way, is {+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}}
    -     NOT a smart way to do + that!). +

    Using this action in multi-user setups or wherever different types of + browsers will access the same Privoxy is + .msn.com
    -    

    not recommended. In single-user, single-browser + setups, you might use it to delete your OS version information from + the headers, because it is an invitation to exploit known bugs for your + OS. It is also occasionally useful to forge this in order to access + sites that won't let you in otherwise (though there may be a good + reason in some cases). Example of this: some MSN sites will not + let Mozilla enter, yet forging to a + Netscape 6.1 user-agent works just fine. + (Must be just a silly MS goof, I'm sure :-). +

    This action is scheduled for improvement. +

    Notes:
    Example usage:

    Warning! This breaks many web sites that depend on this in order - to determine how the target browser will respond to various - requests. Use with caution. +>
    +hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}

    8.5.11. 8.5.14. +handle-as-imagekill-popups

    Type:
    Typical use:

    Boolean.

    Eliminate those annoying pop-up windows

    Typical uses:
    Effect:

    To define what Privoxy should treat - automatically as an image, and is an important ingredient of how - ads are handled. +> While loading the document, replace JavaScript code that opens + pop-up windows with (syntactically neutral) dummy code on the fly.

    Possible values:
    Type:

    N/A -

    Boolean.

    Example usage:
    Parameter:

         {+handle-as-image}
    -     /.*\.(gif|jpg|jpeg|png|bmp|ico)
    -    

    N/A +

    Notes:

    This only has meaning if the URL (or pattern) also is - "+block"ed, in which case a user definable image can - be sent rather than a HTML page. This is integral to the whole concept of - ad blocking: the URL must match both a "+block" rule, +> This action is easily confused with the built-in, hardwired filter + action, but there are important differences: For kill-popups, + the document need not be buffered, so it can be incrementally rendered while + downloading. But kill-popups doesn't catch as many pop-ups as + filter{popups} + does. +

    Think of it as a fast and efficient replacement for a filter that you + can use if you don't want any filtering at all. Note that it doesn't make + sense to combine it with any filter action, + since as soon as one filter applies, + the whole document needs to be buffered anyway, which destroys the advantage of + the kill-popups action over its filter equivalent. +

    Killing all pop-ups is a dangerous business. Many shops and banks rely on + pop-ups to display forms, shopping carts etc, and killing only the unwanted pop-ups + would require artificial intelligence in Privoxy. + If the only kind of pop-ups that you want to kill are exit consoles (those and "+handle-as-image". - (See "+set-image-blocker"really nasty windows that appear when you close an other + one), you might want to use + filter{js-annoyances} - below for control over what will actually be displayed by the browser.) + instead.

    Example usage:

    There is little reason to change the default definition for this action. -

    +kill-popups

    8.5.12. 8.5.15. +set-image-blockerlimit-connect

    Type:
    Typical use:

    Parameterized.

    Prevent abuse of Privoxy as a TCP proxy relay

    Typical uses:
    Effect:

    Decide what to do with URLs that end up tagged with both - "+block" - and "+handle-as-image", - e.g an advertisement. +> Specifies to which ports HTTP CONNECT requests are allowable.

    Possible values:
    Type:

    There are four available options: "-set-image-blocker" will send a HTML - "blocked" page, usually resulting in a "broken - image" icon. - "+set-image-blocker{blank}" will send a - 1x1 transparent GIF image. - "+set-image-blocker{pattern}" will send a - checkerboard type pattern (the default). And finally, - "+set-image-blocker{http://xyz.com}" will - send a HTTP temporary redirect to the specified image. This has the - advantage of the icon being being cached by the browser, which will speed - up the display. -

    Parameterized.

    Example usage:
    Parameter:

         {+set-image-blocker{blank}}
    -     .example.com
    -    

    A comma-separated list of ports or port ranges (the latter using dashes, with the minimum + defaulting to 0 and the maximum to 65K). +

    Notes:

    If you want invisible ads, they need to meet - criteria as matching both images and blocked - actions. And then, "image-blocker" should be set to +> By default, i.e. if no limit-connect action applies, Privoxy only allows HTTP CONNECT + requests to port 443 (the standard, secure HTTPS port). Use + limit-connect if more fine-grained control is desired + for some or all destinations. +

    The CONNECT methods exists in HTTP to allow access to secure websites + ("blank" for invisibility. Note you cannot treat HTML pages as - images in most cases. For instance, frames require an HTML page to - display. So a frame that is an ad, typically cannot be treated as an image. - Forcing an "image" in this situation just will not work - reliably. +>"https://" URLs) through proxies. It works very simply: + the proxy connects to the server on the specified port, and then + short-circuits its connections to the client and to the remote server. + This can be a big security hole, since CONNECT-enabled proxies can be + abused as TCP relays very easily. +

    If you don't know what any of this means, there probably is no reason to + change this one, since the default is already very restrictive. +

    Example usages:

    +limit-connect{443}                   # This is the default and need not be specified.
    ++limit-connect{80,443}                # Ports 80 and 443 are OK.
    ++limit-connect{-3, 7, 20-100, 500-}   # Ports less than 3, 7, 20 to 100 and above 500 are OK.
    ++limit-connect{-}                     # All ports are OK (gaping security hole!)

    8.5.13. 8.5.16. +limit-connectprevent-compression

    Type:
    Typical use:

    Parameterized.

    Ensure that servers send the content uncompressed, so it can be + passed through filters +

    Typical uses:
    Effect:

    By default, Privoxy only allows HTTP CONNECT - requests to port 443 (the standard, secure HTTPS port). Use - "+limit-connect" to disable this altogether, or to allow - more ports. +> Adds a header to the request that asks for uncompressed transfer.

    Possible values:
    Type:

    Any valid port number, or port number range. -

    Boolean.

    Example usages:
    Parameter:

         +limit-connect{443}                       # This is the default and need not be specified.
    -     +limit-connect{80,443}                  # Ports 80 and 443 are OK.
    -     +limit-connect{-3, 7, 20-100, 500-}   # Port less than 3, 7, 20 to 100 and above 500 are OK.
    -    

    N/A +

    Notes:

    The CONNECT methods exists in HTTP to allow access to secure websites - (https:// URLs) through proxies. It works very simply: the proxy connects - to the server on the specified port, and then short-circuits its - connections to the client and to the remote proxy. - This can be a big security hole, since CONNECT-enabled proxies can be - abused as TCP relays very easily. -

    More and more websites send their content compressed by default, which + is generally a good idea and saves bandwidth. But for the filter, deanimate-gifs + and kill-popups actions to work, + Privoxy needs access to the uncompressed data. + Unfortunately, Privoxy can't yet(!) uncompress, filter, and + re-compress the content on the fly. So if you want to ensure that all websites, including + those that normally compress, can be filtered, you need to use this action. +

    - If you want to allow CONNECT for more ports than this, or want to forbid - CONNECT altogether, you can specify a comma separated list of ports and - port ranges (the latter using dashes, with the minimum defaulting to 0 and - max to 65K). -

    This will slow down transfers from those websites, though. If you use any of the above-mentioned + actions, you will typically want to use prevent-compression in conjunction + with them. +

    If you don't know what any of this means, there probably is no reason to - change this one. -

    Note that some (rare) ill-configured sites don't handle requests for uncompressed + documents correctly (they send an empty document body). If you use prevent-compression + per default, you'll have to add exceptions for those sites. See the example for how to do that. +

    Example usage (sections):

    # Set default:
    +#
    +{+prevent-compression}
    +/ # Match all sites
    +
    +# Make exceptions for ill sites:
    +#
    +{-prevent-compression}
    +www.debianhelp.org
    +www.pclinuxonline.com
    +

    8.5.14. 8.5.17. +prevent-compressionsend-vanilla-wafer

    Type:
    Typical use:

    Boolean.

    Feed log analysis scripts with useless data. +

    Typical uses:
    Effect:

    Prevent the specified websites from compressing HTTP data. +> Sends a cookie with each request stating that you do not accept any copyright + on cookies sent to you, and asking the site operator not to track you.

    Possible values:
    Type:

    Boolean.

    Parameter:

    N/A

    Example usage:
    Notes:

         {+prevent-compression}
    -     .example.com
    -    

    The vanilla wafer is a (relatively) unique header and could conceivably be used to track you. +

    This action is rarely used and not enabled in the default configuration. +

    Notes:
    Example usage:

    Some websites do this, which can be a problem for - Privoxy, since - "+filter", - "+kill-popups" - and "+gif-deanimate"
    +send-vanilla-wafer
    - will not work on compressed data. This will slow down connections to those - websites, though. Default typically is to turn - "prevent-compression" on.

    8.5.15. 8.5.18. +session-cookies-onlysend-wafer

    Type:
    Typical use:

    Boolean.

    Send custom cookies or feed log analysis scripts with even more useless data. +

    Typical uses:
    Effect:

    Allow cookies for the current browser session only. +> Sends a custom, user-defined cookie with each request.

    Possible values:
    Type:

    N/A -

    Multi-value.

    Example usage (disabling):
    Parameter:

         {-session-cookies-only}
    -     .example.com
    -    

    A string of the form "name=value". +

    Notes:

    If websites set cookies, "+session-cookies-only" will make sure - they are erased when you exit and restart your web browser. This makes - profiling cookies useless, but won't break sites which require cookies so - that you can log in for transactions. This is generally turned on for all - sites, and is the recommended setting. +> Being multi-valued, multiple instances of this action can apply to the same request, + resulting in multiple cookies being sent.

    "+prevent-*-cookies" actions should be turned off as well (see - below), for "+session-cookies-only" to work. Or, else no cookies - will get through at all. For, "persistent" cookies that survive - across browser sessions, see below as well. +> This action is rarely used and not enabled in the default configuration. +

    Example usage (section):

    {+send-wafer{UsingPrivoxy=true}}
    +my-internal-testing-server.void

    8.5.16. 8.5.19. +prevent-reading-cookiessession-cookies-only

    Type:
    Typical use:

    Boolean.

    Allow only temporary "session" cookies (for the current browser session only). +

    Typical uses:
    Effect:

    Explicitly prevent the web server from reading any cookies on your - system. +> Deletes the "expires" field from "Set-Cookie:" server headers. + Most browsers will not store such cookies permanently and forget them in between sessions.

    Possible values:
    Type:

    N/A -

    Boolean.

    Example usage:
    Parameter:

         {+prevent-reading-cookies}
    -     .example.com
    -    

    N/A +

    Notes:

    Often used in conjunction with "+prevent-setting-cookies" to - disable cookies completely. Note that - "+session-cookies-only" This is less strict than crunch-incoming-cookies / + crunch-outgoing-cookies and allows you to browse + websites that insist or rely on setting cookies, without compromising your privacy too badly. +

    Most browsers will not permanently store cookies that have been processed by + session-cookies-only and will forget about them between sessions. + This makes profiling cookies useless, but won't break sites which require cookies so + that you can log in for transactions. This is generally turned on for all + sites, and is the recommended setting. +

    It makes no sense at all to use session-cookies-only - requires these to both be disabled (or else it never gets any cookies to cache). + together with crunch-incoming-cookies or + crunch-outgoing-cookies. If you do, cookies + will be plainly killed.

    For Note that it is up to the browser how it handles such cookies without an "persistent" cookies to work (i.e. they survive across browser - sessions and reboots), all three cookie settings should be "off" - for the specified sites. +>"expires" + field. If you use an exotic browser, you might want to try it out to be sure. +

    Example usage:

    +session-cookies-only

    8.5.17. 8.5.20. +prevent-setting-cookiesset-image-blocker

    Type:

    Boolean.

    Typical uses:

    Explicitly block the web server from storing cookies on your - system. -

    Possible values:
    Typical use:

    N/A -

    Choose the replacement for blocked images

    Example usage:
    Effect:

         {+prevent-setting-cookies}
    -      This action alone doesn't do anything noticeable. If .example.com
    -    

    Notes:

    Often used in conjunction with "+prevent-reading-cookies" to - disable cookies completely (see above). -

    both + 8.5.18. block +kill-popups

    Type:

    Boolean.

    Typical uses:

    Stop those annoying JavaScript pop-up windows! +>and handle-as-image also + apply, i.e. if the request is to be blocked as an image, + then the parameter of this action decides what will be + sent as a replacement.

    Possible values:
    Type:

    N/A -

    Parameterized.

    Example usage:
    Parameter:

         {+kill-popups}
    -     .example.com
    -    

    Notes:

    • "+kill-popups" uses a built in filter to disable pop-ups - that use the window.open() function, etc. This is - one of the first actions processed by "pattern" to send a built-in checkerboard pattern image. The image is visually + decent, scales very well, and makes it obvious where banners were busted. +

    • "blank" to send a built-in transparent image. This makes banners disappear + completely, but makes it hard to detect where Privoxy has blocked + images on a given page and complicates troubleshooting if Privoxy - as it contacts the remote web server. This action is not always 100% reliable, - and is supplemented by "+filter{popups}". -

    8.5.19. +send-vanilla-wafer

  • Type:
    "target-url" to + send a redirect to target-url. You can redirect + to any image anywhere, even in your local filesystem (via "file:///" URL). +

    Boolean.

    A good application of redirects is to use special Privoxy-built-in + URLs, which send the built-in images, as target-url. + This has the same visual effect as specifying "blank" or "pattern" in + the first place, but enables your browser to cache the replacement image, instead of requesting + it over and over again. +

  • Typical uses:
    Notes:

    Sends a cookie for every site stating that you do not accept any copyright - on cookies sent to you, and asking them not to track you. +> The URLs for the built-in images are "http://config.privoxy.org/send-banner?type=type", where type is + either "blank" or "pattern".

    Possible values:

    N/A +> There is a third (advanced) type, called "auto". It is NOT to be + used in set-image-blocker, but meant for use from filters. + Auto will select the type of image that would have applied to the referring page, had it been an image.

    Example usage:

         {+send-vanilla-wafer}
    -     .example.com
    -    

    Notes:
    Built-in pattern: +

    This action only applies if you are using a jarfile
    +set-image-blocker{pattern}
    - for saving cookies. Of course, this is a (relatively) unique header and - could conceivably be used to track you.

    8.5.20. +send-wafer

    Type:

    Multi-value.

    Typical uses:

    This allows you to send an arbitrary, user definable cookie. +> Redirect to the BSD devil:

    Possible values:

    User specified cookie name and corresponding value. +>
    +set-image-blocker{http://www.freebsd.org/gifs/dae_up3.gif}

    Example usage:

         {+send-wafer{name=value}}
    -     .example.com
    -    

    Notes:
    Redirect to the built-in pattern for better caching: +

    This can be specified multiple times in order to add as many cookies as you - like. +>
    +set-image-blocker{http://config.privoxy.org/send-banner?type=pattern}

    8.5.21. Summary

    for a brief example on troubleshooting actions.

    8.5.22. Sample Actions Files8.6. Aliases

    Remember that the meaning of any of the above references is reversed by preceding - the action with a Custom "-", in place of the "actions", known to Privoxy + as "+". Also, - that some actions are turned on in the default section of the actions file, - and require little to no additional configuration. These are just "aliases", can be defined by combining other actions. + These can in turn be invoked just like the built-in actions. + Currently, an alias name can contain any character except space, tab, + "on".

    But, other actions that are turned on in the default section do - typically require exceptions to be listed in the latter sections of - one of our actions file. For instance, by default no URLs are +>"=", "blocked" (i.e. in the default definitions of - default.action). We need exceptions to this in order to - "{" and "}", but we enable ad blocking in the lower sections. But we need to - be very selective about what we do block. Thus, the default is strongly + recommend that you only use "a" to "z", + "0" to "9", "+", and "-". + Alias names are not case sensitive, and are not required to start with a + "+" or "off""-" sign, since they are merely textually + expanded.

    Aliases can be used throughout the actions file, but they must be + defined in a special section at the top of the file! - for blocking.

    There are two main reasons to use aliases: One is to save typing for frequently + used combinations of actions, the other one is a gain in flexibility: If you + decide once how you want to handle shops by defining an alias called + "shop", you can later change your policy on shops in + one place, and your changes will take effect everywhere + in the actions file where the "shop" alias is used. Calling aliases + by their purpose also makes your actions files more readable.

    Currently, there is one big drawback to using aliases, though: + Privoxy's built-in web-based action file + editor honors aliases when reading the actions files, but it expands + them before writing. So the effects of your aliases are of course preserved, + but the aliases themselves are lost when you edit sections that use aliases + with it. + This is likely to change in future versions of Privoxy.

    Now let's define some aliases...

     # Useful custom aliases we can use later.
    + #
    + # Note the (required!) section header line and that this section
    + # must be at the top of the actions file!
    + #
    + {{alias}}
    +
    + # These aliases just save typing later:
    + # (Note that some already use other aliases!)
    + #
    + +crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
    + -crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
    + block-as-image      = +block +handle-as-image
    + mercy-for-cookies   = -crunch-all-cookies -session-cookies-only
    +
    + # These aliases define combinations of actions
    + # that are useful for certain types of sites:
    + #
    + fragile     = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
    + shop        = -crunch-all-cookies -filter{popups} -kill-popups
    +
    + # Short names for other aliases, for really lazy people ;-)
    + #
    + c0 = +crunch-all-cookies
    + c1 = -crunch-all-cookies

    ...and put them to use. These sections would appear in the lower part of an + actions file and define exceptions to the default actions (as specified further + up for the "/" pattern):

     # These sites are either very complex or very keen on
    + # user data and require minimal interference to work:
    + #
    + {fragile}
    + .office.microsoft.com
    + .windowsupdate.microsoft.com
    + .nytimes.com
    +
    + # Shopping sites:
    + # Allow cookies (for setting and retrieving your customer data)
    + #           
    + {shop}
    + .quietpc.com
    + .worldpay.com   # for quietpc.com
    + .scan.co.uk
    +
    + # These shops require pop-ups:
    + #
    + {shop -kill-popups -filter{popups}}
    +  .dabs.com
    +  .overclockers.co.uk

    Aliases like "shop" and "fragile" are often used for + "problem" sites that require some actions to be disabled + in order to function properly.

    8.7. Actions Files Tutorial

    Below is a liberally commented sample The above chapters have shown which actions files + there are and how they are organized, how actions are specified and applied + to URLs, how patterns work, and how to + define and use aliases. Now, let's look at an + example default.action file - to demonstrate how all the pieces come together. And to show how exceptions - to the default policies can be handled. This is followed by a brief - and user.action with similar examples.

    + file and see how all these pieces come together:

    8.7.1. default.action

    # Sample default.action file <developers@privoxy.org>
    -
    -# Settings -- Don't change! For internal Privoxy use ONLY.
    -{{settings}}
    -for-privoxy-version=3.0
    -
    -
    -##########################################################################
    -# Every config file should start with a short comment stating its purpose:

    # Sample default.action file <developers@privoxy.org>

    Then, since this is the default.action file, the +first section is a special section for internal use that you needn't +change or worry about:

    ##########################################################################
    +# Settings -- Don't change! For internal Privoxy use ONLY.
    +##########################################################################
    +
    +{{settings}}
    +for-privoxy-version=3.0

    After that comes the (optional) alias section. We'll use the example +section from the above Aliases must be defined *before* they are used. These are
    -# easier to remember, and can combine several actions into one. Once 
    -# defined they can be used just like any built-in action -- but within 
    -# this file only! Aliases do not require a + or - sign.
    -##########################################################################
    -
    -# Some useful aliases.
    -# Alias to turn off cookie handling, ie allow all cookies unmolested.
    - -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies \
    -                    -session-cookies-only
    -
    -# Alias to both block and treat as if an image for ad blocking
    -# purposes.
    - +imageblock      = +block +handle-as-image
    -
    -# Fragile sites should have the minimum changes:
    - fragile     = -block -deanimate-gifs -fast-redirects -filter -hide-referer \
    -               -prevent-cookies -kill-popups
    -
    -# Shops should be allowed to set persistent cookies
    - shop        = -filter -prevent-cookies -session-cookies-only
    -
    -
    -##########################################################################
    -# Begin default action settings. Anything in this section will match 
    -# all URLs -- UNLESS we have exceptions that also match, defined below this 
    -# section. We will show all potential actions here whether they are on 
    -# or off. We could omit any disabled action if we wanted, since all 
    -# actions are 'off' by default anyway. Shown for completeness only.
    -# Actions are enabled if preceded by a '+', otherwise they are disabled 
    -# (unless an alias has been defined without this).
    -##########################################################################
    - { \
    chapter on aliases, +that also explains why and how aliases are used:

    ##########################################################################
    +# Aliases
    +##########################################################################
    +{{alias}}
    +
    +# These aliases just save typing later:
    +# (Note that some already use other aliases!)
    +#
    ++crunch-all-cookies = +crunch-incoming-cookies +crunch-outgoing-cookies
    +-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
    +block-as-image      = +block +handle-as-image
    +mercy-for-cookies   = -crunch-all-cookies -session-cookies-only
    +
    +# These aliases define combinations of actions
    +# that are useful for certain types of sites:
    +#
    +fragile     = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
    +shop        = mercy-for-cookies -filter{popups} -kill-popups

    Now come the regular sections, i.e. sets of actions, accompanied + by URL patterns to which they apply. Remember all actions + are disabled when matching starts, so we have to explicitly + enable the ones we want.

    The first regular section is probably the most important. It has only + one pattern, "/", but this pattern + matches all URLs.. Therefore, the + set of actions used in this "default" section will + be applied to all requests as a start. It can be partly or + wholly overridden by later matches further down this file, or in user.action, + but it will still be largely responsible for your overall browsing + experience.

    Again, at the start of matching, all actions are disabled, so there is + no real need to disable any actions here, but we will do that nonetheless, + to have a complete listing for your reference. (Remember: A "+" + preceding the action name enables the action, a "-" disables!). + Also note how this long line has been made more readable by splitting it into + multiple lines with line continuation.

    ##########################################################################
    +# "Defaults" section:
    +##########################################################################
    + { \
    + --add-header \
    add-header \ + --block \
    block \ + -crunch-incoming-cookies \ + -crunch-outgoing-cookies \ + +-deanimate-gifs \
    deanimate-gifs \ + --downgrade-http-version \
    downgrade-http-version \ + ++fast-redirects \
    fast-redirects \ + ++filter{html-annoyances} \
    filter{html-annoyances} \ + ++filter{js-annoyances} \
    filter{js-annoyances} \ + --filter{content-cookies} \
    filter{content-cookies} \ + +-filter{popups} \
    filter{popups} \ + ++filter{webbugs} \
    filter{webbugs} \ + --filter{refresh-tags} \
    filter{refresh-tags} \ + --filter{fun} \
    filter{fun} \ + ++filter{nimda} \
    filter{nimda} \ + ++filter{banners-by-size} \
    filter{banners-by-size} \ + --filter{shockwave-flash} \
    filter{shockwave-flash} \ + --filter{crude-prental} \
    filter{crude-parental} \ + -handle-as-image \ + ++hide-forwarded-for-headers \
    hide-forwarded-for-headers \ + ++hide-from-header{block} \
    hide-from-header{block} \ + +-hide-referrer \
    hide-referrer{forge} \ + --hide-user-agent \
    -handle-as-image \
    +set-image-blocker{pattern} \
    hide-user-agent \ + -kill-popups \ + --limit-connect \
    limit-connect \ + ++prevent-compression \
    -session-cookies-only \
    -prevent-reading-cookies \
    -prevent-setting-cookies \
    -kill-popups \
    prevent-compression \ + --send-vanilla-wafer \
    send-vanilla-wafer \ + --send-wafer \
    - }
    - / # forward slash will match *all* potential URL patterns. 
    -
    -##########################################################################
    -# Default behavior is now set. Now we will define some exceptions to our 
    -# default action policies.
    -##########################################################################
    -
    -# These sites are very complex and require very minimal interference.
    -# We'll disable most actions with our 'fragile' alias:
    - { fragile }
    - .office.microsoft.com           # surprise, surprise!
    - .windowsupdate.microsoft.com
    -
    -
    -# Shopping sites - not as fragile but require some special 
    -# handling. We still want to block ads, and we will allow 
    -# persistant cookies via the 'shop' alias:
    - { shop }
    - .quietpc.com 
    - .worldpay.com   # for quietpc.com
    - .jungle.com
    - .scan.co.uk
    -
    -
    -# These sites require pop-ups too :(  We'll combine our 'shop' 
    -# alias with two other actions into one rule to allow all popups.
    - { shop -kill-popups -filter{popups} }
    - .dabs.com
    - .overclockers.co.uk
    -
    -
    -# The 'Fast-redirects' action breaks some sites. Disable this action
    -# for these known sensitive sites:
    - { -fast-redirects }
    - login.yahoo.com
    - edit.europe.yahoo.com
    - .google.com
    - .altavista.com/.*(like|url|link):http
    - .altavista.com/trans.*urltext=http
    - .nytimes.com
    -
    -
    -# Define which file types will be treated as images. Important
    -# for ad blocking.
    - { +handle-as-image }
    - /.*\.(gif|jpe?g|png|bmp|ico)
    -
    -
    -# Now lets list some domains that are known ad generators. And
    -# our alias that we use here will block these as well as force 
    -# them to be treated as images. This combination of actions is 
    -# important for ad blocking. What the browser will show instead is 
    -# determined by the setting of send-wafer \ + +session-cookies-only \ + +set-image-blocker{pattern} \ + } + / # forward slash will match *all* potential URL patterns.

    The default behavior is now set. Note that some actions, like not hiding + the user agent, are part of a "+set-image-blocker"
    - { +imageblock }
    - ar.atwola.com 
    - .ad.doubleclick.net
    - .a.yimg.com/(?:(?!/i/).)*$
    - .a[0-9].yimg.com/(?:(?!/i/).)*$
    - bs*.gsanet.com
    - bs*.einets.com
    - .qkimg.net
    - ad.*.doubleclick.net
    -
    -
    -# These will just simply be blocked. They will generate the BLOCKED
    -# banner page, if matched. Heavy use of wildcards and regular 
    -# expressions in this example. Enable block action:
    - { +block }
    - ad*.
    - .*ads.
    - banner?.
    - count*.
    - /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
    - /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
    - .hitbox.com 
    -
    -
    -# The above block section will probably inadvertantly catch some 
    -# sites we DO NOT want blocked via the wildcards and regular expressions. 
    -# Now let's set exceptions to the exceptions so the good guys get better 
    -# treatment. Disable block action:
    - { -block }
    - advogato.org
    - adsl.
    - ad[ud]*.
    - advice.
    -# Let's just trust all .edu top level domains.
    - .edu
    - www.ugu.com/sui/ugu/adv
    -# We'll need to access to path names containing 'download' 
    - .*downloads.
    - /downloads/
    -# 'adv' is for globalintersec and means advanced, not advertisement
    - www.globalintersec.com/adv
    -
    -
    -# Don't filter *anything* from our friends at sourceforge.
    -# Notice we don't have to name the individual filter 
    -# identifiers -- we just turn them all off in one fell swoop.
    -# Disable all filters for this one site:
    - { -filter }
    - .sourceforge.net
    -   

    - "general policy" that applies + universally and won't get any exceptions defined later. Other choices, + like not blocking (which is understandably the + default!) need exceptions, i.e. we need to specify explicitly what we + want to block in later sections. + We will also want to make exceptions from our general pop-up-killing, + and use our defined aliases for that.

    The first of our specialized sections is concerned with "fragile" + sites, i.e. sites that require minimum interference, because they are either + very complex or very keen on tracking you (and have mechanisms in place that + make them unusable for people who avoid being tracked). We will simply use + our pre-defined fragile alias instead of stating the list + of actions explicitly:

    ##########################################################################
    +# Exceptions for sites that'll break under the default action set:
    +##########################################################################
    +
    +# "Fragile" Use a minimum set of actions for these sites (see alias above):
    +#
    +{ fragile }
    +.office.microsoft.com           # surprise, surprise!
    +.windowsupdate.microsoft.com

    So far we are painting with a broad brush by setting general policies. - The above would be a reasonable starting point for many situations. Now, - we want to be more specific and have customized rules that are more suitable - to our personal habits and preferences. These would be for narrowly defined - situations like your ISP or your bank, and should be placed in - user.action, which is parsed after all other - actions files and should not be clobbered by upgrades. So any settings here, - will have the last word and over-ride any previously defined actions.

    Shopping sites are not as fragile, but they typically + require cookies to log in, and pop-up windows for shopping + carts or item details. Again, we'll use a pre-defined alias:

    Now a few examples of some things that one might do with a - user.action file.

    # Shopping sites:
    +#
    +{ shop }
    +.quietpc.com 
    +.worldpay.com   # for quietpc.com
    +.jungle.com
    +.scan.co.uk

    # Sample user.action file.
    -
    -# Any aliases you want to use need to be re-defined here.
    -# Alias to turn off cookie handling, ie allow all cookies unmolested.
    - -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies \
    -                    -session-cookies-only
    -
    -# Fragile sites should have the minimum changes:
    - fragile     = -block -deanimate-gifs -fast-redirects -filter -hide-referer \
    -               -prevent-cookies -kill-popups
    -
    -# Allow persistent cookies for a few regular sites that we 
    -# trust via our above alias. These will be saved from one browser session 
    -# to the next. We are explicity turning off any and all cookie handling, 
    -# even though the prevent-*-cookie settings were disabled in our above 
    -# default.action anyway. So cookies from these domains will come through 
    -# unmolested.
    - { -prevent-cookies }
    - .sun.com
    - .yahoo.com
    - .msdn.microsoft.com
    - .redhat.com
    -
    -
    -# My ISP uses obnoxious self promoting images on many pages.
    -# Nuke them :) Note that "+handle-as-image" need not be specified,
    -# since all URLs ending in .gif will be tagged as images by the
    -# general rules in default.action anyway.
    - {  Then, there are sites which rely on pop-up windows (yuck!) to work. + Since we made pop-up-killing our default above, we need to make exceptions + now. +block }
    - www.my-isp-example.com/logo[0-9].gif
    -
    -
    -# Say the site where you do your homebanking needs to open
    -# popup windows, but you have chosen to kill popups by
    -# default. This will allow it for your-example-bank.com:
    -#
    - { Mozilla users, who + can turn on smart handling of unwanted pop-ups in their browsers, can + safely choose + --filter{popups} filter{popups} (and + --kill-popups }
    - .my-example-bank.com
    -
    -
    -# This site is delicate, and requires kid-glove 
    -# treatment.
    - { fragile }
    - .forbes.com
    -   

    - kill-popups) above + and hence don't need this section. Anyway, disabling an already disabled + action doesn't hurt, so we'll define our exceptions regardless of what was + chosen in the defaults section:

    # These sites require pop-ups too :( 
    +#
    +{ -kill-popups -filter{popups} }
    +.dabs.com
    +.overclockers.co.uk
    +.deutsche-bank-24.de

    The 8.6. Aliases

    fast-redirects + action, which we enabled per default above, breaks some sites. So disable + it for popular sites where we know it misbehaves:

    Custom "actions", known to
    { -fast-redirects }
    +login.yahoo.com
    +edit.*.yahoo.com
    +.google.com
    +.altavista.com/.*(like|url|link):http
    +.altavista.com/trans.*urltext=http
    +.nytimes.com

    It is important that Privoxy knows which + URLs belong to images, so that if they are to + be blocked, a substitute image can be sent, rather than an HTML page. + Contacting the remote site to find out is not an option, since it + would destroy the loading time advantage of banner blocking, and it + would feed the advertisers (in terms of money and - as "aliases", can be defined by combining other "actions". - These can in turn be invoked just like the built-in handle-as-image action, + and marking all URLs that end in a known image file extension is a + good start:

    ##########################################################################
    +# Images:
    +##########################################################################
    +
    +# Define which file types will be treated as images, in case they get
    +# blocked further down this file:
    +#
    +{ +handle-as-image }
    +/.*\.(gif|jpe?g|png|bmp|ico)$

    And then there are known banner sources. They often use scripts to + generate the banners, so it won't be visible from the URL that the + request is for an image. Hence we block them and + mark them as images in one go, with the help of our + block-as-image alias defined above. (We could of + course just as well use +block + +handle-as-image here.) + Remember that the type of the replacement image is chosen by the + set-image-blocker + action. Since all URLs have matched the default section with its + +set-image-blocker{pattern} + action before, it still applies and needn't be repeated:

    # Known ad generators:
    +#
    +{ block-as-image }
    +ar.atwola.com 
    +.ad.doubleclick.net
    +.ad.*.doubleclick.net
    +.a.yimg.com/(?:(?!/i/).)*$
    +.a[0-9].yimg.com/(?:(?!/i/).)*$
    +bs*.gsanet.com
    +bs*.einets.com
    +.qkimg.net

    One of the most important jobs of Privoxy + is to block banners. A huge bunch of them are already "actions". - Currently, an alias can contain any character except space, tab, "blocked" + by the filter{banners-by-size} + action, which we enabled above, and which deletes the references to banner + images from the pages while they are loaded, so the browser doesn't request + them anymore, and hence they don't need to be blocked here. But this naturally + doesn't catch all banners, and some people choose not to use filters, so we + need a comprehensive list of patterns for banner URLs here, and apply the + block action to them.

    First comes a bunch of generic patterns, which do most of the work, by + matching typical domain and path name components of banners. Then comes + a list of individual patterns for specific sites, which is omitted here + to keep the example short:

    ##########################################################################
    +# Block these fine banners:
    +##########################################################################
    +{ +block }
    +
    +# Generic patterns:
    +# 
    +ad*.
    +.*ads.
    +banner?.
    +count*.
    +/.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)
    +/(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/
    +
    +# Site-specific patterns (abbreviated):
    +#
    +.hitbox.com

    You wouldn't believe how many advertisers actually call their banner + servers ads.company.com, or call the directory + in which the banners are stored simply "=", +>"banners". So the above + generic patterns are surprisingly effective.

    But being very generic, they necessarily also catch URLs that we don't want + to block. The pattern .*ads. e.g. catches "{" or "}". But please use only "nasty-ads.nasty-corp.com" as intended, + but also "a"- +>"downloads.sourcefroge.net" or "z", "0"-"9", "+", and +>"adsl.some-provider.net." So here come some + well-known exceptions to the +block + section above.

    Note that these are exceptions to exceptions from the default! Consider the URL "-". Alias names are not case sensitive, and - "downloads.sourcefroge.net": Initially, all actions are deactivated, + so it wouldn't get blocked. Then comes the defaults section, which matches the + URL, but just deactivates the block + action once again. Then it matches .*ads., an exception to the + general non-blocking policy, and suddenly + +block applies. And now, it'll match + .*loads., where -block + applies, so (unless it matches must be defined before other actions in the - actions file! And there can only be one set of again further down) it ends up + with no block action applying.

    ##########################################################################
    +# Save some innocent victims of the above generic block patterns:
    +##########################################################################
    +
    +# By domain:
    +# 
    +{ -block }
    +adv[io]*.  # (for advogato.org and advice.*)
    +adsl.      # (has nothing to do with ads)
    +ad[ud]*.   # (adult.* and add.*)
    +.edu       # (universities don't host banners (yet!))
    +.*loads.   # (downloads, uploads etc)
    +
    +# By path:
    +#
    +/.*loads/
    +
    +# Site-specific:
    +#
    +www.globalintersec.com/adv # (adv = advanced)
    +www.ugu.com/sui/ugu/adv

    Filtering source code can have nasty side effects, + so make an exception for our friends at sourceforge.net, + and all paths with "aliases""cvs" in them. Note that + -filter - defined per file. Each actions file may have its own aliases, but they are - only visible within that file. Aliases do not requir a "+" or + disables all filters in one fell swoop!

    # Don't filter code!
    +#
    +{ -filter }
    +/.*cvs
    +.sourceforge.net

    The actual default.action is of course more + comprehensive, but we hope this example made clear how it works.

    8.7.2. user.action

    So far we are painting with a broad brush by setting general policies, + which would be a reasonable starting point for many people. Now, + you'd maybe want to be more specific and have customized rules that + are more suitable to your personal habits and preferences. These would + be for narrowly defined situations like your ISP or your bank, and should + be placed in user.action, which is parsed after all other + actions files and hence has the last word, over-riding any previously + defined actions. user.action is also a + safe place for your personal settings, since + default.action is actively maintained by the "-" sign in front, since they are merely expanded.

    Privoxy developers and you'll probably want + to install updated versions from time to time.

    Now let's define a few aliases:

    So let's look at a few examples of things that one might typically do in + user.action:

    # My user.action file. <fred@foobar.com>

    As aliases are local to the actions + file that they are defined in, you can't use the ones from + default.action, unless you repeat them here:

    # (Re-)define aliases for this file:
    +#
    +{{alias}}
    +-crunch-all-cookies = -crunch-incoming-cookies -crunch-outgoing-cookies
    +mercy-for-cookies   = -crunch-all-cookies -session-cookies-only
    +fragile     = -block -crunch-all-cookies -filter -fast-redirects -hide-referer -kill-popups
    +shop        = mercy-for-cookies -filter{popups} -kill-popups
    +allow-ads   = -block -filter{banners-by-size} # (see below)
    +

    Say you have accounts on some sites that you visit regularly, and + you don't want to have to log in manually each time. So you'd like + to allow persistent cookies for these sites. The +

     # Useful custom aliases we can use later. These must come first!
    - {{alias}}
    - +prevent-cookies = +prevent-setting-cookies +prevent-reading-cookies
    - -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies
    - fragile     = -block -prevent-cookies -filter -fast-redirects -hide-referer -kill-popups
    - shop        = -prevent-cookies -filter -fast-redirects
    - +imageblock = +block +handle-as-image
    -
    - # Aliases defined from other aliases, for people who don't like to type 
    - # too much:  ;-)
    - c0 = +prevent-cookies
    - c1 = -prevent-cookies
    - #... etc.  Customize to your heart's content.
    -   

    - mercy-for-cookies alias defined above does exactly + that, i.e. it disables crunching of cookies in any direction, and + processing of cookies to make them temporary.

    { mercy-for-cookies }
    +sunsolve.sun.com
    +slashdot.org
    +.yahoo.com
    +.msdn.microsoft.com
    +.redhat.com

    Some examples using our "shop" and "fragile" Your bank needs popups and is allergic to some filter, but you don't + know which, so you disable them all:

    { -filter -kill-popups }
    +.your-home-banking-site.com

    While browsing the web with Privoxy you + noticed some ads that sneaked through, but you were too lazy to + report them through our fine and easy feedback - aliases from above. These would appear in the lower sections of an - actions file as exceptions to the default actions (as defined in the - upper section):

    { +block }
    +www.a-popular-site.com/some/unobvious/path
    +another.popular.site.net/more/junk/here/

    Note that, assuming the banners in the above example have regular image + extensions (most do), +

     # These sites are very complex and require
    - # minimal interference.
    - {fragile}
    -  .office.microsoft.com
    -  .windowsupdate.microsoft.com
    -  .nytimes.com
    -
    - # Shopping sites - but we still want to block ads.
    - {shop}
    -  .quietpc.com
    -  .worldpay.com   # for quietpc.com
    -  .scan.co.uk
    -
    - # These shops require pop-ups also 
    - {shop -kill-popups}
    -  .dabs.com
    -  .overclockers.co.uk
    -   

    - +handle-as-image + need not be specified, since all URLs ending in these extensions will + already have been tagged as images in the relevant section of + default.action by now.

    Then you noticed that the default configuration breaks Forbes Magazine, + but you were too lazy to find out which action is the culprit, and you + were again too lazy to give feedback, so + you just used the fragile alias on the site, and + -- whoa! -- it worked:

    { fragile }
    +.forbes.com

    The "shop" and You like the "fragile" aliases are often used for - "fun" text replacements in default.filter, + but it is disabled in the distributed actions file. (My colleagues on the team just + don't have a sense of humour, that's why! ;-). So you'd like to turn it on in your private, + update-safe config, once and for all:

    { +filter{fun} }
    +/ # For ALL sites!

    Note that the above is not really a good idea: There are exceptions + to the filters in default.action for things that + really shouldn't be filtered, like code on CVS->Web interfaces. Since + user.action has the last word, these exceptions + won't be valid for the "problem" sites that require most actions to be disabled - in order to function properly.

    "fun" filtering specified here.

    Finally, you might think about how your favourite free websites are + funded, and find that they rely on displaying banner advertisements + to survive. So you might want to specifically allow banners for those + sites that you feel provide value to you:

    { allow-ads }
    +.sourceforge.net
    +.slashdot.org
    +.osdn.net

    Note that allow-ads has been aliased to + -block + -filter{banners-by-size} + above.

    Appendix Privoxy can use uses Perl-style "regular expressions" - in various config files. Assuming support for "pcre" (Perl - Compatible Regular Expressions) is compiled in, which is the default. Such - configuration directives do not require regular expressions, but they can be - used to increase flexibility by matching a pattern with wild-cards against - URLs.

    "regular + expressions"
    in its actions + files and filter file, + through the PCRE and + PCRS libraries.

    If you are reading this, you probably don't understand what "regular expressions" are, or what they can do. So this will be a very brief - introduction only. A full explanation would require a book ;-)

    book ;-)

    Regular expressions provide a language to describe patterns that can be + run against strings of characters (letter, numbers, etc), to see if they + match the string or not. The patterns are themselves (sometimes complex) + strings of literal characters, combined with wild-cards, and other special + characters, called meta-characters. The "Regular expressions" is a way of matching one character - expression against another to see if it matches or not. One of the +>"meta-characters" have + special meanings and are used to build complex patterns to be matched against. + Perl Compatible Regular Expressions are an especially convenient "expressions" is a literal string of readable characters - (letter, numbers, etc), and the other is a complex string of literal - characters combined with wild-cards, and other special characters, called - meta-characters. The "meta-characters" have special meanings and - are used to build the complex pattern to be matched against. Perl Compatible - Regular Expressions is an enhanced form of the regular expression language - with backward compatibility.

    "dialect" of the regular expression language.

    To make a simple analogy, we do something similar when we use wild-card characters when listing files with the

    s/string1/string2/g - This is used to rewrite strings of text. - "string1" is replaced by "string2" in this - example. There must of course be a match on "string1" first. -

    These are just some of the ones you are likely to use when matching URLs with is not in the expression anywhere).

    s/microsoft(?!.com)/MicroSuck/i - This is - a substitution. "MicroSuck" will replace any occurrence of - "microsoft". The "i" at the end of the expression - means ignore case. The "(?!.com)" means - the match should fail if "microsoft" is followed by - ".com". In other words, this acts like a "NOT" - modifier. In case this is a hyperlink, we don't want to break it ;-).

    We are barely scratching the surface of regular expressions here so that you can understand the default http://www.perldoc.com/perl5.6/pod/perlre.html

    For information on regular expression based substititions and their applications + in filters, please see the filter file tutorial + in this manual.

    14.2. Privoxy

    Alternately, this may be reached at There is a shortcut: http://p.p/, but this - variation may not work as reliably as the above in some configurations. +> (But it + doesn't provide a fallback to a real page, in case the request is not + sent through Privoxy)

  • Short cuts. Turn off, then on:

  • Privoxy - Submit Filter FeedbackPrivoxy - Submit Actions File Feedback

  • Credit: The site which gave me the general idea for these bookmarklets is +> Credit: The site which gave us the general idea for these bookmarklets is checks to see if the URL matches any "+block""+handle-as-image" page is sent back. Otherwise, if it does match, an image is returned. The type of image depends on the setting of "+set-image-blocker"

    If the URL pattern matches the "+fast-redirects" Now the rest of the client browser's request headers are processed. If any of these match any of the relevant actions (e.g. "+hide-user-agent""+prevent-setting-cookies""+crunch-incoming-cookies", "+session-cookies-only", and "+downgrade-http-version"

    If the "+kill-popups"

    If a "+filter" or "+deanimate-gifs"

    If neither "+filter" or "+deanimate-gifs" applies "actions" - and actions and "filters"filters to any given URL can be complex, and not always so easy to understand what is happening. And sometimes we need to be able to @@ -1352,11 +1285,11 @@ CLASS="APPLICATION" > is doing is causing us a problem inadvertently. It can be a little daunting to look at the actions and filters files themselves, since they tend to be filled with - "regular expressions" whose consequences are not always - so obvious.

    regular expressions whose consequences are not + always so obvious.

    One quick test to see if "+filter" This tells us how we have defined our "actions". The first is negating our previous cookie setting, which was for "+session-cookies-only" any "+fast-redirects""+imageblock". ("Aliases" is done here -- as both a "+block" an "+handle-as-image"The Main Configuration File suffix

    Default value:
    Default values:

    The filter file to use +HREF="filter-file.html" +>filter file to use

    No textual content filtering takes place, i.e. all +filter{+filter{nameNotes:

    The "default.filter" file contains content modification rules - that use "regular expressions". These rules permit powerful - changes on the content of Web pages, e.g., you could disable your favorite +> The filter file contains content modification + rules that use regular expressions. These rules permit + powerful changes on the content of Web pages, e.g., you could disable your favorite JavaScript annoyances, re-write the actual displayed text, or just have some fun replacing wherever it appears on a Web page.

    The + +filter{name} + actions rely on the relevant filter (name) + to be defined in the filter file! +

    A pre-defined filter file called default.filter that contains + a bunch of handy filters for common problems is included in the distribution. + See the section on the filter + action for a list. +

  • Privoxy for more users - that just yourself, it might be a good idea to let them know how to reach - you, what you block and why you do that, your policies etc. + than just yourself, it might be a good idea to let them know how to reach + you, what you block and why you do that, your policies, etc.

    The User Manual URI is used for help links from some of the internal CGI pages. - The manual itself is normally packaged with the binary distributions, so you propably want + The manual itself is normally packaged with the binary distributions, so you probably want to set this to a locally installed copy. For multi-user setups, you could provide a copy on a local webserver for all your users and use the corresponding URL here.

    The value of this option only matters if the experimental trust mechanism has been - activated. (See trustfiletrustfile above.)

    Specifies:

    Key values that determine what information gets logged. +> Key values that determine what information gets logged to the + logfile.

    Default value:

    localhost:8118

    127.0.0.1:8118

    Effect if unset:

    Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended for +> Bind to 127.0.0.1 (localhost), port 8118. This is suitable and recommended for home users who run Privoxy"toggled off" mode, i.e. behave like a normal, content-neutral - proxy. See enable-remote-toggle - below. This is not really useful anymore, since toggling is much easier - via below. This is not really useful + anymore, since toggling is much easier via the web - interface than via editing the the web interface than via + editing the conf file. @@ -1797,11 +1846,16 @@ CLASS="EMPHASIS" Privoxy only listens on the localhost or internal (home) - network address by means of the listen-address option. +> only listens on the localhost + (127.0.0.1) or internal (home) network address by means of the + listen-address + option.

    Please see the warnings in the FAQ that this proxy is not intended to be a substitute diff --git a/doc/webserver/user-manual/configuration.html b/doc/webserver/user-manual/configuration.html index 50825391..3b4256fd 100644 --- a/doc/webserver/user-manual/configuration.html +++ b/doc/webserver/user-manual/configuration.html @@ -4,8 +4,7 @@ >Privoxy Configuration

    6.1. Controlling Privoxydefault.action (which you will most propably want +> (which you will most probably want to define sooner or later) are probably best applied in

    We value your feedback. However, to provide you with the best support, please - note the following sections.

    We value your feedback. In fact, we rely on it to improve + Privoxy and its configuration. + However, please note the following hints, so we can + provide you with the best support:

    11.1. Get Support

    To get support, use the Sourceforge Support Forum:
    -
    -     For casual users, our support forum at + SourceForge + is probably best suited: + http://sourceforge.net/tracker/?group_id=11118&atid=211118

    -

    All users are of course welcome to discuss their issues on the users + mailing list, where the developers also hang around.

    11.2. Report bugs11.2. Report Bugs

    To submit bugs, use the Sourceforge Bug Forum:
    -
    -     Please report all bugs only through our + bug tracker: + http://sourceforge.net/tracker/?group_id=11118&atid=111118

    -

    .

    Before doing so, please make sure that the bug has not already been submitted + and observe the aditional hints at the top of the submit + form.

    Make sure that the bug has not already been submitted. Please try to - verify that it is a + Please try to verify that it is a Privoxy bug, and not a - browser or site bug first. If you are using your own custom configuration, - please try the stock configs to see if the problem is a configuration - related bug. And if not using the latest development snapshot, please try - the latest one. Or even better, CVS sources. Please be sure to include the - bug, + and not a browser or site bug first. If unsure, + try toggling + off Privoxy version, platform, browser, any - pertinent log data, any other relevant details (please be specific) and, - if possible, some way to reproduce the bug. -

    , and see if the problem persists. + The appendix + of the user manual also has helpful information + on action debugging. If you are using your own custom configuration, please try + the stock configs to see if the problem is configuration related.

    If not using the latest version, chances are that the bug has been found + and fixed in the meantime. We would appreciate if you could take the time + to upgrade + to the latest version (or even the latest CVS snapshot) and verify + your bug, but this is not required for reporting.

    11.3. Request new features11.3. Request New Features

    To submit ideas on new features, use the Sourceforge feature request forum:
    -
    -     You are welcome to submit ideas on new features or other proposals + for improvement through our feature request tracker at + http://sourceforge.net/tracker/?atid=361118&group_id=11118&func=browsehttp://sourceforge.net/tracker/?atid=361118&group_id=11118.

    -

    11.4. Report ads or other filter problems11.4. Report Ads or Other Actions-Related Problems

    You can also send feedback on websites that Privoxy has problems with. Please bookmark -the following link: Please send feedback on ads that slipped through, innocent images that were blocked, + and any other problems relating to the default.action file through + our actions feedback mechanism located at + "Privoxy - Submit Filter Feedback". Once you surf to a page with problems, use the -bookmark to send us feedback. We will look into the issue as soon as possible. -

    http://www.privoxy.org/actions/. + On this page, you will also find a bookmark which will take you back there from + any troubled site and even pre-fill the form!

    New, improved ijbswa-announce - list.

    project page.

    11.5. Other

    For any other issues, feel free to use the mailing lists:

    -    For any other issues, feel free to use the mailing lists. Technically interested users +and people who wish to contribute to the project are also welcome on the developers list! +You can find an overview of all Prixoxy-related mailing lists, +including list archives, at: +http://sourceforge.net/mail/?group_id=11118.

    -

    Anyone interested in actively participating in development and related - discussions can also join the appropriate mailing list. Archives are - available, too. See the page on Sourceforge. -

    Privoxy Copyright, License and History

    12.1. License

    12.2. History

    Privoxy is evolved, and derived from, - In the beginning, there was the + the Internet Junkbuster, with many - improvments and enhancements over the original.

    Internet Junkbuster, + by Anonymous Coders and Junkbusters + Corporation. It saved many users a lot of pain in the early days of + web advertising and user tracking.

    But the web, its protocols and standards, and with it, the techniques for + forcing users to consume ads, give up autonomy over their browsing, and + for spying on them, kept evolving. Unfortunately, the Junkbuster was originally written by Anonymous - Coders and Internet + Junkbuster did not. Version 2.0.2, published in 1998, was + (and is) the last official + release + available from Junkbusters - Corporation, and was released as free open-source software under the - GNU GPL. Junkbusters Corporation. + Fortunately, it had been released under the GNU + GPL, which allowed further + development by others.

    So Stefan Waldherr started maintaining an + Stefan - Waldherr made many improvements, and started the improved version of the + software, to which eventually a number of people contributed patches. + It could already replace banners with a transparent image, and had a first + version of pop-up killing, but it was still very closely based on the + original, with all its limitations, such as the lack of HTTP/1.1 support, + flexible per-site configuration, or content modification. The last release + from this effort was version 2.0.2-10, published in 2000.

    Then, some + developers + picked up the thread, and started turning the software inside out, upside down, + and then reassembled it, adding many + SourceForge project - Privoxy to rekindle development. There are now several active - developers contributing. The last stable release of - new + features along the way.

    The result of this is Junkbuster was v2.0.2, which has now - grown whiskers ;-).

    Privoxy, whose first + stable release, 3.0, is due in May 2002.

    12.3. Authors

    Current Project Developers:

     Jon Foster
    + Andreas Oesterhelt
    + Stefan Waldherr

    + Thomas Steudten
    + Rodney Stromlund

    Current Project Contributors:

     Rodrigo Barbosa (RPM specfiles)
    + Hal Burgiss (docs)
    + Alexander Lazic
    + Gábor Lipták
    + Guy
    + Haroon Rafique
    + David Schmidt (OS/2, Mac OSX ports)
    + Joerg Strohmayer
    + Sarantis Paskalis

    Originally developed by:

     Junkbusters Corp.
    + Anonymous Coders

    Thanks to the many people who have tested Privoxy, reported bugs, or made + suggestions. These include (in alphabetical order):

     Ken Arromdee
    + Reiner Buehl
    + Andrew J. Caines
    + Clifford Caoile
    + Peter E
    + Aaron Hamid
    + Magnus Holmgren
    + Paul Lieverse
    + Roberto Ragusa
    + Bart Schelstraete
    + Darren Wiebe

    The Filter File9. The Filter File

    Any web page can be dynamically modified with the filter file. This - modification can be removal, or re-writing, of any web page content, - including tags and non-visible content. The default filter file is - oddly enough All text substitutions that can be invoked through the + filter action + must first be defined in the filter file, which is typically + called default.filter, located in the config - directory.

    and which can be + selected through the filterfile config + option.

    This is potentially a very powerful feature, and requires knowledge of both - Typical reasons for doing such substitutions are to eliminate + common annoyances in HTML and JavaScript, such as pop-up windows, + exit consoles, crippled windows without navigation tools, the + infamous <BLINK> tag etc, to suppress images with certain + width and height attributes (standard banner sizes or web-bugs), + or just to have fun. The possibilities are endless.

    Filtering works on any text-based document type, including plain + text, HTML, JavaScript, CSS etc. (all text/* + MIME types). Substitutions are made at the source level, so if + you want to "roll your own" filters, you should be + familiar with HTML syntax.

    Just like the actions files, the + filter file is organized in sections, which are called filters + here. Each filter consists of a heading line, that starts with the + keyword FILTER:, followed by + the filter's name, and a short (one line) + description of what it does. Below that line + come the jobs, i.e. lines that define the actual + text substitutions. By convention, the name of a filter + should describe what the filter eliminates. The + comment is used in the web-based + user interface.

    Once a filter called name has been defined + in the filter file, it can be invoked by using an action of the form + +filter{name} + in any actions file.

    A filter header line for a filter called "foo" could look + like this:

    FILTER: foo Replace all "foo" with "bar"

    Below that line, and up to the next header line, come the jobs that + define what text replacements the filter executes. They are specified + in a syntax that imitates Perl's + s/// operator. If you are familiar with Perl, you + will find this to be quite intuitive, and may want to look at the + PCRS man page + for the subtle differences to Perl behaviour. Most notably, the non-standard + option letter U is supported, which turns the default + to ungreedy matching.

    If you are new to regular expressions, you might want to take a look at + the Appendix on regular expressions, and + see the Perl + manual for + the + s/// operator's syntax and Perl-style regular + expressions in general. + The below examples might also help to get you started.

    9.1. Filter File Tutorial

    Now, let's complete our "regular expression" and HTML in order create custom - filters. But, there are a number of useful filters included with +>"foo" filter. We have already defined + the heading, but the jobs are still missing. Since all it does is to replace Privoxy for many common situations.

    "foo" with "bar", there is only one (trivial) job + needed:

    s/foo/bar/

    But wait! Didn't the comment say that all occurrences + of "foo" should be replaced? Our current job will only take + care of the first "foo" on each page. For global substitution, + we'll need to add the g option:

    s/foo/bar/g

    Our complete filter now looks like this:

    FILTER: foo Replace all "foo" with "bar"
    +s/foo/bar/g

    Let's look at some real filters for more interesting examples. Here you see + a filter that protects against some common annoyances that arise from JavaScript + abuse. Let's look at its jobs one after the other:

    FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse
    +
    +# Get rid of JavaScript referrer tracking. Test page: http://www.randomoddness.com/untitled.htm
    +#
    +s|(<script.*)document\.referrer(.*</script>)|$1"Not Your Business!"$2|Usg

    Following the header line and a comment, you see the job. Note that it uses + | as the delimiter instead of /, because + the pattern contains a forward slash, which would otherwise have to be escaped + by a backslash (\).

    The included example file is divided into sections. Each section begins - with the Now, let's examine the pattern: it starts with the text FILTER keyword, followed by the identifier - for that section, e.g. <script.* + enclosed in parentheses. Since the dot matches any character, and * + means: "FILTER: webbugs". Each section performs - a similar type of filtering, such as "Match an arbitrary number of the element left of myself", this + matches "html-annoyances".

    "<script", followed by any text, i.e. + it matches the whole page, from the start of the first <script> tag.

    This file uses regular expressions to alter or remove any string in the - target page. The expressions can only operate on one line at a time. Some - examples from the included default default.filter:

    That's more than we want, but the pattern continues: document\.referrer + matches only the exact string "document.referrer". The dot needed to + be escaped, i.e. preceded by a backslash, to take away its + special meaning as a joker, and make it just a regular dot. So far, the meaning is: + Match from the start of the first <script> tag in a the page, up to, and including, + the text "document.referrer", if both are present + in the page (and appear in that order).

    Stop web pages from displaying annoying messages in the status bar by - deleting such references:

    But there's still more pattern to go. The next element, again enclosed in parentheses, + is .*</script>. You already know what .* + means, so the whole pattern translates to: Match from the start of the first <script> + tag in a page to the end of the last <script> tag, provided that the text + "document.referrer" appears somewhere in between.

    This is still not the whole story, since we have ignored the options and the parentheses: + The portions of the page matched by sub-patterns that are enclosed in parentheses, will be + remembered and be available through the variables

     FILTER: html-annoyances
    -
    - # New browser windows should be resizeable and have a location and status
    - # bar. Make it so.
    - #
    - s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig
    - s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig
    - s/scrolling="?(no|0|Auto)"?/scrolling=1/ig
    - s/menubar="?(no|0)"?/menubar=1/ig 
    -
    - # The <BLINK> tag was a crime!
    - #
    - s*<blink>|</blink>**ig
    -
    - # Is this evil? 
    - #
    - #s/framespacing="?(no|0)"?//ig
    - #s/margin(height|width)=[0-9]*//gi
    -   

    -

    $1, $2, ...
    in + the substitute. The U option switches to ungreedy matching, which means + that the first .* in the pattern will only "eat up" all + text in between "<script" and the first occurrence + of "document.referrer", and that the second .* will + only span the text up to the first "</script>" + tag. Furthermore, the s option says that the match may span + multiple lines in the page, and the g option again means that the + substitution is global.

    Just for kicks, replace any occurrence of So, to summarize, the pattern means: Match all scripts that contain the text + "Microsoft" with +>"document.referrer". Remember the parts of the script from + (and including) the start tag up to (and excluding) the string "MicroSuck", and have a little fun with topical buzzwords:

    "document.referrer" as $1, and the part following + that string, up to and including the closing tag, as $2.

    Now the pattern is deciphered, but wasn't this about substituting things? So + lets look at the substitute:

     FILTER: fun
    -
    - s/microsoft(?!.com)/MicroSuck/ig
    -
    - # Buzzword Bingo:
    - #
    - s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig
    -   

    -

    $1"Not Your Business!"$2
    is + easy to read: The text remembered as $1, followed by + "Not Your Business!" (including + the quotation marks!), followed by the text remembered as $2. + This produces an exact copy of the original string, with the middle part + (the "document.referrer") replaced by "Not Your + Business!".

    Kill those pesky little web-bugs:

    The whole job now reads: Replace "document.referrer" by + "Not Your Business!" wherever it appears inside a + <script> tag. Note that this job won't break JavaScript syntax, + since both the original and the replacement are syntactically valid + string objects. The script just won't have access to the referrer + information anymore.

    We'll show you two other jobs from the JavaScript taming department, but + this time only point out the constructs of special interest:

    # The status bar is for displaying link targets, not pointless blahblah
    +#
    +s/window\.status\s*=\s*['"].*?['"]/dUmMy=1/ig

     # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking)
    - FILTER: webbugs
    -
    - s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig
    -   

    -

    9.1. The \s stands for whitespace characters (space, tab, newline, + carriage return, form feed), so that \s* means: "zero + or more whitespace". The ? in .*? + makes this matching of arbitrary text ungreedy. (Note that the U + option is not set). The ['"] construct means: "a single + +filter Action

    or a double quote".

    Filters are enabled with the So what does this job do? It replaces assignments of single- or double-quoted + strings to the "window.status" object with a dummy assignment + (using a variable name that is hopefully odd enough not to conflict with + real variables in scripts). Thus, it catches many cases where e.g. pointless + descriptions are displayed in the status bar instead of the link target when + you move your mouse over links.

    # Kill OnUnload popups. Yummy. Test: http://www.zdnet.com/zdsubs/yahoo/tree/yfs.html
    +#
    +s/(<body .*)onunload(.*>)/$1never$2/iU

    Including the + OnUnload + event binding in the HTML DOM was a CRIME. + When I close a browser window, I want it to close and die. Basta. + This job replaces the "+filter" action from within - one of the actions files. "onunload" attribute in + "+filter" requires one parameter, which - should match one of the section identifiers in the filter file itself. Example:

    "<body>" tags with the dummy word never. + Note that the i option makes the pattern matching + case-insensitive.

    The last example is from the fun department:

      +filter{html-annoyances}
    FILTER: fun Fun text replacements + +# Spice the daily news: +# +s/microsoft(?!\.com)/MicroSuck/ig

    This would activate that particular filter. Similarly, "+filter" - can be turned off for selected sites as: +> Note the (?!\.com) part (a so-called negative lookahead) + in the job's pattern, which means: Don't match, if the string "-filter{html-annoyances}". Remember too, all actions are off by - default, unless they are explicity enabled in one of the actions files.

    ".com" appears directly following "microsoft" + in the page. This prevents links to microsoft.com from being messed, while + still replacing the word everywhere else.

    # Buzzword Bingo (example for extended regex syntax)
    +#
    +s* industry[ -]leading \
    +|  cutting[ -]edge \
    +|  award[ -]winning # Comments are OK, too! \
    +|  high[ -]performance \
    +|  solutions[ -]based \
    +|  unmatched \
    +|  unparalleled \
    +|  unrivalled \
    +*<font color="red"><b>BINGO!</b></font> \
    +*igx

    The x option in this job turns on extended syntax, and allows for + e.g. the liberal use of (non-interpreted!) whitespace for nicer formatting.

    You get the idea?

    Privoxy User Manual

    $Id: user-manual.sgml,v 1.105 2002/05/05 20:26:02 hal9 Exp $

    $Id: user-manual.sgml,v 1.117 2002/05/17 13:56:16 oes Exp $

    2.1.1. Red Hat and SuSE RPMsRed Hat, SuSE RPMs and Conectiva
    2.1.2.
    5.1. RedHat and DebianRedHat, Conectiva and Debian
    5.2.
    6.1. Controlling Privoxy
    8.1. Finding the Right Mix
    8.2. How to Edit
    8.3. How Actions are Applied to URLs
    8.4. Patterns
    22
    8.4.1. The Domain Pattern
    8.4.2. The Path Pattern
    +add-headeradd-header
    +blockblock
    8.5.3. +deanimate-gifscrunch-incoming-cookies
    8.5.4. +downgrade-http-versioncrunch-outgoing-cookies
    8.5.5. +fast-redirectsdeanimate-gifs
    8.5.6. +filterdowngrade-http-version
    8.5.7. +hide-forwarded-for-headersfast-redirects
    8.5.8. +hide-from-headerfilter
    8.5.9. +hide-refererhandle-as-image
    8.5.10. +hide-user-agenthide-forwarded-for-headers
    8.5.11. +handle-as-imagehide-from-header
    8.5.12. +set-image-blockerhide-referrer
    8.5.13. +limit-connecthide-user-agent
    8.5.14. +prevent-compressionkill-popups
    8.5.15. +session-cookies-onlylimit-connect
    8.5.16. +prevent-reading-cookiesprevent-compression
    8.5.17. +prevent-setting-cookiessend-vanilla-wafer
    8.5.18. +kill-popupssend-wafer
    8.5.19. +send-vanilla-wafersession-cookies-only
    8.5.20. +send-waferset-image-blocker
    8.5.21. Summary
    8.5.22. Sample Actions Files
    Aliases
    8.7. Actions Files Tutorial
    8.7.1. default.action
    8.7.2. user.action
    9.1. The +filter ActionFilter File Tutorial
    11.2. Report bugsReport Bugs
    11.3. Request new featuresRequest New Features
    11.4. Report ads or other filter problemsReport Ads or Other Actions-Related Problems
    11.5.
    12.1. License
    History
    12.3. Authors
    14.2. PrivoxyInstallationPrivoxy installation on your system, you - will need to remove it. Some platforms do this for you as part - of their installation procedure. (See below for your platform). - In any case be sure to backup your old configuration - if it is valuable to you. See the - be sure to backup your old configuration if it is valuable to + you. See the note to upgraders section - below.

    note to + upgraders section below.

    2.1.1. Red Hat and SuSE RPMs2.1.1. Red Hat, SuSE RPMs and Conectiva

    RPMs can be installed with Introduction

    Modularized configuration that allows for standard settings and user settings to reside in separate files, so that installing updated - actions files won't overwrite idividual user settings. + actions files won't overwrite individual user settings.

  • Quickstart to Using Privoxy

  • See Also http://www.privoxy.org/, - The Privoxy Home page. @@ -110,10 +109,33 @@ BORDER="0" > http://www.privoxy.org/faq/, + the Privoxy FAQ. +

    +

    http://sourceforge.net/projects/ijbswahttp://sourceforge.net/projects/ijbswa/, the Project Page for SourceforgeSourceForge.
    http://p.p/, access - http://config.privoxy.org/, + the web-based user interface. Privoxy from your browser. Alternately, - must be + running for this to work. Shortcut: http://config.privoxy.orghttp://p.p/ - may work in some situations where the first does not.
    http://p.p/, and select "Privoxy - Submit Filter Feedback" to submit http://www.privoxy.org/actions/, to submit "misses" to the developers. @@ -196,11 +210,32 @@ BORDER="0" >
    http://www.junkbusters.com/ht/en/cookies.html, + an explanation how cookies are used to track web users. +

    +

    http://www.squid-cache.org/ +>, a very popular + caching proxy, which is often used together with Privoxy.
    http://www.junkbusters.com/ijb.html, + the original Internet Junkbuster.
    http://www.waldherr.org/junkbuster/ +>, + Stefan Waldherr's version of Junkbuster, from which Privoxy was + derived.
    http://privacy.net/analyze/ +>, a useful site + to check what information about you is leaked while you browse the web.