X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=doc%2Fwebserver%2Fuser-manual%2Fconfiguration.html;h=0dc23c0d94e5c09801a6f889e193a277507ace23;hp=abfbabb402b556fcb95348af528619f59fa92b59;hb=75397537c3986cafa8dc1b4b37af40f93ab9372f;hpb=56d03106907472899fa6e8933e81058744ce0fed diff --git a/doc/webserver/user-manual/configuration.html b/doc/webserver/user-manual/configuration.html index abfbabb4..0dc23c0d 100644 --- a/doc/webserver/user-manual/configuration.html +++ b/doc/webserver/user-manual/configuration.html @@ -1,5864 +1,166 @@ -
All Privoxy configuration is stored - in text files. These files can be edited with a text editor. - Many important aspects of Privoxy can - also be controlled easily with a web browser. - -
Privoxy's user interface can be reached through the special - URL http://config.privoxy.org/ - (shortcut: http://p.p/), - which is a built-in page and works without Internet access. - You will see the following section:
Please choose from the following options: - - * Privoxy main page - * Show information about the current configuration - * Show the source code version numbers - * Show the request headers. - * Show which actions apply to a URL and why - * Toggle Privoxy on or off - * Edit the actions list - - |
This should be self-explanatory. Note the last item is an editor for the - "actions list", which is where much of the ad, banner, cookie, - and URL blocking magic is configured as well as other advanced features of - Privoxy. This is an easy way to adjust various - aspects of Privoxy configuration. The actions - file, and other configuration files, are explained in detail below.
"Toggle Privoxy On or Off" is handy for sites that might - have problems with your current actions and filters. You can in fact use - it as a test to see whether it is Privoxy - causing the problem or not. Privoxy continues - to run as a proxy in this case, but all filtering is disabled. There - is even a toggle Bookmarklet offered, so - that you can toggle Privoxy with one click from - your browser.
For Unix, *BSD and Linux, all configuration files are located in - /etc/privoxy/ by default. For MS Windows, OS/2, and - AmigaOS these are all in the same directory as the - Privoxy executable. The name - and number of configuration files has changed from previous versions, and is - subject to change as development progresses.
The installed defaults provide a reasonable starting point, though possibly - aggressive by some standards. For the time being, there are only three - default configuration files (this may change in time):
The main configuration file is named config - on Linux, Unix, BSD, OS/2, and AmigaOS and config.txt - on Windows. -
default.action (the actions file) is used to define - which of a set of various "actions" relating to images, banners, - pop-ups, access restrictions, banners and cookies are to be applied, and where. - There is a web based editor for this file that can be accessed at http://config.privoxy.org/edit-actions/ - (Shortcut: http://p.p/edit-actions/). - (Other actions files are included as well with differing levels of filtering - and blocking, e.g. basic.action.) -
default.filter (the filter file) can be used to re-write the raw - page content, including viewable text as well as embedded HTML and JavaScript, - and whatever else lurks on any given web page. The filtering jobs are only - pre-defined here; whether to apply them or not is up to the actions file. -
All files use the "#" character to denote a - comment (the rest of the line will be ignored) and understand line continuation - through placing a backslash ("\") as the very last character - in a line. If the # is preceded by a backslash, it looses - its special function. Placing a # in front of an otherwise - valid configuration line to prevent it from being interpreted is called "commenting - out" that line.
default.action and default.filter - can use Perl style regular expressions for - maximum flexibility.
After making any changes, there is no need to restart - Privoxy in order for the changes to take - effect. Privoxy detects such changes - automatically. Note, however, that it may take one or two additional - requests for the change to take effect. When changing the listening address - of Privoxy, these "wake up" requests - must obviously be sent to the old listening address.
While under development, the configuration content is subject to change. - The below documentation may not be accurate by the time you read this. - Also, what constitutes a "default" setting, may change, so - please check all your configuration files on important issues.
Again, the main configuration file is named config on - Linux/Unix/BSD and OS/2, and config.txt on Windows. - Configuration lines consist of an initial keyword followed by a list of - values, all separated by whitespace (any number of spaces or tabs). For - example:
confdir /etc/privoxy
-
Assigns the value /etc/privoxy to the option - confdir and thus indicates that the configuration - directory is named "/etc/privoxy/".
All options in the config file except for confdir and - logdir are optional. Watch out in the below description - for what happens if you leave them unset.
The main config file controls all aspects of Privoxy's - operation that are not location dependent (i.e. they apply universally, no matter - where you may be surfing).
Privoxy can (and normally does) use a number of - other files for additional configuration and logging. - This section of the configuration file tells Privoxy - where to find those other files.
The directory where the other configuration files are located
Path name
/etc/privoxy (Unix) or Privoxy installation dir (Windows)
Mandatory
No trailing "/", please -
When development goes modular and multi-user, the blocker, filter, and - per-user config will be stored in subdirectories of "confdir". - For now, the configuration directory structure is flat, except for - confdir/templates, where the HTML templates for CGI - output reside (e.g. Privoxy's 404 error page). -
The directory where all logging takes place (i.e. where logfile and - jarfile are located) -
Path name
/var/log/privoxy (Unix) or Privoxy installation dir (Windows)
Mandatory
No trailing "/", please -
The actions file to use -
File name, relative to confdir
default.action (Unix) or default.action.txt (Windows)
No action is taken at all. Simple neutral proxying. -
There is no point in using Privoxy without - an actions file. There are three different actions files included in the - distribution, with varying degrees of aggressiveness: - default.action, intermediate.action and - advanced.action. -
The filter file to use -
File name, relative to confdir
default.filter (Unix) or default.filter.txt (Windows)
No textual content filtering takes place, i.e. all - +filter{name} - actions in the actions file are turned off -
The "default.filter" file contains content modification rules - that use "regular expressions". These rules permit powerful - changes on the content of Web pages, e.g., you could disable your favorite - JavaScript annoyances, re-write the actual displayed text, or just have some - fun replacing "Microsoft" with "MicroSuck" wherever - it appears on a Web page. -
The log file to use -
File name, relative to logdir
logfile (Unix) or privoxy.log (Windows)
No log file is used, all log messages go to the console (stderr). -
The windows version will additionally log to the console. -
The logfile is where all logging and error messages are written. The level - of detail and number of messages are set with the debug - option (see below). The logfile can be useful for tracking down a problem with - Privoxy (e.g., it's not blocking an ad you - think it should block) but in most cases you probably will never look at it. -
Your logfile will grow indefinitely, and you will probably want to - periodically remove it. On Unix systems, you can do this with a cron job - (see "man cron"). For Red Hat, a logrotate - script has been included. -
On SuSE Linux systems, you can place a line like "/var/log/privoxy.* - +1024k 644 nobody.nogroup" in /etc/logfiles, with - the effect that cron.daily will automatically archive, gzip, and empty the - log, when it exceeds 1M size. -
The file to store intercepted cookies in -
File name, relative to logdir
jarfile (Unix) or privoxy.jar (Windows)
Intercepted cookies are not stored at all. -
The jarfile may grow to ridiculous sizes over time. -
The trust file to use -
File name, relative to confdir
Unset (commented out). When activated: trust (Unix) or trust.txt (Windows)
The whole trust mechanism is turned off. -
The trust mechanism is an experimental feature for building white-lists and should - be used with care. It is NOT recommended for the casual user. -
If you specify a trust file, Privoxy will only allow - access to sites that are named in the trustfile. - You can also mark sites as trusted referrers (with +), with - the effect that access to untrusted sites will be granted, if a link from a - trusted referrer was used. - The link target will then be added to the "trustfile". - Possible applications include limiting Internet access for children. -
If you use + operator in the trust file, it may grow considerably over time. -
If you intend to operate Privoxy for more users - that just yourself, it might be a good idea to let them know how to reach - you, what you block and why you do that, your policies etc. -
A URL to be displayed in the error page that users will see if access to an untrusted page is denied. -
URL
Two example URL are provided
No links are displayed on the "untrusted" error page. -
The value of this option only matters if the experimental trust mechanism has been - activated. (See trustfile above.) -
If you use the trust mechanism, it is a good idea to write up some on-line - documentation about your trust policy and to specify the URL(s) here. - Use multiple times for multiple URLs. -
The URL(s) should be added to the trustfile as well, so users don't end up - locked out from the information on why they were locked out in the first place! -
An email address to reach the proxy administrator. -
Email address
Unset
No email address is displayed on error pages and the CGI user interface. -
If both admin-address and proxy-info-url - are unset, the whole "Local Privoxy Support" box on all generated pages will - not be shown. -
A URL to documentation about the local Privoxy setup, - configuration or policies. -
URL
Unset
No link to local documentation is displayed on error pages and the CGI user interface. -
If both admin-address and proxy-info-url - are unset, the whole "Local Privoxy Support" box on all generated pages will - not be shown. -
This URL shouldn't be blocked ;-) -
These options are mainly useful when tracing a problem. - Note that you might also want to invoke - Privoxy with the --no-daemon - command line option when debugging. -
Key values that determine what information gets logged. -
Integer values
12289 (i.e.: URLs plus informational and warning messages)
Nothing gets logged. -
The available debug levels are: -
debug 1 # show each GET/POST/CONNECT request - debug 2 # show each connection status - debug 4 # show I/O status - debug 8 # show header parsing - debug 16 # log all data into the logfile - debug 32 # debug force feature - debug 64 # debug regular expression filter - debug 128 # debug fast redirects - debug 256 # debug GIF de-animation - debug 512 # Common Log Format - debug 1024 # debug kill pop-ups - debug 4096 # Startup banner and warnings. - debug 8192 # Non-fatal errors - |
To select multiple debug levels, you can either add them or use - multiple debug lines. -
A debug level of 1 is informative because it will show you each request - as it happens. 1, 4096 and 8192 are highly recommended - so that you will notice when things go wrong. The other levels are probably - only of interest if you are hunting down a specific problem. They can produce - a hell of an output (especially 16). - -
The reporting of fatal errors (i.e. ones which crash - Privoxy) is always on and cannot be disabled. -
If you want to use CLF (Common Log Format), you should set "debug - 512" ONLY and not enable anything else. -
Whether to run only one server thread -
None
Unset
Multi-threaded (or, where unavailable: forked) operation, i.e. the ability to - serve multiple requests simultaneously. -
This option is only there for debug purposes and you should never - need to use it. It will drastically reduce performance. -
This section of the config file controls the security-relevant aspects - of Privoxy's configuration. -
The IP address and TCP port on which Privoxy will - listen for client requests. -
[IP-Address]:Port
localhost:8118
Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended for - home users who run Privoxy on the same machine as - their browser. -
You will need to configure your browser(s) to this proxy address and port. -
If you already have another service running on port 8118, or if you want to - serve requests from other machines (e.g. on your local network) as well, you - will need to override the default. -
If you leave out the IP address, Privoxy will - bind to all interfaces (addresses) on your machine and may become reachable - from the Internet. In that case, consider using access control lists (acl's) - (see "ACLs" below), or a firewall. -
Suppose you are running Privoxy on - a machine which has the address 192.168.0.1 on your local private network - (192.168.0.0) and has another outside connection with a different address. - You want it to serve requests from inside only: -
listen-address 192.168.0.1:8118 - |
Initial state of "toggle" status -
1 or 0
1
Act as if toggled on -
If set to 0, Privoxy will start in - "toggled off" mode, i.e. behave like a normal, content-neutral - proxy. See enable-remote-toggle - below. This is not really useful anymore, since toggling is much easier - via the web - interface then via editing the conf file. -
The windows version will only display the toggle icon in the system tray - if this option is present. -
Whether or not the web-based toggle - feature may be used -
0 or 1
1
The web-based toggle feature is disabled. -
When toggled off, Privoxy acts like a normal, - content-neutral proxy, i.e. it acts as if none of the actions applied to - any URL. -
For the time being, access to the toggle feature can not be - controlled separately by "ACLs" or HTTP authentication, - so that everybody who can access Privoxy (see - "ACLs" and listen-address above) can - toggle it for all users. So this option is not recommended - for multi-user environments with untrusted users. -
Note that you must have compiled Privoxy with - support for this feature, otherwise this option has no effect. -
Whether or not the web-based actions - file editor may be used -
0 or 1
1
The web-based actions file editor is disabled. -
For the time being, access to the editor can not be - controlled separately by "ACLs" or HTTP authentication, - so that everybody who can access Privoxy (see - "ACLs" and listen-address above) can - modify its configuration for all users. So this option is not - recommended for multi-user environments with untrusted users. -
Note that you must have compiled Privoxy with - support for this feature, otherwise this option has no effect. -
Who can access what. -
src_addr[/src_masklen] - [dst_addr[/dst_masklen]] -
Where src_addr and - dst_addr are IP addresses in dotted decimal notation or valid - DNS names, and src_masklen and - dst_masklen are subnet masks in CIDR notation, i.e. integer - values from 2 to 30 representing the length (in bits) of the network address. The masks and the whole - destination part are optional. -
Unset
Don't restrict access further than implied by listen-address -
Access controls are included at the request of ISPs and systems - administrators, and are not usually needed by individual users. - For a typical home user, it will normally suffice to ensure that - Privoxy only listens on the localhost or internal (home) - network address by means of the listen-address option. -
Please see the warnings in the FAQ that this proxy is not intended to be a substitute - for a firewall or to encourage anyone to defer addressing basic security - weaknesses. -
Multiple ACL lines are OK. - If any ACLs are specified, then the Privoxy - talks only to IP addresses that match at least one permit-access line - and don't match any subsequent deny-access line. In other words, the - last match wins, with the default being deny-access. -
If Privoxy is using a forwarder (see forward below) - for a particular destination URL, the dst_addr - that is examined is the address of the forwarder and NOT the address - of the ultimate target. This is necessary because it may be impossible for the local - Privoxy to determine the IP address of the - ultimate target (that's often what gateways are used for). -
You should prefer using IP addresses over DNS names, because the address lookups take - time. All DNS names must resolve! You can not use domain patterns - like "*.org" or partial domain names. If a DNS name resolves to multiple - IP addresses, only the first one is used. -
Denying access to particular sites by ACL may have undesired side effects - if the site in question is hosted on a machine which also hosts other sites. -
Explicitly define the default behavior if no ACL and - listen-address are set: "localhost" - is OK. The absence of a dst_addr implies that - all destination addresses are OK: -
permit-access localhost - |
Allow any host on the same class C subnet as www.privoxy.org access to - nothing but www.example.com: -
permit-access www.privoxy.org/24 www.example.com/32 - |
Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere, - with the exception that 192.168.45.73 may not access www.dirty-stuff.example.com: -
permit-access 192.168.45.64/26 - deny-access 192.168.45.73 www.dirty-stuff.example.com - |
Maximum size of the buffer for content filtering. -
Size in Kbytes
4096
Use a 4MB (4096 KB) limit. -
For content filtering, i.e. the +filter and - +deanimate-gif actions, it is necessary that - Privoxy buffers the entire document body. - This can be potentially dangerous, since a server could just keep sending - data indefinitely and wait for your RAM to exhaust -- with nasty consequences. - Hence this option. -
When a document buffer size reaches the buffer-limit, it is - flushed to the client unfiltered and no further attempt to - filter the rest of the document is made. Remember that there may be multiple threads - running, which might require up to buffer-limit Kbytes - each, unless you have enabled "single-threaded" - above. -
This feature allows routing of HTTP requests through a chain of - multiple proxies. - It can be used to better protect privacy and confidentiality when - accessing specific domains by routing requests to those domains - through an anonymous public proxy (see e.g. http://www.multiproxy.org/anon_list.htm) - Or to use a caching proxy to speed up browsing. Or chaining to a parent - proxy may be necessary because the machine that Privoxy - runs on has no direct Internet access.
Also specified here are SOCKS proxies. Privoxy - supports the SOCKS 4 and SOCKS 4A protocols.
To which parent HTTP proxy specific requests should be routed. -
target_domain[:port] - http_parent[/port] -
Where target_domain is a domain name pattern (see the - chapter on domain matching in the actions file), - http_parent is the address of the parent HTTP proxy - as an IP addresses in dotted decimal notation or as a valid DNS name (or "." to denote - "no forwarding", and the optional - port parameters are TCP ports, i.e. integer - values from 1 to 64535 -
Unset
Don't use parent HTTP proxies. -
If http_parent is ".", then requests are not - forwarded to another HTTP proxy but are made directly to the web servers. -
Multiple lines are OK, they are checked in sequence, and the last match wins. -
Everything goes to an example anonymizing proxy, except SSL on port 443 (which it doesn't handle): -
forward .* anon-proxy.example.org:8080 - forward :443 . - |
Everything goes to our example ISP's caching proxy, except for requests - to that ISP's sites: -
forward .*. caching-proxy.example-isp.net:8000 - forward .example-isp.net . - |
Through which SOCKS proxy (and to which parent HTTP proxy) specific requests should be routed. -
target_domain[:port] - socks_proxy[/port] - http_parent[/port] -
Where target_domain is a domain name pattern (see the - chapter on domain matching in the actions file), - http_parent and socks_proxy - are IP addresses in dotted decimal notation or valid DNS names (http_parent - may be "." to denote "no HTTP forwarding"), and the optional - port parameters are TCP ports, i.e. integer values from 1 to 64535 -
Unset
Don't use SOCKS proxies. -
Multiple lines are OK, they are checked in sequence, and the last match wins. -
The difference between forward-socks4 and forward-socks4a - is that in the SOCKS 4A protocol, the DNS resolution of the target hostname happens on the SOCKS - server, while in SOCKS 4 it happens locally. -
If http_parent is ".", then requests are not - forwarded to another HTTP proxy but are made (HTTP-wise) directly to the web servers, albeit through - a SOCKS proxy. -
From the company example.com, direct connections are made to all - "internal" domains, but everything outbound goes through - their ISP's proxy by way of example.com's corporate SOCKS 4A gateway to - the Internet. -
forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080 - forward .example.com . - |
A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent looks like this: -
forward-socks4 .*. socks-gw.example.com:1080 . - |
If you have links to multiple ISPs that provide various special content - only to their subscribers, you can configure multiple Privoxies - which have connections to the respective ISPs to act as forwarders to each other, so that - your users can see the internal content of all ISPs.
Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP connection to - isp-b.net. Both run Privoxy. Their forwarding - configuration can look like this:
host-a:
forward .*. . - forward .isp-b.net host-b:8118 - |
host-b:
forward .*. . - forward .isp-a.net host-a:8118 - |
Now, your users can set their browser's proxy to use either - host-a or host-b and be able to browse the internal content - of both isp-a and isp-b.
If you intend to chain Privoxy and - squid locally, then chain as - browser -> squid -> privoxy is the recommended way.
Assuming that Privoxy and squid - run on the same box, your squid configuration could then look like this:
# Define Privoxy as parent proxy (without ICP) - cache_peer 127.0.0.1 parent 8118 7 no-query - - # Define ACL for protocol FTP - acl ftp proto FTP - - # Do not forward FTP requests to Privoxy - always_direct allow ftp - - # Forward all the rest to Privoxy - never_direct allow all - |
You would then need to change your browser's proxy settings to squid's address and port. - Squid normally uses port 3128. If unsure consult http_port in squid.conf.
Privoxy has a number of options specific to the - Windows GUI interface:
If "activity-animation" is set to 1, the - Privoxy icon will animate when - "Privoxy" is active. To turn off, set to 0.
activity-animation 1
-
If "log-messages" is set to 1, - Privoxy will log messages to the console - window:
log-messages 1
-
- If "log-buffer-size" is set to 1, the size of the log buffer, - i.e. the amount of memory used for the log messages displayed in the - console window, will be limited to "log-max-lines" (see below).
Warning: Setting this to 0 will result in the buffer to grow infinitely and - eat up all your memory!
log-buffer-size 1
-
log-max-lines is the maximum number of lines held - in the log buffer. See above.
log-max-lines 200
-
If "log-highlight-messages" is set to 1, - Privoxy will highlight portions of the log - messages with a bold-faced font:
log-highlight-messages 1
-
The font used in the console window:
log-font-name Comic Sans MS
-
Font size used in the console window:
log-font-size 8
-
- "show-on-task-bar" controls whether or not - Privoxy will appear as a button on the Task bar - when minimized:
show-on-task-bar 0
-
If "close-button-minimizes" is set to 1, the Windows close - button will minimize Privoxy instead of closing - the program (close with the exit option on the File menu).
close-button-minimizes 1
-
The "hide-console" option is specific to the MS-Win console - version of Privoxy. If this option is used, - Privoxy will disconnect from and hide the - command console.
#hide-console
-
The actions file (default.action, formerly: - actionsfile or ijb.action) is used - to define what actions Privoxy takes for which - URLs, and thus determines how ad images, cookies and various other aspects - of HTTP content and transactions are handled on which sites (or even parts - thereof).
- Anything you want can blocked, including ads, banners, or just some obnoxious - URL that you would rather not see. Cookies can be accepted or rejected, or - accepted only during the current browser session (i.e. not written to disk), - content can be modified, JavaScripts tamed, user-tracking fooled, and much more. - See below for a complete list of available actions.
An actions file typically has sections. At the top, "aliases" are - defined (discussed below), then the default set of rules which will apply - universally to all sites and pages. And then below that is generally a lengthy - set of exceptions to the defined universal policies.
Note that some actions like cookie suppression or script disabling may - render some sites unusable, which rely on these techniques to work properly. - Finding the right mix of actions is not easy and certainly a matter of personal - taste. In general, it can be said that the more "aggressive" - your default settings (in the top section of the actions file) are, - the more exceptions for "trusted" sites you will have to - make later. If, for example, you want to kill popup windows per default, you'll - have to make exceptions from that rule for sites that you regularly use - and that require popups for actually useful content, like maybe your bank, - favorite shop, or newspaper.
We have tried to provide you with reasonable rules to start from in the - distribution actions file. But there is no general rule of thumb on these - things. There just are too many variables, and sites are constantly changing. - Sooner or later you will want to change the rules (and read this chapter).
The easiest way to edit the "actions" file is with a browser by - using our browser-based editor, which is available at http://config.privoxy.org/edit-actions.
If you prefer plain text editing to GUIs, you can of course also directly edit the - default.action file.
The actions file is divided into sections. There are special sections, - like the "alias" sections which will be discussed later. For now - let's concentrate on regular sections: They have a heading line (often split - up to multiple lines for readability) which consist of a list of actions, - separated by whitespace and enclosed in curly braces. Below that, there - is a list of URL patterns, each on a separate line.
To determine which actions apply to a request, the URL of the request is - compared to all patterns in this file. Every time it matches, the list of - applicable actions for the URL is incrementally updated, using the heading - of the section in which the pattern is located. If multiple matches for - the same URL set the same action differently, the last match wins.
You can trace this process by visiting http://config.privoxy.org/show-url-info.
More detail on this is provided in the Appendix, Anatomy of an Action.
Generally, a pattern has the form <domain>/<path>, - where both the <domain> and <path> - are optional. (This is why the pattern / matches all URLs).
is a domain-only pattern and will match any request to www.example.com, - regardless of which document on that server is requested. -
means exactly the same. For domain-only patterns, the trailing / may - be omitted. -
matches only the single document /index.html - on www.example.com. -
matches the document /index.html, regardless of the domain, - i.e. on any web server. -
matches nothing, since it would be interpreted as a domain name and - there is no top-level domain called .html. -
The matching of the domain part offers some flexible options: if the - domain starts or ends with a dot, it becomes unanchored at that end. - For example:
matches any domain that ENDS in - .example.com -
matches any domain that STARTS with - www. -
matches any domain that CONTAINS .example. - (Correctly speaking: It matches any FQDN that contains example as a domain.) -
Additionally, there are wild-cards that you can use in the domain names - themselves. They work pretty similar to shell wild-cards: "*" - stands for zero or more arbitrary characters, "?" stands for - any single character, you can define character classes in square - brackets and all of that can be freely mixed:
matches "adserver.example.com", - "ads.example.com", etc but not "sfads.example.com" -
matches all of the above, and then some. -
matches www.ipix.com, - pictures.epix.com, a.b.c.d.e.upix.com etc. -
matches www1.example.com, - www4.example.cc, wwwd.example.cy, - wwwz.example.com etc., but not - wwww.example.com. -
Privoxy uses Perl compatible regular expressions - (through the PCRE library) for - matching the path.
There is an Appendix with a brief quick-start into regular - expressions, and full (very technical) documentation on PCRE regex syntax is available on-line - at http://www.pcre.org/man.txt. - You might also find the Perl man page on regular expressions (man perlre) - useful, which is available on-line at http://www.perldoc.com/perl5.6/pod/perlre.html.
Note that the path pattern is automatically left-anchored at the "/", - i.e. it matches as if it would start with a "^".
Please also note that matching in the path is case - INSENSITIVE by default, but you can switch to case - sensitive at any point in the pattern by using the - "(?-i)" switch: - www.example.com/(?-i)PaTtErN.* will match only - documents whose path starts with PaTtErN in - exactly this capitalization.
Actions are enabled if preceded with a "+", and disabled if - preceded with a "-". So a "+action" means - "do that action", e.g. "+block" means please - "block the following URLs and/or patterns". All actions are - disabled by default, until they are explicitly enabled somewhere in an actions - file.
- Actions are invoked by enclosing the action name in curly braces (e.g. - {+some_action}), followed by a list of URLs (or patterns that match URLs) to - which the action applies. There are three classes of actions:
- Boolean, i.e the action can only be "on" or - "off". Examples: -
{+name} # enable this action
- {-name} # disable this action
-
- Parameterized, e.g. "+/-hide-user-agent{ Mozilla 1.0 }", - where some value is required in order to enable this type of action. - Examples: -
{+name{param}} # enable action and set parameter to "param"
- {-name} # disable action ("parameter") can be omitted
-
- - Multi-value, e.g. "{+/-add-header{Name: value}}" ot - "{+/-wafer{name=value}}"), where some value needs to be defined - in addition to simply enabling the actino. Examples: -
{+name{param=value}} # enable action and set "param" to "value"
- {-name{param=value}} # remove the parameter "param" completely
- {-name} # disable this action totally and remove param too
-
If nothing is specified in this file, no "actions" are taken. - So in this case Privoxy would just be a - normal, non-blocking, non-anonymizing proxy. You must specifically - enable the privacy and blocking features you need (although the - provided default default.action file will - give a good starting point).
Later defined actions always over-ride earlier ones. So exceptions - to any rules you make, should come in the latter part of the file. For - multi-valued actions, the actions are applied in the order they are - specified.
The list of valid Privoxy "actions" are:
Multi-value.
Send a user defined HTTP header to the web server. -
Any value is possible. Validity of the defined HTTP headers is not checked. -
{+add-header{X-User-Tracking: sucks}}
- .example.com
-
This action may be specified multiple times, in order to define multiple - headers. This is rarely needed for the typical user. If you don't know what - "HTTP headers" are, you definitely don't need to worry about this - one. -
Boolean.
Used to block a URL from reaching your browser. The URL may be - anything, but is typically used to block ads or other obnoxious - content. -
N/A
{+block}
- .example.com
- .ads.r.us
-
Privoxy will display its - special "BLOCKED" page if a URL matches one of the - blocked patterns. If there is sufficient space, a large red - banner will appear with a friendly message about why the page - was blocked, and a way to go there anyway. If there is insufficient - space a smaller blocked page will appear without the red banner. - One exception is if the URL matches both "+block" - and "+image", then it can be handled by - "+image-blocker" (see below). -
The "+filter" action can also perform some of the - same functionality as "+block", but by virtue of very - different programming techniques, and is typically used for different - reasons. -
Parameterized.
To stop those annoying, distracting animated GIF images. -
"last" or "first" -
{+deanimate-gifs{last}}
- .example.com
-
De-animate all animated GIF images, i.e. reduce them to their last frame. - This will also shrink the images considerably (in bytes, not pixels!). If - the option "first" is given, the first frame of the animation - is used as the replacement. If "last" is given, the last - frame of the animation is used instead, which probably makes more sense for - most banner animations, but also has the risk of not showing the entire - last frame (if it is only a delta to an earlier frame). -
Boolean.
"+downgrade" will downgrade HTTP/1.1 client requests to - HTTP/1.0 and downgrade the responses as well. -
N/A -
{+downgrade}
- .example.com
-
Use this action for servers that use HTTP/1.1 protocol features that - Privoxy doesn't handle well yet. HTTP/1.1 is - only partially implemented. Default is not to downgrade requests. This is - an infrequently needed action, and is used to help with problem sites only. -
Boolean.
The "+fast-redirects" action enables interception of - "redirect" requests from one server to another, which - are used to track users.Privoxy can cut off - all but the last valid URL in redirect request and send a local redirect - back to your browser without contacting the intermediate site(s). -
N/A -
{+fast-redirects}
- .example.com
-
- Many sites, like yahoo.com, don't just link to other sites. Instead, they - will link to some script on their own server, giving the destination as a - parameter, which will then redirect you to the final target. URLs - resulting from this scheme typically look like: - http://some.place/some_script?http://some.where-else. -
Sometimes, there are even multiple consecutive redirects encoded in the - URL. These redirections via scripts make your web browsing more traceable, - since the server from which you follow such a link can see where you go - to. Apart from that, valuable bandwidth and time is wasted, while your - browser ask the server for one redirect after the other. Plus, it feeds - the advertisers. -
This is a normally on feature, and often requires exceptions for sites that - are sensitive to defeating this mechanism. -
Parameterized.
Apply page filtering as defined by named sections of the - default.filter file to the specified site(s). - "Filtering" can be any modification of the raw - page content, including re-writing or deletion of content. -
"+filter" must include the name of one of the section identifiers - from default.filter (or whatever - filterfile is specified in config). -
+filter{html-annoyances}: Get rid of particularly annoying HTML abuse. - |
+filter{js-annoyances}: Get rid of particularly annoying JavaScript abuse - |
+filter{content-cookies}: Kill cookies that come in the HTML or JS content - |
+filter{popups}: Kill all popups in JS and HTML - |
+filter{frameset-borders}: Give frames a border and make them resizable - |
+filter{webbugs}: Squish WebBugs (1x1 invisible GIFs used for user tracking) - |
+filter{refresh-tags}: Kill automatic refresh tags (for dial-on-demand setups) - |
+filter{fun}: Text replacements for subversive browsing fun! - |
+filter{nimda}: Remove Nimda (virus) code. - |
+filter{banners-by-size}: Kill banners by size (very efficient!) - |
+filter{shockwave-flash}: Kill embedded Shockwave Flash objects - |
+filter{crude-parental}: Kill all web pages that contain the words "sex" or "warez" - |
This is potentially a very powerful feature! And requires a knowledge - of regular expressions if you want to "roll your own". - Filtering operates on a line by line basis. -
Filtering requires buffering the page content, which may appear to - slow down page rendering since nothing is displayed until all content has - passed the filters. (It does not really take longer, but seems that way - since the page is not incrementally displayed.) This effect will be more - noticeable on slower connections. -
Filtering can achieve some of the effects as the "+block" - action, i.e. it can be used to block ads and banners. In the overall - scheme of things, filtering is one of the last things "Privoxy" - does with a web page. So other actions are applied first. -
Boolean.
Block any existing X-Forwarded-for HTTP header, and do not add a new one. -
N/A -
{+hide-forwarded}
- .example.com
-
It is fairly safe to leave this on. It does not seem to break many sites. -
Parameterized.
To block the browser from sending your email address in a "From:" - header. -
Keyword: "block", or any user defined value. -
{+hide-from{block}}
- .example.com
-
The keyword "block" will completely remove the header. - Alternately, you can specify any value you prefer to send to the web - server. -
Parameterized.
Don't send the "Referer:" (sic) HTTP header to the web site. - Or, alternately send a forged header instead. -
Prevent the header from being sent with the keyword, "block". - Or, "forge" a URL to one from the same server as the request. - Or, set to user defined value of your choice. -
{+hide-referer{forge}}
- .example.com
-
"forge" is the preferred option here, since some servers will - not send images back otherwise. -
- "+hide-referrer" is an alternate spelling of - "+hide-referer". It has the exact same parameters, and can be freely - mixed with, "+hide-referer". ("referrer" is the - correct English spelling, however the HTTP specification has a bug - it - requires it to be spelled as "referer".) -
Parameterized.
To change the "User-Agent:" header so web servers can't tell - your browser type. Who's business is it anyway? -
Any user defined string. -
{+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}}
- .msn.com
-
Warning! This breaks many web sites that depend on this in order - to determine how the target browser will respond to various - requests. Use with caution. -
Boolean.
To define what Privoxy should treat - automatically as an image. -
N/A -
{+image}
- /.*\.(gif|jpg|jpeg|png|bmp|ico)
-
This only has meaning if the URL (or pattern) also is - "+block"ed, in which case a "blocked" image can - be sent rather than a HTML page. (See "+image-blocker{}" below - for the control over what is actually sent.) -
There is little reason to change the default definition for this. -
Parameterized.
Decide what to do with URLs that end up tagged with both "{+block}" - and "{+image}", e.g an advertisement. -
There are four available options: "-image-blocker" will send a HTML - "blocked" page, usually resulting in a "broken - image" icon. "+image-blocker{blank}" will send a 1x1 - transparent GIF image. "+image-blocker{pattern}" will send a - checkerboard type pattern (the default). And finally, - "+image-blocker{http://xyz.com}" will send a HTTP temporary - redirect to the specified image. This has the advantage of the icon being - being cached by the browser, which will speed up the display. -
{+image-blocker{blank}}
- .example.com
-
If you want invisible ads, they need to be both - defined as images and blocked. - And then, "image-blocker" should be set to - "blank" for invisibility. Note you cannot treat HTML pages as - images in most cases. For instance, frames require an HTML page to display. - So a frame that is an ad, cannot be treated as an image. Forcing an - "image" in this situation just will not work. -
Parameterized.
By default, Privoxy only allows HTTP CONNECT - requests to port 443 (the standard, secure HTTPS port). Use - "+limit-connect" to disable this altogether, or to allow - more ports. -
Any valid port number, or port number range. -
+limit-connect{443} # This is the default and need not be specified.
- +limit-connect{80,443} # Ports 80 and 443 are OK.
- +limit-connect{-3, 7, 20-100, 500-} # Port less than 3, 7, 20 to 100 and above 500 are OK.
-
The CONNECT methods exists in HTTP to allow access to secure websites - (https:// URLs) through proxies. It works very simply: the proxy connects - to the server on the specified port, and then short-circuits its - connections to the client and to the remote proxy. - This can be a big security hole, since CONNECT-enabled proxies can be - abused as TCP relays very easily. -
- If you want to allow CONNECT for more ports than this, or want to forbid - CONNECT altogether, you can specify a comma separated list of ports and - port ranges (the latter using dashes, with the minimum defaulting to 0 and - max to 65K). -
If you don't know what any of this means, there probably is no reason to - change this one. -
Boolean.
Prevent the specified websites from compressing HTTP data. -
N/A -
{+no-compression}
- .example.com
-
Some websites do this, which can be a problem for - Privoxy, since "+filter", - "+no-popup" and "+gif-deanimate" will not work - on compressed data. This will slow down connections to those websites, - though. Default typically is to turn "no-compression" on. -
Boolean.
Allow cookies for the current browser session only. -
N/A -
{+no-cookies-keep}
- .example.com
-
If websites set cookies, "no-cookies-keep" will make sure - they are erased when you exit and restart your web browser. This makes - profiling cookies useless, but won't break sites which require cookies so - that you can log in for transactions. This is generally turned on for all - sites. Sometimes referred to as "session cookies". -
Boolean.
Explicitly prevent the web server from reading any cookies on your - system. -
N/A -
{+no-cookies-read}
- .example.com
-
Often used in conjunction with "+no-cookies-set" to - disable persistant cookies completely. -
Boolean.
Explicitly block the web server from sending cookies to your - system. -
N/A -
{+no-cookies-set}
- .example.com
-
Often used in conjunction with "+no-cookies-read" to - disable persistant cookies completely. -
Boolean.
Stop those annoying JavaScript pop-up windows! -
N/A -
{+no-popup}
- .example.com
-
"+no-popup" uses a built in filter to disable pop-ups - that use the window.open() function, etc. -
An alternate spelling is "+no-popups", which is - interchangeable. -
Boolean.
Sends a cookie for every site stating that you do not accept any copyright - on cookies sent to you, and asking them not to track you. -
N/A -
{+vanilla-wafer}
- .example.com
-
This action only applies if you are using a jarfile - for saving cookies. Of course, this is a (relatively) unique header and - could be used to track you. -
Multi-value.
This allows you to send an arbitrary, user definable cookie. -
User specified cookie name and corresponding value. -
{+wafer{name=value}}
- .example.com
-
This can be specified multiple times in order to add as many cookies as you - like. -
Note that the meaning of any of the above examples is reversed by preceding - the action with a "-", in place of the "+". Also, - that some actions are turned on in the default section of the actions file, - and require little to no additional configuration. These are just "on". - Some actions that are turned on the default section do typically require - exceptions to be listed in the lower sections of actions file.
Some examples:
Turn off cookies by default, then allow a few through for specified sites:
# Turn off all persistent cookies
- { +no-cookies-read }
- { +no-cookies-set }
-
- # Allow cookies for this browser session ONLY
- { +no-cookies-keep }
-
- # Exceptions to the above, sites that benefit from persistent cookies
- # that saved from one browser session to the next.
- { -no-cookies-read }
- { -no-cookies-set }
- { -no-cookies-keep }
- .javasoft.com
- .sun.com
- .yahoo.com
- .msdn.microsoft.com
- .redhat.com
-
- # Alternative way of saying the same thing
- {-no-cookies-set -no-cookies-read -no-cookies-keep}
- .sourceforge.net
- .sf.net
-
Now turn off "fast redirects", and then we allow two exceptions:
# Turn them off!
- {+fast-redirects}
-
- # Reverse it for these two sites, which don't work right without it.
- {-fast-redirects}
- www.ukc.ac.uk/cgi-bin/wac\.cgi\?
- login.yahoo.com
-
Turn on page filtering according to rules in the defined sections - of default.filter, and make one exception for - Sourceforge: -
# Run everything through the filter file, using only the
- # specified sections:
- +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups}\
- +filter{webbugs} +filter{nimda} +filter{banners-by-size}
-
- # Then disable filtering of code from sourceforge!
- {-filter}
- .cvs.sourceforge.net
-
Now some URLs that we want "blocked" (normally generates - the "blocked" banner). Many of these use - regular expressions that will expand to match - multiple URLs:
# Blocklist:
- {+block}
- /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g))
- /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/])
- /.*/(ng)?adclient\.cgi
- /.*/(plain|live|rotate)[-_.]?ads?/
- /.*/(sponsor)s?[0-9]?/
- /.*/_?(plain|live)?ads?(-banners)?/
- /.*/abanners/
- /.*/ad(sdna_image|gifs?)/
- /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe)
- /.*/adbanners/
- /.*/adserver
- /.*/adstream\.cgi
- /.*/adv((er)?ts?|ertis(ing|ements?))?/
- /.*/banner_?ads/
- /.*/banners?/
- /.*/banners?\.cgi/
- /.*/cgi-bin/centralad/getimage
- /.*/images/addver\.gif
- /.*/images/marketing/.*\.(gif|jpe?g)
- /.*/popupads/
- /.*/siteads/
- /.*/sponsor.*\.gif
- /.*/sponsors?[0-9]?/
- /.*/advert[0-9]+\.jpg
- /Media/Images/Adds/
- /ad_images/
- /adimages/
- /.*/ads/
- /bannerfarm/
- /grafikk/annonse/
- /graphics/defaultAd/
- /image\.ng/AdType
- /image\.ng/transactionID
- /images/.*/.*_anim\.gif # alvin brattli
- /ip_img/.*\.(gif|jpe?g)
- /rotateads/
- /rotations/
- /worldnet/ad\.cgi
- /cgi-bin/nph-adclick.exe/
- /.*/Image/BannerAdvertising/
- /.*/ad-bin/
- /.*/adlib/server\.cgi
- /autoads/
-
Note that many of these actions have the potential to cause a page to - misbehave, possibly even not to display at all. There are many ways - a site designer may choose to design his site, and what HTTP header - content he may depend on. There is no way to have hard and fast rules - for all sites. See the Appendix - for a brief example on troubleshooting actions.
Custom "actions", known to Privoxy - as "aliases", can be defined by combining other "actions". - These can in turn be invoked just like the built-in "actions". - Currently, an alias can contain any character except space, tab, "=", - "{" or "}". But please use only "a"- - "z", "0"-"9", "+", and - "-". Alias names are not case sensitive, and - must be defined before anything else in the - default.actionfile! And there can only be one set of - "aliases" defined.
Now let's define a few aliases:
# Useful custom aliases we can use later. These must come first!
- {{alias}}
- +no-cookies = +no-cookies-set +no-cookies-read
- -no-cookies = -no-cookies-set -no-cookies-read
- fragile = -block -no-cookies -filter -fast-redirects -hide-referer -no-popups
- shop = -no-cookies -filter -fast-redirects
- +imageblock = +block +image
-
- #For people who don't like to type too much: ;-)
- c0 = +no-cookies
- c1 = -no-cookies
- c2 = -no-cookies-set +no-cookies-read
- c3 = +no-cookies-set -no-cookies-read
- #... etc. Customize to your heart's content.
-
Some examples using our "shop" and "fragile" - aliases from above:
# These sites are very complex and require
- # minimal interference.
- {fragile}
- .office.microsoft.com
- .windowsupdate.microsoft.com
- .nytimes.com
-
- # Shopping sites - but we still want to block ads.
- {shop}
- .quietpc.com
- .worldpay.com # for quietpc.com
- .jungle.com
- .scan.co.uk
-
- # These shops require pop-ups also
- {shop -no-popups}
- .dabs.com
- .overclockers.co.uk
-
The "shop" and "fragile" aliases are often used for - "problem" sites that require most actions to be disabled - in order to function properly.
Any web page can be dynamically modified with the filter file. This - modification can be removal, or re-writing, of any web page content, - including tags and non-visible content. The default filter file is - default.filter, located in the config directory.
This is potentially a very powerful feature, and requires knowledge of both - "regular expression" and HTML in order create custom - filters. But, there are a number of useful filters included with - Privoxy for many common situations.
The included example file is divided into sections. Each section begins - with the FILTER keyword, followed by the identifier - for that section, e.g. "FILTER: webbugs". Each section performs - a similar type of filtering, such as "html-annoyances".
This file uses regular expressions to alter or remove any string in the - target page. The expressions can only operate on one line at a time. Some - examples from the included default default.filter:
Stop web pages from displaying annoying messages in the status bar by - deleting such references:
FILTER: html-annoyances
-
- # New browser windows should be resizeable and have a location and status
- # bar. Make it so.
- #
- s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig
- s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig
- s/scrolling="?(no|0|Auto)"?/scrolling=1/ig
- s/menubar="?(no|0)"?/menubar=1/ig
-
- # The <BLINK> tag was a crime!
- #
- s*<blink>|</blink>**ig
-
- # Is this evil?
- #
- #s/framespacing="?(no|0)"?//ig
- #s/margin(height|width)=[0-9]*//gi
-
Just for kicks, replace any occurrence of "Microsoft" with - "MicroSuck", and have a little fun with topical buzzwords:
FILTER: fun
-
- s/microsoft(?!.com)/MicroSuck/ig
-
- # Buzzword Bingo:
- #
- s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig
-
Kill those pesky little web-bugs:
# webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking)
- FILTER: webbugs
-
- s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig
-
When Privoxy displays one of its internal - pages, such as a 404 Not Found error page, it uses the appropriate template. - On Linux, BSD, and Unix, these are located in - /etc/privoxy/templates by default. These may be - customized, if desired. cgi-style.css is - used to control the HTML attributes (fonts, etc).
The default "Blocked" banner page with the bright red top - banner, is called just "blocked". This - may be customized or replaced with something else if desired.
Prev | Home | Next |
Quickstart to Using Privoxy | Contacting the Developers, Bug Reporting and Feature -Requests |
All Privoxy configuration is stored in text files. These files can be edited + with a text editor. Many important aspects of Privoxy can also be controlled + easily with a web browser.
+Privoxy's user interface can be reached through the special URL http://config.privoxy.org/ (shortcut: http://p.p/), which is a built-in page and works without Internet access. You will see the + following section:
+
+ + Privoxy Menu++
|
+
This should be self-explanatory. Note the first item leads to an editor for the actions files, which is where the ad, banner, cookie, and URL blocking magic is + configured as well as other advanced features of Privoxy. This is an easy way to + adjust various aspects of Privoxy configuration. The actions file, and other + configuration files, are explained in detail below.
+"Toggle Privoxy On or Off" is handy for sites that might have problems with your + current actions and filters. You can in fact use it as a test to see whether it is Privoxy causing the problem or not. Privoxy continues to + run as a proxy in this case, but all manipulation is disabled, i.e. Privoxy acts + like a normal forwarding proxy.
+Note that several of the features described above are disabled by default in Privoxy 3.0.7 beta and later. Check the configuration + file to learn why and in which cases it's safe to enable them again.
+For Unix, *BSD and GNU/Linux, all configuration files are located in /etc/privoxy/ + by default. For MS Windows and OS/2 these are all in the same directory as the Privoxy executable.
+The installed defaults provide a reasonable starting point, though some settings may be aggressive by some + standards. For the time being, the principle configuration files are:
+The main configuration file is named config on + GNU/Linux, Unix, BSD, and OS/2, and config.txt on Windows. This is a required + file.
+match-all.action is used to define which "actions" + relating to banner-blocking, images, pop-ups, content modification, cookie handling etc should be applied by + default. It should be the first actions file loaded.
+default.action defines many exceptions (both positive and negative) from the + default set of actions that's configured in match-all.action. It should be the + second actions file loaded and shouldn't be edited by the user.
+Multiple actions files may be defined in config. These are processed in the + order they are defined. Local customizations and locally preferred exceptions to the default policies as + defined in match-all.action (which you will most probably want to define sooner or + later) are best applied in user.action, where you can preserve them across + upgrades. The file isn't installed by all installers, but you can easily create it yourself with a text + editor.
+There is also a web based editor that can be accessed from http://config.privoxy.org/show-status (Shortcut: http://p.p/show-status) for the various actions files.
+"Filter files" (the filter file) can be used to + re-write the raw page content, including viewable text as well as embedded HTML and JavaScript, and whatever + else lurks on any given web page. The filtering jobs are only pre-defined here; whether to apply them or not + is up to the actions files. default.filter includes various filters made available + for use by the developers. Some are much more intrusive than others, and all should be used with caution. You + may define additional filter files in config as you can with actions files. We + suggest user.filter for any locally defined filters or customizations.
+The syntax of the configuration and filter files may change between different Privoxy versions, unfortunately + some enhancements cost backwards compatibility.
+All files use the "#" character to denote a comment (the + rest of the line will be ignored) and understand line continuation through placing a backslash ("\") as the very last character in a line. If the # is preceded by a + backslash, it looses its special function. Placing a # in front of an otherwise valid + configuration line to prevent it from being interpreted is called "commenting out" that line. Blank lines are + ignored.
+The actions files and filter files can use Perl style regular expressions + for maximum flexibility.
+After making any changes, there is no need to restart Privoxy in order for + the changes to take effect. Privoxy detects such changes automatically. Note, + however, that it may take one or two additional requests for the change to take effect. When changing the + listening address of Privoxy, these "wake up" + requests must obviously be sent to the old listening + address.
+