4 >Privoxy Configuration</TITLE
7 CONTENT="Modular DocBook HTML Stylesheet Version 1.64
10 TITLE="Privoxy User Manual"
11 HREF="index.html"><LINK
13 TITLE="Starting Privoxy"
14 HREF="startup.html"><LINK
16 TITLE="Contacting the Developers, Bug Reporting and Feature
18 HREF="contact.html"><LINK
21 HREF="../p_doc.css"></HEAD
40 >Privoxy User Manual</TH
84 > configuration is stored
85 in text files. These files can be edited with a text editor.
86 Many important aspects of <SPAN
90 also be controlled easily with a web browser.
98 >7.1. Controlling <SPAN
101 > with Your Web Browser</A
107 >'s user interface can be reached through the special
109 HREF="http://config.privoxy.org/"
111 >http://config.privoxy.org/</A
118 which is a built-in page and works without Internet access.
119 You will see the following section: </P
139 HREF="http://config.privoxy.org/show-status"
141 >View & change the current configuration</A
148 HREF="http://config.privoxy.org/show-version"
150 >View the source code version numbers</A
157 HREF="http://config.privoxy.org/show-request"
159 >View the request headers.</A
166 HREF="http://config.privoxy.org/show-url-info"
168 >Look up which actions apply to a URL and why</A
175 HREF="http://config.privoxy.org/toggle"
177 >Toggle Privoxy on or off</A
190 > This should be self-explanatory. Note the first item leads to an editor for the
193 >"actions list"</SPAN
194 >, which is where the ad, banner, cookie,
195 and URL blocking magic is configured as well as other advanced features of
199 >. This is an easy way to adjust various
203 > configuration. The actions
204 file, and other configuration files, are explained in detail below. </P
208 >"Toggle Privoxy On or Off"</SPAN
209 > is handy for sites that might
210 have problems with your current actions and filters. You can in fact use
211 it as a test to see whether it is <SPAN
215 causing the problem or not. <SPAN
219 to run as a proxy in this case, but all filtering is disabled. There
221 HREF="appendix.html#BOOKMARKLETS"
224 that you can toggle <SPAN
227 > with one click from
236 >7.2. Configuration Files Overview</A
239 > For Unix, *BSD and Linux, all configuration files are located in
243 > by default. For MS Windows, OS/2, and
244 AmigaOS these are all in the same directory as the
248 > executable. The name
249 and number of configuration files has changed from previous versions, and is
250 subject to change as development progresses.</P
252 > The installed defaults provide a reasonable starting point, though
253 some settings may be aggressive by some standards. For the time being, the
254 principle configuration files are:</P
261 > The main configuration file is named <A
262 HREF="configuration.html#CONFIG"
265 on Linux, Unix, BSD, OS/2, and AmigaOS and <TT
269 on Windows. This is a required file.
278 HREF="configuration.html#ACTIONS-FILE"
281 the default settings for various <SPAN
284 > relating to images, banners,
285 pop-ups, access restrictions, banners and cookies.
288 > Multiple actions files may be defined in <TT
292 are processed in the order they are defined. Local customizations and locally
293 preferred exceptions to the default policies as defined in
297 > are probably best applied in
301 >, which should be preserved across
305 > is also included. This is mostly
313 There is also a web based editor that can be accessed from
315 HREF="http://config.privoxy.org/show-status/"
317 >http://config.privoxy.org/show-status/</A
320 HREF="http://p.p/show-status/"
322 >http://p.p/show-status/</A
324 various actions files.
333 HREF="configuration.html#FILTER-FILE"
336 >) can be used to re-write the raw page content, including
337 viewable text as well as embedded HTML and JavaScript, and whatever else
338 lurks on any given web page. The filtering jobs are only pre-defined here;
339 whether to apply them or not is up to the actions files.
345 > All files use the <SPAN
351 > character to denote a
352 comment (the rest of the line will be ignored) angd understand line continuation
353 through placing a backslash ("<TT
356 >") as the very last character
357 in a line. If the <TT
360 > is preceded by a backslash, it looses
361 its special function. Placing a <TT
364 > in front of an otherwise
365 valid configuration line to prevent it from being interpreted is called "commenting
368 > The actions files and <TT
372 can use Perl style <A
373 HREF="appendix.html#REGEX"
374 >regular expressions</A
376 maximum flexibility. </P
378 > After making any changes, there is no need to restart
382 > in order for the changes to take
386 > detects such changes
387 automatically. Note, however, that it may take one or two additional
388 requests for the change to take effect. When changing the listening address
396 must obviously be sent to the <I
399 > listening address.</P
401 > While under development, the configuration content is subject to change.
402 The below documentation may not be accurate by the time you read this.
403 Also, what constitutes a <SPAN
406 > setting, may change, so
407 please check all your configuration files on important issues.</P
415 >7.3. The Main Configuration File</A
418 > Again, the main configuration file is named <TT
422 Linux/Unix/BSD and OS/2, and <TT
426 Configuration lines consist of an initial keyword followed by a list of
427 values, all separated by whitespace (any number of spaces or tabs). For
433 CLASS="LITERALLAYOUT"
436 >confdir /etc/privoxy</I
438 </P
443 > Assigns the value <TT
450 > and thus indicates that the configuration
451 directory is named <SPAN
453 >"/etc/privoxy/"</SPAN
456 > All options in the config file except for <TT
463 > are optional. Watch out in the below description
464 for what happens if you leave them unset.</P
466 > The main config file controls all aspects of <SPAN
470 operation that are not location dependent (i.e. they apply universally, no matter
471 where you may be surfing).</P
478 >7.3.1. Configuration and Log File Locations</A
484 > can (and normally does) use a number of
485 other files for additional configuration and logging.
486 This section of the configuration file tells <SPAN
490 where to find those other files. </P
508 >The directory where the other configuration files are located</P
520 >/etc/privoxy (Unix) <I
526 > installation dir (Windows) </P
529 >Effect if unset:</DT
550 > When development goes modular and multi-user, the blocker, filter, and
551 per-user config will be stored in subdirectories of <SPAN
555 For now, the configuration directory structure is flat, except for
558 >confdir/templates</TT
559 >, where the HTML templates for CGI
560 output reside (e.g. <SPAN
586 > The directory where all logging takes place (i.e. where <TT
606 >/var/log/privoxy (Unix) <I
612 > installation dir (Windows) </P
615 >Effect if unset:</DT
646 NAME="DEFAULT.ACTION"
650 NAME="STANDARD.ACTION"
669 HREF="configuration.html#ACTIONS"
678 >File name, relative to <TT
694 CLASS="LITERALLAYOUT"
695 > standard # Internal purposes, recommended not editing</P
702 CLASS="LITERALLAYOUT"
703 > default # Main actions file</P
710 CLASS="LITERALLAYOUT"
711 > user # User customizations</P
721 >Effect if unset:</DT
724 > No actions are taken at all. Simple neutral proxying.
734 > lines are OK and are in fact recommended!
738 The default values include standard.action, which is used for internal
739 purposes and should be loaded, default.action, which is the
743 > actions file maintained by the developers, and
747 >, where you can make your personal additions.
751 There is no point in using <SPAN
754 > without an actions file.
767 NAME="DEFAULT.FILTER"
781 HREF="configuration.html#FILTER"
790 >File name, relative to <TT
799 >default.filter (Unix) <I
802 > default.filter.txt (Windows)</P
805 >Effect if unset:</DT
808 > No textual content filtering takes place, i.e. all
818 actions in the actions files are turned off
827 >"default.filter"</SPAN
828 > file contains content modification rules
831 >"regular expressions"</SPAN
832 >. These rules permit powerful
833 changes on the content of Web pages, e.g., you could disable your favorite
834 JavaScript annoyances, re-write the actual displayed text, or just have some
842 it appears on a Web page.
865 > The log file to use
872 >File name, relative to <TT
884 > privoxy.log (Windows)</P
887 >Effect if unset:</DT
890 > No log file is used, all log messages go to the console (<TT
900 > The windows version will additionally log to the console.
903 > The logfile is where all logging and error messages are written. The level
904 of detail and number of messages are set with the <TT
908 option (see below). The logfile can be useful for tracking down a problem with
912 > (e.g., it's not blocking an ad you
913 think it should block) but in most cases you probably will never look at it.
916 > Your logfile will grow indefinitely, and you will probably want to
917 periodically remove it. On Unix systems, you can do this with a cron job
921 >). For Red Hat, a <B
925 script has been included.
928 > On SuSE Linux systems, you can place a line like <SPAN
931 +1024k 644 nobody.nogroup"</SPAN
936 the effect that cron.daily will automatically archive, gzip, and empty the
937 log, when it exceeds 1M size.
960 > The file to store intercepted cookies in
967 >File name, relative to <TT
979 > privoxy.jar (Windows)</P
982 >Effect if unset:</DT
985 > Intercepted cookies are not stored at all.
992 > The jarfile may grow to ridiculous sizes over time.
1004 >7.3.1.7. trustfile</A
1009 CLASS="VARIABLELIST"
1015 > The trust file to use
1022 >File name, relative to <TT
1033 >Unset (commented out)</I
1034 >. When activated: trust (Unix) <I
1037 > trust.txt (Windows)</P
1040 >Effect if unset:</DT
1043 > The whole trust mechanism is turned off.
1050 > The trust mechanism is an experimental feature for building white-lists and should
1051 be used with care. It is <I
1054 > recommended for the casual user.
1057 > If you specify a trust file, <SPAN
1061 access to sites that are named in the trustfile.
1062 You can also mark sites as trusted referrers (with <TT
1066 the effect that access to untrusted sites will be granted, if a link from a
1067 trusted referrer was used.
1068 The link target will then be added to the <SPAN
1072 Possible applications include limiting Internet access for children.
1078 > operator in the trust file, it may grow considerably over time.
1091 >7.3.2. Local Set-up Documentation</A
1094 > If you intend to operate <SPAN
1098 that just yourself, it might be a good idea to let them know how to reach
1099 you, what you block and why you do that, your policies etc.
1106 NAME="TRUST-INFO-URL"
1107 >7.3.2.1. trust-info-url</A
1112 CLASS="VARIABLELIST"
1118 > A URL to be displayed in the error page that users will see if access to an untrusted page is denied.
1131 >Two example URL are provided</P
1134 >Effect if unset:</DT
1137 > No links are displayed on the "untrusted" error page.
1144 > The value of this option only matters if the experimental trust mechanism has been
1151 > If you use the trust mechanism, it is a good idea to write up some on-line
1152 documentation about your trust policy and to specify the URL(s) here.
1153 Use multiple times for multiple URLs.
1156 > The URL(s) should be added to the trustfile as well, so users don't end up
1157 locked out from the information on why they were locked out in the first place!
1168 NAME="ADMIN-ADDRESS"
1169 >7.3.2.2. admin-address</A
1174 CLASS="VARIABLELIST"
1180 > An email address to reach the proxy administrator.
1199 >Effect if unset:</DT
1202 > No email address is displayed on error pages and the CGI user interface.
1216 are unset, the whole "Local Privoxy Support" box on all generated pages will
1228 NAME="PROXY-INFO-URL"
1229 >7.3.2.3. proxy-info-url</A
1234 CLASS="VARIABLELIST"
1240 > A URL to documentation about the local <SPAN
1244 configuration or policies.
1263 >Effect if unset:</DT
1266 > No link to local documentation is displayed on error pages and the CGI user interface.
1280 are unset, the whole "Local Privoxy Support" box on all generated pages will
1284 > This URL shouldn't be blocked ;-)
1297 >7.3.3. Debugging</A
1300 > These options are mainly useful when tracing a problem.
1301 Note that you might also want to invoke
1309 command line option when debugging.
1322 CLASS="VARIABLELIST"
1328 > Key values that determine what information gets logged.
1341 >12289 (i.e.: URLs plus informational and warning messages)</P
1344 >Effect if unset:</DT
1347 > Nothing gets logged.
1354 > The available debug levels are:
1364 CLASS="PROGRAMLISTING"
1365 > debug 1 # show each GET/POST/CONNECT request
1366 debug 2 # show each connection status
1367 debug 4 # show I/O status
1368 debug 8 # show header parsing
1369 debug 16 # log all data into the logfile
1370 debug 32 # debug force feature
1371 debug 64 # debug regular expression filter
1372 debug 128 # debug fast redirects
1373 debug 256 # debug GIF de-animation
1374 debug 512 # Common Log Format
1375 debug 1024 # debug kill pop-ups
1376 debug 4096 # Startup banner and warnings.
1377 debug 8192 # Non-fatal errors</PRE
1384 > To select multiple debug levels, you can either add them or use
1391 > A debug level of 1 is informative because it will show you each request
1394 >1, 4096 and 8192 are highly recommended</I
1396 so that you will notice when things go wrong. The other levels are probably
1397 only of interest if you are hunting down a specific problem. They can produce
1398 a hell of an output (especially 16).
1402 > The reporting of <I
1405 > errors (i.e. ones which crash
1409 >) is always on and cannot be disabled.
1412 > If you want to use CLF (Common Log Format), you should set <SPAN
1419 > and not enable anything else.
1430 NAME="SINGLE-THREADED"
1431 >7.3.3.2. single-threaded</A
1436 CLASS="VARIABLELIST"
1442 > Whether to run only one server thread
1464 >Effect if unset:</DT
1467 > Multi-threaded (or, where unavailable: forked) operation, i.e. the ability to
1468 serve multiple requests simultaneously.
1475 > This option is only there for debug purposes and you should never
1478 >It will drastically reduce performance.</I
1491 NAME="ACCESS-CONTROL"
1492 >7.3.4. Access Control and Security</A
1495 > This section of the config file controls the security-relevant aspects
1506 NAME="LISTEN-ADDRESS"
1507 >7.3.4.1. listen-address</A
1512 CLASS="VARIABLELIST"
1518 > The IP address and TCP port on which <SPAN
1522 listen for client requests.
1548 >Effect if unset:</DT
1551 > Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended for
1552 home users who run <SPAN
1555 > on the same machine as
1563 > You will need to configure your browser(s) to this proxy address and port.
1566 > If you already have another service running on port 8118, or if you want to
1567 serve requests from other machines (e.g. on your local network) as well, you
1568 will need to override the default.
1571 > If you leave out the IP address, <SPAN
1575 bind to all interfaces (addresses) on your machine and may become reachable
1576 from the Internet. In that case, consider using access control lists (ACL's)
1580 > below), or a firewall.
1587 > Suppose you are running <SPAN
1591 a machine which has the address 192.168.0.1 on your local private network
1592 (192.168.0.0) and has another outside connection with a different address.
1593 You want it to serve requests from inside only:
1603 CLASS="PROGRAMLISTING"
1604 > listen-address 192.168.0.1:8118</PRE
1625 CLASS="VARIABLELIST"
1631 > Initial state of "toggle" status
1647 >Effect if unset:</DT
1650 > Act as if toggled on
1657 > If set to 0, <SPAN
1663 >"toggled off"</SPAN
1664 > mode, i.e. behave like a normal, content-neutral
1667 >enable-remote-toggle</TT
1669 below. This is not really useful anymore, since toggling is much easier
1671 HREF="http://config.privoxy.org/toggle"
1675 > then via editing the <TT
1681 > The windows version will only display the toggle icon in the system tray
1682 if this option is present.
1693 NAME="ENABLE-REMOTE-TOGGLE"
1694 >7.3.4.3. enable-remote-toggle</A
1699 CLASS="VARIABLELIST"
1705 > Whether or not the <A
1706 HREF="http://config.privoxy.org/toggle"
1726 >Effect if unset:</DT
1729 > The web-based toggle feature is disabled.
1736 > When toggled off, <SPAN
1739 > acts like a normal,
1740 content-neutral proxy, i.e. it acts as if none of the actions applied to
1744 > For the time being, access to the toggle feature can <I
1748 controlled separately by <SPAN
1751 > or HTTP authentication,
1752 so that everybody who can access <SPAN
1763 toggle it for all users. So this option is <I
1767 for multi-user environments with untrusted users.
1770 > Note that you must have compiled <SPAN
1774 support for this feature, otherwise this option has no effect.
1785 NAME="ENABLE-EDIT-ACTIONS"
1786 >7.3.4.4. enable-edit-actions</A
1791 CLASS="VARIABLELIST"
1797 > Whether or not the <A
1798 HREF="http://config.privoxy.org/show-status"
1818 >Effect if unset:</DT
1821 > The web-based actions file editor is disabled.
1828 > For the time being, access to the editor can <I
1832 controlled separately by <SPAN
1835 > or HTTP authentication,
1836 so that everybody who can access <SPAN
1847 modify its configuration for all users. So this option is <I
1851 > for multi-user environments with untrusted users.
1854 > Note that you must have compiled <SPAN
1858 support for this feature, otherwise this option has no effect.
1878 ACLs: permit-access and deny-access</A
1883 CLASS="VARIABLELIST"
1889 > Who can access what.
1931 > are IP addresses in dotted decimal notation or valid
1943 > are subnet masks in CIDR notation, i.e. integer
1944 values from 2 to 30 representing the length (in bits) of the network address. The masks and the whole
1945 destination part are optional.
1958 >Effect if unset:</DT
1961 > Don't restrict access further than implied by <TT
1971 > Access controls are included at the request of ISPs and systems
1972 administrators, and <I
1974 >are not usually needed by individual users</I
1976 For a typical home user, it will normally suffice to ensure that
1980 > only listens on the localhost or internal (home)
1981 network address by means of the <TT
1987 > Please see the warnings in the FAQ that this proxy is not intended to be a substitute
1988 for a firewall or to encourage anyone to defer addressing basic security
1992 > Multiple ACL lines are OK.
1993 If any ACLs are specified, then the <SPAN
1997 talks only to IP addresses that match at least one <TT
2001 and don't match any subsequent <TT
2004 > line. In other words, the
2005 last match wins, with the default being <TT
2014 > is using a forwarder (see <TT
2018 for a particular destination URL, the <TT
2024 that is examined is the address of the forwarder and <I
2028 of the ultimate target. This is necessary because it may be impossible for the local
2032 > to determine the IP address of the
2033 ultimate target (that's often what gateways are used for).
2036 > You should prefer using IP addresses over DNS names, because the address lookups take
2037 time. All DNS names must resolve! You can <I
2040 > use domain patterns
2044 > or partial domain names. If a DNS name resolves to multiple
2045 IP addresses, only the first one is used.
2048 > Denying access to particular sites by ACL may have undesired side effects
2049 if the site in question is hosted on a machine which also hosts other sites.
2056 > Explicitly define the default behavior if no ACL and
2064 is OK. The absence of a <TT
2073 > destination addresses are OK:
2084 > permit-access localhost</PRE
2091 > Allow any host on the same class C subnet as www.privoxy.org access to
2092 nothing but www.example.com:
2103 > permit-access www.privoxy.org/24 www.example.com/32</PRE
2110 > Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere,
2111 with the exception that 192.168.45.73 may not access www.dirty-stuff.example.com:
2122 > permit-access 192.168.45.64/26
2123 deny-access 192.168.45.73 www.dirty-stuff.example.com</PRE
2139 >7.3.4.6. buffer-limit</A
2144 CLASS="VARIABLELIST"
2150 > Maximum size of the buffer for content filtering.
2166 >Effect if unset:</DT
2169 > Use a 4MB (4096 KB) limit.
2176 > For content filtering, i.e. the <TT
2183 > actions, it is necessary that
2187 > buffers the entire document body.
2188 This can be potentially dangerous, since a server could just keep sending
2189 data indefinitely and wait for your RAM to exhaust -- with nasty consequences.
2193 > When a document buffer size reaches the <TT
2197 flushed to the client unfiltered and no further attempt to
2198 filter the rest of the document is made. Remember that there may be multiple threads
2199 running, which might require up to <TT
2206 >, unless you have enabled <SPAN
2208 >"single-threaded"</SPAN
2223 >7.3.5. Forwarding</A
2226 > This feature allows routing of HTTP requests through a chain of
2228 It can be used to better protect privacy and confidentiality when
2229 accessing specific domains by routing requests to those domains
2230 through an anonymous public proxy (see e.g. <A
2231 HREF="http://www.multiproxy.org/anon_list.htm"
2233 >http://www.multiproxy.org/anon_list.htm</A
2235 Or to use a caching proxy to speed up browsing. Or chaining to a parent
2236 proxy may be necessary because the machine that <SPAN
2240 runs on has no direct Internet access.</P
2242 > Also specified here are SOCKS proxies. <SPAN
2246 supports the SOCKS 4 and SOCKS 4A protocols.</P
2253 >7.3.5.1. forward</A
2258 CLASS="VARIABLELIST"
2264 > To which parent HTTP proxy specific requests should be routed.
2300 > is a domain name pattern (see the
2301 chapter on domain matching in the <TT
2310 > is the address of the parent HTTP proxy
2311 as an IP addresses in dotted decimal notation or as a valid DNS name (or <SPAN
2317 >"no forwarding"</SPAN
2324 > parameters are TCP ports, i.e. integer
2325 values from 1 to 64535
2338 >Effect if unset:</DT
2341 > Don't use parent HTTP proxies.
2356 >, then requests are not
2357 forwarded to another HTTP proxy but are made directly to the web servers.
2360 > Multiple lines are OK, they are checked in sequence, and the last match wins.
2367 > Everything goes to an example anonymizing proxy, except SSL on port 443 (which it doesn't handle):
2378 > forward .* anon-proxy.example.org:8080
2386 > Everything goes to our example ISP's caching proxy, except for requests
2387 to that ISP's sites:
2398 > forward .*. caching-proxy.example-isp.net:8000
2399 forward .example-isp.net .</PRE
2416 NAME="FORWARD-SOCKS4"
2420 NAME="FORWARD-SOCKS4A"
2423 forward-socks4 and forward-socks4a</A
2428 CLASS="VARIABLELIST"
2434 > Through which SOCKS proxy (and to which parent HTTP proxy) specific requests should be routed.
2481 > is a domain name pattern (see the
2482 chapter on domain matching in the <TT
2497 are IP addresses in dotted decimal notation or valid DNS names (<TT
2508 >"no HTTP forwarding"</SPAN
2509 >), and the optional
2515 > parameters are TCP ports, i.e. integer values from 1 to 64535
2528 >Effect if unset:</DT
2531 > Don't use SOCKS proxies.
2538 > Multiple lines are OK, they are checked in sequence, and the last match wins.
2541 > The difference between <TT
2546 >forward-socks4a</TT
2548 is that in the SOCKS 4A protocol, the DNS resolution of the target hostname happens on the SOCKS
2549 server, while in SOCKS 4 it happens locally.
2560 >, then requests are not
2561 forwarded to another HTTP proxy but are made (HTTP-wise) directly to the web servers, albeit through
2569 > From the company example.com, direct connections are made to all
2573 > domains, but everything outbound goes through
2574 their ISP's proxy by way of example.com's corporate SOCKS 4A gateway to
2586 > forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080
2587 forward .example.com .</PRE
2594 > A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent looks like this:
2605 > forward-socks4 .*. socks-gw.example.com:1080 .</PRE
2620 NAME="ADVANCED-FORWARDING-EXAMPLES"
2621 >7.3.5.3. Advanced Forwarding Examples</A
2624 > If you have links to multiple ISPs that provide various special content
2625 only to their subscribers, you can configure multiple <SPAN
2629 which have connections to the respective ISPs to act as forwarders to each other, so that
2633 > users can see the internal content of all ISPs.</P
2635 > Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP connection to
2636 isp-b.net. Both run <SPAN
2640 configuration can look like this:</P
2653 forward .isp-b.net host-b:8118</PRE
2670 forward .isp-a.net host-a:8118</PRE
2676 > Now, your users can set their browser's proxy to use either
2677 host-a or host-b and be able to browse the internal content
2678 of both isp-a and isp-b.</P
2680 > If you intend to chain <SPAN
2687 > locally, then chain as
2690 >browser -> squid -> privoxy</TT
2691 > is the recommended way. </P
2693 > Assuming that <SPAN
2700 run on the same box, your squid configuration could then look like this:</P
2710 > # Define Privoxy as parent proxy (without ICP)
2711 cache_peer 127.0.0.1 parent 8118 7 no-query
2713 # Define ACL for protocol FTP
2716 # Do not forward FTP requests to Privoxy
2717 always_direct allow ftp
2719 # Forward all the rest to Privoxy
2720 never_direct allow all</PRE
2726 > You would then need to change your browser's proxy settings to <SPAN
2729 >'s address and port.
2730 Squid normally uses port 3128. If unsure consult <TT
2745 >7.3.6. Windows GUI Options</A
2751 > has a number of options specific to the
2752 Windows GUI interface:</P
2754 NAME="ACTIVITY-ANIMATION"
2759 >"activity-animation"</SPAN
2764 > icon will animate when
2768 > is active. To turn off, set to 0.</P
2773 CLASS="LITERALLAYOUT"
2776 >activity-animation 1</I
2778 </P
2788 >"log-messages"</SPAN
2793 > will log messages to the console
2799 CLASS="LITERALLAYOUT"
2804 </P
2809 NAME="LOG-BUFFER-SIZE"
2815 >"log-buffer-size"</SPAN
2816 > is set to 1, the size of the log buffer,
2817 i.e. the amount of memory used for the log messages displayed in the
2818 console window, will be limited to <SPAN
2820 >"log-max-lines"</SPAN
2823 > Warning: Setting this to 0 will result in the buffer to grow infinitely and
2824 eat up all your memory!</P
2829 CLASS="LITERALLAYOUT"
2832 >log-buffer-size 1</I
2834 </P
2839 NAME="LOG-MAX-LINES"
2844 >log-max-lines</SPAN
2845 > is the maximum number of lines held
2846 in the log buffer. See above.</P
2851 CLASS="LITERALLAYOUT"
2854 >log-max-lines 200</I
2856 </P
2861 NAME="LOG-HIGHLIGHT-MESSAGES"
2866 >"log-highlight-messages"</SPAN
2871 > will highlight portions of the log
2872 messages with a bold-faced font:</P
2877 CLASS="LITERALLAYOUT"
2880 >log-highlight-messages 1</I
2882 </P
2887 NAME="LOG-FONT-NAME"
2890 > The font used in the console window:</P
2895 CLASS="LITERALLAYOUT"
2898 >log-font-name Comic Sans MS</I
2900 </P
2905 NAME="LOG-FONT-SIZE"
2908 > Font size used in the console window:</P
2913 CLASS="LITERALLAYOUT"
2918 </P
2923 NAME="SHOW-ON-TASK-BAR"
2929 >"show-on-task-bar"</SPAN
2930 > controls whether or not
2934 > will appear as a button on the Task bar
2940 CLASS="LITERALLAYOUT"
2943 >show-on-task-bar 0</I
2945 </P
2950 NAME="CLOSE-BUTTON-MINIMIZES"
2955 >"close-button-minimizes"</SPAN
2956 > is set to 1, the Windows close
2957 button will minimize <SPAN
2960 > instead of closing
2961 the program (close with the exit option on the File menu).</P
2966 CLASS="LITERALLAYOUT"
2969 >close-button-minimizes 1</I
2971 </P
2981 >"hide-console"</SPAN
2982 > option is specific to the MS-Win console
2986 >. If this option is used,
2990 > will disconnect from and hide the
2996 CLASS="LITERALLAYOUT"
2997 > #hide-console<br>
2998 </P
3010 >7.4. Actions Files</A
3013 > The actions files are used to define what actions
3017 > takes for which URLs, and thus determines
3018 how ad images, cookies and various other aspects of HTTP content and
3019 transactions are handled, and on which sites (or even parts thereof). There
3020 are three such files included with <SPAN
3024 with slightly different purposes. <TT
3028 the default policies. <TT
3030 >standard.action</TT
3035 > and the web based editor to set
3036 pre-defined values (and normally should not be edited). Local exceptions
3037 are best done in <TT
3040 >. The content of these
3041 can all be viewed and edited from <A
3042 HREF="http://config.privoxy.org/show-status"
3044 >http://config.privoxy.org/show-status</A
3049 Anything you want can blocked, including ads, banners, or just some obnoxious
3050 URL that you would rather not see is done here. Cookies can be accepted or rejected, or
3051 accepted only during the current browser session (i.e. not written to disk),
3052 content can be modified, JavaScripts tamed, user-tracking fooled, and much more.
3053 See below for a complete list of available actions.</P
3055 > An actions file typically has sections. Near the top, <SPAN
3059 optionally defined (discussed <A
3060 HREF="configuration.html#ALIASES"
3063 >), then the default set of rules
3064 which will apply universally to all sites and pages. And then below that,
3065 exceptions to the defined universal policies. </P
3072 >7.4.1. Finding the Right Mix</A
3076 HREF="configuration.html#ACTIONS"
3078 > like cookie suppression
3079 or script disabling may render some sites unusable, which rely on these
3080 techniques to work properly. Finding the right mix of actions is not easy and
3081 certainly a matter of personal taste. In general, it can be said that the more
3085 > your default settings (in the top section of the
3086 actions file) are, the more exceptions for <SPAN
3090 will have to make later. If, for example, you want to kill popup windows per
3091 default, you'll have to make exceptions from that rule for sites that you
3092 regularly use and that require popups for actually useful content, like maybe
3093 your bank, favorite shop, or newspaper.</P
3095 > We have tried to provide you with reasonable rules to start from in the
3096 distribution actions files. But there is no general rule of thumb on these
3097 things. There just are too many variables, and sites are constantly changing.
3098 Sooner or later you will want to change the rules (and read this chapter again :).</P
3106 >7.4.2. How to Edit</A
3109 > The easiest way to edit the <SPAN
3112 > files is with a browser by
3113 using our browser-based editor, which can be reached from <A
3114 HREF="http://config.privoxy.org/show-status"
3116 >http://config.privoxy.org/show-status</A
3119 > If you prefer plain text editing to GUIs, you can of course also directly edit the
3120 the actions files.</P
3128 >7.4.3. How Actions are Applied to URLs</A
3131 > Actions files are divided into sections. There are special sections,
3135 HREF="configuration.html#ALIASES"
3138 > sections which will be discussed later. For now
3139 let's concentrate on regular sections: They have a heading line (often split
3140 up to multiple lines for readability) which consist of a list of actions,
3141 separated by whitespace and enclosed in curly braces. Below that, there
3142 is a list of URL patterns, each on a separate line.</P
3144 > To determine which actions apply to a request, the URL of the request is
3145 compared to all patterns in this file. Every time it matches, the list of
3146 applicable actions for the URL is incrementally updated, using the heading
3147 of the section in which the pattern is located. If multiple matches for
3148 the same URL set the same action differently, the last match wins. If not,
3149 the effects are aggregated (e.g. a URL might match both the
3151 HREF="configuration.html#HANDLE-AS-IMAGE"
3155 >"+handle-as-image"</SPAN
3159 HREF="configuration.html#BLOCK"
3168 > You can trace this process by visiting <A
3169 HREF="http://config.privoxy.org/show-url-info"
3171 >http://config.privoxy.org/show-url-info</A
3174 > More detail on this is provided in the Appendix, <A
3175 HREF="appendix.html#ACTIONSANAT"
3176 > Anatomy of an Action</A
3188 > Generally, a pattern has the form <TT
3190 ><domain>/<path></TT
3194 ><domain></TT
3199 are optional. (This is why the pattern <TT
3202 > matches all URLs).</P
3206 CLASS="VARIABLELIST"
3211 >www.example.com/</TT
3215 > is a domain-only pattern and will match any request to <TT
3217 >www.example.com</TT
3219 regardless of which document on that server is requested.
3225 >www.example.com</TT
3229 > means exactly the same. For domain-only patterns, the trailing <TT
3239 >www.example.com/index.html</TT
3243 > matches only the single document <TT
3249 >www.example.com</TT
3260 > matches the document <TT
3263 >, regardless of the domain,
3277 > matches nothing, since it would be interpreted as a domain name and
3278 there is no top-level domain called <TT
3292 >7.4.4.1. The Domain Pattern</A
3295 > The matching of the domain part offers some flexible options: if the
3296 domain starts or ends with a dot, it becomes unanchored at that end.
3301 CLASS="VARIABLELIST"
3310 > matches any domain that <I
3327 > matches any domain that <I
3344 > matches any domain that <I
3351 (Correctly speaking: It matches any FQDN that contains <TT
3360 > Additionally, there are wild-cards that you can use in the domain names
3361 themselves. They work pretty similar to shell wild-cards: <SPAN
3365 stands for zero or more arbitrary characters, <SPAN
3369 any single character, you can define character classes in square
3370 brackets and all of that can be freely mixed:</P
3374 CLASS="VARIABLELIST"
3379 >ad*.example.com</TT
3385 >"adserver.example.com"</SPAN
3389 >"ads.example.com"</SPAN
3390 >, etc but not <SPAN
3392 >"sfads.example.com"</SPAN
3399 >*ad*.example.com</TT
3403 > matches all of the above, and then some.
3419 >pictures.epix.com</TT
3422 >a.b.c.d.e.upix.com</TT
3429 >www[1-9a-ez].example.c*</TT
3435 >www1.example.com</TT
3439 >www4.example.cc</TT
3442 >wwwd.example.cy</TT
3446 >wwwz.example.com</TT
3453 >wwww.example.com</TT
3466 >7.4.4.2. The Path Pattern</A
3472 > uses Perl compatible regular expressions
3474 HREF="http://www.pcre.org/"
3478 matching the path.</P
3481 HREF="appendix.html#REGEX"
3483 > with a brief quick-start into regular
3484 expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
3486 HREF="http://www.pcre.org/man.txt"
3488 >http://www.pcre.org/man.txt</A
3490 You might also find the Perl man page on regular expressions (<TT
3494 useful, which is available on-line at <A
3495 HREF="http://www.perldoc.com/perl5.6/pod/perlre.html"
3497 >http://www.perldoc.com/perl5.6/pod/perlre.html</A
3500 > Note that the path pattern is automatically left-anchored at the <SPAN
3504 i.e. it matches as if it would start with a <SPAN
3507 > (regular expression speak
3508 for the beginning of a line).</P
3510 > Please also note that matching in the path is case
3514 > by default, but you can switch to case
3515 sensitive at any point in the pattern by using the
3522 >www.example.com/(?-i)PaTtErN.*</TT
3524 documents whose path starts with <TT
3531 > this capitalization.</P
3543 > All actions are disabled by default, until they are explicitly enabled
3544 somewhere in an actions file. Actions are turned on if preceded with a
3548 >, and turned off if preceded with a <SPAN
3557 >"do that action"</SPAN
3562 > means please <SPAN
3564 >"block the following URL
3569 Actions are invoked by enclosing the action name in curly braces (e.g.
3570 {+some_action}), followed by a list of URLs (or patterns that match URLs) to
3571 which the action applies. There are three classes of actions: </P
3579 Boolean, i.e the action can only be <SPAN
3592 CLASS="LITERALLAYOUT"
3596 > # enable this action<br>
3600 > # disable this action<br>
3601 </P
3610 Parameterized, e.g. <SPAN
3612 >"+/-hide-user-agent{ Mozilla 1.0 }"</SPAN
3614 where some value is required in order to enable this type of action.
3621 CLASS="LITERALLAYOUT"
3625 > # enable action and set parameter to <SPAN
3632 > # disable action (<SPAN
3635 >) can be omitted<br>
3636 </P
3646 Multi-value, e.g. <SPAN
3648 >"{+/-add-header{Name: value}}"</SPAN
3652 >"{+/-send-wafer{name=value}}"</SPAN
3653 >), where some value needs to be defined
3654 in addition to simply enabling the action. Examples:
3660 CLASS="LITERALLAYOUT"
3663 >{+name{param=value}}</I
3664 > # enable action and set <SPAN
3667 > to <SPAN
3673 >{-name{param=value}}</I
3674 > # remove the parameter <SPAN
3677 > completely<br>
3681 > # disable this action totally and remove <SPAN
3685 </P
3694 > If nothing is specified in any actions file, no <SPAN
3698 taken. So in this case <SPAN
3702 normal, non-blocking, non-anonymizing proxy. You must specifically enable the
3703 privacy and blocking features you need (although the provided default actions
3704 files will give a good starting point).</P
3706 > Later defined actions always over-ride earlier ones. So exceptions
3707 to any rules you make, should come in the latter part of the file. For
3708 multi-valued actions, the actions are applied in the order they are
3709 specified. Actions files are processed in the order they are defined
3713 > (the default installation has three
3714 actions files). It also quite possible for any given URL pattern to
3715 match more than one action!</P
3717 > The list of valid <SPAN
3732 >+add-header{Name: value}</I
3738 CLASS="VARIABLELIST"
3750 > Send a user defined HTTP header to the web server.
3754 >Possible values:</DT
3757 > Any value is possible. Validity of the defined HTTP headers is not checked.
3764 CLASS="LITERALLAYOUT"
3765 > <I
3767 >{+add-header{X-User-Tracking: sucks}}</I
3769 <I
3773 </P
3779 > This action may be specified multiple times, in order to define multiple
3780 headers. This is rarely needed for the typical user. If you don't know what
3783 >"HTTP headers"</SPAN
3784 > are, you definitely don't need to worry about this
3805 CLASS="VARIABLELIST"
3817 > Used to block a URL from reaching your browser. The URL may be
3818 anything, but is typically used to block ads or other obnoxious
3823 >Possible values:</DT
3832 CLASS="LITERALLAYOUT"
3833 > <I
3837 <I
3839 >.banners.example.com</I
3841 <I
3845 </P
3851 > If a URL matches one of the blocked patterns, <SPAN
3855 will intercept the URL and display its special <SPAN
3859 instead. If there is sufficient space, a large red banner will appear with
3860 a friendly message about why the page was blocked, and a way to go there
3861 anyway. If there is insufficient space a smaller blocked page will appear
3862 without the red banner.
3864 HREF="http://ads.bannerserver.example.com/nasty-ads/sponsor.html"
3868 to view the default blocked HTML page (<SPAN
3872 for this to work as intended!).
3876 A very important exception is if the URL <I
3884 HREF="configuration.html#HANDLE-AS-IMAGE"
3888 >"+handle-as-image"</SPAN
3891 then it will be handled by
3893 HREF="configuration.html#SET-IMAGE-BLOCKER"
3897 >"+set-image-blocker"</SPAN
3900 (see below). It is important to understand this process, in order
3901 to understand how <SPAN
3904 > is able to deal with
3905 ads and other objectionable content.
3909 HREF="configuration.html#FILTER"
3916 action can also perform some of the
3917 same functionality as <SPAN
3920 >, but by virtue of very
3921 different programming techniques, and is most often used for different
3933 NAME="DEANIMATE-GIFS"
3942 CLASS="VARIABLELIST"
3954 > To stop those annoying, distracting animated GIF images.
3958 >Possible values:</DT
3974 CLASS="LITERALLAYOUT"
3975 > <I
3977 >{+deanimate-gifs{last}}</I
3979 <I
3983 </P
3989 > De-animate all animated GIF images, i.e. reduce them to their last frame.
3990 This will also shrink the images considerably (in bytes, not pixels!). If
3994 > is given, the first frame of the animation
3995 is used as the replacement. If <SPAN
3998 > is given, the last
3999 frame of the animation is used instead, which probably makes more sense for
4000 most banner animations, but also has the risk of not showing the entire
4001 last frame (if it is only a delta to an earlier frame).
4012 NAME="DOWNGRADE-HTTP-VERSION"
4015 >+downgrade-http-version</I
4021 CLASS="VARIABLELIST"
4035 >"+downgrade-http-version"</SPAN
4036 > will downgrade HTTP/1.1 client requests to
4037 HTTP/1.0 and downgrade the responses as well.
4041 >Possible values:</DT
4051 CLASS="LITERALLAYOUT"
4052 > <I
4054 >{+downgrade-http-version}</I
4056 <I
4060 </P
4066 > Use this action for servers that use HTTP/1.1 protocol features that
4070 > doesn't handle well yet. HTTP/1.1 is
4071 only partially implemented. Default is not to downgrade requests. This is
4072 an infrequently needed action, and is used to help with rare problem sites only.
4083 NAME="FAST-REDIRECTS"
4092 CLASS="VARIABLELIST"
4106 >"+fast-redirects"</SPAN
4107 > action enables interception of
4111 > requests from one server to another, which
4112 are used to track users.<SPAN
4116 all but the last valid URL in a redirect request and send a local redirect
4117 back to your browser without contacting the intermediate site(s).
4121 >Possible values:</DT
4131 CLASS="LITERALLAYOUT"
4132 > <I
4134 >{+fast-redirects}</I
4136 <I
4140 </P
4147 Many sites, like yahoo.com, don't just link to other sites. Instead, they
4148 will link to some script on their own server, giving the destination as a
4149 parameter, which will then redirect you to the final target. URLs
4150 resulting from this scheme typically look like:
4153 >http://some.place/some_script?http://some.where-else</I
4157 > Sometimes, there are even multiple consecutive redirects encoded in the
4158 URL. These redirections via scripts make your web browsing more traceable,
4159 since the server from which you follow such a link can see where you go
4160 to. Apart from that, valuable bandwidth and time is wasted, while your
4161 browser ask the server for one redirect after the other. Plus, it feeds
4165 > This is a normally <SPAN
4168 > feature, and often requires exceptions
4169 for sites that are sensitive to defeating this mechanism.
4189 CLASS="VARIABLELIST"
4201 > Apply page filtering as defined by named sections of the
4205 > file to the specified site(s).
4209 > can be any modification of the raw
4210 page content, including re-writing or deletion of content.
4214 >Possible values:</DT
4220 > must include the name of one of the section identifiers
4228 > is specified in <TT
4235 >Example usage (from the current <TT
4249 >+filter{html-annoyances}</I
4250 >: Get rid of particularly annoying HTML abuse.
4266 >+filter{js-annoyances}</I
4267 >: Get rid of particularly annoying JavaScript abuse
4283 >+filter{content-cookies}</I
4284 >: Kill cookies that come in the HTML or JS content
4301 >: Kill all popups in JS and HTML
4317 >+filter{frameset-borders}</I
4318 >: Give frames a border and make them resizable
4334 >+filter{webbugs}</I
4335 >: Squish WebBugs (1x1 invisible GIFs used for user tracking)
4351 >+filter{refresh-tags}</I
4352 >: Kill automatic refresh tags (for dial-on-demand setups)
4369 >: Text replacements for subversive browsing fun!
4386 >: Remove Nimda (virus) code.
4402 >+filter{banners-by-size}</I
4403 >: Kill banners by size (<I
4422 >+filter{shockwave-flash}</I
4423 >: Kill embedded Shockwave Flash objects
4439 >+filter{crude-parental}</I
4440 >: Kill all web pages that contain the words "sex" or "warez"
4452 > This is potentially a very powerful feature! And requires a knowledge
4453 of regular expressions if you want to <SPAN
4455 >"roll your own"</SPAN
4457 Filtering operates on a line by line basis throughout the entire page.
4460 > Filtering requires buffering the page content, which may appear to
4461 slow down page rendering since nothing is displayed until all content has
4462 passed the filters. (It does not really take longer, but seems that way
4463 since the page is not incrementally displayed.) This effect will be more
4464 noticeable on slower connections.
4467 > Filtering can achieve some of the effects as the
4469 HREF="configuration.html#BLOCK"
4476 action, i.e. it can be used to block ads and banners. In the overall
4477 scheme of things, filtering is one of the first things <SPAN
4481 does with a web page. So other most other actions are applied to the
4496 NAME="HIDE-FORWARDED-FOR-HEADERS"
4499 >+hide-forwarded-for-headers</I
4505 CLASS="VARIABLELIST"
4517 > Block any existing X-Forwarded-for HTTP header, and do not add a new one.
4521 >Possible values:</DT
4531 CLASS="LITERALLAYOUT"
4532 > <I
4534 >{+hide-forwarded-for-headers}</I
4536 <I
4540 </P
4546 > It is fairly safe to leave this on. It does not seem to break many sites.
4557 NAME="HIDE-FROM-HEADER"
4560 >+hide-from-header</I
4566 CLASS="VARIABLELIST"
4578 > To block the browser from sending your email address in a <SPAN
4586 >Possible values:</DT
4592 >, or any user defined value.
4599 CLASS="LITERALLAYOUT"
4600 > <I
4602 >{+hide-from-header{block}}</I
4604 <I
4608 </P
4617 > will completely remove the header
4618 (not to be confused with the <A
4619 HREF="configuration.html#BLOCK"
4626 Alternately, you can specify any value you prefer to send to the web
4642 NAME="HIDE-REFERRER"
4650 CLASS="VARIABLELIST"
4662 > Don't send the <SPAN
4665 > (sic) HTTP header to the web site.
4666 Or, alternately send a forged header instead.
4670 >Possible values:</DT
4673 > Prevent the header from being sent with the keyword, <SPAN
4680 > a URL to one from the same server as the request.
4681 Or, set to user defined value of your choice.
4688 CLASS="LITERALLAYOUT"
4689 > <I
4691 >{+hide-referer{forge}}</I
4693 <I
4697 </P
4706 > is the preferred option here, since some servers will
4707 not send images back otherwise.
4713 >"+hide-referrer"</SPAN
4714 > is an alternate spelling of
4717 >"+hide-referer"</SPAN
4718 >. It has the exact same parameters, and can be freely
4721 >"+hide-referer"</SPAN
4726 correct English spelling, however the HTTP specification has a bug - it
4727 requires it to be spelled as <SPAN
4741 NAME="HIDE-USER-AGENT"
4744 >+hide-user-agent</I
4750 CLASS="VARIABLELIST"
4762 > To change the <SPAN
4764 >"User-Agent:"</SPAN
4765 > header so web servers can't tell
4766 your browser type. Who's business is it anyway?
4770 >Possible values:</DT
4773 > Any user defined string.
4780 CLASS="LITERALLAYOUT"
4781 > <I
4783 >{+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}}</I
4785 <I
4789 </P
4795 > Warning! This breaks many web sites that depend on this in order
4796 to determine how the target browser will respond to various
4797 requests. Use with caution.
4808 NAME="HANDLE-AS-IMAGE"
4811 >+handle-as-image</I
4817 CLASS="VARIABLELIST"
4829 > To define what <SPAN
4833 automatically as an image, and is an important ingredient of how
4838 >Possible values:</DT
4848 CLASS="LITERALLAYOUT"
4849 > <I
4851 >{+handle-as-image}</I
4853 <I
4855 >/.*\.(gif|jpg|jpeg|png|bmp|ico)</I
4857 </P
4863 > This only has meaning if the URL (or pattern) also is
4867 >ed, in which case a user definable image can
4868 be sent rather than a HTML page. This is integral to the whole concept of
4869 ad blocking: the URL must match <I
4873 HREF="configuration.html#BLOCK"
4885 >"+handle-as-image"</SPAN
4888 HREF="configuration.html#SET-IMAGE-BLOCKER"
4892 >"+set-image-blocker"</SPAN
4895 below for control over what will actually be displayed by the browser.)
4898 > There is little reason to change the default definition for this action.
4909 NAME="SET-IMAGE-BLOCKER"
4912 >+set-image-blocker</I
4918 CLASS="VARIABLELIST"
4930 > Decide what to do with URLs that end up tagged with <I
4935 HREF="configuration.html#BLOCK"
4943 HREF="configuration.html#HANDLE-AS-IMAGE"
4947 >"+handle-as-image"</SPAN
4950 e.g an advertisement.
4954 >Possible values:</DT
4957 > There are four available options: <SPAN
4959 >"-set-image-blocker"</SPAN
4964 > page, usually resulting in a <SPAN
4971 >"+set-image-blocker{<I
4976 1x1 transparent GIF image.
4979 >"+set-image-blocker{<I
4984 checkerboard type pattern (the default). And finally,
4987 >"+set-image-blocker{<I
4992 send a HTTP temporary redirect to the specified image. This has the
4993 advantage of the icon being being cached by the browser, which will speed
5001 CLASS="LITERALLAYOUT"
5002 > <I
5004 >{+set-image-blocker{blank}}</I
5006 <I
5010 </P
5019 > ads, they need to meet
5020 criteria as matching both <I
5027 actions. And then, <SPAN
5029 >"image-blocker"</SPAN
5034 > for invisibility. Note you cannot treat HTML pages as
5035 images in most cases. For instance, frames require an HTML page to
5036 display. So a frame that is an ad, typically cannot be treated as an image.
5040 > in this situation just will not work
5052 NAME="LIMIT-CONNECT"
5061 CLASS="VARIABLELIST"
5076 > only allows HTTP CONNECT
5077 requests to port 443 (the standard, secure HTTPS port). Use
5080 >"+limit-connect"</SPAN
5081 > to disable this altogether, or to allow
5086 >Possible values:</DT
5089 > Any valid port number, or port number range.
5093 >Example usages:</DT
5096 CLASS="LITERALLAYOUT"
5097 > <I
5099 >+limit-connect{443}</I
5100 > # This is the default and need not be specified.<br>
5101 <I
5103 >+limit-connect{80,443}</I
5104 > # Ports 80 and 443 are OK.<br>
5105 <I
5107 >+limit-connect{-3, 7, 20-100, 500-}</I
5108 > # Port less than 3, 7, 20 to 100 and above 500 are OK.<br>
5109 </P
5115 > The CONNECT methods exists in HTTP to allow access to secure websites
5116 (https:// URLs) through proxies. It works very simply: the proxy connects
5117 to the server on the specified port, and then short-circuits its
5118 connections to the client <I
5121 > to the remote proxy.
5122 This can be a big security hole, since CONNECT-enabled proxies can be
5123 abused as TCP relays very easily.
5127 If you want to allow CONNECT for more ports than this, or want to forbid
5128 CONNECT altogether, you can specify a comma separated list of ports and
5129 port ranges (the latter using dashes, with the minimum defaulting to 0 and
5133 > If you don't know what any of this means, there probably is no reason to
5145 NAME="PREVENT-COMPRESSION"
5148 >+prevent-compression</I
5154 CLASS="VARIABLELIST"
5166 > Prevent the specified websites from compressing HTTP data.
5170 >Possible values:</DT
5180 CLASS="LITERALLAYOUT"
5181 > <I
5183 >{+prevent-compression}</I
5185 <I
5189 </P
5195 > Some websites do this, which can be a problem for
5201 HREF="configuration.html#FILTER"
5209 HREF="configuration.html#KILL-POPUPS"
5213 >"+kill-popups"</SPAN
5217 HREF="configuration.html#GIF-DEANIMATE"
5221 >"+gif-deanimate"</SPAN
5224 will not work on compressed data. This will slow down connections to those
5225 websites, though. Default typically is to turn
5228 >"prevent-compression"</SPAN
5240 NAME="SESSION-COOKIES-ONLY"
5243 >+session-cookies-only</I
5249 CLASS="VARIABLELIST"
5261 > Allow cookies for the current browser session <I
5268 >Possible values:</DT
5275 >Example usage (disabling):</DT
5278 CLASS="LITERALLAYOUT"
5279 > <I
5281 >{-session-cookies-only}</I
5283 <I
5287 </P
5293 > If websites set cookies, <SPAN
5295 >"+session-cookies-only"</SPAN
5297 they are erased when you exit and restart your web browser. This makes
5298 profiling cookies useless, but won't break sites which require cookies so
5299 that you can log in for transactions. This is generally turned on for all
5300 sites, and is the recommended setting.
5305 >"+prevent-*-cookies"</SPAN
5306 > actions should be turned off as well (see
5309 >"+session-cookies-only"</SPAN
5310 > to work. Or, else no cookies
5311 will get through at all. For, <SPAN
5314 > cookies that survive
5315 across browser sessions, see below as well.
5326 NAME="PREVENT-READING-COOKIES"
5329 >+prevent-reading-cookies</I
5335 CLASS="VARIABLELIST"
5347 > Explicitly prevent the web server from reading any cookies on your
5352 >Possible values:</DT
5362 CLASS="LITERALLAYOUT"
5363 > <I
5365 >{+prevent-reading-cookies}</I
5367 <I
5371 </P
5377 > Often used in conjunction with <SPAN
5379 >"+prevent-setting-cookies"</SPAN
5381 disable cookies completely. Note that
5383 HREF="configuration.html#SESSION-COOKIES-ONLY"
5387 >"+session-cookies-only"</SPAN
5390 requires these to both be disabled (or else it never gets any cookies to cache).
5396 > cookies to work (i.e. they survive across browser
5397 sessions and reboots), all three cookie settings should be <SPAN
5401 for the specified sites.
5412 NAME="PREVENT-SETTING-COOKIES"
5415 >+prevent-setting-cookies</I
5421 CLASS="VARIABLELIST"
5433 > Explicitly block the web server from storing cookies on your
5438 >Possible values:</DT
5448 CLASS="LITERALLAYOUT"
5449 > <I
5451 >{+prevent-setting-cookies}</I
5453 <I
5457 </P
5463 > Often used in conjunction with <SPAN
5465 >"+prevent-reading-cookies"</SPAN
5467 disable cookies completely (see above).
5490 CLASS="VARIABLELIST"
5502 > Stop those annoying JavaScript pop-up windows!
5506 >Possible values:</DT
5516 CLASS="LITERALLAYOUT"
5517 > <I
5521 <I
5525 </P
5533 >"+kill-popups"</SPAN
5534 > uses a built in filter to disable pop-ups
5538 > function, etc. This is
5539 one of the first actions processed by <SPAN
5543 as it contacts the remote web server. This action is not always 100% reliable,
5544 and is supplemented by <SPAN
5561 NAME="SEND-VANILLA-WAFER"
5564 >+send-vanilla-wafer</I
5570 CLASS="VARIABLELIST"
5582 > Sends a cookie for every site stating that you do not accept any copyright
5583 on cookies sent to you, and asking them not to track you.
5587 >Possible values:</DT
5597 CLASS="LITERALLAYOUT"
5598 > <I
5600 >{+send-vanilla-wafer}</I
5602 <I
5606 </P
5612 > This action only applies if you are using a <TT
5616 for saving cookies. Of course, this is a (relatively) unique header and
5617 could conceivably be used to track you.
5637 CLASS="VARIABLELIST"
5649 > This allows you to send an arbitrary, user definable cookie.
5653 >Possible values:</DT
5656 > User specified cookie name and corresponding value.
5663 CLASS="LITERALLAYOUT"
5664 > <I
5666 >{+send-wafer{name=value}}</I
5668 <I
5672 </P
5678 > This can be specified multiple times in order to add as many cookies as you
5691 >7.4.5.21. Actions Examples</A
5694 > Note that the meaning of any of the above examples is reversed by preceding
5695 the action with a <SPAN
5698 >, in place of the <SPAN
5702 that some actions are turned on in the default section of the actions file,
5703 and require little to no additional configuration. These are just <SPAN
5707 But, other actions that are turned on the default section <I
5710 typically require</I
5711 > exceptions to be listed in the lower sections of
5712 actions file. E.g. by default no URLs are <SPAN
5716 the default definitions of <TT
5720 exceptions to this in order to enable ad blocking.</P
5724 > Turn off cookies by default, then allow a few through for specified sites
5725 (showing an excerpt from the <SPAN
5728 > section of an actions
5734 CLASS="LITERALLAYOUT"
5735 > # Excerpt only:<br>
5736 # Allow cookies to and from the server, but<br>
5737 # for this browser session ONLY<br>
5739 # other actions normally listed here...<br>
5740 -prevent-setting-cookies \<br>
5741 -prevent-reading-cookies \<br>
5742 +session-cookies-only \ <br>
5744 / # match all URLs<br>
5746 # Exceptions to the above, sites that benefit from persistent cookies<br>
5747 # that are saved from one browser session to the next.<br>
5748 { -session-cookies-only }<br>
5749 .javasoft.com<br>
5750 .sun.com<br>
5751 .yahoo.com<br>
5752 .msdn.microsoft.com<br>
5753 .redhat.com<br>
5755 </P
5760 > Now turn off <SPAN
5762 >"fast redirects"</SPAN
5763 >, and then we allow two exceptions:</P
5768 CLASS="LITERALLAYOUT"
5769 > # Turn them off (excerpt only)!<br>
5771 # other actions normally listed here...<br>
5772 +fast-redirects<br>
5774 / # match all URLs<br>
5776 # Reverse it for these two sites, which don't work right without it.<br>
5777 {-fast-redirects}<br>
5778 www.ukc.ac.uk/cgi-bin/wac\.cgi\?<br>
5779 login.yahoo.com<br>
5780 </P
5785 > Turn on page filtering according to rules in the defined sections
5789 >, and make one exception for
5796 CLASS="LITERALLAYOUT"
5797 > # Run everything through the filter file, using only certain<br>
5798 # specified sections:<br>
5800 # other actions normally listed here...<br>
5801 +filter{html-annoyances} +filter{js-annoyances} +filter{kill-popups}\<br>
5802 +filter{webbugs} +filter{nimda} +filter{banners-by-size}<br>
5804 / #match all URLs<br>
5805 <br>
5806 # Then disable filtering of code from all sourceforge domains!<br>
5808 .sourceforge.net<br>
5809 </P
5814 > Now some URLs that we want <SPAN
5817 > (normally generates
5821 > banner). Typically, the <SPAN
5825 action is off by default in the upper section of an actions file, then enabled
5826 against certain URLs and patterns in the lower part of the file. Many of these use <A
5827 HREF="appendix.html#REGEX"
5828 >regular expressions</A
5829 > that will expand to match multiple
5835 CLASS="LITERALLAYOUT"
5836 > # Blocklist:<br>
5837 {+block}<br>
5838 ad*.<br>
5839 .*ads.<br>
5840 banner?.<br>
5841 count*.<br>
5842 /.*count(er)?\.(pl|cgi|exe|dll|asp|php[34]?)<br>
5843 /(?:.*/)?(publicite|werbung|rekla(ma|me|am)|annonse|maino(kset|nta|s)?)/<br>
5844 .hitbox.com <br>
5845 /.*/(ng)?adclient\.cgi<br>
5846 /.*/(plain|live|rotate)[-_.]?ads?/<br>
5847 /.*/abanners/<br>
5848 /autoads/<br>
5849 </P
5854 > Note that many of these actions have the potential to cause a page to
5855 misbehave, possibly even not to display at all. There are many ways
5856 a site designer may choose to design his site, and what HTTP header
5857 content, and other criteria, he may depend on. There is no way to have hard
5858 and fast rules for all sites. See the <A
5859 HREF="appendix.html#ACTIONSANAT"
5861 > for a brief example on troubleshooting
5884 >, can be defined by combining other <SPAN
5888 These can in turn be invoked just like the built-in <SPAN
5892 Currently, an alias can contain any character except space, tab, <SPAN
5902 >. But please use only <SPAN
5922 >. Alias names are not case sensitive, and
5925 >must be defined before other actions</I
5927 actions file! And there can only be one set of <SPAN
5931 defined per file. Each actions file may have its own aliases, but they are
5932 only visible within that file.</P
5934 > Now let's define a few aliases:</P
5939 CLASS="LITERALLAYOUT"
5940 > # Useful custom aliases we can use later. These must come first!<br>
5942 +prevent-cookies = +prevent-setting-cookies +prevent-reading-cookies<br>
5943 -prevent-cookies = -prevent-setting-cookies -prevent-reading-cookies<br>
5944 fragile = -block -prevent-cookies -filter -fast-redirects -hide-referer -kill-popups<br>
5945 shop = -prevent-cookies -filter -fast-redirects<br>
5946 +imageblock = +block +handle-as-image<br>
5948 # Aliases defined from other aliases, for people who don't like to type <br>
5949 # too much: ;-)<br>
5950 c0 = +prevent-cookies<br>
5951 c1 = -prevent-cookies<br>
5952 #... etc. Customize to your heart's content.<br>
5953 </P
5958 > Some examples using our <SPAN
5965 aliases from above. These would appear in the lower sections of an
5966 actions file as exceptions to the default actions (as defined in the
5972 CLASS="LITERALLAYOUT"
5973 > # These sites are very complex and require<br>
5974 # minimal interference.<br>
5976 .office.microsoft.com<br>
5977 .windowsupdate.microsoft.com<br>
5978 .nytimes.com<br>
5980 # Shopping sites - but we still want to block ads.<br>
5982 .quietpc.com<br>
5983 .worldpay.com # for quietpc.com<br>
5984 .scan.co.uk<br>
5986 # These shops require pop-ups also <br>
5987 {shop -kill-popups}<br>
5988 .dabs.com<br>
5989 .overclockers.co.uk<br>
5990 </P
6001 > aliases are often used for
6005 > sites that require most actions to be disabled
6006 in order to function properly. </P
6015 >7.5. The Filter File</A
6018 > Any web page can be dynamically modified with the filter file. This
6019 modification can be removal, or re-writing, of any web page content,
6020 including tags and non-visible content. The default filter file is
6024 >, located in the config directory. </P
6026 > This is potentially a very powerful feature, and requires knowledge of both
6029 >"regular expression"</SPAN
6030 > and HTML in order create custom
6031 filters. But, there are a number of useful filters included with
6035 > for many common situations.</P
6037 > The included example file is divided into sections. Each section begins
6041 > keyword, followed by the identifier
6042 for that section, e.g. <SPAN
6044 >"FILTER: webbugs"</SPAN
6045 >. Each section performs
6046 a similar type of filtering, such as <SPAN
6048 >"html-annoyances"</SPAN
6051 > This file uses regular expressions to alter or remove any string in the
6052 target page. The expressions can only operate on one line at a time. Some
6053 examples from the included default <TT
6058 > Stop web pages from displaying annoying messages in the status bar by
6059 deleting such references:</P
6064 CLASS="LITERALLAYOUT"
6065 > FILTER: html-annoyances<br>
6067 # New browser windows should be resizeable and have a location and status<br>
6068 # bar. Make it so.<br>
6070 s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig<br>
6071 s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig<br>
6072 s/scrolling="?(no|0|Auto)"?/scrolling=1/ig<br>
6073 s/menubar="?(no|0)"?/menubar=1/ig <br>
6075 # The <BLINK> tag was a crime!<br>
6077 s*<blink>|</blink>**ig<br>
6079 # Is this evil? <br>
6081 #s/framespacing="?(no|0)"?//ig<br>
6082 #s/margin(height|width)=[0-9]*//gi<br>
6083 </P
6088 > Just for kicks, replace any occurrence of <SPAN
6095 >, and have a little fun with topical buzzwords: </P
6100 CLASS="LITERALLAYOUT"
6101 > FILTER: fun<br>
6103 s/microsoft(?!.com)/MicroSuck/ig<br>
6105 # Buzzword Bingo:<br>
6107 s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig<br>
6108 </P
6113 > Kill those pesky little web-bugs:</P
6118 CLASS="LITERALLAYOUT"
6119 > # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking)<br>
6120 FILTER: webbugs<br>
6122 s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig<br>
6123 </P
6140 > displays one of its internal
6141 pages, such as a 404 Not Found error page, it uses the appropriate template.
6142 On Linux, BSD, and Unix, these are located in
6145 >/etc/privoxy/templates</TT
6146 > by default. These may be
6147 customized, if desired. <TT
6151 used to control the HTML attributes (fonts, etc).</P
6156 > banner page with the bright red top
6157 banner, is called just <SPAN
6164 may be customized or replaced with something else if desired. </P
6220 >Contacting the Developers, Bug Reporting and Feature