4 >Privoxy Configuration</TITLE
7 CONTENT="Modular DocBook HTML Stylesheet Version 1.64
10 TITLE="Privoxy User Manual"
11 HREF="index.html"><LINK
13 TITLE="Quickstart to Using Privoxy"
14 HREF="quickstart.html"><LINK
16 TITLE="Contacting the Developers, Bug Reporting and Feature
18 HREF="contact.html"><LINK
21 HREF="../p_doc.css"></HEAD
40 >Privoxy User Manual</TH
48 HREF="quickstart.html"
84 > configuration is stored
85 in text files. These files can be edited with a text editor.
86 Many important aspects of <SPAN
90 also be controlled easily with a web browser.
99 >5.1. Controlling <SPAN
102 > with Your Web Browser</A
108 >'s user interface can be reached through the special
110 HREF="http://config.privoxy.org/"
112 >http://config.privoxy.org/</A
119 which is a built-in page and works without Internet access.
120 You will see the following section: </P
130 > Please choose from the following options:
133 * Show information about the current configuration
134 * Show the source code version numbers
135 * Show the request headers.
136 * Show which actions apply to a URL and why
137 * Toggle Privoxy on or off
138 * Edit the actions list
146 > This should be self-explanatory. Note the last item is an editor for the
149 >"actions list"</SPAN
150 >, which is where much of the ad, banner, cookie,
151 and URL blocking magic is configured as well as other advanced features of
155 >. This is an easy way to adjust various
159 > configuration. The actions
160 file, and other configuration files, are explained in detail below. </P
164 >"Toggle Privoxy On or Off"</SPAN
165 > is handy for sites that might
166 have problems with your current actions and filters. You can in fact use
167 it as a test to see whether it is <SPAN
171 causing the problem or not. <SPAN
175 to run as a proxy in this case, but all filtering is disabled. There
177 HREF="appendix.html#BOOKMARKLETS"
180 that you can toggle <SPAN
183 > with one click from
192 >5.2. Configuration Files Overview</A
195 > For Unix, *BSD and Linux, all configuration files are located in
199 > by default. For MS Windows, OS/2, and
200 AmigaOS these are all in the same directory as the
204 > executable. The name
205 and number of configuration files has changed from previous versions, and is
206 subject to change as development progresses.</P
208 > The installed defaults provide a reasonable starting point, though possibly
209 aggressive by some standards. For the time being, there are only three
210 default configuration files (this may change in time):</P
217 > The main configuration file is named <TT
221 on Linux, Unix, BSD, OS/2, and AmigaOS and <TT
233 > (the actions file) is used to define
234 which of a set of various <SPAN
237 > relating to images, banners,
238 pop-ups, access restrictions, banners and cookies are to be applied, and where.
239 There is a web based editor for this file that can be accessed at <A
240 HREF="http://config.privoxy.org/edit-actions/"
242 >http://config.privoxy.org/edit-actions/</A
245 HREF="http://p.p/edit-actions/"
247 >http://p.p/edit-actions/</A
249 (Other actions files are included as well with differing levels of filtering
250 and blocking, e.g. <TT
261 > (the filter file) can be used to re-write the raw
262 page content, including viewable text as well as embedded HTML and JavaScript,
263 and whatever else lurks on any given web page. The filtering jobs are only
264 pre-defined here; whether to apply them or not is up to the actions file.
270 > All files use the <SPAN
276 > character to denote a
277 comment (the rest of the line will be ignored) and understand line continuation
278 through placing a backslash ("<TT
281 >") as the very last character
282 in a line. If the <TT
285 > is preceded by a backslash, it looses
286 its special function. Placing a <TT
289 > in front of an otherwise
290 valid configuration line to prevent it from being interpreted is called "commenting
300 can use Perl style <A
301 HREF="appendix.html#REGEX"
302 >regular expressions</A
304 maximum flexibility. </P
306 > After making any changes, there is no need to restart
310 > in order for the changes to take
314 > detects such changes
315 automatically. Note, however, that it may take one or two additional
316 requests for the change to take effect. When changing the listening address
324 must obviously be sent to the <I
327 > listening address.</P
329 > While under development, the configuration content is subject to change.
330 The below documentation may not be accurate by the time you read this.
331 Also, what constitutes a <SPAN
334 > setting, may change, so
335 please check all your configuration files on important issues.</P
343 >5.3. The Main Configuration File</A
346 > Again, the main configuration file is named <TT
350 Linux/Unix/BSD and OS/2, and <TT
354 Configuration lines consist of an initial keyword followed by a list of
355 values, all separated by whitespace (any number of spaces or tabs). For
361 CLASS="LITERALLAYOUT"
364 >confdir /etc/privoxy</I
366 </P
371 > Assigns the value <TT
378 > and thus indicates that the configuration
379 directory is named <SPAN
381 >"/etc/privoxy/"</SPAN
384 > All options in the config file except for <TT
391 > are optional. Watch out in the below description
392 for what happens if you leave them unset.</P
394 > The main config file controls all aspects of <SPAN
398 operation that are not location dependent (i.e. they apply universally, no matter
399 where you may be surfing).</P
406 >5.3.1. Configuration and Log File Locations</A
412 > can (and normally does) use a number of
413 other files for additional configuration and logging.
414 This section of the configuration file tells <SPAN
418 where to find those other files. </P
436 >The directory where the other configuration files are located</P
448 >/etc/privoxy (Unix) <I
454 > installation dir (Windows) </P
457 >Effect if unset:</DT
478 > When development goes modular and multi-user, the blocker, filter, and
479 per-user config will be stored in subdirectories of <SPAN
483 For now, the configuration directory structure is flat, except for
486 >confdir/templates</TT
487 >, where the HTML templates for CGI
488 output reside (e.g. <SPAN
514 > The directory where all logging takes place (i.e. where <TT
534 >/var/log/privoxy (Unix) <I
540 > installation dir (Windows) </P
543 >Effect if unset:</DT
573 >5.3.1.3. actionsfile</A
584 > The actions file to use
591 >File name, relative to <TT
600 >default.action (Unix) <I
603 > default.action.txt (Windows)</P
606 >Effect if unset:</DT
609 > No action is taken at all. Simple neutral proxying.
616 > There is no point in using <SPAN
620 an actions file. There are three different actions files included in the
621 distribution, with varying degrees of aggressiveness:
627 >intermediate.action</TT
644 >5.3.1.4. filterfile</A
655 > The filter file to use
662 >File name, relative to <TT
671 >default.filter (Unix) <I
674 > default.filter.txt (Windows)</P
677 >Effect if unset:</DT
680 > No textual content filtering takes place, i.e. all
690 actions in the actions file are turned off
699 >"default.filter"</SPAN
700 > file contains content modification rules
703 >"regular expressions"</SPAN
704 >. These rules permit powerful
705 changes on the content of Web pages, e.g., you could disable your favorite
706 JavaScript annoyances, re-write the actual displayed text, or just have some
714 it appears on a Web page.
737 > The log file to use
744 >File name, relative to <TT
756 > privoxy.log (Windows)</P
759 >Effect if unset:</DT
762 > No log file is used, all log messages go to the console (<TT
772 > The windows version will additionally log to the console.
775 > The logfile is where all logging and error messages are written. The level
776 of detail and number of messages are set with the <TT
780 option (see below). The logfile can be useful for tracking down a problem with
784 > (e.g., it's not blocking an ad you
785 think it should block) but in most cases you probably will never look at it.
788 > Your logfile will grow indefinitely, and you will probably want to
789 periodically remove it. On Unix systems, you can do this with a cron job
793 >). For Red Hat, a <B
797 script has been included.
800 > On SuSE Linux systems, you can place a line like <SPAN
803 +1024k 644 nobody.nogroup"</SPAN
808 the effect that cron.daily will automatically archive, gzip, and empty the
809 log, when it exceeds 1M size.
832 > The file to store intercepted cookies in
839 >File name, relative to <TT
851 > privoxy.jar (Windows)</P
854 >Effect if unset:</DT
857 > Intercepted cookies are not stored at all.
864 > The jarfile may grow to ridiculous sizes over time.
876 >5.3.1.7. trustfile</A
887 > The trust file to use
894 >File name, relative to <TT
905 >Unset (commented out)</I
906 >. When activated: trust (Unix) <I
909 > trust.txt (Windows)</P
912 >Effect if unset:</DT
915 > The whole trust mechanism is turned off.
922 > The trust mechanism is an experimental feature for building white-lists and should
923 be used with care. It is <I
926 > recommended for the casual user.
929 > If you specify a trust file, <SPAN
933 access to sites that are named in the trustfile.
934 You can also mark sites as trusted referrers (with <TT
938 the effect that access to untrusted sites will be granted, if a link from a
939 trusted referrer was used.
940 The link target will then be added to the <SPAN
944 Possible applications include limiting Internet access for children.
950 > operator in the trust file, it may grow considerably over time.
963 >5.3.2. Local Set-up Documentation</A
966 > If you intend to operate <SPAN
970 that just yourself, it might be a good idea to let them know how to reach
971 you, what you block and why you do that, your policies etc.
979 >5.3.2.1. trust-info-url</A
990 > A URL to be displayed in the error page that users will see if access to an untrusted page is denied.
1003 >Two example URL are provided</P
1006 >Effect if unset:</DT
1009 > No links are displayed on the "untrusted" error page.
1016 > The value of this option only matters if the experimental trust mechanism has been
1023 > If you use the trust mechanism, it is a good idea to write up some on-line
1024 documentation about your trust policy and to specify the URL(s) here.
1025 Use multiple times for multiple URLs.
1028 > The URL(s) should be added to the trustfile as well, so users don't end up
1029 locked out from the information on why they were locked out in the first place!
1041 >5.3.2.2. admin-address</A
1046 CLASS="VARIABLELIST"
1052 > An email address to reach the proxy administrator.
1071 >Effect if unset:</DT
1074 > No email address is displayed on error pages and the CGI user interface.
1088 are unset, the whole "Local Privoxy Support" box on all generated pages will
1101 >5.3.2.3. proxy-info-url</A
1106 CLASS="VARIABLELIST"
1112 > A URL to documentation about the local <SPAN
1116 configuration or policies.
1135 >Effect if unset:</DT
1138 > No link to local documentation is displayed on error pages and the CGI user interface.
1152 are unset, the whole "Local Privoxy Support" box on all generated pages will
1156 > This URL shouldn't be blocked ;-)
1169 >5.3.3. Debugging</A
1172 > These options are mainly useful when tracing a problem.
1173 Note that you might also want to invoke
1181 command line option when debugging.
1194 CLASS="VARIABLELIST"
1200 > Key values that determine what information gets logged.
1213 >12289 (i.e.: URLs plus informational and warning messages)</P
1216 >Effect if unset:</DT
1219 > Nothing gets logged.
1226 > The available debug levels are:
1236 CLASS="PROGRAMLISTING"
1237 > debug 1 # show each GET/POST/CONNECT request
1238 debug 2 # show each connection status
1239 debug 4 # show I/O status
1240 debug 8 # show header parsing
1241 debug 16 # log all data into the logfile
1242 debug 32 # debug force feature
1243 debug 64 # debug regular expression filter
1244 debug 128 # debug fast redirects
1245 debug 256 # debug GIF de-animation
1246 debug 512 # Common Log Format
1247 debug 1024 # debug kill pop-ups
1248 debug 4096 # Startup banner and warnings.
1249 debug 8192 # Non-fatal errors
1257 > To select multiple debug levels, you can either add them or use
1264 > A debug level of 1 is informative because it will show you each request
1267 >1, 4096 and 8192 are highly recommended</I
1269 so that you will notice when things go wrong. The other levels are probably
1270 only of interest if you are hunting down a specific problem. They can produce
1271 a hell of an output (especially 16).
1275 > The reporting of <I
1278 > errors (i.e. ones which crash
1282 >) is always on and cannot be disabled.
1285 > If you want to use CLF (Common Log Format), you should set <SPAN
1292 > and not enable anything else.
1304 >5.3.3.2. single-threaded</A
1309 CLASS="VARIABLELIST"
1315 > Whether to run only one server thread
1337 >Effect if unset:</DT
1340 > Multi-threaded (or, where unavailable: forked) operation, i.e. the ability to
1341 serve multiple requests simultaneously.
1348 > This option is only there for debug purposes and you should never
1351 >It will drastically reduce performance.</I
1365 >5.3.4. Access Control and Security</A
1368 > This section of the config file controls the security-relevant aspects
1380 >5.3.4.1. listen-address</A
1385 CLASS="VARIABLELIST"
1391 > The IP address and TCP port on which <SPAN
1395 listen for client requests.
1421 >Effect if unset:</DT
1424 > Bind to localhost (127.0.0.1), port 8118. This is suitable and recommended for
1425 home users who run <SPAN
1428 > on the same machine as
1436 > You will need to configure your browser(s) to this proxy address and port.
1439 > If you already have another service running on port 8118, or if you want to
1440 serve requests from other machines (e.g. on your local network) as well, you
1441 will need to override the default.
1444 > If you leave out the IP address, <SPAN
1448 bind to all interfaces (addresses) on your machine and may become reachable
1449 from the Internet. In that case, consider using access control lists (acl's)
1453 > below), or a firewall.
1460 > Suppose you are running <SPAN
1464 a machine which has the address 192.168.0.1 on your local private network
1465 (192.168.0.0) and has another outside connection with a different address.
1466 You want it to serve requests from inside only:
1476 CLASS="PROGRAMLISTING"
1477 > listen-address 192.168.0.1:8118
1499 CLASS="VARIABLELIST"
1505 > Initial state of "toggle" status
1521 >Effect if unset:</DT
1524 > Act as if toggled on
1531 > If set to 0, <SPAN
1537 >"toggled off"</SPAN
1538 > mode, i.e. behave like a normal, content-neutral
1541 >enable-remote-toggle</TT
1543 below. This is not really useful anymore, since toggling is much easier
1545 HREF="http://config.privoxy.org/toggle"
1549 > then via editing the <TT
1555 > The windows version will only display the toggle icon in the system tray
1556 if this option is present.
1568 >5.3.4.3. enable-remote-toggle</A
1573 CLASS="VARIABLELIST"
1579 > Whether or not the <A
1580 HREF="http://config.privoxy.org/toggle"
1600 >Effect if unset:</DT
1603 > The web-based toggle feature is disabled.
1610 > When toggled off, <SPAN
1613 > acts like a normal,
1614 content-neutral proxy, i.e. it acts as if none of the actions applied to
1618 > For the time being, access to the toggle feature can <I
1622 controlled separately by <SPAN
1625 > or HTTP authentication,
1626 so that everybody who can access <SPAN
1637 toggle it for all users. So this option is <I
1641 for multi-user environments with untrusted users.
1644 > Note that you must have compiled <SPAN
1648 support for this feature, otherwise this option has no effect.
1660 >5.3.4.4. enable-edit-actions</A
1665 CLASS="VARIABLELIST"
1671 > Whether or not the <A
1672 HREF="http://config.privoxy.org/edit-actions"
1692 >Effect if unset:</DT
1695 > The web-based actions file editor is disabled.
1702 > For the time being, access to the editor can <I
1706 controlled separately by <SPAN
1709 > or HTTP authentication,
1710 so that everybody who can access <SPAN
1721 modify its configuration for all users. So this option is <I
1725 > for multi-user environments with untrusted users.
1728 > Note that you must have compiled <SPAN
1732 support for this feature, otherwise this option has no effect.
1744 >5.3.4.5. ACLs: permit-access and deny-access</A
1749 CLASS="VARIABLELIST"
1755 > Who can access what.
1797 > are IP addresses in dotted decimal notation or valid
1809 > are subnet masks in CIDR notation, i.e. integer
1810 values from 2 to 30 representing the length (in bits) of the network address. The masks and the whole
1811 destination part are optional.
1824 >Effect if unset:</DT
1827 > Don't restrict access further than implied by <TT
1837 > Access controls are included at the request of ISPs and systems
1838 administrators, and <I
1840 >are not usually needed by individual users</I
1842 For a typical home user, it will normally suffice to ensure that
1846 > only listens on the localhost or internal (home)
1847 network address by means of the <TT
1853 > Please see the warnings in the FAQ that this proxy is not intended to be a substitute
1854 for a firewall or to encourage anyone to defer addressing basic security
1858 > Multiple ACL lines are OK.
1859 If any ACLs are specified, then the <SPAN
1863 talks only to IP addresses that match at least one <TT
1867 and don't match any subsequent <TT
1870 > line. In other words, the
1871 last match wins, with the default being <TT
1880 > is using a forwarder (see <TT
1884 for a particular destination URL, the <TT
1890 that is examined is the address of the forwarder and <I
1894 of the ultimate target. This is necessary because it may be impossible for the local
1898 > to determine the IP address of the
1899 ultimate target (that's often what gateways are used for).
1902 > You should prefer using IP addresses over DNS names, because the address lookups take
1903 time. All DNS names must resolve! You can <I
1906 > use domain patterns
1910 > or partial domain names. If a DNS name resolves to multiple
1911 IP addresses, only the first one is used.
1914 > Denying access to particular sites by ACL may have undesired side effects
1915 if the site in question is hosted on a machine which also hosts other sites.
1922 > Explicitly define the default behavior if no ACL and
1930 is OK. The absence of a <TT
1939 > destination addresses are OK:
1950 > permit-access localhost
1958 > Allow any host on the same class C subnet as www.privoxy.org access to
1959 nothing but www.example.com:
1970 > permit-access www.privoxy.org/24 www.example.com/32
1978 > Allow access from any host on the 26-bit subnet 192.168.45.64 to anywhere,
1979 with the exception that 192.168.45.73 may not access www.dirty-stuff.example.com:
1990 > permit-access 192.168.45.64/26
1991 deny-access 192.168.45.73 www.dirty-stuff.example.com
2008 >5.3.4.6. buffer-limit</A
2013 CLASS="VARIABLELIST"
2019 > Maximum size of the buffer for content filtering.
2035 >Effect if unset:</DT
2038 > Use a 4MB (4096 KB) limit.
2045 > For content filtering, i.e. the <TT
2052 > actions, it is necessary that
2056 > buffers the entire document body.
2057 This can be potentially dangerous, since a server could just keep sending
2058 data indefinitely and wait for your RAM to exhaust -- with nasty consequences.
2062 > When a document buffer size reaches the <TT
2066 flushed to the client unfiltered and no further attempt to
2067 filter the rest of the document is made. Remember that there may be multiple threads
2068 running, which might require up to <TT
2075 >, unless you have enabled <SPAN
2077 >"single-threaded"</SPAN
2092 >5.3.5. Forwarding</A
2095 > This feature allows routing of HTTP requests through a chain of
2097 It can be used to better protect privacy and confidentiality when
2098 accessing specific domains by routing requests to those domains
2099 through an anonymous public proxy (see e.g. <A
2100 HREF="http://www.multiproxy.org/anon_list.htm"
2102 >http://www.multiproxy.org/anon_list.htm</A
2104 Or to use a caching proxy to speed up browsing. Or chaining to a parent
2105 proxy may be necessary because the machine that <SPAN
2109 runs on has no direct Internet access.</P
2111 > Also specified here are SOCKS proxies. <SPAN
2115 supports the SOCKS 4 and SOCKS 4A protocols.</P
2122 >5.3.5.1. forward</A
2127 CLASS="VARIABLELIST"
2133 > To which parent HTTP proxy specific requests should be routed.
2169 > is a domain name pattern (see the
2170 chapter on domain matching in the actions file),
2176 > is the address of the parent HTTP proxy
2177 as an IP addresses in dotted decimal notation or as a valid DNS name (or <SPAN
2183 >"no forwarding"</SPAN
2190 > parameters are TCP ports, i.e. integer
2191 values from 1 to 64535
2204 >Effect if unset:</DT
2207 > Don't use parent HTTP proxies.
2222 >, then requests are not
2223 forwarded to another HTTP proxy but are made directly to the web servers.
2226 > Multiple lines are OK, they are checked in sequence, and the last match wins.
2233 > Everything goes to an example anonymizing proxy, except SSL on port 443 (which it doesn't handle):
2244 > forward .* anon-proxy.example.org:8080
2253 > Everything goes to our example ISP's caching proxy, except for requests
2254 to that ISP's sites:
2265 > forward .*. caching-proxy.example-isp.net:8000
2266 forward .example-isp.net .
2283 >5.3.5.2. forward-socks4 and forward-socks4a</A
2288 CLASS="VARIABLELIST"
2294 > Through which SOCKS proxy (and to which parent HTTP proxy) specific requests should be routed.
2341 > is a domain name pattern (see the
2342 chapter on domain matching in the actions file),
2354 are IP addresses in dotted decimal notation or valid DNS names (<TT
2365 >"no HTTP forwarding"</SPAN
2366 >), and the optional
2372 > parameters are TCP ports, i.e. integer values from 1 to 64535
2385 >Effect if unset:</DT
2388 > Don't use SOCKS proxies.
2395 > Multiple lines are OK, they are checked in sequence, and the last match wins.
2398 > The difference between <TT
2403 >forward-socks4a</TT
2405 is that in the SOCKS 4A protocol, the DNS resolution of the target hostname happens on the SOCKS
2406 server, while in SOCKS 4 it happens locally.
2417 >, then requests are not
2418 forwarded to another HTTP proxy but are made (HTTP-wise) directly to the web servers, albeit through
2426 > From the company example.com, direct connections are made to all
2430 > domains, but everything outbound goes through
2431 their ISP's proxy by way of example.com's corporate SOCKS 4A gateway to
2443 > forward-socks4a .*. socks-gw.example.com:1080 www-cache.example-isp.net:8080
2444 forward .example.com .
2452 > A rule that uses a SOCKS 4 gateway for all destinations but no HTTP parent looks like this:
2463 > forward-socks4 .*. socks-gw.example.com:1080 .
2480 >5.3.5.3. Advanced Forwarding Examples</A
2483 > If you have links to multiple ISPs that provide various special content
2484 only to their subscribers, you can configure multiple <SPAN
2488 which have connections to the respective ISPs to act as forwarders to each other, so that
2492 > users can see the internal content of all ISPs.</P
2494 > Assume that host-a has a PPP connection to isp-a.net. And host-b has a PPP connection to
2495 isp-b.net. Both run <SPAN
2499 configuration can look like this:</P
2512 forward .isp-b.net host-b:8118
2530 forward .isp-a.net host-a:8118
2537 > Now, your users can set their browser's proxy to use either
2538 host-a or host-b and be able to browse the internal content
2539 of both isp-a and isp-b.</P
2541 > If you intend to chain <SPAN
2548 > locally, then chain as
2551 >browser -> squid -> privoxy</TT
2552 > is the recommended way. </P
2554 > Assuming that <SPAN
2561 run on the same box, your squid configuration could then look like this:</P
2571 > # Define Privoxy as parent proxy (without ICP)
2572 cache_peer 127.0.0.1 parent 8118 7 no-query
2574 # Define ACL for protocol FTP
2577 # Do not forward FTP requests to Privoxy
2578 always_direct allow ftp
2580 # Forward all the rest to Privoxy
2581 never_direct allow all
2588 > You would then need to change your browser's proxy settings to <SPAN
2591 >'s address and port.
2592 Squid normally uses port 3128. If unsure consult <TT
2607 >5.3.6. Windows GUI Options</A
2613 > has a number of options specific to the
2614 Windows GUI interface:</P
2618 >"activity-animation"</SPAN
2623 > icon will animate when
2627 > is active. To turn off, set to 0.</P
2632 CLASS="LITERALLAYOUT"
2635 >activity-animation 1</I
2637 </P
2644 >"log-messages"</SPAN
2649 > will log messages to the console
2655 CLASS="LITERALLAYOUT"
2660 </P
2668 >"log-buffer-size"</SPAN
2669 > is set to 1, the size of the log buffer,
2670 i.e. the amount of memory used for the log messages displayed in the
2671 console window, will be limited to <SPAN
2673 >"log-max-lines"</SPAN
2676 > Warning: Setting this to 0 will result in the buffer to grow infinitely and
2677 eat up all your memory!</P
2682 CLASS="LITERALLAYOUT"
2685 >log-buffer-size 1</I
2687 </P
2694 >log-max-lines</SPAN
2695 > is the maximum number of lines held
2696 in the log buffer. See above.</P
2701 CLASS="LITERALLAYOUT"
2704 >log-max-lines 200</I
2706 </P
2713 >"log-highlight-messages"</SPAN
2718 > will highlight portions of the log
2719 messages with a bold-faced font:</P
2724 CLASS="LITERALLAYOUT"
2727 >log-highlight-messages 1</I
2729 </P
2734 > The font used in the console window:</P
2739 CLASS="LITERALLAYOUT"
2742 >log-font-name Comic Sans MS</I
2744 </P
2749 > Font size used in the console window:</P
2754 CLASS="LITERALLAYOUT"
2759 </P
2767 >"show-on-task-bar"</SPAN
2768 > controls whether or not
2772 > will appear as a button on the Task bar
2778 CLASS="LITERALLAYOUT"
2781 >show-on-task-bar 0</I
2783 </P
2790 >"close-button-minimizes"</SPAN
2791 > is set to 1, the Windows close
2792 button will minimize <SPAN
2795 > instead of closing
2796 the program (close with the exit option on the File menu).</P
2801 CLASS="LITERALLAYOUT"
2804 >close-button-minimizes 1</I
2806 </P
2813 >"hide-console"</SPAN
2814 > option is specific to the MS-Win console
2818 >. If this option is used,
2822 > will disconnect from and hide the
2828 CLASS="LITERALLAYOUT"
2829 > #hide-console<br>
2830 </P
2842 >5.4. The Actions File</A
2845 > The actions file (<TT
2856 to define what actions <SPAN
2860 URLs, and thus determines how ad images, cookies and various other aspects
2861 of HTTP content and transactions are handled on which sites (or even parts
2865 Anything you want can blocked, including ads, banners, or just some obnoxious
2866 URL that you would rather not see. Cookies can be accepted or rejected, or
2867 accepted only during the current browser session (i.e. not written to disk),
2868 content can be modified, JavaScripts tamed, user-tracking fooled, and much more.
2869 See below for a complete list of available actions.</P
2871 > An actions file typically has sections. At the top, <SPAN
2875 defined (discussed below), then the default set of rules which will apply
2876 universally to all sites and pages. And then below that is generally a lengthy
2877 set of exceptions to the defined universal policies.</P
2884 >5.4.1. Finding the Right Mix</A
2887 > Note that some actions like cookie suppression or script disabling may
2888 render some sites unusable, which rely on these techniques to work properly.
2889 Finding the right mix of actions is not easy and certainly a matter of personal
2890 taste. In general, it can be said that the more <SPAN
2894 your default settings (in the top section of the actions file) are,
2895 the more exceptions for <SPAN
2898 > sites you will have to
2899 make later. If, for example, you want to kill popup windows per default, you'll
2900 have to make exceptions from that rule for sites that you regularly use
2901 and that require popups for actually useful content, like maybe your bank,
2902 favorite shop, or newspaper.</P
2904 > We have tried to provide you with reasonable rules to start from in the
2905 distribution actions file. But there is no general rule of thumb on these
2906 things. There just are too many variables, and sites are constantly changing.
2907 Sooner or later you will want to change the rules (and read this chapter).</P
2915 >5.4.2. How to Edit</A
2918 > The easiest way to edit the <SPAN
2921 > file is with a browser by
2922 using our browser-based editor, which is available at <A
2923 HREF="http://config.privoxy.org/edit-actions"
2925 >http://config.privoxy.org/edit-actions</A
2928 > If you prefer plain text editing to GUIs, you can of course also directly edit the
2940 >5.4.3. How Actions are Applied to URLs</A
2943 > The actions file is divided into sections. There are special sections,
2947 > sections which will be discussed later. For now
2948 let's concentrate on regular sections: They have a heading line (often split
2949 up to multiple lines for readability) which consist of a list of actions,
2950 separated by whitespace and enclosed in curly braces. Below that, there
2951 is a list of URL patterns, each on a separate line.</P
2953 > To determine which actions apply to a request, the URL of the request is
2954 compared to all patterns in this file. Every time it matches, the list of
2955 applicable actions for the URL is incrementally updated, using the heading
2956 of the section in which the pattern is located. If multiple matches for
2957 the same URL set the same action differently, the last match wins.</P
2959 > You can trace this process by visiting <A
2960 HREF="http://config.privoxy.org/show-url-info"
2962 >http://config.privoxy.org/show-url-info</A
2965 > More detail on this is provided in the Appendix, <A
2966 HREF="appendix.html#ACTIONSANAT"
2967 > Anatomy of an Action</A
2979 > Generally, a pattern has the form <TT
2981 ><domain>/<path></TT
2985 ><domain></TT
2990 are optional. (This is why the pattern <TT
2993 > matches all URLs).</P
2997 CLASS="VARIABLELIST"
3002 >www.example.com/</TT
3006 > is a domain-only pattern and will match any request to <TT
3008 >www.example.com</TT
3010 regardless of which document on that server is requested.
3016 >www.example.com</TT
3020 > means exactly the same. For domain-only patterns, the trailing <TT
3030 >www.example.com/index.html</TT
3034 > matches only the single document <TT
3040 >www.example.com</TT
3051 > matches the document <TT
3054 >, regardless of the domain,
3068 > matches nothing, since it would be interpreted as a domain name and
3069 there is no top-level domain called <TT
3083 >5.4.4.1. The Domain Pattern</A
3086 > The matching of the domain part offers some flexible options: if the
3087 domain starts or ends with a dot, it becomes unanchored at that end.
3092 CLASS="VARIABLELIST"
3101 > matches any domain that <I
3118 > matches any domain that <I
3135 > matches any domain that <I
3142 (Correctly speaking: It matches any FQDN that contains <TT
3151 > Additionally, there are wild-cards that you can use in the domain names
3152 themselves. They work pretty similar to shell wild-cards: <SPAN
3156 stands for zero or more arbitrary characters, <SPAN
3160 any single character, you can define character classes in square
3161 brackets and all of that can be freely mixed:</P
3165 CLASS="VARIABLELIST"
3170 >ad*.example.com</TT
3176 >"adserver.example.com"</SPAN
3180 >"ads.example.com"</SPAN
3181 >, etc but not <SPAN
3183 >"sfads.example.com"</SPAN
3190 >*ad*.example.com</TT
3194 > matches all of the above, and then some.
3210 >pictures.epix.com</TT
3213 >a.b.c.d.e.upix.com</TT
3220 >www[1-9a-ez].example.c*</TT
3226 >www1.example.com</TT
3230 >www4.example.cc</TT
3233 >wwwd.example.cy</TT
3237 >wwwz.example.com</TT
3244 >wwww.example.com</TT
3257 >5.4.4.2. The Path Pattern</A
3263 > uses Perl compatible regular expressions
3265 HREF="http://www.pcre.org/"
3269 matching the path.</P
3272 HREF="appendix.html#REGEX"
3274 > with a brief quick-start into regular
3275 expressions, and full (very technical) documentation on PCRE regex syntax is available on-line
3277 HREF="http://www.pcre.org/man.txt"
3279 >http://www.pcre.org/man.txt</A
3281 You might also find the Perl man page on regular expressions (<TT
3285 useful, which is available on-line at <A
3286 HREF="http://www.perldoc.com/perl5.6/pod/perlre.html"
3288 >http://www.perldoc.com/perl5.6/pod/perlre.html</A
3291 > Note that the path pattern is automatically left-anchored at the <SPAN
3295 i.e. it matches as if it would start with a <SPAN
3300 > Please also note that matching in the path is case
3304 > by default, but you can switch to case
3305 sensitive at any point in the pattern by using the
3312 >www.example.com/(?-i)PaTtErN.*</TT
3314 documents whose path starts with <TT
3321 > this capitalization.</P
3333 > Actions are enabled if preceded with a <SPAN
3337 preceded with a <SPAN
3346 >"do that action"</SPAN
3353 >"block the following URLs and/or patterns"</SPAN
3355 disabled by default, until they are explicitly enabled somewhere in an actions
3359 Actions are invoked by enclosing the action name in curly braces (e.g.
3360 {+some_action}), followed by a list of URLs (or patterns that match URLs) to
3361 which the action applies. There are three classes of actions: </P
3369 Boolean, i.e the action can only be <SPAN
3382 CLASS="LITERALLAYOUT"
3386 > # enable this action<br>
3390 > # disable this action<br>
3391 </P
3400 Parameterized, e.g. <SPAN
3402 >"+/-hide-user-agent{ Mozilla 1.0 }"</SPAN
3404 where some value is required in order to enable this type of action.
3411 CLASS="LITERALLAYOUT"
3415 > # enable action and set parameter to <SPAN
3422 > # disable action (<SPAN
3425 >) can be omitted<br>
3426 </P
3436 Multi-value, e.g. <SPAN
3438 >"{+/-add-header{Name: value}}"</SPAN
3442 >"{+/-wafer{name=value}}"</SPAN
3443 >), where some value needs to be defined
3444 in addition to simply enabling the actino. Examples:
3450 CLASS="LITERALLAYOUT"
3453 >{+name{param=value}}</I
3454 > # enable action and set <SPAN
3457 > to <SPAN
3463 >{-name{param=value}}</I
3464 > # remove the parameter <SPAN
3467 > completely<br>
3471 > # disable this action totally and remove <SPAN
3475 </P
3484 > If nothing is specified in this file, no <SPAN
3488 So in this case <SPAN
3492 normal, non-blocking, non-anonymizing proxy. You must specifically
3493 enable the privacy and blocking features you need (although the
3494 provided default <TT
3498 give a good starting point).</P
3500 > Later defined actions always over-ride earlier ones. So exceptions
3501 to any rules you make, should come in the latter part of the file. For
3502 multi-valued actions, the actions are applied in the order they are
3505 > The list of valid <SPAN
3520 >+add-header{Name: value}</I
3526 CLASS="VARIABLELIST"
3538 > Send a user defined HTTP header to the web server.
3542 >Possible values:</DT
3545 > Any value is possible. Validity of the defined HTTP headers is not checked.
3552 CLASS="LITERALLAYOUT"
3553 > <I
3555 >{+add-header{X-User-Tracking: sucks}}</I
3557 <I
3561 </P
3567 > This action may be specified multiple times, in order to define multiple
3568 headers. This is rarely needed for the typical user. If you don't know what
3571 >"HTTP headers"</SPAN
3572 > are, you definitely don't need to worry about this
3593 CLASS="VARIABLELIST"
3605 > Used to block a URL from reaching your browser. The URL may be
3606 anything, but is typically used to block ads or other obnoxious
3611 >Possible values:</DT
3620 CLASS="LITERALLAYOUT"
3621 > <I
3625 <I
3629 <I
3633 </P
3646 > page if a URL matches one of the
3647 blocked patterns. If there is sufficient space, a large red
3648 banner will appear with a friendly message about why the page
3649 was blocked, and a way to go there anyway. If there is insufficient
3650 space a smaller blocked page will appear without the red banner.
3651 One exception is if the URL matches both <SPAN
3658 >, then it can be handled by
3661 >"+image-blocker"</SPAN
3668 > action can also perform some of the
3669 same functionality as <SPAN
3672 >, but by virtue of very
3673 different programming techniques, and is typically used for different
3685 NAME="DEANIMATE-GIFS"
3694 CLASS="VARIABLELIST"
3706 > To stop those annoying, distracting animated GIF images.
3710 >Possible values:</DT
3726 CLASS="LITERALLAYOUT"
3727 > <I
3729 >{+deanimate-gifs{last}}</I
3731 <I
3735 </P
3741 > De-animate all animated GIF images, i.e. reduce them to their last frame.
3742 This will also shrink the images considerably (in bytes, not pixels!). If
3746 > is given, the first frame of the animation
3747 is used as the replacement. If <SPAN
3750 > is given, the last
3751 frame of the animation is used instead, which probably makes more sense for
3752 most banner animations, but also has the risk of not showing the entire
3753 last frame (if it is only a delta to an earlier frame).
3773 CLASS="VARIABLELIST"
3788 > will downgrade HTTP/1.1 client requests to
3789 HTTP/1.0 and downgrade the responses as well.
3793 >Possible values:</DT
3803 CLASS="LITERALLAYOUT"
3804 > <I
3808 <I
3812 </P
3818 > Use this action for servers that use HTTP/1.1 protocol features that
3822 > doesn't handle well yet. HTTP/1.1 is
3823 only partially implemented. Default is not to downgrade requests. This is
3824 an infrequently needed action, and is used to help with problem sites only.
3835 NAME="FAST-REDIRECTS"
3844 CLASS="VARIABLELIST"
3858 >"+fast-redirects"</SPAN
3859 > action enables interception of
3863 > requests from one server to another, which
3864 are used to track users.<SPAN
3868 all but the last valid URL in redirect request and send a local redirect
3869 back to your browser without contacting the intermediate site(s).
3873 >Possible values:</DT
3883 CLASS="LITERALLAYOUT"
3884 > <I
3886 >{+fast-redirects}</I
3888 <I
3892 </P
3899 Many sites, like yahoo.com, don't just link to other sites. Instead, they
3900 will link to some script on their own server, giving the destination as a
3901 parameter, which will then redirect you to the final target. URLs
3902 resulting from this scheme typically look like:
3905 >http://some.place/some_script?http://some.where-else</I
3909 > Sometimes, there are even multiple consecutive redirects encoded in the
3910 URL. These redirections via scripts make your web browsing more traceable,
3911 since the server from which you follow such a link can see where you go
3912 to. Apart from that, valuable bandwidth and time is wasted, while your
3913 browser ask the server for one redirect after the other. Plus, it feeds
3917 > This is a normally on feature, and often requires exceptions for sites that
3918 are sensitive to defeating this mechanism.
3938 CLASS="VARIABLELIST"
3950 > Apply page filtering as defined by named sections of the
3954 > file to the specified site(s).
3958 > can be any modification of the raw
3959 page content, including re-writing or deletion of content.
3963 >Possible values:</DT
3969 > must include the name of one of the section identifiers
3977 > is specified in <TT
3984 >Example usage (from the current <TT
3998 >+filter{html-annoyances}</I
3999 >: Get rid of particularly annoying HTML abuse.
4015 >+filter{js-annoyances}</I
4016 >: Get rid of particularly annoying JavaScript abuse
4032 >+filter{content-cookies}</I
4033 >: Kill cookies that come in the HTML or JS content
4050 >: Kill all popups in JS and HTML
4066 >+filter{frameset-borders}</I
4067 >: Give frames a border and make them resizable
4083 >+filter{webbugs}</I
4084 >: Squish WebBugs (1x1 invisible GIFs used for user tracking)
4100 >+filter{refresh-tags}</I
4101 >: Kill automatic refresh tags (for dial-on-demand setups)
4118 >: Text replacements for subversive browsing fun!
4135 >: Remove Nimda (virus) code.
4151 >+filter{banners-by-size}</I
4152 >: Kill banners by size (<I
4171 >+filter{shockwave-flash}</I
4172 >: Kill embedded Shockwave Flash objects
4188 >+filter{crude-parental}</I
4189 >: Kill all web pages that contain the words "sex" or "warez"
4201 > This is potentially a very powerful feature! And requires a knowledge
4202 of regular expressions if you want to <SPAN
4204 >"roll your own"</SPAN
4206 Filtering operates on a line by line basis.
4209 > Filtering requires buffering the page content, which may appear to
4210 slow down page rendering since nothing is displayed until all content has
4211 passed the filters. (It does not really take longer, but seems that way
4212 since the page is not incrementally displayed.) This effect will be more
4213 noticeable on slower connections.
4216 > Filtering can achieve some of the effects as the <SPAN
4220 action, i.e. it can be used to block ads and banners. In the overall
4221 scheme of things, filtering is one of the last things <SPAN
4225 does with a web page. So other actions are applied first.
4236 NAME="HIDE-FORWARDED"
4245 CLASS="VARIABLELIST"
4257 > Block any existing X-Forwarded-for HTTP header, and do not add a new one.
4261 >Possible values:</DT
4271 CLASS="LITERALLAYOUT"
4272 > <I
4274 >{+hide-forwarded}</I
4276 <I
4280 </P
4286 > It is fairly safe to leave this on. It does not seem to break many sites.
4306 CLASS="VARIABLELIST"
4318 > To block the browser from sending your email address in a <SPAN
4326 >Possible values:</DT
4332 >, or any user defined value.
4339 CLASS="LITERALLAYOUT"
4340 > <I
4342 >{+hide-from{block}}</I
4344 <I
4348 </P
4357 > will completely remove the header.
4358 Alternately, you can specify any value you prefer to send to the web
4377 NAME="HIDE-REFERRER"
4382 CLASS="VARIABLELIST"
4394 > Don't send the <SPAN
4397 > (sic) HTTP header to the web site.
4398 Or, alternately send a forged header instead.
4402 >Possible values:</DT
4405 > Prevent the header from being sent with the keyword, <SPAN
4412 > a URL to one from the same server as the request.
4413 Or, set to user defined value of your choice.
4420 CLASS="LITERALLAYOUT"
4421 > <I
4423 >{+hide-referer{forge}}</I
4425 <I
4429 </P
4438 > is the preferred option here, since some servers will
4439 not send images back otherwise.
4445 >"+hide-referrer"</SPAN
4446 > is an alternate spelling of
4449 >"+hide-referer"</SPAN
4450 >. It has the exact same parameters, and can be freely
4453 >"+hide-referer"</SPAN
4458 correct English spelling, however the HTTP specification has a bug - it
4459 requires it to be spelled as <SPAN
4473 NAME="HIDE-USER-AGENT"
4476 >+hide-user-agent</I
4482 CLASS="VARIABLELIST"
4494 > To change the <SPAN
4496 >"User-Agent:"</SPAN
4497 > header so web servers can't tell
4498 your browser type. Who's business is it anyway?
4502 >Possible values:</DT
4505 > Any user defined string.
4512 CLASS="LITERALLAYOUT"
4513 > <I
4515 >{+hide-user-agent{Netscape 6.1 (X11; I; Linux 2.4.18 i686)}}</I
4517 <I
4521 </P
4527 > Warning! This breaks many web sites that depend on this in order
4528 to determine how the target browser will respond to various
4529 requests. Use with caution.
4549 CLASS="VARIABLELIST"
4561 > To define what <SPAN
4565 automatically as an image.
4569 >Possible values:</DT
4579 CLASS="LITERALLAYOUT"
4580 > <I
4584 <I
4586 >/.*\.(gif|jpg|jpeg|png|bmp|ico)</I
4588 </P
4594 > This only has meaning if the URL (or pattern) also is
4598 >ed, in which case a <SPAN
4602 be sent rather than a HTML page. (See <SPAN
4604 >"+image-blocker{}"</SPAN
4606 for the control over what is actually sent.)
4609 > There is little reason to change the default definition for this.
4620 NAME="IMAGE-BLOCKER"
4629 CLASS="VARIABLELIST"
4641 > Decide what to do with URLs that end up tagged with both <SPAN
4648 >, e.g an advertisement.
4652 >Possible values:</DT
4655 > There are four available options: <SPAN
4657 >"-image-blocker"</SPAN
4662 > page, usually resulting in a <SPAN
4668 >"+image-blocker{blank}"</SPAN
4670 transparent GIF image. <SPAN
4672 >"+image-blocker{pattern}"</SPAN
4674 checkerboard type pattern (the default). And finally,
4677 >"+image-blocker{http://xyz.com}"</SPAN
4678 > will send a HTTP temporary
4679 redirect to the specified image. This has the advantage of the icon being
4680 being cached by the browser, which will speed up the display.
4687 CLASS="LITERALLAYOUT"
4688 > <I
4690 >{+image-blocker{blank}}</I
4692 <I
4696 </P
4705 > ads, they need to be both
4715 >"image-blocker"</SPAN
4720 > for invisibility. Note you cannot treat HTML pages as
4721 images in most cases. For instance, frames require an HTML page to display.
4722 So a frame that is an ad, cannot be treated as an image. Forcing an
4726 > in this situation just will not work.
4737 NAME="LIMIT-CONNECT"
4746 CLASS="VARIABLELIST"
4761 > only allows HTTP CONNECT
4762 requests to port 443 (the standard, secure HTTPS port). Use
4765 >"+limit-connect"</SPAN
4766 > to disable this altogether, or to allow
4771 >Possible values:</DT
4774 > Any valid port number, or port number range.
4778 >Example usages:</DT
4781 CLASS="LITERALLAYOUT"
4782 > <I
4784 >+limit-connect{443}</I
4785 > # This is the default and need not be specified.<br>
4786 <I
4788 >+limit-connect{80,443}</I
4789 > # Ports 80 and 443 are OK.<br>
4790 <I
4792 >+limit-connect{-3, 7, 20-100, 500-}</I
4793 > # Port less than 3, 7, 20 to 100 and above 500 are OK.<br>
4794 </P
4800 > The CONNECT methods exists in HTTP to allow access to secure websites
4801 (https:// URLs) through proxies. It works very simply: the proxy connects
4802 to the server on the specified port, and then short-circuits its
4803 connections to the client <I
4806 > to the remote proxy.
4807 This can be a big security hole, since CONNECT-enabled proxies can be
4808 abused as TCP relays very easily.
4812 If you want to allow CONNECT for more ports than this, or want to forbid
4813 CONNECT altogether, you can specify a comma separated list of ports and
4814 port ranges (the latter using dashes, with the minimum defaulting to 0 and
4818 > If you don't know what any of this means, there probably is no reason to
4830 NAME="NO-COMPRESSION"
4839 CLASS="VARIABLELIST"
4851 > Prevent the specified websites from compressing HTTP data.
4855 >Possible values:</DT
4865 CLASS="LITERALLAYOUT"
4866 > <I
4868 >{+no-compression}</I
4870 <I
4874 </P
4880 > Some websites do this, which can be a problem for
4893 >"+gif-deanimate"</SPAN
4895 on compressed data. This will slow down connections to those websites,
4896 though. Default typically is to turn <SPAN
4898 >"no-compression"</SPAN
4910 NAME="NO-COOKIES-KEEP"
4913 >+no-cookies-keep</I
4919 CLASS="VARIABLELIST"
4931 > Allow cookies for the current browser session only.
4935 >Possible values:</DT
4945 CLASS="LITERALLAYOUT"
4946 > <I
4948 >{+no-cookies-keep}</I
4950 <I
4954 </P
4960 > If websites set cookies, <SPAN
4962 >"no-cookies-keep"</SPAN
4964 they are erased when you exit and restart your web browser. This makes
4965 profiling cookies useless, but won't break sites which require cookies so
4966 that you can log in for transactions. This is generally turned on for all
4967 sites. Sometimes referred to as <SPAN
4969 >"session cookies"</SPAN
4981 NAME="NO-COOKIES-READ"
4984 >+no-cookies-read</I
4990 CLASS="VARIABLELIST"
5002 > Explicitly prevent the web server from reading any cookies on your
5007 >Possible values:</DT
5017 CLASS="LITERALLAYOUT"
5018 > <I
5020 >{+no-cookies-read}</I
5022 <I
5026 </P
5032 > Often used in conjunction with <SPAN
5034 >"+no-cookies-set"</SPAN
5036 disable persistant cookies completely.
5047 NAME="NO-COOKIES-SET"
5056 CLASS="VARIABLELIST"
5068 > Explicitly block the web server from sending cookies to your
5073 >Possible values:</DT
5083 CLASS="LITERALLAYOUT"
5084 > <I
5086 >{+no-cookies-set}</I
5088 <I
5092 </P
5098 > Often used in conjunction with <SPAN
5100 >"+no-cookies-read"</SPAN
5102 disable persistant cookies completely.
5125 CLASS="VARIABLELIST"
5137 > Stop those annoying JavaScript pop-up windows!
5141 >Possible values:</DT
5151 CLASS="LITERALLAYOUT"
5152 > <I
5156 <I
5160 </P
5169 > uses a built in filter to disable pop-ups
5176 > An alternate spelling is <SPAN
5191 NAME="VANILLA-WAFER"
5200 CLASS="VARIABLELIST"
5212 > Sends a cookie for every site stating that you do not accept any copyright
5213 on cookies sent to you, and asking them not to track you.
5217 >Possible values:</DT
5227 CLASS="LITERALLAYOUT"
5228 > <I
5230 >{+vanilla-wafer}</I
5232 <I
5236 </P
5242 > This action only applies if you are using a <TT
5246 for saving cookies. Of course, this is a (relatively) unique header and
5247 could be used to track you.
5267 CLASS="VARIABLELIST"
5279 > This allows you to send an arbitrary, user definable cookie.
5283 >Possible values:</DT
5286 > User specified cookie name and corresponding value.
5293 CLASS="LITERALLAYOUT"
5294 > <I
5296 >{+wafer{name=value}}</I
5298 <I
5302 </P
5308 > This can be specified multiple times in order to add as many cookies as you
5321 >5.4.5.21. Actions Examples</A
5324 > Note that the meaning of any of the above examples is reversed by preceding
5325 the action with a <SPAN
5328 >, in place of the <SPAN
5332 that some actions are turned on in the default section of the actions file,
5333 and require little to no additional configuration. These are just <SPAN
5337 Some actions that are turned on the default section do typically require
5338 exceptions to be listed in the lower sections of actions file.</P
5342 > Turn off cookies by default, then allow a few through for specified sites:</P
5347 CLASS="LITERALLAYOUT"
5348 > # Turn off all persistent cookies<br>
5349 { +no-cookies-read }<br>
5350 { +no-cookies-set }<br>
5352 # Allow cookies for this browser session ONLY<br>
5353 { +no-cookies-keep }<br>
5355 # Exceptions to the above, sites that benefit from persistent cookies<br>
5356 # that saved from one browser session to the next.<br>
5357 { -no-cookies-read }<br>
5358 { -no-cookies-set }<br>
5359 { -no-cookies-keep }<br>
5360 .javasoft.com<br>
5362 .yahoo.com<br>
5363 .msdn.microsoft.com<br>
5364 .redhat.com<br>
5366 # Alternative way of saying the same thing<br>
5367 {-no-cookies-set -no-cookies-read -no-cookies-keep}<br>
5368 .sourceforge.net<br>
5370 </P
5375 > Now turn off <SPAN
5377 >"fast redirects"</SPAN
5378 >, and then we allow two exceptions:</P
5383 CLASS="LITERALLAYOUT"
5384 > # Turn them off!<br>
5385 {+fast-redirects}<br>
5387 # Reverse it for these two sites, which don't work right without it.<br>
5388 {-fast-redirects}<br>
5389 www.ukc.ac.uk/cgi-bin/wac\.cgi\?<br>
5390 login.yahoo.com<br>
5391 </P
5396 > Turn on page filtering according to rules in the defined sections
5400 >, and make one exception for
5407 CLASS="LITERALLAYOUT"
5408 > # Run everything through the filter file, using only the<br>
5409 # specified sections:<br>
5410 +filter{html-annoyances} +filter{js-annoyances} +filter{no-popups}\<br>
5411 +filter{webbugs} +filter{nimda} +filter{banners-by-size}<br>
5412 <br>
5413 # Then disable filtering of code from sourceforge!<br>
5415 .cvs.sourceforge.net<br>
5416 </P
5421 > Now some URLs that we want <SPAN
5424 > (normally generates
5428 > banner). Many of these use
5430 HREF="appendix.html#REGEX"
5431 >regular expressions</A
5432 > that will expand to match
5438 CLASS="LITERALLAYOUT"
5439 > # Blocklist:<br>
5440 {+block}<br>
5441 /.*/(.*[-_.])?ads?[0-9]?(/|[-_.].*|\.(gif|jpe?g))<br>
5442 /.*/(.*[-_.])?count(er)?(\.cgi|\.dll|\.exe|[?/])<br>
5443 /.*/(ng)?adclient\.cgi<br>
5444 /.*/(plain|live|rotate)[-_.]?ads?/<br>
5445 /.*/(sponsor)s?[0-9]?/<br>
5446 /.*/_?(plain|live)?ads?(-banners)?/<br>
5447 /.*/abanners/<br>
5448 /.*/ad(sdna_image|gifs?)/<br>
5449 /.*/ad(server|stream|juggler)\.(cgi|pl|dll|exe)<br>
5450 /.*/adbanners/<br>
5451 /.*/adserver<br>
5452 /.*/adstream\.cgi<br>
5453 /.*/adv((er)?ts?|ertis(ing|ements?))?/<br>
5454 /.*/banner_?ads/<br>
5455 /.*/banners?/<br>
5456 /.*/banners?\.cgi/<br>
5457 /.*/cgi-bin/centralad/getimage<br>
5458 /.*/images/addver\.gif<br>
5459 /.*/images/marketing/.*\.(gif|jpe?g)<br>
5460 /.*/popupads/<br>
5461 /.*/siteads/<br>
5462 /.*/sponsor.*\.gif<br>
5463 /.*/sponsors?[0-9]?/<br>
5464 /.*/advert[0-9]+\.jpg<br>
5465 /Media/Images/Adds/<br>
5466 /ad_images/<br>
5467 /adimages/<br>
5468 /.*/ads/<br>
5469 /bannerfarm/<br>
5470 /grafikk/annonse/<br>
5471 /graphics/defaultAd/<br>
5472 /image\.ng/AdType<br>
5473 /image\.ng/transactionID<br>
5474 /images/.*/.*_anim\.gif # alvin brattli<br>
5475 /ip_img/.*\.(gif|jpe?g)<br>
5476 /rotateads/<br>
5477 /rotations/ <br>
5478 /worldnet/ad\.cgi<br>
5479 /cgi-bin/nph-adclick.exe/<br>
5480 /.*/Image/BannerAdvertising/<br>
5481 /.*/ad-bin/<br>
5482 /.*/adlib/server\.cgi<br>
5483 /autoads/<br>
5484 </P
5489 > Note that many of these actions have the potential to cause a page to
5490 misbehave, possibly even not to display at all. There are many ways
5491 a site designer may choose to design his site, and what HTTP header
5492 content he may depend on. There is no way to have hard and fast rules
5493 for all sites. See the <A
5494 HREF="appendix.html#ACTIONSANAT"
5497 for a brief example on troubleshooting actions.</P
5519 >, can be defined by combining other <SPAN
5523 These can in turn be invoked just like the built-in <SPAN
5527 Currently, an alias can contain any character except space, tab, <SPAN
5537 >. But please use only <SPAN
5557 >. Alias names are not case sensitive, and
5560 >must be defined before anything</I
5565 >file! And there can only be one set of
5571 > Now let's define a few aliases:</P
5576 CLASS="LITERALLAYOUT"
5577 > # Useful custom aliases we can use later. These must come first!<br>
5579 +no-cookies = +no-cookies-set +no-cookies-read<br>
5580 -no-cookies = -no-cookies-set -no-cookies-read<br>
5581 fragile = -block -no-cookies -filter -fast-redirects -hide-referer -no-popups<br>
5582 shop = -no-cookies -filter -fast-redirects<br>
5583 +imageblock = +block +image<br>
5585 #For people who don't like to type too much: ;-)<br>
5586 c0 = +no-cookies<br>
5587 c1 = -no-cookies<br>
5588 c2 = -no-cookies-set +no-cookies-read<br>
5589 c3 = +no-cookies-set -no-cookies-read<br>
5590 #... etc. Customize to your heart's content.<br>
5591 </P
5596 > Some examples using our <SPAN
5603 aliases from above:</P
5608 CLASS="LITERALLAYOUT"
5609 > # These sites are very complex and require<br>
5610 # minimal interference.<br>
5612 .office.microsoft.com<br>
5613 .windowsupdate.microsoft.com<br>
5614 .nytimes.com<br>
5616 # Shopping sites - but we still want to block ads.<br>
5618 .quietpc.com<br>
5619 .worldpay.com # for quietpc.com<br>
5620 .jungle.com<br>
5621 .scan.co.uk<br>
5623 # These shops require pop-ups also <br>
5624 {shop -no-popups}<br>
5626 .overclockers.co.uk<br>
5627 </P
5638 > aliases are often used for
5642 > sites that require most actions to be disabled
5643 in order to function properly. </P
5652 >5.5. The Filter File</A
5655 > Any web page can be dynamically modified with the filter file. This
5656 modification can be removal, or re-writing, of any web page content,
5657 including tags and non-visible content. The default filter file is
5661 >, located in the config directory. </P
5663 > This is potentially a very powerful feature, and requires knowledge of both
5666 >"regular expression"</SPAN
5667 > and HTML in order create custom
5668 filters. But, there are a number of useful filters included with
5672 > for many common situations.</P
5674 > The included example file is divided into sections. Each section begins
5678 > keyword, followed by the identifier
5679 for that section, e.g. <SPAN
5681 >"FILTER: webbugs"</SPAN
5682 >. Each section performs
5683 a similar type of filtering, such as <SPAN
5685 >"html-annoyances"</SPAN
5688 > This file uses regular expressions to alter or remove any string in the
5689 target page. The expressions can only operate on one line at a time. Some
5690 examples from the included default <TT
5695 > Stop web pages from displaying annoying messages in the status bar by
5696 deleting such references:</P
5701 CLASS="LITERALLAYOUT"
5702 > FILTER: html-annoyances<br>
5704 # New browser windows should be resizeable and have a location and status<br>
5705 # bar. Make it so.<br>
5707 s/resizable="?(no|0)"?/resizable=1/ig s/noresize/yesresize/ig<br>
5708 s/location="?(no|0)"?/location=1/ig s/status="?(no|0)"?/status=1/ig<br>
5709 s/scrolling="?(no|0|Auto)"?/scrolling=1/ig<br>
5710 s/menubar="?(no|0)"?/menubar=1/ig <br>
5712 # The <BLINK> tag was a crime!<br>
5714 s*<blink>|</blink>**ig<br>
5716 # Is this evil? <br>
5718 #s/framespacing="?(no|0)"?//ig<br>
5719 #s/margin(height|width)=[0-9]*//gi<br>
5720 </P
5725 > Just for kicks, replace any occurrence of <SPAN
5732 >, and have a little fun with topical buzzwords: </P
5737 CLASS="LITERALLAYOUT"
5738 > FILTER: fun<br>
5740 s/microsoft(?!.com)/MicroSuck/ig<br>
5742 # Buzzword Bingo:<br>
5744 s/industry-leading|cutting-edge|award-winning/<font color=red><b>BINGO!</b></font>/ig<br>
5745 </P
5750 > Kill those pesky little web-bugs:</P
5755 CLASS="LITERALLAYOUT"
5756 > # webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking)<br>
5757 FILTER: webbugs<br>
5759 s/<img\s+[^>]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>/<!-- Squished WebBug -->/sig<br>
5760 </P
5777 > displays one of its internal
5778 pages, such as a 404 Not Found error page, it uses the appropriate template.
5779 On Linux, BSD, and Unix, these are located in
5782 >/etc/privoxy/templates</TT
5783 > by default. These may be
5784 customized, if desired. <TT
5788 used to control the HTML attributes (fonts, etc).</P
5793 > banner page with the bright red top
5794 banner, is called just <SPAN
5801 may be customized or replaced with something else if desired. </P
5819 HREF="quickstart.html"
5844 >Quickstart to Using <SPAN
5857 >Contacting the Developers, Bug Reporting and Feature