+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
<HTML
><HEAD
><TITLE
>Appendix</TITLE
><META
NAME="GENERATOR"
-CONTENT="Modular DocBook HTML Stylesheet Version 1.76b+
-"><LINK
+CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REL="HOME"
-TITLE="Privoxy 3.0.6 User Manual"
+TITLE="Privoxy 3.0.27 User Manual"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="See Also"
HREF="seealso.html"><LINK
REL="STYLESHEET"
TYPE="text/css"
-HREF="../p_doc.css">
+HREF="../p_doc.css"><META
+HTTP-EQUIV="Content-Type"
+CONTENT="text/html;
+charset=ISO-8859-1">
<LINK REL="STYLESHEET" TYPE="text/css" HREF="p_doc.css">
</head
><BODY
><TH
COLSPAN="3"
ALIGN="center"
->Privoxy 3.0.6 User Manual</TH
+>Privoxy 3.0.27 User Manual</TH
></TR
><TR
><TD
CLASS="SECT1"
><A
NAME="APPENDIX"
-></A
->14. Appendix</H1
+>14. Appendix</A
+></H1
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="REGEX"
-></A
->14.1. Regular Expressions</H2
+>14.1. Regular Expressions</A
+></H2
><P
> <SPAN
CLASS="APPLICATION"
characters when listing files with the <B
CLASS="COMMAND"
>dir</B
-> command in DOS.
+> command in DOS.
<TT
CLASS="LITERAL"
>*.*</TT
powerful. There are many more <SPAN
CLASS="QUOTE"
>"special characters"</SPAN
-> and ways of
+> and ways of
building complex patterns however. Let's look at a few of the common ones,
and then some examples:</P
><P
-><P
></P
><TABLE
BORDER="0"
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
CLASS="QUOTE"
>"escape"</SPAN
> character denotes that
- the following character should be taken literally. This is used where one of the
+ the following character should be taken literally. This is used where one of the
special characters (e.g. <SPAN
CLASS="QUOTE"
>"."</SPAN
not as a special meta-character. Example: <SPAN
CLASS="QUOTE"
>"example\.com"</SPAN
->, makes
- sure the period is recognized only as a period (and not expanded to its
+>, makes
+ sure the period is recognized only as a period (and not expanded to its
meta-character meaning of any single character).
</TD
></TR
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
CLASS="QUOTE"
>"[0-9]"</SPAN
>
- matches any numeric digit (zero through nine). As an example, we can combine
+ matches any numeric digit (zero through nine). As an example, we can combine
this with <SPAN
CLASS="QUOTE"
>"+"</SPAN
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
></TABLE
><P
></P
-></P
-><P
><P
></P
><TABLE
<SPAN
CLASS="QUOTE"
>"/(this|that) example/"</SPAN
-> uses grouping and the bar character
+> uses grouping and the bar character
and would match either <SPAN
CLASS="QUOTE"
>"this example"</SPAN
></TABLE
><P
></P
-></P
><P
-> These are just some of the ones you are likely to use when matching URLs with
+> These are just some of the ones you are likely to use when matching URLs with
<SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
> and <SPAN
CLASS="QUOTE"
>"*"</SPAN
-> to
+> to
denote any character, zero or more times. In other words, any string at all.
- So we start with a literal forward slash, then our regular expression pattern
+ So we start with a literal forward slash, then our regular expression pattern
(<SPAN
CLASS="QUOTE"
>".*"</SPAN
<SPAN
CLASS="QUOTE"
>".*"</SPAN
->. We are building
+>. We are building
a directory path here. This will match any file with the path that has a
directory named <SPAN
CLASS="QUOTE"
>/.*/adv((er)?ts?|ertis(ing|ements?))?/</TT
></I
></SPAN
-> -
+> -
We have several literal forward slashes again (<SPAN
CLASS="QUOTE"
>"/"</SPAN
>), so we are
- building another expression that is a file path statement. We have another
+ building another expression that is a file path statement. We have another
<SPAN
CLASS="QUOTE"
>".*"</SPAN
CLASS="QUOTE"
>"adv"</SPAN
> string is the
- interesting part. </P
+ interesting part.</P
><P
> Remember the <SPAN
CLASS="QUOTE"
means <SPAN
CLASS="QUOTE"
>"or"</SPAN
->. We have two of those. For instance,
+>. We have two of those. For instance,
<SPAN
CLASS="QUOTE"
>"(ing|ements?)"</SPAN
>, can expand to match either <SPAN
CLASS="QUOTE"
>"ing"</SPAN
->
+>
<SPAN
CLASS="emphasis"
><I
attempt at matching as many variations of <SPAN
CLASS="QUOTE"
>"advertisement"</SPAN
->, and
+>, and
similar, as possible. So this would expand to match just <SPAN
CLASS="QUOTE"
>"adv"</SPAN
<SPAN
CLASS="QUOTE"
>"advertisements"</SPAN
->. You get the idea. But it would not match
+>. You get the idea. But it would not match
<SPAN
CLASS="QUOTE"
>"advertizements"</SPAN
CLASS="QUOTE"
>"z"</SPAN
>). We could fix that by
- changing our regular expression to:
+ changing our regular expression to:
<SPAN
CLASS="QUOTE"
>"/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/"</SPAN
>/.*/advert[0-9]+\.(gif|jpe?g)</TT
></I
></SPAN
-> - Again
- another path statement with forward slashes. Anything in the square brackets
+> - Again
+ another path statement with forward slashes. Anything in the square brackets
<SPAN
CLASS="QUOTE"
>"[ ]"</SPAN
CLASS="QUOTE"
>"+"</SPAN
>
- means one or more of the preceding expression must be included. The preceding
- expression here is what is in the square brackets -- in this case, any digit
+ means one or more of the preceding expression must be included. The preceding
+ expression here is what is in the square brackets -- in this case, any digit
one through nine. Then, at the end, we have a grouping: <SPAN
CLASS="QUOTE"
>"(gif|jpe?g)"</SPAN
->.
+>.
This includes a <SPAN
CLASS="QUOTE"
>"|"</SPAN
expressions. Now that you know enough to get started, you can learn more on
your own :/</P
><P
-> More reading on Perl Compatible Regular expressions:
+> More reading on Perl Compatible Regular expressions:
<A
HREF="http://perldoc.perl.org/perlre.html"
TARGET="_top"
><H2
CLASS="SECT2"
><A
-NAME="AEN4992"
-></A
->14.2. Privoxy's Internal Pages</H2
+NAME="INTERNAL-PAGES"
+>14.2. Privoxy's Internal Pages</A
+></H2
><P
> Since <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> proxies each requested
+> proxies each requested
web page, it is easy for <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> to
+> to
trap certain special URLs. In this way, we can talk directly to
<SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
->, and see how it is
- configured, see how our rules are being applied, change these
+>, and see how it is
+ configured, see how our rules are being applied, change these
rules and other configuration options, and even turn
<SPAN
CLASS="APPLICATION"
>Privoxy's</SPAN
-> filtering off, all with
- a web browser. </P
+> filtering off, all with
+ a web browser.</P
><P
-> The URLs listed below are the special ones that allow direct access
+> The URLs listed below are the special ones that allow direct access
to <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
<SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> must be running to access these. If
- not, you will get a friendly error message. Internet access is not
+> must be running to access these. If
+ not, you will get a friendly error message. Internet access is not
necessary either.</P
><P
-> <P
></P
><UL
><LI
><P
->
- Privoxy main page:
+> Privoxy main page:
</P
><A
-NAME="AEN5006"
+NAME="AEN5956"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/"
TARGET="_top"
>http://config.privoxy.org/</A
></LI
><LI
><P
->
- Show information about the current configuration, including viewing and
+> Show information about the current configuration, including viewing and
editing of actions files:
</P
><A
-NAME="AEN5014"
+NAME="AEN5964"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/show-status"
TARGET="_top"
>http://config.privoxy.org/show-status</A
></LI
><LI
><P
->
- Show the source code version numbers:
+> Show the source code version numbers:
</P
><A
-NAME="AEN5019"
+NAME="AEN5969"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/show-version"
TARGET="_top"
>http://config.privoxy.org/show-version</A
></LI
><LI
><P
->
- Show the browser's request headers:
+> Show the browser's request headers:
</P
><A
-NAME="AEN5024"
+NAME="AEN5974"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/show-request"
TARGET="_top"
>http://config.privoxy.org/show-request</A
></LI
><LI
><P
->
- Show which actions apply to a URL and why:
+> Show which actions apply to a URL and why:
</P
><A
-NAME="AEN5029"
+NAME="AEN5979"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/show-url-info"
TARGET="_top"
>http://config.privoxy.org/show-url-info</A
></LI
><LI
><P
->
- Toggle Privoxy on or off. In this case, <SPAN
+> Toggle Privoxy on or off. This feature can be turned off/on in the main
+ <TT
+CLASS="FILENAME"
+>config</TT
+> file. When toggled <SPAN
+CLASS="QUOTE"
+>"off"</SPAN
+>, <SPAN
CLASS="QUOTE"
>"Privoxy"</SPAN
-> continues
- to run, but only as a pass-through proxy, with no actions taking place:
+>
+ continues to run, but only as a pass-through proxy, with no actions taking
+ place:
</P
><A
-NAME="AEN5035"
+NAME="AEN5987"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/toggle"
TARGET="_top"
>http://config.privoxy.org/toggle</A
</P
></BLOCKQUOTE
><P
-> Short cuts. Turn off, then on:
+> Short cuts. Turn off, then on:
</P
><A
-NAME="AEN5039"
+NAME="AEN5991"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/toggle?set=disable"
TARGET="_top"
>http://config.privoxy.org/toggle?set=disable</A
</P
></BLOCKQUOTE
><A
-NAME="AEN5042"
+NAME="AEN5994"
></A
><BLOCKQUOTE
CLASS="BLOCKQUOTE"
><P
->
- <A
+> <A
HREF="http://config.privoxy.org/toggle?set=enable"
TARGET="_top"
>http://config.privoxy.org/toggle?set=enable</A
></BLOCKQUOTE
></LI
></UL
-></P
-><P
-> These may be bookmarked for quick reference. See next. </P
-><DIV
-CLASS="SECT3"
-><H3
-CLASS="SECT3"
-><A
-NAME="BOOKMARKLETS"
-></A
->14.2.1. Bookmarklets</H3
-><P
-> Below are some <SPAN
-CLASS="QUOTE"
->"bookmarklets"</SPAN
-> to allow you to easily access a
- <SPAN
-CLASS="QUOTE"
->"mini"</SPAN
-> version of some of <SPAN
-CLASS="APPLICATION"
->Privoxy's</SPAN
->
- special pages. They are designed for MS Internet Explorer, but should work
- equally well in Netscape, Mozilla, and other browsers which support
- JavaScript. They are designed to run directly from your bookmarks - not by
- clicking the links below (although that should work for testing).</P
-><P
-> To save them, right-click the link and choose <SPAN
-CLASS="QUOTE"
->"Add to Favorites"</SPAN
->
- (IE) or <SPAN
-CLASS="QUOTE"
->"Add Bookmark"</SPAN
-> (Netscape). You will get a warning that
- the bookmark <SPAN
-CLASS="QUOTE"
->"may not be safe"</SPAN
-> - just click OK. Then you can run the
- Bookmarklet directly from your favorites/bookmarks. For even faster access,
- you can put them on the <SPAN
-CLASS="QUOTE"
->"Links"</SPAN
-> bar (IE) or the <SPAN
-CLASS="QUOTE"
->"Personal
- Toolbar"</SPAN
-> (Netscape), and run them with a single click. </P
-><P
-> <P
-></P
-><UL
-><LI
-><P
-> <A
-HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=enabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
-TARGET="_top"
->Privoxy - Enable</A
->
- </P
-></LI
-><LI
-><P
-> <A
-HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=disabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
-TARGET="_top"
->Privoxy - Disable</A
->
- </P
-></LI
-><LI
-><P
-> <A
-HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=toggle','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
-TARGET="_top"
->Privoxy - Toggle Privoxy</A
-> (Toggles between enabled and disabled)
- </P
-></LI
-><LI
-><P
-> <A
-HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
-TARGET="_top"
->Privoxy- View Status</A
->
- </P
-></LI
-><LI
-><P
-> <A
-HREF="javascript:void(window.open('http://config.privoxy.org/show-url-info?url='+escape(location.href),'Why').focus());"
-TARGET="_top"
->Privoxy - Why?</A
->
- </P
-></LI
-></UL
-></P
-><P
-> Credit: The site which gave us the general idea for these bookmarklets is
- <A
-HREF="http://www.bookmarklets.com/"
-TARGET="_top"
->www.bookmarklets.com</A
->. They
- have more information about bookmarklets. </P
-></DIV
></DIV
><DIV
CLASS="SECT2"
CLASS="SECT2"
><A
NAME="CHAIN"
-></A
->14.3. Chain of Events</H2
+>14.3. Chain of Events</A
+></H2
><P
-> Let's take a quick look at the basic sequence of events when a web page is
- requested by your browser and <SPAN
+> Let's take a quick look at how some of <SPAN
CLASS="APPLICATION"
->Privoxy</SPAN
-> is on duty:</P
+>Privoxy's</SPAN
+>
+ core features are triggered, and the ensuing sequence of events when a web
+ page is requested by your browser:</P
><P
-> <P
></P
><UL
><LI
><P
-> First, your web browser requests a web page. The browser knows to send
+> First, your web browser requests a web page. The browser knows to send
the request to <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
->, which will in turn,
- relay the request to the remote web server after passing the following
- tests:
+>, which will in turn,
+ relay the request to the remote web server after passing the following
+ tests:
</P
></LI
><LI
> <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> traps any request for its own internal CGI
+> traps any request for its own internal CGI
pages (e.g <A
HREF="http://p.p/"
TARGET="_top"
> Next, <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> checks to see if the URL
+> checks to see if the URL
matches any <A
HREF="actions-file.html#BLOCK"
><SPAN
CLASS="QUOTE"
>"+handle-as-image"</SPAN
></A
->
- is then checked and if it does not match, an
+>
+ and
+ <A
+HREF="actions-file.html#HANDLE-AS-EMPTY-DOCUMENT"
+><SPAN
+CLASS="QUOTE"
+>"+handle-as-empty-document"</SPAN
+></A
+>
+ are then checked, and if there is no match, an
HTML <SPAN
CLASS="QUOTE"
>"BLOCKED"</SPAN
-> page is sent back. Otherwise, if it does match,
- an image is returned. The type of image depends on the setting of <A
+> page is sent back to the browser. Otherwise, if
+ it does match, an image is returned for the former, and an empty text
+ document for the latter. The type of image would depend on the setting of
+ <A
HREF="actions-file.html#SET-IMAGE-BLOCKER"
><SPAN
CLASS="QUOTE"
></LI
><LI
><P
-> Now the web server starts sending its response back (i.e. typically a web page and related
- data).
+> Now the web server starts sending its response back (i.e. typically a web
+ page).
</P
></LI
><LI
><P
> First, the server headers are read and processed to determine, among other
things, the MIME type (document type) and encoding. The headers are then
- filtered as determined by the
+ filtered as determined by the
<A
HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
><SPAN
></LI
><LI
><P
-> If the <A
-HREF="actions-file.html#KILL-POPUPS"
-><SPAN
-CLASS="QUOTE"
->"+kill-popups"</SPAN
-></A
->
- action applies, and it is an HTML or JavaScript document, the popup-code in the
- response is filtered on-the-fly as it is received.
- </P
-></LI
-><LI
-><P
-> If a <A
+> If any <A
HREF="actions-file.html#FILTER"
><SPAN
CLASS="QUOTE"
>"+filter"</SPAN
></A
->
+> action
or <A
HREF="actions-file.html#DEANIMATE-GIFS"
><SPAN
<SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> back to your browser.
+> back to your browser.
</P
><P
-> If neither <A
+> If neither a <A
HREF="actions-file.html#FILTER"
><SPAN
CLASS="QUOTE"
>"+filter"</SPAN
></A
->
+> action
or <A
HREF="actions-file.html#DEANIMATE-GIFS"
><SPAN
matches, then <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> passes the raw data through
+> passes the raw data through
to the client browser as it becomes available.
</P
></LI
><LI
><P
-> As the browser receives the now (possibly filtered) page content, it
+> As the browser receives the now (possibly filtered) page content, it
reads and then requests any URLs that may be embedded within the page
source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
- frames), sounds, etc. For each of these objects, the browser issues a new
- request. And each such request is in turn processed as above. Note that a
- complex web page may have many such embedded URLs.
+ frames), sounds, etc. For each of these objects, the browser issues a
+ separate request (this is easily viewable in <SPAN
+CLASS="APPLICATION"
+>Privoxy's</SPAN
+>
+ logs). And each such request is in turn processed just as above. Note that a
+ complex web page will have many, many such embedded URLs. If these
+ secondary requests are to a different server, then quite possibly a very
+ differing set of actions is triggered.
</P
></LI
></UL
-></P
+><P
+> NOTE: This is somewhat of a simplistic overview of what happens with each URL
+ request. For the sake of brevity and simplicity, we have focused on
+ <SPAN
+CLASS="APPLICATION"
+>Privoxy's</SPAN
+> core features only.</P
></DIV
><DIV
CLASS="SECT2"
CLASS="SECT2"
><A
NAME="ACTIONSANAT"
-></A
->14.4. Troubleshooting: Anatomy of an Action</H2
+>14.4. Troubleshooting: Anatomy of an Action</A
+></H2
><P
> The way <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> applies
+> applies
<A
HREF="actions-file.html#ACTIONS"
>actions</A
HREF="appendix.html#REGEX"
>regular expressions</A
> whose consequences are not
- always so obvious. </P
+ always so obvious.</P
><P
> One quick test to see if <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> is causing a problem
- or not, is to disable it temporarily. This should be the first troubleshooting
- step. See <A
-HREF="appendix.html#BOOKMARKLETS"
->the Bookmarklets</A
-> section on a quick
- and easy way to do this (be sure to flush caches afterward!). Looking at the
- logs is a good idea too.</P
+> is causing a problem
+ or not, is to disable it temporarily. This should be the first troubleshooting
+ step (be sure to flush caches afterward!). Looking at the
+ logs is a good idea too. (Note that both the toggle feature and logging are
+ enabled via <TT
+CLASS="FILENAME"
+>config</TT
+> file settings, and may need to be
+ turned <SPAN
+CLASS="QUOTE"
+>"on"</SPAN
+>.)</P
><P
> Another easy troubleshooting step to try is if you have done any
customization of your installation, revert back to the installed
> <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> also provides the
+> also provides the
<A
HREF="http://config.privoxy.org/show-url-info"
TARGET="_top"
<SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> will tell us
+> will tell us
how the current configuration will handle it. This will not
help with filtering effects (i.e. the <A
HREF="actions-file.html#FILTER"
HREF="http://google.com"
TARGET="_top"
>google.com</A
->,
- and look at it one section at a time in a sample configuration (your real
+>,
+ and look at it one section at a time in a sample configuration (your real
configuration may vary):</P
-><P
-> <TABLE
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> Matches for http://google.com:
+> Matches for http://www.google.com:
In file: default.action <SPAN
CLASS="GUIBUTTON"
>[ Edit ]</SPAN
>
- {-add-header
- -block
- -content-type-overwrite
- -crunch-client-header
- -crunch-if-none-match
- -crunch-incoming-cookies
- -crunch-outgoing-cookies
- -crunch-server-header
+ {+change-x-forwarded-for{block}
+deanimate-gifs {last}
- -downgrade-http-version
+fast-redirects {check-decoded-url}
- -filter {js-events}
- -filter {content-cookies}
- -filter {all-popups}
- -filter {banners-by-link}
- -filter {tiny-textforms}
- -filter {frameset-borders}
- -filter {demoronizer}
- -filter {shockwave-flash}
- -filter {quicktime-kioskmode}
- -filter {fun}
- -filter {crude-parental}
- -filter {site-specifics}
- -filter {js-annoyances}
- -filter {html-annoyances}
+filter {refresh-tags}
- -filter {unsolicited-popups}
+filter {img-reorder}
+filter {banners-by-size}
+filter {webbugs}
+filter {jumping-windows}
+filter {ie-exploits}
- -filter {google}
- -filter {yahoo}
- -filter {msn}
- -filter {blogspot}
- -filter {xml-to-html}
- -filter {html-to-xml}
- -filter-client-headers
- -filter-server-headers
- -force-text-mode
- -handle-as-empty-document
- -handle-as-image
- -hide-accept-language
- -hide-content-disposition
- +hide-forwarded-for-headers
+hide-from-header {block}
- -hide-if-modified-since
+hide-referrer {forge}
- -hide-user-agent
- -inspect-jpegs
- -kill-popups
- -limit-connect
- -overwrite-last-modified
- +prevent-compression
- -redirect
- -send-vanilla-wafer
- -send-wafer
+session-cookies-only
+set-image-blocker {pattern}
- -treat-forbidden-connects-like-blocks }
/
-
+
{ -session-cookies-only }
.google.com
CLASS="GUIBUTTON"
>[ Edit ]</SPAN
>
-(no matches in this file) </PRE
+(no matches in this file)</PRE
></TD
></TR
></TABLE
-></P
><P
-> This is telling us how we have defined our
+> This is telling us how we have defined our
<A
HREF="actions-file.html#ACTIONS"
><SPAN
which ones match for our test case, <SPAN
CLASS="QUOTE"
>"google.com"</SPAN
->.
+>.
Displayed is all the actions that are available to us. Remember,
the <TT
CLASS="LITERAL"
>. So some are <SPAN
CLASS="QUOTE"
>"on"</SPAN
-> here, but many
+> here, but many
are <SPAN
CLASS="QUOTE"
>"off"</SPAN
> or <SPAN
CLASS="QUOTE"
>"mail.google.com"</SPAN
->. But it would not
+>. But it would not
match <SPAN
CLASS="QUOTE"
>"www.google.de"</SPAN
>user.action</TT
> file, we again have no hits.
So there is nothing google-specific that we might have added to our own, local
- configuration. If there was, those actions would over-rule any actions from
+ configuration. If there was, those actions would over-rule any actions from
previously processed files, such as <TT
CLASS="FILENAME"
>default.action</TT
> is applying all its <SPAN
CLASS="QUOTE"
>"actions"</SPAN
->
+>
to <SPAN
CLASS="QUOTE"
>"google.com"</SPAN
->: </P
-><P
-> <TABLE
+>:</P
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> Final results:
-
+> Final results:
+
-add-header
-block
+ +change-x-forwarded-for{block}
+ -client-header-filter{hide-tor-exit-notation}
-content-type-overwrite
-crunch-client-header
-crunch-if-none-match
-crunch-server-header
+deanimate-gifs {last}
-downgrade-http-version
- +fast-redirects {check-decoded-url}
+ -fast-redirects
-filter {js-events}
-filter {content-cookies}
-filter {all-popups}
-filter {yahoo}
-filter {msn}
-filter {blogspot}
- -filter {xml-to-html}
- -filter {html-to-xml}
- -filter-client-headers
- -filter-server-headers
+ -filter {no-ping}
-force-text-mode
-handle-as-empty-document
-handle-as-image
-hide-accept-language
-hide-content-disposition
- +hide-forwarded-for-headers
+hide-from-header {block}
-hide-if-modified-since
+hide-referrer {forge}
-hide-user-agent
- -inspect-jpegs
- -kill-popups
-limit-connect
-overwrite-last-modified
- +prevent-compression
+ -prevent-compression
-redirect
- -send-vanilla-wafer
- -send-wafer
+ -server-header-filter{xml-to-html}
+ -server-header-filter{html-to-xml}
-session-cookies-only
- +set-image-blocker {pattern}
- -treat-forbidden-connects-like-blocks </PRE
+ +set-image-blocker {pattern} </PRE
></TD
></TR
></TABLE
-></P
><P
-> Notice the only difference here to the previous listing, is to
+> Notice the only difference here to the previous listing, is to
<SPAN
CLASS="QUOTE"
>"fast-redirects"</SPAN
CLASS="QUOTE"
>"session-cookies-only"</SPAN
>,
- which are activated specifically for this site in our configuration,
+ which are activated specifically for this site in our configuration,
and thus show in the <SPAN
CLASS="QUOTE"
>"Final Results"</SPAN
CLASS="QUOTE"
>"ad.doubleclick.net"</SPAN
>:</P
-><P
-> <TABLE
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> { +block }
+> { +block{Domains starts with "ad"} }
ad*.
- { +block }
+ { +block{Domain contains "ad"} }
.ad.
- { +block +handle-as-image }
+ { +block{Doubleclick banner server} +handle-as-image }
.[a-vx-z]*.doubleclick.net</PRE
></TD
></TR
></TABLE
-></P
><P
-> We'll just show the interesting part here - the explicit matches. It is
+> We'll just show the interesting part here - the explicit matches. It is
matched three different times. Two <SPAN
CLASS="QUOTE"
->"+block"</SPAN
-> sections,
+>"+block{}"</SPAN
+> sections,
and a <SPAN
CLASS="QUOTE"
->"+block +handle-as-image"</SPAN
+>"+block{} +handle-as-image"</SPAN
>,
- which is the expanded form of one of our aliases that had been defined as:
+ which is the expanded form of one of our aliases that had been defined as:
<SPAN
CLASS="QUOTE"
>"+block-as-image"</SPAN
>"Aliases"</SPAN
></A
> are defined in
- the first section of the actions file and typically used to combine more
+ the first section of the actions file and typically used to combine more
than one action.)</P
><P
-> Any one of these would have done the trick and blocked this as an unwanted
- image. This is unnecessarily redundant since the last case effectively
- would also cover the first. No point in taking chances with these guys
- though ;-) Note that if you want an ad or obnoxious
+> Any one of these would have done the trick and blocked this as an unwanted
+ image. This is unnecessarily redundant since the last case effectively
+ would also cover the first. No point in taking chances with these guys
+ though ;-) Note that if you want an ad or obnoxious
URL to be invisible, it should be defined as <SPAN
CLASS="QUOTE"
>"ad.doubleclick.net"</SPAN
HREF="actions-file.html#BLOCK"
><SPAN
CLASS="QUOTE"
->"+block"</SPAN
+>"+block{}"</SPAN
></A
>
<SPAN
CLASS="EMPHASIS"
>and</I
></SPAN
-> an
+> an
<A
HREF="actions-file.html#HANDLE-AS-IMAGE"
><SPAN
>"http://www.example.net/adsl/HOWTO/"</SPAN
>.
This one is giving us problems. We are getting a blank page. Hmmm ...</P
-><P
-> <TABLE
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> Matches for http://www.example.net/adsl/HOWTO/:
+> Matches for http://www.example.net/adsl/HOWTO/:
In file: default.action <SPAN
CLASS="GUIBUTTON"
>[ Edit ]</SPAN
>
- {-add-header
+ {-add-header
-block
+ +change-x-forwarded-for{block}
+ -client-header-filter{hide-tor-exit-notation}
-content-type-overwrite
-crunch-client-header
-crunch-if-none-match
-crunch-incoming-cookies
-crunch-outgoing-cookies
-crunch-server-header
- +deanimate-gifs
- -downgrade-http-version
+ +deanimate-gifs
+ -downgrade-http-version
+fast-redirects {check-decoded-url}
-filter {js-events}
-filter {content-cookies}
-filter {yahoo}
-filter {msn}
-filter {blogspot}
- -filter {xml-to-html}
- -filter {html-to-xml}
- -filter-client-headers
- -filter-server-headers
+ -filter {no-ping}
-force-text-mode
-handle-as-empty-document
- -handle-as-image
+ -handle-as-image
-hide-accept-language
- -hide-content-disposition
- +hide-forwarded-for-headers
- +hide-from-header{block}
- +hide-referer{forge}
- -hide-user-agent
- -inspect-jpegs
- -kill-popups
+ -hide-content-disposition
+ +hide-from-header{block}
+ +hide-referer{forge}
+ -hide-user-agent
-overwrite-last-modified
- +prevent-compression
+ +prevent-compression
-redirect
- -send-vanilla-wafer
- -send-wafer
- +session-cookies-only
- +set-image-blocker{blank}
- -treat-forbidden-connects-like-blocks }
+ -server-header-filter{xml-to-html}
+ -server-header-filter{html-to-xml}
+ +session-cookies-only
+ +set-image-blocker{blank} }
/
- { +block +handle-as-image }
+ { +block{Path contains "ads".} +handle-as-image }
/ads</PRE
></TD
></TR
></TABLE
-></P
><P
> Ooops, the <SPAN
CLASS="QUOTE"
> is matching <SPAN
CLASS="QUOTE"
>"/ads"</SPAN
-> in our
+> in our
configuration! But we did not want this at all! Now we see why we get the
- blank page. It is actually triggering two different actions here, and
+ blank page. It is actually triggering two different actions here, and
the effects are aggregated so that the URL is blocked, and <SPAN
CLASS="APPLICATION"
>Privoxy</SPAN
-> is told
+> is told
to treat the block as if it were an image. But this is, of course, all wrong.
We could now add a new action below this (or better in our own
<TT
>"adsl"</SPAN
> in them (remember, last match in the configuration
wins). There are various ways to handle such exceptions. Example:</P
-><P
-> <TABLE
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> { -block }
+> { -block }
/adsl</PRE
></TD
></TR
></TABLE
-></P
><P
-> Now the page displays ;-)
+> Now the page displays ;-)
Remember to flush your browser's caches when making these kinds of changes to
your configuration to insure that you get a freshly delivered page! Or, try
using <TT
>Shift+Reload</TT
>.</P
><P
-> But now what about a situation where we get no explicit matches like
+> But now what about a situation where we get no explicit matches like
we did with:</P
-><P
-> <TABLE
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> { +block +handle-as-image }
+> { +block{Path starts with "ads".} +handle-as-image }
/ads</PRE
></TD
></TR
></TABLE
-></P
><P
> That actually was very helpful and pointed us quickly to where the problem
- was. If you don't get this kind of match, then it means one of the default
+ was. If you don't get this kind of match, then it means one of the default
rules in the first section of <TT
CLASS="FILENAME"
>default.action</TT
>"+filter"</SPAN
></A
>:</P
-><P
-> <TABLE
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> { shop }
+> { shop }
.quietpc.com
.worldpay.com # for quietpc.com
.jungle.com
></TD
></TR
></TABLE
-></P
><P
> <SPAN
CLASS="QUOTE"
> is an <SPAN
CLASS="QUOTE"
>"alias"</SPAN
-> that expands to
+> that expands to
<SPAN
CLASS="QUOTE"
>"<TT
>{ -filter -session-cookies-only }</TT
>"</SPAN
>.
- Or you could do your own exception to negate filtering: </P
-><P
-> <TABLE
+ Or you could do your own exception to negate filtering:</P
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> { -filter }
+> { -filter }
# Disable ALL filter actions for sites in this section
.forbes.com
developer.ibm.com
></TD
></TR
></TABLE
-></P
><P
> This would turn off all filtering for these sites. This is best
put in <TT
>user.action</TT
>, for local site
exceptions. Note that when a simple domain pattern is used by itself (without
- the subsequent path portion), all sub-pages within that domain are included
- automatcially in the scope of the action.</P
+ the subsequent path portion), all sub-pages within that domain are included
+ automatically in the scope of the action.</P
><P
-> Images that are inexplicably being blocked, may well be hitting the
+> Images that are inexplicably being blocked, may well be hitting the
<A
HREF="actions-file.html#FILTER-BANNERS-BY-SIZE"
><SPAN
>"+filter{banners-by-size}"</SPAN
></A
>
- rule, which assumes
- that images of certain sizes are ad banners (works well
+ rule, which assumes
+ that images of certain sizes are ad banners (works well
<SPAN
CLASS="emphasis"
><I
>"</SPAN
> is an alias that disables most
actions that are the most likely to cause trouble. This can be used as a
- last resort for problem sites. </P
-><P
-> <TABLE
+ last resort for problem sites.</P
+><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TD
><PRE
CLASS="SCREEN"
-> { fragile }
+> { fragile }
# Handle with care: easy to break
mail.google.
mybank.example.com</PRE
></TD
></TR
></TABLE
-></P
><P
> <SPAN
CLASS="emphasis"
CLASS="EMPHASIS"
>Remember to flush caches!</I
></SPAN
-> Note that the
+> Note that the
<TT
CLASS="LITERAL"
>mail.google</TT
-> reference lacks the TLD portion (e.g.
+> reference lacks the TLD portion (e.g.
<SPAN
CLASS="QUOTE"
>".com"</SPAN
->. This will effectively match any TLD with
+>). This will effectively match any TLD with
<TT
CLASS="LITERAL"
>google</TT
> in it, such as <TT
CLASS="LITERAL"
->mail.google.de</TT
->,
+>mail.google.de.</TT
+>,
just as an example.</P
><P
->
- If this still does not work, you will have to go through the remaining
+> If this still does not work, you will have to go through the remaining
actions one by one to find which one(s) is causing the problem.</P
></DIV
></DIV