This file belongs into
ijbswa.sourceforge.net:/home/groups/i/ij/ijbswa/htdocs/
- $Id: user-manual.sgml,v 2.30 2007/04/25 15:10:36 fabiankeil Exp $
+ $Id: user-manual.sgml,v 2.36 2007/08/26 16:47:14 fabiankeil Exp $
Copyright (C) 2001-2007 Privoxy Developers http://www.privoxy.org/
See LICENSE.
</subscript>
</pubdate>
-<pubdate>$Id: user-manual.sgml,v 2.30 2007/04/25 15:10:36 fabiankeil Exp $</pubdate>
+<pubdate>$Id: user-manual.sgml,v 2.36 2007/08/26 16:47:14 fabiankeil Exp $</pubdate>
<!--
<listitem>
<para>
Header filtering can be done with dedicated header filters now. As a result
- the actions <q>filter-client-headers</q> and <q>filter-server-headers</q>
+ the actions <quote>filter-client-headers</quote> and <quote>filter-server-headers</quote>
that were introduced with <application>Privoxy 3.0.5</application> to apply
the content filters to the headers as, well have been removed again.
</para>
</para>
<para>
- <application>Privoxy</application> is HTTP/1.1 compliant, but not all of
- the optional 1.1 features are as yet supported. In the unlikely event that
- you experience inexplicable problems with browsers that use HTTP/1.1 per default
+ <application>Privoxy</application> does not support all of the optional HTTP/1.1
+ features yet. In the unlikely event that you experience inexplicable problems
+ with browsers that use HTTP/1.1 per default
(like <application>Mozilla</application> or recent versions of I.E.), you might
try to force HTTP/1.0 compatibility. For Mozilla, look under <literal>Edit ->
Preferences -> Debug -> Networking</literal>.
<listitem>
<para>
<emphasis>--pidfile FILE</emphasis>
-
</para>
<para>
On startup, write the process ID to <emphasis>FILE</emphasis>. Delete the
<listitem>
<para>
<emphasis>--user USER[.GROUP]</emphasis>
-
</para>
<para>
After (optionally) writing the PID file, assume the user ID of
privileges are not sufficient to do so. Unix only.
</para>
</listitem>
- <listitem>
+ <listitem>
<para>
<emphasis>--chroot</emphasis>
-
</para>
<para>
Before changing to the user ID given in the <emphasis>--user</emphasis> option,
Unix only.
</para>
</listitem>
+ <listitem>
+ <para>
+ <emphasis>--pre-chroot-nslookup hostname</emphasis>
+ </para>
+ <para>
+ Specifies a hostname to look up before doing a chroot. On some systems, initializing the
+ resolver library involves reading config files from /etc and/or loading additional shared
+ libraries from /lib. On these systems, doing a hostname lookup before the chroot reduces
+ the number of files that must be copied into the chroot tree.
+ </para>
+ <para>
+ For fastest startup speed, a good value is a hostname that is not in /etc/hosts but that
+ your local name server (listed in /etc/resolv.conf) can resolve without recursion
+ (that is, without having to ask any other name servers). The hostname need not exist,
+ but if it doesn't, an error message (which can be ignored) will be output.
+ </para>
+ </listitem>
+
<listitem>
<para>
<emphasis>configfile</emphasis>
</para>
<para>
- The syntax of all configuration files has remained the same throughout the
- 3.x series. There have been enhancements, but no changes that would preclude
- the use of any configuration file from one version to the next. (There is
- one exception: <link linkend="FAST-REDIRECTS">+fast-redirects</link> which
- has enhanced syntax and will require updating any local configs from earlier
- versions.)
+ The syntax of the configuration and filter files may change between different
+ Privoxy versions, unfortunately some enhancements cost backwards compatibility.
+ <!-- Add link to documentation-->
</para>
<para>
</para>
<para>
The default profiles, and their associated actions, as pre-defined in
- <filename>standard.action</filename> are:
+ <filename>standard.action</filename> are<!-- different than this table which is out of date -->:
</para>
<para>
<table frame=all><title>Default Configurations</title>
edited from <ulink
url="http://config.privoxy.org/show-status">http://config.privoxy.org/show-status</ulink>.
The over-riding principle when applying actions, is that the last action that
- matches a given URL, wins. The broadest, most general rules go first
+ matches a given URL wins. The broadest, most general rules go first
(defined in <filename>default.action</filename>),
followed by any exceptions (typically also in
<filename>default.action</filename>), which are then followed lastly by any
from consulting any previous file). And then below that,
exceptions to the defined universal policies. You can regard
<filename>user.action</filename> as an appendix to <filename>default.action</filename>,
- with the advantage that is a separate file, which makes preserving your
+ with the advantage that it is a separate file, which makes preserving your
personal settings across <application>Privoxy</application> upgrades easier.
</para>
<para>
Actions can be used to block anything you want, including ads, banners, or
- just some obnoxious URL that you would rather not see. Cookies can be accepted
+ just some obnoxious URL whose content you would rather not see. Cookies can be accepted
or rejected, or accepted only during the current browser session (i.e. not
- written to disk), content can be modified, JavaScripts tamed, user-tracking
+ written to disk), content can be modified, some JavaScripts tamed, user-tracking
fooled, and much more. See below for a <link linkend="actions">complete list
of actions</link>.
</para>
will have to make later. If, for example, you want to crunch all cookies per
default, you'll have to make exceptions from that rule for sites that you
regularly use and that require cookies for actually useful purposes, like maybe
- your bank, favorite shop, or newspaper.
+ your bank, favorite shop, or newspaper.
</para>
<para>
</para>
<para>
- Generally, a URL pattern has the form
+ Generally, an URL pattern has the form
<literal><domain>/<path></literal>, where both the
<literal><domain></literal> and <literal><path></literal> are
optional. (This is why the special <literal>/</literal> pattern matches all
</listitem>
</varlistentry>
<varlistentry>
- <term><literal>www.example.com/index.html</literal></term>
+ <term><literal>www.example.com/index.html$</literal></term>
+ <listitem>
+ <para>
+ matches all the documents on <literal>www.example.com</literal>
+ whose name starts with <literal>/index.html</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term><literal>www.example.com/index.html$</literal></term>
<listitem>
<para>
matches only the single document <literal>/index.html</literal>
</listitem>
</varlistentry>
<varlistentry>
- <term><literal>/index.html</literal></term>
+ <term><literal>/index.html$</literal></term>
<listitem>
<para>
matches the document <literal>/index.html</literal>, regardless of the domain,
<term><literal>index.html</literal></term>
<listitem>
<para>
- matches nothing, since it would be interpreted as a domain name and
+ matches nothing, since it would be interpreted as a domain name and
there is no top-level domain called <literal>.html</literal>. So its
a mistake.
</para>
</listitem>
</varlistentry>
<varlistentry>
- <term><literal>.example.com/.*/index.html</literal></term>
+ <term><literal>.example.com/.*/index.html$</literal></term>
<listitem>
<para>
Will match any page in the domain of <quote>example.com</quote> that is
</listitem>
</varlistentry>
<varlistentry>
- <term><literal>.example.com/(.*/)?index\.html</literal></term>
+ <term><literal>.example.com/(.*/)?index\.html$</literal></term>
<listitem>
<para>
This regular expression is conditional so it will match any page
</sect3>
-</sect2>
-
<!-- ~ End section ~ -->
<para>
Tag patterns are used to change the applying actions based on the
request's tags. Tags can be created with either the
- <link linkend="CLIENT-HEADER-FILTER">client-header-tagger</link>
- or the <link linkend="SERVER-HEADER-FILTER">server-header-tagger</link> action.
+ <link linkend="CLIENT-HEADER-TAGGER">client-header-tagger</link>
+ or the <link linkend="SERVER-HEADER-TAGGER">server-header-tagger</link> action.
</para>
<para>
Tag patterns have to start with <quote>TAG:</quote>, so &my-app;
can tell them apart from URL patterns. Everything after the colon
including white space, is interpreted as a regular expression with
- path patterns syntax, except that tag patterns aren't left-anchored
+ path pattern syntax, except that tag patterns aren't left-anchored
automatically (Privoxy doesn't silently add a <quote>^</quote>,
you have to do it yourself if you need it).
</para>
your pattern line should be <quote>TAG:^foo$</quote>,
<quote>TAG:foo</quote> would work as well, but it would also
match requests whose tags contain <quote>foo</quote> somewhere.
+ <quote>TAG: foo</quote> wouldn't work as it requires white space.
</para>
<para>
the last match wins, i.e. the params from earlier matches are simply ignored.
</para>
<para>
- Example: <literal>+hide-user-agent{ Mozilla 1.0 }</literal>
+ Example: <literal>+hide-user-agent{Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.4) Gecko/20070602 Firefox/2.0.0.4}</literal>
</para>
</listitem>
<para>
If nothing is specified in any actions file, no <quote>actions</quote> are
taken. So in this case <application>Privoxy</application> would just be a
- normal, non-blocking, non-anonymizing proxy. You must specifically enable the
+ normal, non-blocking, non-filtering proxy. You must specifically enable the
privacy and blocking features you need (although the provided default actions
files will give a good starting point).
</para>
<para>
- Later defined actions always over-ride earlier ones. So exceptions
- to any rules you make, should come in the latter part of the file (or
+ Later defined action sections always over-ride earlier ones of the same type.
+ So exceptions to any rules you make, should come in the latter part of the file (or
in a file that is processed later when using multiple actions files such
as <filename>user.action</filename>). For multi-valued actions, the actions
are applied in the order they are specified. Actions files are processed in
create your own.
</para>
+ </listitem>
</varlistentry>
<varlistentry>
Client-header taggers are the first actions that are executed
and their tags can be used to control every other action.
</para>
-
+ </listitem>
</varlistentry>
<varlistentry>
<!-- ~~~~~ New section ~~~~~ -->
<sect3 renderas="sect4" id="content-type-overwrite">
-<!--
-new action
--->
<title>content-type-overwrite</title>
<variablelist>
</para>
<para>
It is also useful to make sure the header isn't used as a cookie
- replacement.
+ replacement (unlikely but possible).
</para>
<para>
Blocking the <quote>If-None-Match:</quote> header shouldn't cause any
caching problems, as long as the <quote>If-Modified-Since:</quote> header
- isn't blocked as well.
+ isn't blocked or missing as well.
</para>
<para>
It is recommended to use this action together with
<term>Example usage (section):</term>
<listitem>
<para>
- <screen># Let the browser revalidate cached documents without being tracked across sessions
-{ +hide-if-modified-since{-60} \
+ <screen># Let the browser revalidate cached documents but don't
+# allow the server to use the revalidation headers for user tracking.
+{+hide-if-modified-since{-60} \
+overwrite-last-modified{randomize} \
+crunch-if-none-match}
/ </screen>
<term>Typical use:</term>
<listitem>
<para>
- Prevent the web server from setting any cookies on your system
+ Prevent the web server from setting HTTP cookies on your system
</para>
</listitem>
</varlistentry>
<term>Notes:</term>
<listitem>
<para>
- This action is only concerned with <emphasis>incoming</emphasis> cookies. For
- <emphasis>outgoing</emphasis> cookies, use
+ This action is only concerned with <emphasis>incoming</emphasis> HTTP cookies. For
+ <emphasis>outgoing</emphasis> HTTP cookies, use
<literal><link linkend="crunch-outgoing-cookies">crunch-outgoing-cookies</link></literal>.
- Use <emphasis>both</emphasis> to disable cookies completely.
+ Use <emphasis>both</emphasis> to disable HTTP cookies completely.
</para>
<para>
It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
<term>Typical use:</term>
<listitem>
<para>
- Prevent the web server from reading any cookies from your system
+ Prevent the web server from reading any HTTP cookies from your system
</para>
</listitem>
</varlistentry>
<term>Notes:</term>
<listitem>
<para>
- This action is only concerned with <emphasis>outgoing</emphasis> cookies. For
- <emphasis>incoming</emphasis> cookies, use
+ This action is only concerned with <emphasis>outgoing</emphasis> HTTP cookies. For
+ <emphasis>incoming</emphasis> HTTP cookies, use
<literal><link linkend="crunch-incoming-cookies">crunch-incoming-cookies</link></literal>.
- Use <emphasis>both</emphasis> to disable cookies completely.
+ Use <emphasis>both</emphasis> to disable HTTP cookies completely.
</para>
<para>
It makes <emphasis>no sense at all</emphasis> to use this action in conjunction
This is a left-over from the time when <application>Privoxy</application>
didn't support important HTTP/1.1 features well. It is left here for the
unlikely case that you experience HTTP/1.1 related problems with some server
- out there. Not all (optional) HTTP/1.1 features are supported yet, so there
- is a chance you might need this action.
+ out there. Not all HTTP/1.1 features and requirements are supported yet,
+ so there is a chance you might need this action.
</para>
</listitem>
</varlistentry>
<para>
<screen>
{ +fast-redirects{simple-check} }
- .example.com
+ one.example.com
{ +fast-redirects{check-decoded-url} }
another.example.com/testing</screen>
</para>
<para>
<anchor id="filter-ie-exploits">
- <screen>+filter{ie-exploits} # Disable some known Internet Explorer bug exploits</screen>
+ <screen>+filter{ie-exploits} # Disable a known Internet Explorer bug exploits</screen>
</para>
<para>
<anchor id="filter-site-specifics">
<term>Effect:</term>
<listitem>
<para>
- Overrules the forward directives in the configuration files.
+ Overrules the forward directives in the configuration file.
</para>
</listitem>
</varlistentry>
</listitem>
<listitem>
<para>
- <quote>forward-socks4a 127.0.0.1:9050 .</quote> to use the socks4a proxy listening at 127.0.0.1 port 9050.
- Replace <q>forward-socks4a</q> with <q>forward-socks4</q> to use a socks4 connection (with local DNS
- resolution) instead.
+ <quote>forward-socks4a 127.0.0.1:9050 .</quote> to use the socks4a proxy listening at
+ 127.0.0.1 port 9050. Replace <quote>forward-socks4a</quote> with <quote>forward-socks4</quote>
+ to use a socks4 connection (with local DNS resolution) instead.
</para>
</listitem>
<listitem>
<para>
<quote>forward-socks4a 127.0.0.1:9050 proxy.example.org:8000</quote> to use the socks4a proxy
listening at 127.0.0.1 port 9050 to reach the HTTP proxy listening at proxy.example.org port 8000.
- Replace <q>forward-socks4a</q> with <q>forward-socks4</q> to use a socks4 connection (with local DNS
- resolution) instead.
+ Replace <quote>forward-socks4a</quote> with <quote>forward-socks4</quote> to use a socks4 connection
+ (with local DNS resolution) instead.
</para>
</listitem>
</itemizedlist>
to exit.
</para>
<para>
- Use the <link linkend="http://config.privoxy.org/show-url-info">show-url-info CGI page</link>
+ Use the <ulink url="http://config.privoxy.org/show-url-info">show-url-info CGI page</ulink>
to verify that your forward settings do what you thought the do.
</para>
</warning>
-hide-if-modified-since \
-overwrite-last-modified \
}
-TAG:^User-Agent: fetch libfetch/2.0$
+TAG:^User-Agent: fetch libfetch/2\.0$
</screen>
</para>
</listitem>
# blocked as images:
#
{+block +handle-as-image}
-some.nasty-banner-server.com/junk.cgi?output=trash
+some.nasty-banner-server.com/junk.cgi\?output=trash
# Banner source! Who cares if they also have non-image content?
ad.doubleclick.net
to another one, but in most cases it isn't worth the time to set
it up.
</para>
+ <para>
+ This action will probably be removed in the future,
+ use server-header filters instead.
+ </para>
</listitem>
</varlistentry>
<!-- ~~~~~ New section ~~~~~ -->
<sect3 renderas="sect4" id="hide-forwarded-for-headers">
<title>hide-forwarded-for-headers</title>
-<!--
-new action
--->
<variablelist>
<varlistentry>
<term>Typical use:</term>
<listitem>
- <para>Improve privacy by hiding the true source of the request</para>
+ <para>Improve privacy by not embedding the source of the request in the HTTP headers.</para>
</listitem>
</varlistentry>
<term>Notes:</term>
<listitem>
<para>
- It is fairly safe to leave this on.
- </para>
- <para>
- This action is scheduled for improvement: It should be able to generate forged
- <quote>X-Forwarded-for:</quote> headers using random IP addresses from a specified network,
- to make successive requests from the same client look like requests from a pool of different
- users sharing the same proxy.
+ It is safe to leave this on.
</para>
</listitem>
</varlistentry>
<listitem>
<para><quote>conditional-block</quote> to delete the header completely if the host has changed.</para>
</listitem>
+<!--
+ <listitem>
+ <para><quote>conditional-forge</quote> to forge the header if the host has changed.</para>
+ </listitem>
+-->
<listitem>
<para><quote>block</quote> to delete the header unconditionally.</para>
</listitem>
(Must be just a silly MS goof, I'm sure :-).
</para>
<para>
- This action is scheduled for improvement.
+ More information on known user-agent strings can be found at
+ <ulink url="http://www.user-agents.org/">http://www.user-agents.org/</ulink>
+ and
+ <ulink url="http://en.wikipedia.org/wiki/User_agent">http://en.wikipedia.org/wiki/User_agent</ulink>.
</para>
</listitem>
</varlistentry>
allow execution of code on the target system, giving an attacker access
to the system in question by merely planting an altered JPEG image, which
would have no obvious indications of what lurks inside. This action
- prevents unwanted intrusion.
+ prevents this exploit.
+ </para>
+ <para>
+ Note that the described exploit is only one of many,
+ using this action does not mean that you no longer
+ have to patch the client.
</para>
</listitem>
</sect3>
-
-
<!-- ~~~~~ New section ~~~~~ -->
<sect3 renderas="sect4" id="kill-popups">
<title>kill-popups<anchor id="kill-popup"></title>
to learn which server-header filters are available by default, and how to
create your own.
</para>
+ </listitem>
</varlistentry>
<varlistentry>
doesn't prevent the request from showing up in the server's log file.
</para>
+ </listitem>
</varlistentry>
<varlistentry>
USA
$Log: user-manual.sgml,v $
+ Revision 2.36 2007/08/26 16:47:14 fabiankeil
+ Add Stephen Gildea's --pre-chroot-nslookup patch [#1276666],
+ extensive comments moved to user manual.
+
+ Revision 2.35 2007/08/26 14:59:49 fabiankeil
+ Minor rewordings and fixes.
+
+ Revision 2.34 2007/08/05 15:19:50 fabiankeil
+ - Don't claim HTTP/1.1 compliance.
+ - Use $ in some of the path pattern examples.
+ - Use a hide-user-agent example argument without
+ leading and trailing space.
+ - Make it clear that the cookie actions work with
+ HTTP cookies only.
+ - Rephrase the inspect-jpegs text to underline
+ that it's only meant to protect against a single
+ exploit.
+
+ Revision 2.33 2007/07/27 10:57:35 hal9
+ Add references for user-agent strings for hide-user-agenet
+
+ Revision 2.32 2007/06/07 12:36:22 fabiankeil
+ Apply Roland's 29_usermanual.dpatch to fix a bunch
+ of syntax errors I collected over the last months.
+
+ Revision 2.31 2007/06/02 14:01:37 fabiankeil
+ Start to document forward-override{}.
+
Revision 2.30 2007/04/25 15:10:36 fabiankeil
- Describe installation for FreeBSD.
- Start to document taggers and tag patterns.