-
- <p><span class="emphasis EMPHASIS c2"><tt class=
- "LITERAL">/.*/adv((er)?ts?|ertis(ing|ements?))?/</tt></span> - We have
- several literal forward slashes again (<span class="QUOTE">"/"</span>),
- so we are building another expression that is a file path statement. We
- have another <span class="QUOTE">".*"</span>, so we are matching
- against any conceivable sub-path, just so it matches our expression.
- The only true literal that <span class="emphasis EMPHASIS c2">must
- match</span> our pattern is <span class="APPLICATION">adv</span>,
- together with the forward slashes. What comes after the <span class=
- "QUOTE">"adv"</span> string is the interesting part.</p>
-
- <p>Remember the <span class="QUOTE">"?"</span> means the preceding
- expression (either a literal character or anything grouped with
- <span class="QUOTE">"(...)"</span> in this case) can exist or not,
- since this means either zero or one match. So <span class=
- "QUOTE">"((er)?ts?|ertis(ing|ements?))"</span> is optional, as are the
- individual sub-expressions: <span class="QUOTE">"(er)"</span>,
- <span class="QUOTE">"(ing|ements?)"</span>, and the <span class=
- "QUOTE">"s"</span>. The <span class="QUOTE">"|"</span> means
- <span class="QUOTE">"or"</span>. We have two of those. For instance,
- <span class="QUOTE">"(ing|ements?)"</span>, can expand to match either
- <span class="QUOTE">"ing"</span> <span class=
- "emphasis EMPHASIS c2">OR</span> <span class="QUOTE">"ements?"</span>.
- What is being done here, is an attempt at matching as many variations
- of <span class="QUOTE">"advertisement"</span>, and similar, as
- possible. So this would expand to match just <span class=
- "QUOTE">"adv"</span>, or <span class="QUOTE">"advert"</span>, or
- <span class="QUOTE">"adverts"</span>, or <span class=
- "QUOTE">"advertising"</span>, or <span class=
- "QUOTE">"advertisement"</span>, or <span class=
- "QUOTE">"advertisements"</span>. You get the idea. But it would not
- match <span class="QUOTE">"advertizements"</span> (with a <span class=
- "QUOTE">"z"</span>). We could fix that by changing our regular
- expression to: <span class=
- "QUOTE">"/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/"</span>, which
- would then match either spelling.</p>
-
- <p><span class="emphasis EMPHASIS c2"><tt class=
- "LITERAL">/.*/advert[0-9]+\.(gif|jpe?g)</tt></span> - Again another
- path statement with forward slashes. Anything in the square brackets
- <span class="QUOTE">"[ ]"</span> can be matched. This is using
- <span class="QUOTE">"0-9"</span> as a shorthand expression to mean any
- digit one through nine. It is the same as saying <span class=
- "QUOTE">"0123456789"</span>. So any digit matches. The <span class=
- "QUOTE">"+"</span> means one or more of the preceding expression must
- be included. The preceding expression here is what is in the square
- brackets -- in this case, any digit one through nine. Then, at the end,
- we have a grouping: <span class="QUOTE">"(gif|jpe?g)"</span>. This
- includes a <span class="QUOTE">"|"</span>, so this needs to match the
- expression on either side of that bar character also. A simple
- <span class="QUOTE">"gif"</span> on one side, and the other side will
- in turn match either <span class="QUOTE">"jpeg"</span> or <span class=
- "QUOTE">"jpg"</span>, since the <span class="QUOTE">"?"</span> means
- the letter <span class="QUOTE">"e"</span> is optional and can be
- matched once or not at all. So we are building an expression here to
- match image GIF or JPEG type image file. It must include the literal
- string <span class="QUOTE">"advert"</span>, then one or more digits,
- and a <span class="QUOTE">"."</span> (which is now a literal, and not a
- special character, since it is escaped with <span class=
- "QUOTE">"\"</span>), and lastly either <span class=
- "QUOTE">"gif"</span>, or <span class="QUOTE">"jpeg"</span>, or
- <span class="QUOTE">"jpg"</span>. Some possible matches would include:
- <span class="QUOTE">"//advert1.jpg"</span>, <span class=
- "QUOTE">"/nasty/ads/advert1234.gif"</span>, <span class=
- "QUOTE">"/banners/from/hell/advert99.jpg"</span>. It would not match
- <span class="QUOTE">"advert1.gif"</span> (no leading slash), or
- <span class="QUOTE">"/adverts232.jpg"</span> (the expression does not
- include an <span class="QUOTE">"s"</span>), or <span class=
- "QUOTE">"/advert1.jsp"</span> (<span class="QUOTE">"jsp"</span> is not
- in the expression anywhere).</p>
-
- <p>We are barely scratching the surface of regular expressions here so
- that you can understand the default <span class=
- "APPLICATION">Privoxy</span> configuration files, and maybe use this
- knowledge to customize your own installation. There is much, much more
- that can be done with regular expressions. Now that you know enough to
- get started, you can learn more on your own :/</p>
-
- <p>More reading on Perl Compatible Regular expressions: <a href=
- "http://perldoc.perl.org/perlre.html" target=
+ <p><span class="emphasis"><i class="EMPHASIS"><tt class=
+ "LITERAL">/.*/adv((er)?ts?|ertis(ing|ements?))?/</tt></i></span> - We have several literal forward slashes again
+ (<span class="QUOTE">"/"</span>), so we are building another expression that is a file path statement. We have
+ another <span class="QUOTE">".*"</span>, so we are matching against any conceivable sub-path, just so it matches
+ our expression. The only true literal that <span class="emphasis"><i class="EMPHASIS">must match</i></span> our
+ pattern is <span class="APPLICATION">adv</span>, together with the forward slashes. What comes after the
+ <span class="QUOTE">"adv"</span> string is the interesting part.</p>
+ <p>Remember the <span class="QUOTE">"?"</span> means the preceding expression (either a literal character or
+ anything grouped with <span class="QUOTE">"(...)"</span> in this case) can exist or not, since this means either
+ zero or one match. So <span class="QUOTE">"((er)?ts?|ertis(ing|ements?))"</span> is optional, as are the
+ individual sub-expressions: <span class="QUOTE">"(er)"</span>, <span class="QUOTE">"(ing|ements?)"</span>, and
+ the <span class="QUOTE">"s"</span>. The <span class="QUOTE">"|"</span> means <span class="QUOTE">"or"</span>. We
+ have two of those. For instance, <span class="QUOTE">"(ing|ements?)"</span>, can expand to match either
+ <span class="QUOTE">"ing"</span> <span class="emphasis"><i class="EMPHASIS">OR</i></span> <span class=
+ "QUOTE">"ements?"</span>. What is being done here, is an attempt at matching as many variations of <span class=
+ "QUOTE">"advertisement"</span>, and similar, as possible. So this would expand to match just <span class=
+ "QUOTE">"adv"</span>, or <span class="QUOTE">"advert"</span>, or <span class="QUOTE">"adverts"</span>, or
+ <span class="QUOTE">"advertising"</span>, or <span class="QUOTE">"advertisement"</span>, or <span class=
+ "QUOTE">"advertisements"</span>. You get the idea. But it would not match <span class=
+ "QUOTE">"advertizements"</span> (with a <span class="QUOTE">"z"</span>). We could fix that by changing our
+ regular expression to: <span class="QUOTE">"/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/"</span>, which would then
+ match either spelling.</p>
+ <p><span class="emphasis"><i class="EMPHASIS"><tt class="LITERAL">/.*/advert[0-9]+\.(gif|jpe?g)</tt></i></span> -
+ Again another path statement with forward slashes. Anything in the square brackets <span class="QUOTE">"[
+ ]"</span> can be matched. This is using <span class="QUOTE">"0-9"</span> as a shorthand expression to mean any
+ digit one through nine. It is the same as saying <span class="QUOTE">"0123456789"</span>. So any digit matches.
+ The <span class="QUOTE">"+"</span> means one or more of the preceding expression must be included. The preceding
+ expression here is what is in the square brackets -- in this case, any digit one through nine. Then, at the end,
+ we have a grouping: <span class="QUOTE">"(gif|jpe?g)"</span>. This includes a <span class="QUOTE">"|"</span>, so
+ this needs to match the expression on either side of that bar character also. A simple <span class=
+ "QUOTE">"gif"</span> on one side, and the other side will in turn match either <span class="QUOTE">"jpeg"</span>
+ or <span class="QUOTE">"jpg"</span>, since the <span class="QUOTE">"?"</span> means the letter <span class=
+ "QUOTE">"e"</span> is optional and can be matched once or not at all. So we are building an expression here to
+ match image GIF or JPEG type image file. It must include the literal string <span class="QUOTE">"advert"</span>,
+ then one or more digits, and a <span class="QUOTE">"."</span> (which is now a literal, and not a special
+ character, since it is escaped with <span class="QUOTE">"\"</span>), and lastly either <span class=
+ "QUOTE">"gif"</span>, or <span class="QUOTE">"jpeg"</span>, or <span class="QUOTE">"jpg"</span>. Some possible
+ matches would include: <span class="QUOTE">"//advert1.jpg"</span>, <span class=
+ "QUOTE">"/nasty/ads/advert1234.gif"</span>, <span class="QUOTE">"/banners/from/hell/advert99.jpg"</span>. It
+ would not match <span class="QUOTE">"advert1.gif"</span> (no leading slash), or <span class=
+ "QUOTE">"/adverts232.jpg"</span> (the expression does not include an <span class="QUOTE">"s"</span>), or
+ <span class="QUOTE">"/advert1.jsp"</span> (<span class="QUOTE">"jsp"</span> is not in the expression
+ anywhere).</p>
+ <p>We are barely scratching the surface of regular expressions here so that you can understand the default
+ <span class="APPLICATION">Privoxy</span> configuration files, and maybe use this knowledge to customize your own
+ installation. There is much, much more that can be done with regular expressions. Now that you know enough to get
+ started, you can learn more on your own :/</p>
+ <p>More reading on Perl Compatible Regular expressions: <a href="http://perldoc.perl.org/perlre.html" target=