X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=default.filter;h=8fa455b8f5176ae2940738a50c6551811af0c5cf;hp=b4904b831596418ba6d0eef98212961d17ba71ff;hb=f3616f2012ca291e81b33dfc716892d54ad8fde3;hpb=97110184221edacdb7d4dfc29de0e38d63f4d831 diff --git a/default.filter b/default.filter index b4904b83..8fa455b8 100644 --- a/default.filter +++ b/default.filter @@ -2,237 +2,415 @@ # # File : $Source: /cvsroot/ijbswa/current/default.filter,v $ # +# $Id: default.filter,v 1.67 2008/08/06 17:38:06 fabiankeil Exp $ +# # Purpose : Rules to process the content of web pages # -# Copyright : Written by and Copyright (C) 2001 the SourceForge +# Copyright : Written by and Copyright (C) 2001-2008 the # Privoxy team. http://www.privoxy.org/ +# +# We value your feedback. However, to provide you with the best support, +# please note: +# +# * Use the support forum to get help: +# http://sourceforge.net/tracker/?group_id=11118&atid=211118 +# * Submit bugs only thru our bug forum: +# http://sourceforge.net/tracker/?group_id=11118&atid=111118 +# Make sure that the bug has not already been submitted. Please try +# to verify that it is a Privoxy bug, and not a browser or site +# bug first. If you are using your own custom configuration, please +# try the stock configs to see if the problem is a configuration +# related bug. And if not using the latest development snapshot, +# please try the latest one. Or even better, CVS sources. +# * Submit feature requests only thru our feature request forum: +# http://sourceforge.net/tracker/?atid=361118&group_id=11118&func=browse +# +# For any other issues, feel free to use the mailing lists: +# http://sourceforge.net/mail/?group_id=11118 +# +# Anyone interested in actively participating in development and related +# discussions can join the appropriate mailing list here: +# http://sourceforge.net/mail/?group_id=11118. Archives are available +# here too. # -# This program is free software; you can redistribute it -# and/or modify it under the terms of the GNU General -# Public License as published by the Free Software -# Foundation; either version 2 of the License, or (at -# your option) any later version. -# -# This program is distributed in the hope that it will -# be useful, but WITHOUT ANY WARRANTY; without even the -# implied warranty of MERCHANTABILITY or FITNESS FOR A -# PARTICULAR PURPOSE. See the GNU General Public -# License for more details. -# -# The GNU General Public License should be included with -# this file. If not, you can view it at -# http://www.gnu.org/copyleft/gpl.html -# or write to the Free Software Foundation, Inc., 59 -# Temple Place - Suite 330, Boston, MA 02111-1307, USA. -# -# Revisions : -# $Log: default.filter,v $ -# Revision 1.3 2002/03/24 16:08:03 jongfoster -# Fixing banners-by-size for new config URLs +################################################################################# # -# Revision 1.2 2002/03/24 13:02:18 swa -# name change related issues. +# Syntax: # -# Revision 1.1 2002/03/24 11:37:39 jongfoster -# Name change +# Generally filters start with a line like "FILTER: name description". +# They are then referrable from the actionsfile with +filter{name} # -# Revision 1.24 2002/03/16 20:39:54 oes -# - Added descriptions to the filters so users will know what they select in the cgi editor -# - Added content-cookies filter -# - Bugfixed many jobs (Thanks to Al for some hints) +# FILTER marks a filter as content filter, other filter +# types are CLIENT-HEADER-FILTER, CLIENT-HEADER-TAGGER, +# SERVER-HEADER-FILTER and SERVER-HEADER-TAGGER. # -# Revision 1.22 2002/03/12 13:42:50 oes -# Fixing & Optimizing REs +# Inside the filters, write one Perl-Style substitution (job) per line. +# Jobs that precede the first FILTER: line are ignored. # -# Revision 1.21 2002/03/12 11:59:20 oes -# Beefed up Buzzword Bingo +# For Details see the pcrs manpage contained in this distribution. +# (and the perlre, perlop and pcre manpages) # -# Revision 1.20 2002/03/12 01:42:50 oes -# Introduced modular filters +# Note that you are free to choose the delimiter as you see fit. # -# Revision 1.19 2002/03/10 19:49:24 oes -# Added expression to kill referer tracking in JavaScripts +# Note2: In addition to the Perl options gimsx, the following nonstandard +# options are supported: +# +# 'U' turns the default to ungreedy matching. Add ? to quantifiers to +# switch back to greedy. # -# Revision 1.18 2002/03/08 17:14:12 oes -# PNG -> image in comments +# 'T' (trivial) prevents parsing for backreferences in the substitute. +# Use if you want to include text like '$&' in your substitute without +# quoting. # -# Revision 1.17 2002/03/07 03:50:54 oes -# Adapted comments to new built-in images +# 'D' (Dynamic) allows the use of variables. Supported variables are: +# $host, $origin (the IP address the request came from), $path and $url. # -# Revision 1.16 2002/02/21 00:12:19 jongfoster -# Modifying the banner regexps to use long URLS and to autodetect -# whether to show a logo or a transparent GIF, based on actionsfile -# setting. +# Note that '$' is a bad choice as delimiter for dynamic filters as you +# might end up with unintended variables if you use a variable name +# directly after the delimiter. Variables will be resolved without +# escaping anything, therefore you also have to be careful not to chose +# delimiters that appear in the replacement text. For example '<' should +# be save, while '?' will sooner or later cause conflicts with $url. +# +################################################################################# + + +################################################################################# # -# Revision 1.15 2001/12/28 23:54:20 steudten -# Fix for feature Req #495374: http-equiv problem +# js-annoyances: Get rid of particularly annoying JavaScript abuse. # -# Revision 1.14 2001/12/09 18:55:11 david__schmidt -# Updated CODE_STATUS to beta, commented out microsuck line in re_filterfile -# for 2.9.10 beta +################################################################################# +FILTER: js-annoyances Get rid of particularly annoying JavaScript abuse. + +# Note: Most of these jobs would be safer if restricted to a +# )|$1never|sigU + +# If we allow window.open, we want normal window features: +# Test: http://www.htmlgoodies.com/beyond/notitle.html +# +s/(open\s*\([^\)]+resizable=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+location=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+status=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+scroll(?:ing|bars)=)(["']?)(?:no|0)\2/$1$2auto$2/sigU +s/(open\s*\([^\)]+menubar=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+toolbar=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+directories=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+fullscreen=)(["']?)(?:yes|1)\2/$1$2no$2/sigU +s/(open\s*\([^\)]+always(?:raised|lowered)=)(["']?)(?:yes|1)\2/$1$2no$2/sigU +s/(open\s*\([^\)]+z-?lock=)(["']?)(?:yes|1)\2/$1$2no$2/sigU +s/(open\s*\([^\)]+hotkeys=)(["']?)(?:yes|1)\2/$1$2no$2/sigU +s/(open\s*\([^\)]+titlebar=)(["']?)(?:no|0)\2/$1$2yes$2/sigU +s/(open\s*\([^\)]+always(?:raised|lowered)=)(["']?)(?:yes|1)\2/$1$2no$2/sigU + + +################################################################################# # -# Revision 1.8 2001/06/29 13:34:00 oes -# - Added explanation for U and T options -# - Added hint on image replacement by CGI call -# - Fixed bug in banner-by-size jobs +# js-events: Kill all JS event bindings and timers (Radically destructive! Only for extra nasty sites). # -# Revision 1.7 2001/06/19 14:21:56 oes -# Fixed microsuck line +################################################################################# +FILTER: js-events Kill all JS event bindings and timers (Radically destructive! Only for extra nasty sites). + +s/(on|event\.)((mouse(over|out|down|up|move))|(un)?load|contextmenu|selectstart)/never/ig +# Not events, but abused on the same type of sites: +s/(alert|confirm)\s*\(/concat(/ig +s/settimeout\(/concat(/ig + +################################################################################# # -# Revision 1.6 2001/06/09 14:01:57 swa -# header. cosmetics. default: no messing ala microsuck. +# html-annoyances: Get rid of particularly annoying HTML abuse. # +################################################################################# +FILTER: html-annoyances Get rid of particularly annoying HTML abuse. + +# New browser windows (if allowed -- see no-popups filter below) should be +# resizeable and have a location and status bar # -# +s/(]+resizable=)(['"]?)(?:no|0)\2/$1$2yes$2/igU +s/(]+location=)(['"]?)(?:no|0)\2/$1$2yes$2/igU +s/(]+status=)(['"]?)(?:no|0)\2/$1$2yes1$2/igU +s/(]+scrolling=)(['"]?)(?:no|0)\2/$1$2auto$2/igU +s/(]+menubar=)(['"]?)(?:no|0)\2/$1$2yes$2/igU + +# The and tags were crimes! +# +s---sigU + ################################################################################# # -# Syntax: +# content-cookies: Kill cookies that come in the HTML or JS content. # ################################################################################# +FILTER: content-cookies Kill cookies that come in the HTML or JS content. + +# JS cookies, except those used by antiadbuster.com to detect us: # -# Filters start with a line "FILTER: name description". They are then referrable -# from the actionsfile with +filter{name} -# -# Inside the filters, write one Perl-Style substitution (job) per line. -# Jobs that precede the first FILTER: line are ignored. +s|(\w+\.)+cookie(?=[ \t\r\n]*=)(?!='aab)|ZappedCookie|ig + +# HTML cookies: # -# For Details see the pcrs manpage contained in this distribution. -# (and the perlre, perlop and pcre manpages) +s|||igU + + +################################################################################# # -# Note that you are free to choose the delimter as you see fit. +# refresh-tags: Kill automatic refresh tags (for dial-on-demand setups). # -# Note2: In addidion to the Perl options gimsx, the following nonstandard -# options are supported: -# -# 'U' turns the default to ungreedy matching. Add ? to quantifiers to -# switch back to greedy. -# 'T' (trivial) prevents parsing for backreferences in the substitute. -# Use if you want to include text like '$&' in your substitute without -# quoting. -# ################################################################################# +FILTER: refresh-tags Kill automatic refresh tags (for dial-on-demand setups). + +# Note: Only deactivates refreshes with more than 9 seconds delay to +# preserve monster-stupid but common redirections via meta tags. +# +s/\2]*))?\2/)(?=\s*[^'"])+$1+isU +s@([^\w\s.]\s*)((?:map)?(window|this|parent)\.?)?open\s*\(@$1PrivoxyWindowOpen(@ig +s+([^'"]\s*)(?!\s*(\\n|'|"))+$1+iU + + +################################################################################## # -s/(]+)resizable=['"]?(no|0|false)['"]?(.*>)/$1resizable=1$3/igU -s/(]+)location=['"]?(no|0)['"]?(.*>)/$1location=1$3/igU -s/(]+)status=['"]?(no|0)['"]?(.*>)/$1status=1$3/igU -s/(]+)scrolling=['"]?(no|0|auto)['"]?(.*>)/$1scrolling=no$3/igU -s/(]+)menubar=['"]?(no|0)['"]?(.*>)/$1menubar=1$3/igU +# all-popups: Kill all popups in JavaScript and HTML. +# +################################################################################# +FILTER: all-popups Kill all popups in JavaScript and HTML. -# The tag was a crime! +s@((\W\s*)(?:map)?(window|this|parent)\.?)open\s*\\?\(@$1concat(@ig # JavaScript +#s/\starget\s*=\s*(['"]?)_?(blank|new)\1?/ notarget/ig # HTML +s/\starget\s*=\s*(['"]?)_?(blank|new)\1?/ /ig # (X)HTML + +################################################################################## # -s*|**ig +# img-reorder: Reorder attributes in tags to make the banners-by-* filters more effective. +# +################################################################################# +FILTER: img-reorder Reorder attributes in tags to make the banners-by-* filters more effective. -# Is this evil? +# In the first step src is moved to the start, then width is moved to the second +# place to guarantee an order of src, width, height. Also does some white-space +# normalization. # -#s/margin(height|width)=[0-9]*//gi -#s/noresize/yesresize/igU +# This makes banners-by-size more effective and allows both banners-by-size +# and banners-by-link to preserve the original image URL in the title attribute. + +s|]*)\ssrc\s*=\s*(['"])([^>\\\2]+)\2|]*)\ssrc\s*=\s*([^'">\\\s]+)|]+height)\s*=\s*|$1=|sig +s|\\\\2]*\2\|[^'">\\\s]+?))([^>]*)\s+width\s*=\s*((["']?)\d+?\5)(?=[\s>])|\\\1\s]+)\1)?[^>]*?(width=(['"]?)88\4)[^>]*?(height=(['"]?)31\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)120\4)[^>]*?(height=(['"]?)(?:600?|90|240)\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)125\4)[^>]*?(height=(['"]?)125\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)160\4)[^>]*?(height=(['"]?)600\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)180\4)[^>]*?(height=(['"]?)150\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)(?:234|468)\4)[^>]*?(height=(['"]?)60\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)240\4)[^>]*?(height=(['"]?)400\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)(?:250|300)\4)[^>]*?(height=(['"]?)250\6)[^>]*?(?=/?>)@\ + \\\1\s]+)\1)?[^>]*?(width=(['"]?)336\4)[^>]*?(height=(['"]?)280\6)[^>]*?(?=/?>)@\ + )|$1"Not Your Business!"$2|Usg +#s@\\\1\s]+)\1)?[^>]*?(width=(['"]?)200\4)[^>]*?(height=(['"]?)50\6)[^>]*?(?=/?>)@\ +# \1\s]*?(?:\ + adclick # See www.dn.se \ +| advert # see dict.leo.org \ +| atwola\.com/(?:link|redir) # see www.cnn.com \ +| doubleclick\.net/jump/ # redirs for doublecklick.net ads \ +| counter # common \ +| (?\1\s]*)\1[^>]*>\s*\\\3\s]+)\3)?[^>]*((?:width|height)\s*=\s*(['"]?)\d+?\6)[^>]*((?:width|height)\s*=\s*(['"]?)\d+?\8)[^>]*?(?=/?>)\ +@)/$1never$2/iU +s@\1\s]*?(?:ad(?:click|vert)|atwola\.com/(?:link|redir)|doubleclick\.net/jump/|(?\1\s]*)\1[^>]*>\s*\\\3\s]+)\3)?[^>]*?(?=/?>)@]*\s(?:width|height)\s*=\s*['"]?[01](?=\D)[^>]*\s(?:width|height)\s*=\s*['"]?[01](?=\D)[^>]*?>@@siUg + + +################################################################################# +# +# tiny-textforms: Extend those tiny textareas up to 40x80 and kill the hard wrap. # -s|(document\.cookie)([ \t\r\n]*=)|documenZapCooky$2|g +################################################################################# +FILTER: tiny-textforms Extend those tiny textareas up to 40x80 and kill the hard wrap. -# HTML cookies: +s/(]*?)(?:\s*(?:rows|cols)=(['"]?)\d+\2)+/$1 rows=$2\40$2 cols=$2\80$2/ig +s/(]*?)wrap=(['"]?)hard\2/$1/ig + + +################################################################################# +# +# jumping-windows: Prevent windows from resizing and moving themselves. # -s|].*>||iUT +################################################################################# +FILTER: jumping-windows Prevent windows from resizing and moving themselves. +s/(?<=[\W])(?:window|this|self)\.(?:move|resize)(?:to|by)\(/''.concat(/ig -################################################################################## +################################################################################# # -# no-popups: Kill all popups in JS and HTML +# frameset-borders: Give frames a border, make them resizable and scrollable. # ################################################################################# -FILTER: no-popups Kill all popups in JS and HTML +FILTER: frameset-borders Give frames a border and make them resizable. + +s/(]*)framespacing=(['"]?)(no|0)\2/$1/igU +s/(]*)frameborder=(['"]?)(no|0)\2/$1/igU +s/(]*)border=(['"]?)(no|0)\2/$1/igU +s/(]*)noresize/$1/igU +s/(]*)frameborder=(['"]?)(no|0)\2/$1/igU +s/(]*)scrolling=(['"]?)(no|0)\2/$1/igU + -s/window\.open\(/1;''\.concat\(/ig # JavaScript -s/target=['"]?_blank['"]?/target_crunched/ig # HTML -s/target=['"]?_new['"]?/target_crunched/ig # HTML ################################################################################# # -# frameset-borders: Give frames a border and make them resizable +# demoronizer: Correct Microsoft's abuse of standardized character sets, which +# leave the browser to (mis)-interpret unknown characters, with +# sometimes bizarre results on non-MS platforms. +# +# credit: ripped from the demoroniser.pl script by: +# John Walker -- January 1998, http://www.fourmilab.ch/webtools/demoroniser # ################################################################################# -FILTER: frameset-borders Give frames a border and make them resizable +FILTER: demoronizer Fix MS's non-standard use of standard charsets. + +s/(&\#[0-2]\d\d)\s/$1; /g +# per Robert Lynch: http://slate.msn.com//?id=2067547, just a guess. +# Must come before x94 below. +s/\xE2\x80\x94/ -- /g +s/\x82/,/g +#s-\x83-f-g +s/\x84/,,/g +s/\x85/.../g +#s/\x88/^/g +#s-\x89- °/°°-g +s/\x8B/~-g +#s-\x99-TM-g +# per Robert Lynch. +s/\x9B/>/g # 155 -s/(]+)framespacing=['"]?(no|0)['"]?(.*>)/$1$3/igU -s/(]+)frameborder=['"]?(no|0)['"]?(.*>)/$1$3/igU -s/(]+)border=['"]?(no|0)['"]?(.*>)/$1$3/igU -s/(]+)resizable=['"]?(no|0|false)['"]?(.*>)/$1$3/igU ################################################################################# # -# webbugs: Squish WebBugs (1x1 invisible GIFs used for user tracking) +# shockwave-flash: Kill embedded Shockwave Flash objects. +# Note: Better just block "/.*\.swf$"! # ################################################################################# -FILTER: webbugs Squish WebBugs (1x1 invisible GIFs used for user tracking) +FILTER: shockwave-flash Kill embedded Shockwave Flash objects. -s/]*?(width|height)\s*=\s*['"]?1\D[^>]*?(width|height)\s*=\s*['"]?1(\D[^>]*?)?>//sig +s|]*macromedia.*||sigU +s|]*(application/x-shockwave-flash\|\.swf).*>(.*)?||sigU ################################################################################# # -# no-refresh: Kill automatic refresh tags (for dial-on-demand setups) +# quicktime-kioskmode: Make Quicktime movies saveable. # ################################################################################# -FILTER: no-refresh Kill automatic refresh tags (for dial-on-demand setups) +FILTER: quicktime-kioskmode Make Quicktime movies saveable. -s/]*)['"]?>//iU -s/].*>//iU +s/(]*)kioskmode\s*=\s*(["']?)true\2/$1/ig ################################################################################# @@ -242,79 +420,850 @@ s/].*>/$1@Us +s@(
([^\s]*).*?\.\.\.\s*(\1)@$2@ig + +################################################################################# +# +# x-httpd-php-to-html: Changes the Content-Type header from +# x-httpd-php to html. "Content-Type: x-httpd-php" +# is set by clueless PHP users and causes many +# browsers do open a download menu instead of +# rendering the page. +# +################################################################################# +SERVER-HEADER-FILTER: x-httpd-php-to-html Changes the Content-Type header from x-httpd-php to html. + +s@^(Content-Type:)\s*application/x-httpd-php@$1 text/html@i + +################################################################################# +# +# html-to-xml: Changes the Content-Type header from html to xml. +# +################################################################################# +SERVER-HEADER-FILTER: html-to-xml Changes the Content-Type header from html to xml. + +s@^(Content-Type:)\s*text/html(;.*)?$@$1 application/xhtml+xml$2@i + +################################################################################# +# +# xml-to-html: Changes the Content-Type header from xml to html. +# +################################################################################# +SERVER-HEADER-FILTER: xml-to-html Changes the Content-Type header from xml to html. + +s@^(Content-Type:)\s*(?:application|text)/(?:xhtml\+)?xml(;.*)?$@$1 text/html$2@i + +################################################################################# +# +# hide-tor-exit-notation: Remove the Tor exit node notation in Host and Referer headers. +# +# Note: If Privoxy and Tor are chained and Privoxy is configured to +# use socks4a, one can use http://www.example.org.foobar.exit/ +# to access the host www.example.org through Tor exit node foobar. +# +# As the HTTP client isn't aware of this notation, it treats the +# whole string "www.example.org.foobar.exit" as host and uses it +# for the "Host" and "Referer" headers. From the server's point of +# view the resulting headers are invalid and can cause problems. +# +# An invalid "Referer" header can trigger "hot-linking" protections, +# an invalid "Host" header will make it impossible for the server to +# find the right vhost (several domains hosted on the same IP address). +# +# This filter removes the "foo.exit" part in those headers +# to prevent the mentioned problems. Note that it only modifies +# the HTTP headers, it doesn't make it impossible for the server +# to detect your Tor exit node based on the IP address the request is +# coming from. +# +################################################################################# +CLIENT-HEADER-FILTER: hide-tor-exit-notation Removes the Tor exit node notation in Host and Referer headers. + +s@^((?:Referer|Host):\s*(?:https?://)?[^/]*)\.[^\./]*?\.exit@$1@i + +################################################################################# +# +# less-download-windows: Prevents annoying download windows for content types +# the browser can handle itself. +# +################################################################################# +SERVER-HEADER-FILTER: less-download-windows Prevent annoying download windows for content types the browser can handle itself. + +s@^Content-Disposition:.*filename=(["']?).*\.(png|gif|jpe?g|diff?|d?patch|c|h|pl|shar)\1.*$@@i +s@^(Content-Type:)\s*(?:message/(?:news|rfc822)|text/x-.*|application/x-sh(?:\s|$))\s*@$1 text/plain@i + +################################################################################# +# +# image-requests: Tags detected image requests as "IMAGE-REQUEST". Whether +# or not the detection actually works depends on the browser. +# +################################################################################# +CLIENT-HEADER-TAGGER: image-requests Tags detected image requests as "IMAGE-REQUEST". + +s@^Accept:\s*image/.*@IMAGE-REQUEST@i + +################################################################################# +# +# css-requests: Tags detected CSS requests as "CSS-REQUEST". Whether +# or not the detection actually works depends on the browser. +# +################################################################################# +CLIENT-HEADER-TAGGER: css-requests Tags detected CSS requests as "CSS-REQUEST". + +s@^Accept:\s*text/css.*@CSS-REQUEST@i + +################################################################################# +# +# client-ip-address: Tags the request with the client's IP address. +# +################################################################################# +CLIENT-HEADER-TAGGER: client-ip-address Tags the request with the client's IP address. + +s@^\w*\s+.*\s+HTTP/\d\.\d\s*@IP-ADDRESS: $origin@D + +################################################################################# +# +# http-method: Tags the request with its HTTP method. +# +################################################################################# +CLIENT-HEADER-TAGGER: http-method Tags the request with its HTTP method. + +s@^(\w*).*HTTP/\d\.\d\s*$@$1@i + +################################################################################# +# +# allow-post: Tags POST requests as "ALLOWED-POST". +# +################################################################################# +CLIENT-HEADER-TAGGER: allow-post Tags POST requests as "ALLOWED-POST". + +s@^(?:POST)\s+.*\s+HTTP/\d\.\d\s*@ALLOWED-POST@i + +################################################################################# +# +# complete-url: Tags the request with the whole request URL. +# +################################################################################# +CLIENT-HEADER-TAGGER: complete-url Tags the request with the whole request URL. + +s@^\w*\s+(.*)\s+HTTP/\d\.\d\s*$@$1@i + +################################################################################# +# +# user-agent: Tags the request with the complete User-Agent header. +# +################################################################################# +CLIENT-HEADER-TAGGER: user-agent Tags the request with the complete User-Agent header. + +s@^User-Agent:.*@$0@i + +################################################################################# +# +# content-type: Tags the request with the content type declared by the server. +# +################################################################################# +SERVER-HEADER-TAGGER: content-type Tags the request with the content type declared by the server. + +s@^Content-Type:\s*([^;]+).*@$1@i + +################################################################################# +# +# privoxy-control: The taggers create tags with the content of X-Privoxy-Control +# headers, the filters remove said headers. +# +################################################################################# +CLIENT-HEADER-TAGGER: privoxy-control Creates tags with the content of X-Privoxy-Control headers. + +s@^X-Privoxy-Control:\s*@@i + +CLIENT-HEADER-FILTER: privoxy-control Removes X-Privoxy-Control headers. + +s@^X-Privoxy-Control:.*@@i + +SERVER-HEADER-TAGGER: privoxy-control Creates tags with the content of X-Privoxy-Control headers. + +s@^X-Privoxy-Control:\s*@@i + +SERVER-HEADER-FILTER: privoxy-control Removes X-Privoxy-Control headers. + +s@^X-Privoxy-Control:.*@@i + + +############################################################################## +# +# Revisions : +# $Log: default.filter,v $ +# Revision 1.67 2008/08/06 17:38:06 fabiankeil +# In banners-by-size, make sure white-space around the height +# attribute is removed as well and replace two spaces with +# "\s" so we don't get fooled by tabs. Fixes #2036125. +# +# Revision 1.66 2008/08/03 17:27:47 fabiankeil +# Teach msn filter to catch a few new ad classes. +# +# Revision 1.65 2008/07/21 13:43:44 fabiankeil +# Fix img-reorder regression introduced with my last commit. +# Some tags were terminated too soon, letting the browser render +# some of their arguments as text. Oops. +# +# Revision 1.64 2008/07/12 15:49:09 fabiankeil +# - Don't let img-reorder touch width attributes +# that aren't followed by either whitespace or '>', +# as those usually indicate onclick nonsense. +# Problem and solution reported by Glenn Washburn in #2014552. +# - While at it, don't use more groups than necessary. +# +# Revision 1.63 2008/06/27 12:53:41 fabiankeil +# Make sure the taggers css-requests and image-requests +# only match at the beginning of the header. +# +# Revision 1.62 2008/06/21 17:02:03 fabiankeil +# Fix typo. +# +# Revision 1.61 2008/05/21 18:44:43 fabiankeil +# - Let the content-type tagger ignore headers without value. +# - Remove a few unused lines at the end of the file. +# +# Revision 1.60 2008/04/26 10:36:41 fabiankeil +# Let the msn filter hide another class. +# +# Revision 1.59 2008/04/23 16:18:18 fabiankeil +# s@declarded@declared@ +# +# Revision 1.58 2008/02/02 15:27:19 fabiankeil +# Yet another yahoo update to get the width limitation removal working again. +# +# Revision 1.57 2008/01/26 15:45:39 fabiankeil +# Don't let the less-download-windows filter mess up +# "Content-Type: application/x-shockwave-flash" headers. +# +# Revision 1.56 2008/01/25 19:12:40 fabiankeil +# - Add yet another new yahoo ad id. +# - Don't let the first banners-by-link job punish URLs for merely +# containing the pattern "/jump/" when it should really look for +# "doubleclick\.net/jump/". +# +# Revision 1.55 2007/12/31 19:53:59 fabiankeil +# Let the msn filter remove the width limitation again. +# +# Revision 1.54 2007/12/31 19:11:31 fabiankeil +# - Let the yahoo filter remove the width limitation again. +# - Teach the blogspot filter to remove useless feed comment +# titles that only contain the beginning of the actual comment. +# +# Revision 1.53 2007/12/23 15:48:12 fabiankeil +# - Lo and behold, the CSS fix for the MSN buttons is no longer necessary. +# - Add some new selectors the msn filter should hide. +# - Add the two yahoo selectors Lee reported in #1856574. +# - Add comments that the width limitation fixes stopped +# working for the msn and yahoo filter. +# +# Revision 1.52 2007/11/27 18:35:48 fabiankeil +# Update CSS for the yahoo filter. +# +# Revision 1.51 2007/11/04 16:15:11 fabiankeil +# - Add client-header taggers: client-ip-address, +# http-method, allow-post, complete-url and user-agent. +# - Add server-header tagger: content-type. +# +# Revision 1.50 2007/11/03 15:05:30 fabiankeil +# Consistently use an empty line between the description and the PCRS code +# and end descriptions with dots. Patch submitted by Simon Ruderich. +# +# Revision 1.49 2007/11/03 14:29:41 fabiankeil +# Spelling fixes mostly submitted by Simon Ruderich. +# +# Revision 1.48 2007/10/17 18:11:32 fabiankeil +# Add privoxy-control header filters and taggers. +# +# Revision 1.47 2007/10/06 15:45:25 fabiankeil +# Let msn hide sponsored links in #at divs. +# +# Revision 1.46 2007/10/06 09:54:13 fabiankeil +# - Let msn hide sponsored links in #ar divs. +# - Teach banners-by-link not to block the graphs for sf's tracker statistics. +# +# Revision 1.45 2007/08/11 16:54:12 fabiankeil +# - Complete the changes from r1.42. +# - Make crude-parental less sensitive to the amount of white-space, +# add the note that it doesn't work too well again and replace the +# DMOZ link with a less confusing explanation. +# +# Revision 1.44 2007/07/18 11:06:56 hal9 +# Replace notarget with '' in all popups filter to keep from breaking xhmtl per +# report from Siegfried Gipp. +# +# Revision 1.43 2007/06/01 14:17:04 fabiankeil +# Mention possible delimiter conflicts with variables in dynamic pcrs commands. +# +# Revision 1.42 2007/05/17 15:55:36 fabiankeil +# Undo an improperly tested last-minute change +# and turn "text-requests" back into "css-requests". +# +# Revision 1.41 2007/05/17 15:45:41 fabiankeil +# - Mention new filter types and the 'D' option. +# - Header filters are now case-insensitive and accept a +# varying amount of whitespace after the colon. +# - Add another selector for yahoo ads. +# - New server-header filter: less-download-windows +# - New client-header taggers: text-requests and image-requests. +# +# Revision 1.40 2007/03/20 15:40:00 fabiankeil +# Adjust to new world order with dedicated header-filter actions. +# +# Revision 1.39 2007/02/21 14:10:23 fabiankeil +# - Fix a js-annoyances pcrs command that broke +# evaluated code. (BR #1124071, thanks to Bor Gergely) +# - Have unsolicited-popups and all-popups catch the +# wheather.com popup reported in in AF #1640173. +# +# Revision 1.38 2007/02/19 11:22:48 hal9 +# Adding back the orginal filter content to offset problems found by Fabian. +# +# Revision 1.37 2007/02/17 13:29:44 hal9 +# Updates to the crude parental filter per Feature Requests item #1648657. +# +# Revision 1.36 2007/02/05 16:47:31 fabiankeil +# - Let banners-by-link look for "advert". +# - Fix XML systax problems with banners-by-link +# and banners-by-size (AF#1651570). +# +# Revision 1.35 2006/12/21 12:28:12 fabiankeil +# Escaping special characters in filter descriptions is no +# longer necessary, it's done by Privoxy now. +# +# Revision 1.34 2006/12/12 17:32:23 fabiankeil +# Added id mbEnd to google filter, it's now and then +# used for the sponsored links. +# +# Have js-annoyances try to prevent status bar +# modifications where the status bar text is +# inside another variable. Fixes 1605710. +# +# Revision 1.33 2006/11/16 17:10:43 fabiankeil +# Removed webbugs debugging comment again. +# The apostrophe could break JavaScript and +# the comment itself could mess up existing +# comments. +# +# Revision 1.32 2006/11/10 18:04:04 fabiankeil +# Have no-ping print the ping warning in red. +# +# Modified yahoo to keep in sync with recent +# CSS changes and to suppress a useless horizontal +# scrollbar. +# +# msn now makes sure that the continue-link boxes +# act as links (the original CSS just changes the cursor). +# +# Changed fun filter regex to leave microsoft links alone. +# Fixes BR 1019996. +# +# Revision 1.31 2006/10/21 13:12:28 fabiankeil +# Added no-ping and hide-tor-exit-notation. +# +# Adjusted jumping-windows to break less. +# Fixes BR 1146134. +# +# Revision 1.30 2006/10/18 12:36:50 fabiankeil +# google filter now cleans Google groups as well. +# +# Revision 1.29 2006/10/11 14:03:17 fabiankeil +# Changed img-reorder regex to only move width +# attributes if they are following at least one +# whitespace. Fixes BR 1328455. +# +# Revision 1.28 2006/10/11 13:31:13 fabiankeil +# Added Anduin Withers' js-annoyances fix +# for not messing up escaped quotes. Fixes BR 999765. +# +# Improved blogspot filter to make it less likely that +# the blogspot banner at the top of the page is missed. +# +# Revision 1.27 2006/10/08 17:00:51 fabiankeil +# Modified webbugs filter to create a comment around the offending +# image instead of removing it entirely. +# +# Adjusted regex to only match if there's at least one whitespace +# before the width and height attributes. Makes it more likely that +# they are indeed attributes, and not part of the value of another attribute. +# Solves BR 1035587. +# +# Thanks to Martin Thomas for diagnosing the cause of the problem. +# +# Revision 1.26 2006/10/06 18:06:16 fabiankeil +# Added header filter x-httpd-php-to-html +# and reverted another img-reorder whitespace +# problem. +# +# Revision 1.25 2006/10/06 15:26:09 fabiankeil +# Bumped copyright year. +# +# Reverted parts of the last img-reorder change +# which were intended to remove superfluous whitespace +# but had the side effect to mess up some tags. +# +# Modified banners-by-size and banners-by-link to +# use border value "0" instead of "\0". Fixes BR 1100065. +# +# Revision 1.24 2006/10/06 11:25:31 fabiankeil +# Taught img-reorder not to break img tags +# with empty src attributes. Fixes BR 1089474. +# Thanks to Raphael Moll for reporting. +# +# Revision 1.23 2006/10/05 14:46:28 fabiankeil +# Replaced "<" in img-reorder's description with "<". +# +# Modified msn filter to tag ads with classes instead +# of ids. There may be more than one ad per page, +# but ids are required to be unique. +# +# Revision 1.22 2006/10/04 19:17:14 fabiankeil +# Incorportated Frédéric Crozat's ie-exploits +# modification to make it less trigger-happy. +# +# Modified blogspot filter to make .post-body +# scrollable if necessary. +# +# Revision 1.21 2006/10/02 16:21:14 fabiankeil +# Adjusted yahoo filter to hide .yschspns as well. +# Added header filters: html-to-xml and xml-to-html. +# +# Revision 1.20 2006/10/01 21:00:22 fabiankeil +# New site-specific filters: google, yahoo, msn and blogspot. +# +# Revision 1.19 2006/07/18 14:48:45 david__schmidt +# Reorganizing the repository: swapping out what was HEAD (the old 3.1 branch) +# with what was really the latest development (the v_3_0_branch branch) +# +# Revision 1.11.2.23 2004/02/17 13:34:01 oes +# - Beefed up the protection of the unsolicited-popups +# filter against matching in JavaScript string constants. +# - Extended the fun filter with a German joke +# - Extended the site-specifics filter with a convenience +# reeplacement for managing mailing lists at SourceForge +# +# Revision 1.11.2.22 2004/01/30 15:29:29 oes +# Updated the copyright note +# +# Revision 1.11.2.21 2004/01/20 15:15:01 oes +# Detail enhancement in all-popups +# +# Revision 1.11.2.20 2004/01/06 16:46:14 oes +# Fixed a JS syntax problem in jumping-windows +# +# Revision 1.11.2.19 2003/12/17 17:09:25 oes +# Added remedy against IE address bar spoofing +# +# Revision 1.11.2.18 2003/12/02 11:25:27 oes +# Fixed a line trashed in previous commit +# +# Revision 1.11.2.17 2003/12/01 21:58:46 oes +# Assorted tuning: +# +# - unsolicited-popups no longer matches at start or end of quoted +# strings, and is now activated earlier and deactivated later in +# the page. +# - replacement images in banners-by-* now without border +# - more effective shockwave flash flattening +# - Custom annoyance filtering for Yahoo Groups, Monster.com, NY Times. +# +# Revision 1.11.2.16 2003/05/08 09:44:56 oes +# Allow extra parameters in blink,marquee tags. Fixes bug #734012 +# +# Revision 1.11.2.15 2003/03/30 13:57:08 oes +# Making unsolicited-popups safe for use on tags enclosed in JS strings +# +# Revision 1.11.2.14 2003/03/19 13:17:50 oes +# - Added filter "site-specifics" to address site specific problems +# - Fixed a small problem in the img-reorder filter +# +# Revision 1.11.2.13 2003/03/18 19:28:59 oes +# Fixed a minor problem in the img-reorder filter +# +# Revision 1.11.2.12 2003/03/15 14:06:58 oes +# - Assorted refinements, optimizations and fixes in the js-annoyances, +# img-reorder, banners-by-size, banners-by-link, webbugs, refresh-tags, +# html-annoyances, content-cookies and fun filters +# - Replaced filter "popups" by choice between two modes: +# - "unsolicited-popups" tries to catch only the unsolicited ones +# - "all-popups" tries to kill them all (as before) +# - New filter "tiny-textforms" Help those tiny or hard-wrap textareas. +# - New filter "jumping-windows" that prevents windows from resizing +# and moving themselves +# - Replaced "nimda" with more general "ie-exploits" filter in which +# all filters for exploits shall be collected +# +# Revision 1.11.2.11 2002/11/12 16:14:43 oes +# Exchanged js-annoyance filter against status bar rewrites with improved version by Don Libes +# +# Revision 1.11.2.10 2002/11/11 13:39:47 oes +# Make refresh-tags filter work even on incorrect refresh tags like found on usatoday.com +# +# Revision 1.11.2.9 2002/11/08 16:39:17 oes +# Made img-reorder more cautious. Fixes bug #632715 +# +# Revision 1.11.2.8 2002/10/13 21:56:52 hal9 +# Adding demoronizer filter. This should include all the common abuses. I have +# left a few of the rare cases commented out (never found these in the wild). +# +# Revision 1.11.2.7 2002/09/25 15:09:39 oes +# Preserve original quoting style in tags wherever possible. Fixes Bug #605956 +# +# Revision 1.11.2.6 2002/08/23 14:12:26 oes +# Proofed frameset-borders against "fremaborder=0 border=0" +# +# Revision 1.11.2.5 2002/08/22 15:05:20 oes +# Added Filter to make Quicktime movies saveable (thanks to aaron@linville.org for the idea) +# +# Revision 1.11.2.4 2002/08/10 11:32:29 oes +# Attribute values in replacement tags of banners-by-size filter now undelimited. (Fixes bug #592493) +# +# Revision 1.11.2.3 2002/08/05 11:43:56 oes +# Fixed a bug in the popups filter that was introduced with the last fix :-( +# +# Revision 1.11.2.2 2002/08/01 11:20:13 oes +# Fixed bugs 587802, 577802 and an unreported one +# +# Revision 1.11.2.1 2002/07/26 15:18:26 oes +# - All filters reviewed and many shorcomings fixed +# - New filters: img-reorder, banners-by-link and js-events +# - Jobs reorderd because they are now executed in order of +# appearance +# +# Revision 1.11 2002/05/24 00:57:18 oes +# Made WeBugs job ungreedy; Fixes bug 559190 +# +# Revision 1.10 2002/04/18 10:14:19 oes +# renamed some filters +# +# Revision 1.9 2002/04/11 07:36:35 oes +# Generalized js-popup filter +# +# Revision 1.8 2002/04/10 17:07:21 oes +# Fixed potentially desctructive jobs, added noflash filter +# +# Revision 1.7 2002/04/09 18:34:51 oes +# Fixed HTML syntax in replacements +# +# Revision 1.6 2002/04/03 19:49:52 swa +# name change +# +# Revision 1.5 2002/03/27 15:30:26 swa +# have a consistent appearance +# +# Revision 1.4 2002/03/26 22:29:54 swa +# we have a new homepage! +# +# Revision 1.3 2002/03/24 16:08:03 jongfoster +# Fixing banners-by-size for new config URLs +# +# Revision 1.2 2002/03/24 13:02:18 swa +# name change related issues. +# +# Revision 1.1 2002/03/24 11:37:39 jongfoster +# Name change +# +# Revision 1.24 2002/03/16 20:39:54 oes +# - Added descriptions to the filters so users will know what they select in the cgi editor +# - Added content-cookies filter +# - Bugfixed many jobs (Thanks to Al for some hints) +# +# Revision 1.22 2002/03/12 13:42:50 oes +# Fixing & Optimizing REs +# +# Revision 1.21 2002/03/12 11:59:20 oes +# Beefed up Buzzword Bingo +# +# Revision 1.20 2002/03/12 01:42:50 oes +# Introduced modular filters +# +# Revision 1.19 2002/03/10 19:49:24 oes +# Added expression to kill referer tracking in JavaScripts +# +# Revision 1.18 2002/03/08 17:14:12 oes +# PNG -> image in comments +# +# Revision 1.17 2002/03/07 03:50:54 oes +# Adapted comments to new built-in images +# +# Revision 1.16 2002/02/21 00:12:19 jongfoster +# Modifying the banner regexps to use long URLS and to autodetect +# whether to show a logo or a transparent GIF, based on actionsfile +# setting. +# +# Revision 1.15 2001/12/28 23:54:20 steudten +# Fix for feature Req #495374: http-equiv problem +# +# Revision 1.14 2001/12/09 18:55:11 david__schmidt +# Updated CODE_STATUS to beta, commented out microsuck line in re_filterfile +# for 2.9.10 beta +# +# Revision 1.13 2001/10/13 13:11:20 joergs +# Fixed WebBug filter. +# +# Revision 1.12 2001/10/07 15:46:42 oes +# Followed Guy's proposal to change the document.cookie job +# +# Revision 1.11 2001/09/21 12:34:00 joergs +# Added filter to replace "Nimda" code by a warning. +# +# Revision 1.10 2001/07/20 11:04:26 oes +# Added Rodneys javascript cookie filter +# +# Revision 1.9 2001/07/13 14:03:48 oes +# Elimiated yet another bug in the banner-by-size jobs. Shame on me! +# +# Revision 1.8 2001/06/29 13:34:00 oes +# - Added explanation for U and T options +# - Added hint on image replacement by CGI call +# - Fixed bug in banner-by-size jobs +# +# Revision 1.7 2001/06/19 14:21:56 oes +# Fixed microsuck line +# +# Revision 1.6 2001/06/09 14:01:57 swa +# header. cosmetics. default: no messing ala microsuck. +#