X-Git-Url: http://www.privoxy.org/gitweb/?p=privoxy.git;a=blobdiff_plain;f=default.filter;h=8fa455b8f5176ae2940738a50c6551811af0c5cf;hp=37e7faea50bc94b479bd69b9e2c7213e5c6ff76b;hb=34881c188c29c46b96e41918e0771a3a72873c45;hpb=247be70f6020f2af5cde4a66c89e90f26836a244 diff --git a/default.filter b/default.filter index 37e7faea..8fa455b8 100644 --- a/default.filter +++ b/default.filter @@ -2,11 +2,11 @@ # # File : $Source: /cvsroot/ijbswa/current/default.filter,v $ # -# $Id: default.filter,v 1.50 2007/11/03 15:05:30 fabiankeil Exp $ +# $Id: default.filter,v 1.67 2008/08/06 17:38:06 fabiankeil Exp $ # # Purpose : Rules to process the content of web pages # -# Copyright : Written by and Copyright (C) 2001 - 2007 the +# Copyright : Written by and Copyright (C) 2001-2008 the # Privoxy team. http://www.privoxy.org/ # # We value your feedback. However, to provide you with the best support, @@ -219,14 +219,17 @@ s/\starget\s*=\s*(['"]?)_?(blank|new)\1?/ /ig # (X)HTML FILTER: img-reorder Reorder attributes in tags to make the banners-by-* filters more effective. # In the first step src is moved to the start, then width is moved to the second -# place to guarantee an order of src, width, height. +# place to guarantee an order of src, width, height. Also does some white-space +# normalization. +# # This makes banners-by-size more effective and allows both banners-by-size # and banners-by-link to preserve the original image URL in the title attribute. -s|]*) src\s*=\s*(['"])([^>\\\2]+)\2|]*) src\s*=\s*([^'">\\\s]+)|]*)\ssrc\s*=\s*(['"])([^>\\\2]+)\2|]*)\ssrc\s*=\s*([^'">\\\s]+)|]+height)\s*=\s*|$1=|sig -s|\\\\2]*\2\|[^'">\\\s]+?))([^>]*)\s+width\s*=\s*(["']?)(\d+?)\4|\\\\2]*\2\|[^'">\\\s]+?))([^>]*)\s+width\s*=\s*((["']?)\d+?\5)(?=[\s>])|\1\s]*?(?:\ adclick # See www.dn.se \ | advert # see dict.leo.org \ | atwola\.com/(?:link|redir) # see www.cnn.com \ -| /jump/ # redirs for doublecklick.net ads \ +| doubleclick\.net/jump/ # redirs for doublecklick.net ads \ | counter # common \ | (?)/$1ädchen/Ug # Pages are "blocked" based on keyword matching. # ################################################################################# -FILTER: crude-parental Crude parental filtering. Note that this filter doesn't work reliable. +FILTER: crude-parental Crude parental filtering. Note that this filter doesn't work reliably. # (Note: Middlesex, Sussex and Essex are counties in the UK, not rude words) # (Note #2: Is 'sex' a rude word?!) @@ -458,7 +461,7 @@ s+^.*warez.*$+No Warez

You're not sea # Remove by description s/^.*\ -(?:(suck|lick|tounge|rub|fuck|fingering|finger|chicks?)\s*)?\ +(?:(suck|lick|tongue|rub|fuck|fingering|finger|chicks?)\s*)?\ (?:(her|your|my|hard|with|big|wet|tight|pink|hot|moist|young|teen)\s*)+\ (dicks?|penis|cocks?|balls?|tits?|pussy|cunt|clit|ass|mouth).*$\ /This page has been blocked by Privoxy's crude-parental content filter\ @@ -595,9 +598,10 @@ FILTER: yahoo CSS-based block for Yahoo text ads. Also removes a width limitatio s@@\n\n$0@ @@ -611,25 +615,21 @@ FILTER: msn CSS-based block for MSN text ads. Also removes tracking URLs and a w s@@\n$0@ +# Are these ids still in use? s@(]*) id=(["']?)ads_[^\2]*\2@$1 class="msn_ads"@Uig +s@(]*) class=(["']?)sb_ads[^\2]*\2@$1 class="msn_ads"@Uig s@(]*href=\")http://g.msn.com/.*\?(http://.*)(&&DI=.*)(\")@$1$2$4@Ug s@(]*)gping=\".*\"@$1 title="URL cleaned up by Privoxy's msn filter"@Ug -s@
(

Sponsored sites

)@
$1@ -s@(
|(
([^\s]*).*?\.\.\.\s*(\1)@$2@ig ################################################################################# # @@ -727,7 +732,7 @@ s@^((?:Referer|Host):\s*(?:https?://)?[^/]*)\.[^\./]*?\.exit@$1@i SERVER-HEADER-FILTER: less-download-windows Prevent annoying download windows for content types the browser can handle itself. s@^Content-Disposition:.*filename=(["']?).*\.(png|gif|jpe?g|diff?|d?patch|c|h|pl|shar)\1.*$@@i -s@^(Content-Type:)\s*(?:message/(?:news|rfc822)|text/x-.*|application/x-sh)\s*@$1 text/plain@i +s@^(Content-Type:)\s*(?:message/(?:news|rfc822)|text/x-.*|application/x-sh(?:\s|$))\s*@$1 text/plain@i ################################################################################# # @@ -737,7 +742,7 @@ s@^(Content-Type:)\s*(?:message/(?:news|rfc822)|text/x-.*|application/x-sh)\s*@$ ################################################################################# CLIENT-HEADER-TAGGER: image-requests Tags detected image requests as "IMAGE-REQUEST". -s@Accept:\s*image/.*@IMAGE-REQUEST@i +s@^Accept:\s*image/.*@IMAGE-REQUEST@i ################################################################################# # @@ -747,7 +752,7 @@ s@Accept:\s*image/.*@IMAGE-REQUEST@i ################################################################################# CLIENT-HEADER-TAGGER: css-requests Tags detected CSS requests as "CSS-REQUEST". -s@Accept:\s*text/css.*@CSS-REQUEST@i +s@^Accept:\s*text/css.*@CSS-REQUEST@i ################################################################################# # @@ -796,12 +801,12 @@ s@^User-Agent:.*@$0@i ################################################################################# # -# content-type: Tags the request with the content type declarded by the server. +# content-type: Tags the request with the content type declared by the server. # ################################################################################# -SERVER-HEADER-TAGGER: content-type Tags the request with the content type declarded by the server. +SERVER-HEADER-TAGGER: content-type Tags the request with the content type declared by the server. -s@^Content-Type:\s*([^;]*).*@$1@i +s@^Content-Type:\s*([^;]+).*@$1@i ################################################################################# # @@ -830,6 +835,79 @@ s@^X-Privoxy-Control:.*@@i # # Revisions : # $Log: default.filter,v $ +# Revision 1.67 2008/08/06 17:38:06 fabiankeil +# In banners-by-size, make sure white-space around the height +# attribute is removed as well and replace two spaces with +# "\s" so we don't get fooled by tabs. Fixes #2036125. +# +# Revision 1.66 2008/08/03 17:27:47 fabiankeil +# Teach msn filter to catch a few new ad classes. +# +# Revision 1.65 2008/07/21 13:43:44 fabiankeil +# Fix img-reorder regression introduced with my last commit. +# Some tags were terminated too soon, letting the browser render +# some of their arguments as text. Oops. +# +# Revision 1.64 2008/07/12 15:49:09 fabiankeil +# - Don't let img-reorder touch width attributes +# that aren't followed by either whitespace or '>', +# as those usually indicate onclick nonsense. +# Problem and solution reported by Glenn Washburn in #2014552. +# - While at it, don't use more groups than necessary. +# +# Revision 1.63 2008/06/27 12:53:41 fabiankeil +# Make sure the taggers css-requests and image-requests +# only match at the beginning of the header. +# +# Revision 1.62 2008/06/21 17:02:03 fabiankeil +# Fix typo. +# +# Revision 1.61 2008/05/21 18:44:43 fabiankeil +# - Let the content-type tagger ignore headers without value. +# - Remove a few unused lines at the end of the file. +# +# Revision 1.60 2008/04/26 10:36:41 fabiankeil +# Let the msn filter hide another class. +# +# Revision 1.59 2008/04/23 16:18:18 fabiankeil +# s@declarded@declared@ +# +# Revision 1.58 2008/02/02 15:27:19 fabiankeil +# Yet another yahoo update to get the width limitation removal working again. +# +# Revision 1.57 2008/01/26 15:45:39 fabiankeil +# Don't let the less-download-windows filter mess up +# "Content-Type: application/x-shockwave-flash" headers. +# +# Revision 1.56 2008/01/25 19:12:40 fabiankeil +# - Add yet another new yahoo ad id. +# - Don't let the first banners-by-link job punish URLs for merely +# containing the pattern "/jump/" when it should really look for +# "doubleclick\.net/jump/". +# +# Revision 1.55 2007/12/31 19:53:59 fabiankeil +# Let the msn filter remove the width limitation again. +# +# Revision 1.54 2007/12/31 19:11:31 fabiankeil +# - Let the yahoo filter remove the width limitation again. +# - Teach the blogspot filter to remove useless feed comment +# titles that only contain the beginning of the actual comment. +# +# Revision 1.53 2007/12/23 15:48:12 fabiankeil +# - Lo and behold, the CSS fix for the MSN buttons is no longer necessary. +# - Add some new selectors the msn filter should hide. +# - Add the two yahoo selectors Lee reported in #1856574. +# - Add comments that the width limitation fixes stopped +# working for the msn and yahoo filter. +# +# Revision 1.52 2007/11/27 18:35:48 fabiankeil +# Update CSS for the yahoo filter. +# +# Revision 1.51 2007/11/04 16:15:11 fabiankeil +# - Add client-header taggers: client-ip-address, +# http-method, allow-post, complete-url and user-agent. +# - Add server-header tagger: content-type. +# # Revision 1.50 2007/11/03 15:05:30 fabiankeil # Consistently use an empty line between the description and the PCRS code # and end descriptions with dots. Patch submitted by Simon Ruderich. @@ -1189,7 +1267,3 @@ s@^X-Privoxy-Control:.*@@i # Revision 1.6 2001/06/09 14:01:57 swa # header. cosmetics. default: no messing ala microsuck. # -# -# - -