The Unofficial Homeland Security Filter
Goto page 1, 2, 3  Next
CastleCops -> Mailwasher - Troubleshooting / General

Author: dallas7 PostPosted: Wed Oct 29, 2003 7:25 pm    Post subject: The Unofficial Homeland Security Filter

My previous anti-spam tool was NovaSoft's SpamKiller and in there was a screen in which country codes could be selected for filtering. After checking off all countries except the USA, I was impressed at how much spam was filtered out.

Not having a similar option choice in MailWasher, I created the following filter based on Gary Partain's model:

[enabled],"Foreign Routing","Foreign Routing",33023,OR,Delete,EntireHeader,containsRE,(?m)^Received:.+ (\.af|\.al|\.dz|\.as|\.ad|\.ao|\.ai|\.ag|\.ar|\.am|\.aw|\.au|\.at|\.az|\.bs| \.bh|\.bd|\.bb|\.by|\.be|\.bz|\.bj|\.bm|\.bt|\.bo|\.ba|\.bw|\.bv|\.br|\.io| \.bn|\.bg|\.bf|\.bi|\.kh|\.cm|\.ca|\.cv|\.ky|\.cf|\.td|\.cl|\.cn|\.cx|\.cc|\.co| \.km|\.cg|\.cd|\.ck\.cr|\.ci|\.hr|\.cu|\.cy|\.cz|\.dk|\.dj|\.dm|\.do|\.tp| \.ec|\.eg|\.sv|\.gq|\.er|\.ee|\.et|\.fk|\.fo|\.fj|\.fi|\.fr|\.fx|\.gf|\.pf|\.tf| \.ga|\.gm|\.ge|\.de|\.gh|\.gi|\.gr|\.gl|\.gd|\.gp|\.gu|\.gt|\.gg|\.gn|\.gw| \.gy|\.ht|\.hm|\.va|\.hn|\.hk|\.hu\.is|\.in|\.id|\.int|\.ir|\.iq|\.im|\.il|\.it| \.jm|\.jp|\.je|\.jo|\.kz|\.ke|\.ki|\.kp|\.kr|\.kw|\.kg|\.la|\.lv|\.lb|\.ls|\.lr| \.ly|\.li|\.lt|\.lu|\.mo|\.mk|\.mg|\.mw|\.my|\.mv|\.ml|\.mt|\.mh|\.mq| \.mr|\.mu|\.yt|\.mx|\.fm|\.md|\.mc|\.mn|\.ms|\.ma|\.mz|\.mm|\.na| \.nr|\.np|\.nl\.an|\.nc|\.nz|\.ni|\.ne|\.ng|\.nu|\.nf|\.mp|\.no|\.om|\.pk| \.pw|\.pa|\.pg|\.py|\.pe|\.ph|\.pn|\.pl|\.pt|\.pr|\.qa|\.re|\.ro|\.ru|\.rw| \.kn|\.lc|\.vc|\.ws|\.sm|\.st|\.sa|\.sn|\.sc|\.sl|\.sg|\.sk|\.si|\.sb|\.so|\.za| \.gs|\.es|\.lk|\.sh|\.pm|\.sd|\.sr|\.sj\.sz|\.se|\.ch|\.sy|\.tw|\.tj|\.tz|\.th| \.tg|\.tk|\.to|\.tt|\.tn|\.tr|\.tm|\.tc|\.tv|\.ug|\.ua|\.ae|\.gb|\.uk|\.uy|\.uz| \.vu|\.ve|\.vn|\.vg|\.wf|\.eh|\.ye|\yu|\.zm|\.zw)[^\.\w]

OCT 31: SPACES WERE ADDED TO ALLOW WORD WRAP WITHIN THIS FORUM. USE THE FILTER(S) BELOW AS MINE WAS CORRECTED AND OTHERS HAVE BEEN SUGGESTED AS ALTERNATIVES.

This one filters everything except the USA and the Antarctic. I'd hate to miss the first spam routed through the Antartic!

You can find a country list in http://www.w5hq.com/MailWasher/MailWasherFilters.txt
and remove any countries you need unfiltered.

This filter along with a few good DNS blacklist servers in MWP catch over 95% of the 150-200 spams I get every day.


Last edited by dallas7 on Fri Oct 31, 2003 7:08 pm, edited 2 times in total

Author: Jarron PostPosted: Wed Oct 29, 2003 9:25 pm    Post subject:

Dallas,

I had just downloaded the country codes from Gary's filters and was fixin' to work with them when I happened to find your post. Talk about timing, you saved me a lot of work! Thanks!

Working with your filter I found where you seem to have left the "|" between hu\is, ck\cr, nl\an, and sj\sz.

Here's the correction I made (I hope they're corrections, I don't quite trust myself working with filters).

[enabled],"Foreign Routing","Foreign Routing",33023,OR,Delete,EntireHeader,containsRE,(?m)^Received:.+
(\.af|\.al|\.dz|\.as|\.ad|\.ao|\.ai|\.ag|\.ar|\.am|\.aw|\.au|\.at|\.az|\.bs|
\.bh|\.bd|\.bb|\.by|\.be|\.bz|\.bj|\.bm|\.bt|\.bo|\.ba|\.bw|\.bv|\.br|\.io|
\.bn|\.bg|\.bf|\.bi|\.kh|\.cm|\.ca|\.cv|\.ky|\.cf|\.td|\.cl|\.cn|\.cx|\.cc|\.co|
\.km|\.cg|\.cd|\.ck|\.cr|\.ci|\.hr|\.cu|\.cy|\.cz|\.dk|\.dj|\.dm|\.do|\.tp|
\.ec|\.eg|\.sv|\.gq|\.er|\.ee|\.et|\.fk|\.fo|\.fj|\.fi|\.fr|\.fx|\.gf|\.pf|\.tf|
\.ga|\.gm|\.ge|\.de|\.gh|\.gi|\.gr|\.gl|\.gd|\.gp|\.gu|\.gt|\.gg|\.gn|\.gw|
\.gy|\.ht|\.hm|\.va|\.hn|\.hk|\.hu|\.is|\.in|\.id|\.int|\.ir|\.iq|\.im|\.il|\.it|
\.jm|\.jp|\.je|\.jo|\.kz|\.ke|\.ki|\.kp|\.kr|\.kw|\.kg|\.la|\.lv|\.lb|\.ls|\.lr|
\.ly|\.li|\.lt|\.lu|\.mo|\.mk|\.mg|\.mw|\.my|\.mv|\.ml|\.mt|\.mh|\.mq|
\.mr|\.mu|\.yt|\.mx|\.fm|\.md|\.mc|\.mn|\.ms|\.ma|\.mz|\.mm|\.na|
\.nr|\.np|\.nl|\.an|\.nc|\.nz|\.ni|\.ne|\.ng|\.nu|\.nf|\.mp|\.no|\.om|\.pk|
\.pw|\.pa|\.pg|\.py|\.pe|\.ph|\.pn|\.pl|\.pt|\.pr|\.qa|\.re|\.ro|\.ru|\.rw|
\.kn|\.lc|\.vc|\.ws|\.sm|\.st|\.sa|\.sn|\.sc|\.sl|\.sg|\.sk|\.si|\.sb|\.so|\.za|
\.gs|\.es|\.lk|\.sh|\.pm|\.sd|\.sr|\.sj|\.sz|\.se|\.ch|\.sy|\.tw|\.tj|\.tz|\.th|
\.tg|\.tk|\.to|\.tt|\.tn|\.tr|\.tm|\.tc|\.tv|\.ug|\.ua|\.ae|\.gb|\.uk|\.uy|\.uz|
\.vu|\.ve|\.vn|\.vg|\.wf|\.eh|\.ye|\yu|\.zm|\.zw)[^\.\w]

Thanks again,
Jarron

Edited for forum formatting purposes. Remove CR's before pasting into filter list (text should be one line). Not responsible for false positives or other problems; if you lose out on $1,000,000.00 because you didn't get your letter from Nigeria it's not my fault.


Last edited by Jarron on Fri Oct 31, 2003 7:17 am, edited 4 times in total

Author: dallas7 PostPosted: Wed Oct 29, 2003 11:41 pm    Post subject:

It worked. I was wondering how I could get this filter error corrected and I thought, "Post it in the forum!" Wink

Really, thanks for catching those. Looks like you got all of 'em. Must've happened those four times my eyeballs popped out when I was working on it.

I hope you find it as effective as I do.

Author: TalonTSiLocation: Canada PostPosted: Thu Oct 30, 2003 12:11 am    Post subject:

Instead of a huge filter listing all the country codes you want to exclude, could you not instead add a blacklist entry like *@*.?? then write a filter to override the blacklist for the countries you'd like to receive email from (ie *@*.us or whatever)?

Author: dallas7 PostPosted: Thu Oct 30, 2003 1:12 am    Post subject:

Quote:
Instead of a huge filter listing all the country codes you want to exclude, could you not instead...


I have no clue. I know nothing of regular expressions and have no interest in learning about them.

Like I said above, I just expanded on Mr. Partain's filter.

If there's a better way to do this, it'll never pop out of my head!

Author: IP: 66.44.*.* PostPosted: Thu Oct 30, 2003 1:26 am    Post subject:

I was looking at your regex and I just wanted to make a couple suggestions.

The first iterator in the expression might be better as a non-greedy. Also, it might actually help to specify the "from" that is normally in the line.

Instead of:
^Received:.+(\.af|\.al|\.dz|\.as|\.ad|\.ao|\.ai|\.ag|\.ar|\.am....

change it to:

^Received: from.+?(\.af|\.al|\.dz|\.as|\.ad|\.ao|\.ai|\.ag|\.ar|\.am....


You might be able to speed it up, and shorten the expression if you remove the "\." from before each countrycode and place it before the Mega-OR statement.

Instead of:

^Received: from.+?(\.af|\.al|\.dz|\.as|\.ad|\.ao|\.ai|\.ag|\.ar|\.am....


change it to:

^Received: from.+?(af|al|dz|as|ad|ao|ai|ag|ar|am....

Once that is done, you might be able to shorten it some more by grouping all the first charactors together:

Change:

^Received: from.+?(a[defgilmnorstuwz]|b[abdefghijmnorstvwyz]|c[acdfghiklmnoruvxyz]|d[ejkmoz]|e[ceghrst]|f[ijkmorx]|g[abdefghilmqrstuwy]|h[kmnrtu]|i[dlmnoqrst]|j[emop]|k[eghimnprwyz]|l[abcikrstuvy]|m[acdghklmnopqrstuvwxyz]|n[acefgilopruz]|om|p[aefghklmnrtwy]|qa|r[eouw]|s[abcdeghijklmnortvyz]|t[cdfghjkmnoprtvwz]|u[agkyz]|v[acegnu]|ws|y[etu]|z[amw])[^\.\w]


Finally, at the end of the regex the terminator seems to be malformed.

[^\.\w] is telling the Regex interpreter that any charactor that follows the two letter country designator other than "\", . , or an alphanumeric charactor will make the result true.

It might be a good practice to get into now to skip the "\" before the dot when enclosed in the square brackets "[]"

Change it to read [^.\w]

Author: dallas7 PostPosted: Thu Oct 30, 2003 4:51 am    Post subject:

Shocked

Author: Ikeb PostPosted: Thu Oct 30, 2003 6:46 am    Post subject:

Homeland Security indeed. The term "isolationist" comes to mind......

BTW Dallas, would it be possible to edit your posts and insert a space every once in a while in your rather longish lines? These forums need some sort of autowrap after 200 or so characters.......

Author: IP: 210.49.*.* PostPosted: Thu Oct 30, 2003 9:03 am    Post subject:

Dallas, I just emailed you the solution to do exactly the same thing, but 1/10 the size.

You may not have received it though, I'm in Australia. Oh well.

Author: TimeGhostLocation: USA PostPosted: Thu Oct 30, 2003 3:19 pm    Post subject:

I agree with the suggestion to use non-greedy wildcard.
Also I think the expression should be terminated with \s.
Otherwise, you'll get false positives on servers with dashes and numbers in them:

Received: from mx.af-lala.com

So just replace [^.\w] with \s.

BTW, I appreciate the sarcastic title of the post.
Even if it wasn't intended to be so.

Author: IP: 66.44.*.* PostPosted: Thu Oct 30, 2003 5:17 pm    Post subject:

TimeGhost,

I had considered that, but was concerned that somethimes there is no space between the end of the domain and a bracket.

This is usually indicative of a malformed line, but if you write the regex to insist on "\s" as a termintor you will miss those malformed Recieved lines.

Might as well use the filter to catch a few forgeries here and there. What do you think??

Author: IP: 66.44.*.* PostPosted: Thu Oct 30, 2003 5:23 pm    Post subject:

TimeGhost wrote:
Received: from mx.af-lala.com


oooops....missed that on your post before.


I see exactly what you mean now....

perhaps this would work:

[^.\w-] The \w already takes care of numbers

Author: dallas7 PostPosted: Thu Oct 30, 2003 7:45 pm    Post subject:

Quote:
...I just emailed you the solution to do exactly the same thing,
but 1/10 the size. You may not have received it though, I'm in Australia.



Thank you for appealing to my isolationism by sending it directly to me.
I haven't received it as of yet (OCT 30, 1900 UTC). However, the
globalist in me requests you post it here for everyone.


Quote:
Oh well.



Tsk. I expect more determination from one of our buds Down Under!

Quote:
BTW, I appreciate the sarcastic title of the post.
Even if it wasn't intended to be so.



It was absolutley tongue-in-cheek. Had I wanted it to be offensive,
I would have hidden behind a guest login and stated the purpose as
effective at filtering out those damn globalists.

Quote:
Homeland Security indeed. The term "isolationist" comes to mind......



I was wondering how long it would take to see a comment about my
choice of subject wording! Actually, I think it's smack dab in the middle
between "isolationist" and "globalist." And that's all the attention I'm
gonna pay to this.

Quote:
...would it be possible to edit your posts and insert a space every once in a while...



Do you mean a line feed? And I didn't actually notice there was a
fomatting issue until I re-read my posting the next day. Preview reveals nothing amiss.
This could have happened since I composed in a text editor, since I run a spell checker in it, and then pasted into the edit
window here. Although I do this ALOT in several other forums without
any problems (I've got over 4,000 posts in one of my favorites). I don't
know why my 14th posting here got boned up; maybe it was the rather
long string of filter text. It won't happen again. I hope...

AND NOW BACK TO A DISCUSSION ABOUT THE SPAM FILTER:

I would prefer to leave Gary's filtering structure as is. First, it works.
Second, I didn't notice any performance hits. Third, should one day I
observe I'm filtering tons of really really cool email from Djibouti, I can
simply go in and zap "dj" and revel in the comfort that the rest of the filter will still work.
Recall, please, one of my previous posts: I'm regular
expression illiterate. I am pleased, though, my posting became such an
inspiration to all you great folks who aren't!


Last edited by dallas7 on Fri Oct 31, 2003 7:28 pm, edited 1 time in total

Author: IP: 66.44.*.* PostPosted: Thu Oct 30, 2003 10:38 pm    Post subject:

Dallas7 wrote:
Do you mean a line feed? And I didn't actually notice there was a
fomatting issue until I re-read my posting the next day. Preview reveals nothing amiss.


A space will do.....

The word wrapping algorythm does not know where to wrap without some kind of 'whitespace'.

If you have a long line of text, a space every 50-60 charactors will be sufficient to trigger the word wrapping.

Author: dallas7 PostPosted: Thu Oct 30, 2003 11:51 pm    Post subject:

Noted. Thanks for the heads-up on that.



CastleCops -> Mailwasher - Troubleshooting / General

All times are GMT

Goto page 1, 2, 3  Next
Page 1 of 3


Powered by phpBB © 2001 phpBB Group