CastleCops, Internet Crime Fighters
Need help? Click here to register for free! Absolutely zero advertisements on this site!

$9736.22 of $21422.68
left sidedonated so farneed $11686.46 donated to reach our goalright side, our goal
Help CastleCops serve the community on new servers, Donate Here to reach our goal.

Donation/Premium
spacer
block bottom
Security Central
spacer
· Home
· PIRT/Fried Phish
· MIRT
· SIRT
· Deutsch
· Wiki
· Newsletter
· O16/ActiveX
· CLSID List
· Contest2007
· Downloads
· Feedback (send)
· Forums
· HijackThis
· Hijacktrend
· LSPs
· My Downloads
· O18
· O20
· O21
· O22
· O23
· O9
· Premium
· Private Messages
· Proxomitron
· Reviews
· Search
· StartupList
· Stories Archive
· Submit News
· WsIRT
· Your Account
· Acceptable Use Policy
block bottom
Survey
spacer
Was 2007 a good year?

Yes it was a wonderful year
Yes, but there is always room for improvement
Status quo
It was a challenge
Other (leave comment)



Results
Polls

Votes: 949
Comments: 28
block bottom
spacer spacer

Filtering for URLs and other text in body

 
Post new topic   Reply to topic       All -> FavForums -> Mailwasher - Troubleshooting / General [del.icio.us!] [digg it!] [reddit!]
View previous topic :: View next topic  
Author Message
Whisperer

Sergeant
Sergeant


Joined: Mar 29, 2003
Posts: 134
Location: USA

PostPosted: Wed Dec 24, 2003 6:55 am    Post subject: Filtering for URLs and other text in body
Reply with quote

I'm determined to avoid false positives and have decided to try making a lot of filters that contain links found in the body of spams I get, on the assumption that there are a lot of repeats coming in. Initial testing shows that within a few days this is catching about 20-50% of the spam I get.

I'm setting these as hide-and-delete filters, so I can't leave room for error. I figure that as long as I'm only using the URLs that the spams want me to click through to, or even the main domains as appears appropriate, I'm safe on that issue.

My question is this. Sometimes I see two URLs in sequence, separated by an asterisk, such as:

Code:

<a href=http://drs.yahoo.com/dogma/admiringbiggerslouchy/*http://
www.morosophunrivalled.goosadsjsfjb.biz/cf329/ target=_blank>

On the assumption that spammers will change things frequently, but also that they're less likely to change the URL they want you to click through to (yes, I realize that's a hypothetical assumption and that they can use redirects to keep us guessing until the cows come home), I'm just wondering which of the two you'd expect to remain as the more permanent one, the first or the second in the sequence.

I also note that usually when I see this the first one it at a major ISP like yahoo, while the latter is an unknown domain.

On a separate note, sometimes the body contains nothing but pure gibberish in a long string of characters making up a sizable paragraph. I'm just wondering if it might not be somewhat useful to take the first dozen or two dozen characters and a put them into a filter on the chance that the spammers will use the same gibberish at least for a while before changing it.

Thanks.



Last edited by Whisperer on Mon Dec 29, 2003 4:15 am, edited 1 time in total
Back to top
View users profile Send private message Visit posters website
IP: 66.44.*.*

Guest






PostPosted: Wed Dec 24, 2003 7:35 pm    Post subject: Re: Filtering for URLs and other text in body
Reply with quote

Whisperer,

Ikeb posted a filter that is designed to look for such tricks. I don't have the time to do a search for it right now...but perhaps Ikeb will post a link to it.

In the meantime....I have a RegExpr that could help for such a filter.

To prevent it from scrolling, I am posting it in smaller sections. Just cut and paste it together in the order posted. You can add more domains to it as required.

Code:

http://[^"<>]*?[.]?

Code:
(aqmp\.net|asphost\.com|carlz\.us|carriespickspreview\.com|
Code:
cherrypickedofferz\.com|controlz\.us|dkldk\.com|dock1\.com|
Code:
dubnh\.us|ero-roots\.com|ecom-universe\.net|emedorders\.com|faithweb\.com|
Code:
ff545zz\.com|figure7v\.com|ghkp\.us|goandbuyit\.com|
Code:
gono\.us|hostgym\.com|imgehost\.com|kiffergly\.net|
Code:
klinelenderspress\.com|linkcounter\.com|re55steel4\.com|
Code:
remote-cars1\.com|rx359\.net|mdwebdoctor\.com|medsfactory\.com|
Code:
netidcuh\.com|oldcactus\.com|paylesscanadiandrugs\.com|
Code:
prefer\d+f\.com|pills\d+as\.com|rhinoceros\.us|
Code:
sacrosanctraindrop\.net|shopnsavecentral\.com|
Code:
savvypurchaser\.com|seeingnoone\.com|swena\.net|
Code:
tashabo\.com|theholmesgroup\.com|unone\.us|webrxonline\.com|\.biz)

Code:
[ /\?"]


You will note that at the end of the above RegExp...any link to a '.biz' address will trigger the expression. You can remove that, or any pther domain name if you would like. You

Back to top
IP: 65.37.*.*

Guest






PostPosted: Sun Dec 28, 2003 6:37 am    Post subject: Not too shabby
Reply with quote

I want to report my results so far with creating hide-and-delete filters based mainly on the URLs in the body but also using some key phrases in the body, Subject, and From fields.

It's been about a week to ten days and I have to say that, while it's been quite a lot of work so far to create -- whoa! -- almost 70 filters with anywhere from one or two to up to the max number of expressions each (I'm not using RegExpr), the work is decreasing dramatically day by day.

In the past few days, out of about fifty spams, I'd say the filter caught as much as 75% of them overall, and in the past 24-48 hours it's caught, well, about 12 out of 18, and then, just now, seven out of seven. And because of the way I'm doing it, I feel the chance of false positives is slim to none.

I'd say I've had to create maybe a dozen new entries in the past two days.

That's starting to become considerably less time than it takes me to visually scan through many dozens of spams before deleting them.

The question is how long these will continue to be valid and how fast new ones will keep appearing.

I'll try to post a follow-up... or, feel free to remind me to do so.

Back to top
IP: 65.37.*.*

Guest






PostPosted: Sun Dec 28, 2003 6:37 am    Post subject: Not too shabby
Reply with quote

I want to report my results so far with creating hide-and-delete filters based mainly on the URLs in the body but also using some key phrases in the body, Subject, and From fields.

It's been about a week to ten days and I have to say that, while it's been quite a lot of work so far to create -- whoa! -- almost 70 filters with anywhere from one or two to up to the max number of expressions each (I'm not using RegExpr), the work is decreasing dramatically day by day.

In the past few days, out of about fifty spams, I'd say the filter caught as much as 75% of them overall, and in the past 24-48 hours it's caught, well, about 12 out of 18, and then, just now, seven out of seven. And because of the way I'm doing it, I feel the chance of false positives is slim to none.

I'd say I've had to create maybe a dozen new entries in the past two days.

That's starting to become considerably less time than it takes me to visually scan through many dozens of spams before deleting them.

The question is how long these will continue to be valid and how fast new ones will keep appearing.

I'll try to post a follow-up... or, feel free to remind me to do so.

Back to top
Whisperer

Sergeant
Sergeant


Joined: Mar 29, 2003
Posts: 134
Location: USA

PostPosted: Sun Dec 28, 2003 6:41 am    Post subject:
Reply with quote

That double-post, above, was a mistake -- obviously.

And I thought I was logged in but apparently I wasn't.

My bad.

Whisperer

Back to top
View users profile Send private message Visit posters website
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16515

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Sun Dec 28, 2003 7:34 am    Post subject:
Reply with quote

Yes indeed. But you were logged in for your OP. That's the one that's screwing up this page's width. Just edit that post and break up that one long line that insists on posting without wrapping....

The filter for redirected HTTP links you were inquiring about:

Code:
If the Body contains the RegExpr "(?i)<\s*a[\s\w=]+(?s)href=(3D)??"?http://[\d\w\./]+(@|�{0,5}64;|\*|�{0,5}42;).+>" then hide the message from the messages list, and mark the message as mail to be deleted. This filter takes priority over the friends list.


BTW, I'm finding that Denn988's Banned Dialup filter strategy is just the ticket for me! The strategy focuses on the header, doesn't involve endless filter additions, and can't be easily bypassed by a SPAMer. It really is brilliant ... as the Eggman stated right after Denn988 posted it but which took me a day or so -- along with some help from Denn988 -- to fully appreciate.

Back to top
View users profile Send private message
Whisperer

Sergeant
Sergeant


Joined: Mar 29, 2003
Posts: 134
Location: USA

PostPosted: Mon Dec 29, 2003 4:21 am    Post subject:
Reply with quote

Ikeb wrote:
Yes indeed. But you were logged in for your OP. That's the one that's screwing up this page's width. Just edit that post and break up that one long line that insists on posting without wrapping....

Gotcha. Done.

Ikeb wrote:
The filter for redirected HTTP links you were inquiring about:
Code:
If the Body contains the RegExpr "(?i)<\s*a[\s\w=]+(?s)href=(3D)??"?http://[\d\w\./]+(@|�{0,5}64;|\*|�{0,5}42;).+>" then hide the message from the messages list, and mark the message as mail to be deleted. This filter takes priority over the friends list.

Thanks. But how good is it for reliably avoiding false positives?

Ikeb wrote:
BTW, I'm finding that Denn988's Banned Dialup filter strategy is just the ticket for me! The strategy focuses on the header, doesn't involve endless filter additions, and can't be easily bypassed by a SPAMer. It really is brilliant ... as the Eggman stated right after Denn988 posted it but which took me a day or so -- along with some help from Denn988 -- to fully appreciate.

Cool... but, uh, how good is it for reliably avoiding false positives?

Thanks!

Back to top
View users profile Send private message Visit posters website
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16515

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Mon Dec 29, 2003 5:30 am    Post subject:
Reply with quote

Whisperer wrote:
Ikeb wrote:
The filter for redirected HTTP links you were inquiring about:

Thanks. But how good is it for reliably avoiding false positives?

Just within the last week, out of the 183 SPAM messages my filters detected, this filter detected 19 of them. Since I've been using it, I have not found a single false positive.

BTW the "Buried Email address" regex I also included in that same post has detected 30 SPAM messages over the last week. I fully expected that one to get false positives from my emag subscriptions but since I combine that one with another SPAM indicator, I haven't had a single false positive with that one either.

Whisperer wrote:
Ikeb wrote:
BTW, I'm finding that Denn988's Banned Dialup filter strategy is just the ticket for me!

Cool... but, uh, how good is it for reliably avoiding false positives?

Too early to make a certainty of it, but so far so good! In the last couple of days the filters I've set up to trigger on suspect "Received: from" field SPAM indicators has caught almost ALL the SPAM detected! (I.e. other filters might well have detected the same message based on other indicators but the higher priority of the "Received: from" filters means they trigger if SPAM is detected that way.) In fact the only other filter that I can recall has detected SPAM is the "Redirected HTML" filter.

I have had false positives on a couple I'm still testing and which may yet prove to be dead end ideas but the one posted by Denn988, as well as a couple more I've developed from his, have yet to trigger falsely!

Back to top
View users profile Send private message
AlphaCentauri

SIRT Handler
Premium Member

Joined: Nov 20, 2003
Posts: 2763

Premium

PostPosted: Mon Dec 29, 2003 5:12 pm    Post subject: autodeleting by url's in spam
Reply with quote

I also primarily screen by URL's. As far as the autodelete, there is one very big caveat: If you are using Reg Exp's so you can have lots of URL's in one filter, you may make a mistake and have two dividers next to each other (||) or end a line with a divider. It won't be easy to see because they look like l's. And it won't be easy to notice its effect because it will label everything that isn't on the Friends list as spam, and usually, it is. But it's an easy mistake to make even if you are aware of it, and you may not want to set any reg exps. to autodelete.

You are right that there are a lot of repeats. The spam is advertising the sites, and if the sites change names, the nitwits who answer the spam can't send money. So they fake the header, but never the URL, and at least for a few weeks, you get a lot of spam for the same URL.

They do put a lot of nonsense in front of the address. I only screen for the last domain name before the .com, .net, or whatever. The rest doesn't mean anything.

I also have a text file where I keep a copy of my filtered expressions. Basically, I just keep copying and pasting into that file with dividers. Then I copy and paste it into a mailwasher filter called "Link to Spam Recent" and make it my highest level filter after "fake from me." When I get to the point where word pad has to start a new line even without text wrap being on, it's time to start a new line in the Mailwasher filter, too. I guess when I get to 10 lines of filters, I'll swap out the first line and see if any of those URL's are still showing up. Anytime a piece of spam is caught by a lower level filter, I open up "show complete header" and harvest the URL for another filtering expression and paste it into my wordpad file and then into mailwasher.

The ones in base 64 are harder. I go to http://www.opinionatedgeek.com/dotnet/tools/Base64Decode/safedecode.aspx to decode it, and strip out letters four at a time and keep decoding until I get to the sequence I can filter for. I have a separate filter "base 64 href" that has the various permutations of letters that could code "href=http://" since I don't know of anyone who sends html code in base 64. Again, I don't autodelete, but it's been 100% specific so far. You can also strip out the code for the actual url's but since there are several ways to code each sequence of English letters, it's pretty tedious to do for every spammer. I just go for the ones that include an html link. I figure the odds that the same long sequence of letters will show up in a photo or something is pretty remote.

Another sneaky thing is spammers that don't actually include a link in the message -- they require the recipient to cut and paste the address into an email or browser. Of course, that's not as effective a way of marketing their sites. I screen for body contains "html" AND body contains ">g<|>o<|>e<" (e because it's common, g and o because gono.com is the main offender and they can't break up that name in too many different ways.)

Back to top
View users profile Send private message
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16515

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Mon Dec 29, 2003 6:12 pm    Post subject:
Reply with quote

I've toyed with the SPAMversized URL myself and built up the list of sites that will be detected. After only a week or less, I was beginning to see hits (i.e. repeated SPAMversizements for the same site). I was using this strategy to "blacklist" any SPAM that "leaked" through my other filters.

However since Denn988 posted his "Received: from" strategy and adding some of my own filters based on this strategy, I haven't had any further "leaks". Therefore I haven't made use of this strategy since Christmas.

But back to your post Alpha, I didn't bother with Base64 encoded messages. I figured the pain wasn't worth it. (Actually I never had one of those "leak" through anyway.) Of course if FireTrust were to add a decoder, that would allow easy extension of SPAMversized URL detection. But the best thing FireTrust could do to ease implementation of this strategy is to add a SPAMvertized URL blacklist feature.

Back to top
View users profile Send private message
stan_qaz

Premium Member


Joined: Mar 31, 2003
Posts: 10626

Premium

PostPosted: Mon Dec 29, 2003 6:32 pm    Post subject:
Reply with quote

It looks like this one is making the most requested list, hopefully we can get it added to an upcoming version in the new year.

rusticdog, what are the chances of seeing this and when should we look for it if it is possible?


_________________
Questions? Try the wiki
http://wiki.castlecops.com/MailWasher_Pro
Back to top
View users profile Send private message
Display posts from previous:   
Post new topic   Reply to topic       All -> FavForums -> Mailwasher - Troubleshooting / General All times are GMT
Page 1 of 1

 
Quick Reply:
Username: 

Quote the last message
Attach signature (signatures can be changed in profile)
 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001 phpBB Group
spacer spacer