CastleCops, Internet Crime Fighters
Need help? Click here to register for free! Absolutely zero advertisements on this site!

$9736.22 of $21422.68
left sidedonated so farneed $11686.46 donated to reach our goalright side, our goal
Help CastleCops serve the community on new servers, Donate Here to reach our goal.

Donation/Premium
spacer
block bottom
Security Central
spacer
· Home
· PIRT/Fried Phish
· MIRT
· SIRT
· Deutsch
· Wiki
· Newsletter
· O16/ActiveX
· CLSID List
· Contest2007
· Downloads
· Feedback (send)
· Forums
· HijackThis
· Hijacktrend
· LSPs
· My Downloads
· O18
· O20
· O21
· O22
· O23
· O9
· Premium
· Private Messages
· Proxomitron
· Reviews
· Search
· StartupList
· Stories Archive
· Submit News
· WsIRT
· Your Account
· Acceptable Use Policy
block bottom
Survey
spacer
Was 2007 a good year?

Yes it was a wonderful year
Yes, but there is always room for improvement
Status quo
It was a challenge
Other (leave comment)



Results
Polls

Votes: 937
Comments: 25
block bottom
spacer spacer

Filter nonexistent email addresses?

 
Post new topic   Reply to topic       All -> FavForums -> Mailwasher - Troubleshooting / General [del.icio.us!] [digg it!] [reddit!]
View previous topic :: View next topic  
Author Message
IP: 24.136.*.*

Guest






PostPosted: Thu Nov 27, 2003 5:18 pm    Post subject: Filter nonexistent email addresses?
Reply with quote

How would I filter out nonexistant email addresses? For example, I might get an email from something like this:

?B8L?#LSJK<@>

MWP somehow knows that it's not a real email address, because it shows all that crap as the friendly name, and puts empty brackets for the email address. It doesn't let you blacklist it. Well, I'd like to filter out or blacklist empty email addresses. How do you do this? TIA

-Jeremy

Back to top
IP: 68.51.*.*

Guest






PostPosted: Thu Nov 27, 2003 11:07 pm    Post subject:
Reply with quote

You can try this filter:

Code:
[enabled],"[1] Bad ""From"" (F)","[1] Blank From: Address (F)",255,OR,Delete,From,doesn'tContainRE,"[\w.-]+@([\w-]+\.)+[A-Z]{2,4}"



It is copied from Gary P.'s filter page here:

http://www.w5hq.com/MailWasher/MailWasherFilters.txt


I do not use this filter so I cannot reply on how it functions in real life but you'll need to test run it to see if it gives any false positives. The filter sets catches to be deleted since in an incomplete from field their is no address that can be blacklisted.

Back to top
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16506

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Fri Nov 28, 2003 6:29 am    Post subject:
Reply with quote

Anonymous wrote:
The filter sets catches to be deleted since in an incomplete from field their is no address that can be blacklisted.

Presumably one could set a wildcard entry in the blacklist for '?B8L?#LSJK@* in this case. I don't imagine it would ever be a valid email address.

Do you get repeats of the same address and/or common character patterns Jeremy?

Back to top
View users profile Send private message
Anonymous, I guess

Guest
IP: 67.2.*.*






PostPosted: Fri Nov 28, 2003 7:01 am    Post subject: I'm glad I found this post
Reply with quote

Anonymous wrote:
You can try this filter:

Code:
[enabled],"[1] Bad ""From"" (F)","[1] Blank From: Address (F)",255,OR,Delete,From,doesn'tContainRE,"[\w.-]+@([\w-]+\.)+[A-Z]{2,4}"



It is copied from Gary P.'s filter page here:

http://www.w5hq.com/MailWasher/MailWasherFilters.txt


I do not use this filter so I cannot reply on how it functions in real life but you'll need to test run it to see if it gives any false positives. The filter sets catches to be deleted since in an incomplete from field their is no address that can be blacklisted.


I'm glad I found this post because I found that filter and I do not believe it does what is intended. I have two issues with it:

  • I looked in the ICANN website and there appear to be legal TLD's of more than 4 letters (.museum)[/i]
  • That feature doesn't really work anyways because, as soon as it finds a presumed TLD of 2-4 characters it will consider it a match since the filter doesn't look for an end of the string and thus would fail to exclude TLD's which are too long to be legal.[/i]

With simple modifications these problems are resolved and
Code:

[\w.-]+@([\w-]+\.)+[A-Z]{2,4}

becomes
Code:

[\w\.-]+@([\w-]+\.)+[A-Z]{2,6}>$

The ">$" prevents something like "no_one@nowhere.way_too_long_tld" from being accepted as valid. The 6 instead of 4 is there to accomodate ".museum".

Back to top
IP: 68.51.*.*

Guest






PostPosted: Fri Nov 28, 2003 7:27 am    Post subject: Re: I'm glad I found this post
Reply with quote

Anonymous, I guess wrote:

I'm glad I found this post because I found that filter and I do not believe it does what is intended. I have two issues with it:


Hello Anonymous, I guess,
As I mentioned in the above post I do not use the filter (since I get almost no spam with an invalid from address) so I have no real life experience with the filter but for the common knowledge how is the real life with your modification?
Btw, thanks for posting a further development of the filter.

Back to top
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16506

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Fri Nov 28, 2003 7:32 am    Post subject: Re: I'm glad I found this post
Reply with quote

Anonymous, I guess wrote:

Code:

[\w\.-]+@([\w-]+\.)+[A-Z]{2,6}>$

The ">$" prevents something like "no_one@nowhere.way_too_long_tld" from being accepted as valid. The 6 instead of 4 is there to accomodate ".museum".

Good points ... except that the '>' isn't always present. So I'd say the regex should be
Code:
[\w\.-]+@([\w-]+\.)+[A-Z]{2,6}>??$

Back to top
View users profile Send private message
Anonymous, I guess

Guest
IP: 67.2.*.*






PostPosted: Fri Nov 28, 2003 9:04 am    Post subject: Re: I'm glad I found this post
Reply with quote

Ikeb wrote:
Anonymous, I guess wrote:

Code:

[\w\.-]+@([\w-]+\.)+[A-Z]{2,6}>$

The ">$" prevents something like "no_one@nowhere.way_too_long_tld" from being accepted as valid. The 6 instead of 4 is there to accomodate ".museum".

Good points ... except that the '>' isn't always present. So I'd say the regex should be
Code:
[\w\.-]+@([\w-]+\.)+[A-Z]{2,6}>??$

I didn't think of that. I suppose there's no reason why ">" should be there if some spam mailing program or worm SMTP client doesn't choose to put it there. BTW, what's the difference between the "?" and "??" iterators?
Anonymous wrote:

Hello Anonymous, I guess,
As I mentioned in the above post I do not use the filter (since I get almost no spam with an invalid from address) so I have no real life experience with the filter but for the common knowledge how is the real life with your modification?
Btw, thanks for posting a further development of the filter.


Actually, I thought the same thing about it not being very practical but I've seen it be triggered several times in a couple of days. I've started using MailWasher a couple of days ago because I'm being flooded by the Swen.a worm and I wanted to be able to delete those messages on the server without having to go to the webmail (in addition to POP, my ISP allows access through the web) every time. I actually put together a set of filters that seem to work well for the purpose of excluding Swen.a worm infected messages. I was surprised to find that a few messages (not many, I grant you) where filtered by the non-existent address filter (most or all were using an empty e-mail address).

So yes, some real world e-mails come with a malformed email address. However, counting on this to catch Swen.a won't work as the wormy e-mails produced by Swen.a with a malformed e-mail address are in the minority.

I won't post those here because they'll mess with the formatting (being rather long) but they can be found at the following address:

http://webspace4me.net/~cosmicaug/msf.html#swenfil

I think the best working filter in there is the last one which wasn't really somethig I put together (I just read about that approach elsewhere and tried to rewrite so that mailwasher would understand it).
It's has catch everything so far that something else (like a black hole or the non-existent mail address filter) hasn't caught first.

I also turned it into a batch file that appends the filters to "filters.txt" (adding an "echo" to the beginning of the lines and an append redirector --">>"-- to the end of the lines) to get around the problem of unwittingly adding unwanted linefeeds when cutting and pasting by hand.

Though the filters have produced zero false positives, I can't claim they won't do so as I haven't really received that much e-mail the last day (other than the wormy messages, of which I've received plenty).

BTW, I'm not really intending to be anonymous, I'm just too lazy to register.
--August Pamplona

Back to top
Anonymous, I guess

Guest
IP: 67.2.*.*






PostPosted: Fri Nov 28, 2003 9:09 am    Post subject: Re: I'm glad I found this post
Reply with quote

Anonymous, I guess wrote:

BTW, I'm not really intending to be anonymous, I'm just too lazy to register.
--August Pamplona


Which is too bad bacause as an unregistered guest I can't go back and edit the many spelling mistakes one only manages to see after clicking the 'Submit' button. Sad
--August Pamplona

Back to top
IP: 68.51.*.*

Guest






PostPosted: Fri Nov 28, 2003 9:50 am    Post subject:
Reply with quote

Welcome to the MailWasher forums August Pamplona. Looks like the forum has gained another great poster.

For those who use IE:

http://www.iespell.com/

Back to top
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16506

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Fri Nov 28, 2003 7:52 pm    Post subject: Re: I'm glad I found this post
Reply with quote

Anonymous, I guess wrote:
Ikeb wrote:
Good points ... except that the '>' isn't always present. So I'd say the regex should be
Code:
[\w\.-]+@([\w-]+\.)+[A-Z]{2,6}>??$

I didn't think of that. I suppose there's no reason why ">" should be there if some spam mailing program or worm SMTP client doesn't choose to put it there. BTW, what's the difference between the "?" and "??" iterators?

From the TRegExpr help document:
Code:
?      zero or one ("greedy"), similar to {0,1}
??     zero or one ("non-greedy"), similar to {0,1}?

So if the search engine had a choice, the former would chose one while the latter would chose zero. To be honest I don't know if there could be something that would make a difference in this case. Denn988 has convinced me to default to the non-greedy form unless there's a reason to do otherwise.

Anonymous, I guess wrote:
I won't post those here because they'll mess with the formatting (being rather long) but they can be found at the following address:

http://webspace4me.net/~cosmicaug/msf.html#swenfil

Yeah, even if applying the 'Code' and '/Code' BBCode tags to bracket such filter expressions, the formatting is screwed up. Sad
Thanks for putting up the web page and providing the reference.

Anonymous, I guess wrote:
BTW, I'm not really intending to be anonymous, I'm just too lazy to register.
--August Pamplona

But not too lazy to put up a web page of some neat Swen filters? What's up with that? Razz

Back to top
View users profile Send private message
Anonymous, I guess

Guest
IP: 67.2.*.*






PostPosted: Fri Nov 28, 2003 8:46 pm    Post subject: In form spell checkingand I've just
Reply with quote

Anonymous wrote:
Welcome to the MailWasher forums August Pamplona. Looks like the forum has gained another great poster.

For those who use IE:

http://www.iespell.com/


Thanks, I didn't know about that. I'm actually using Firebird most of the time which doesn't have a spellchecker extension yet but I've just found out that there's one under development right now which I've downloaded and may test if I feel brave enough. The Mozilla suite (that's the older project which includes browser and mail/news client functionality --whereas the newer pre-release project separate it to Firebird and Thunderbird respectively) has had spellchecking built into builds starting at 1.5b and the Mozilla Thunderbird mail/news client also has it built in.

Back to top
IP: 67.2.*.*

Guest






PostPosted: Fri Nov 28, 2003 9:10 pm    Post subject: Re: I'm glad I found this post
Reply with quote

Ikeb wrote:
Anonymous, I guess wrote:

I didn't think of that. I suppose there's no reason why ">" should be there if some spam mailing program or worm SMTP client doesn't choose to put it there. BTW, what's the difference between the "?" and "??" iterators?

From the TRegExpr help document:
Code:
?      zero or one ("greedy"), similar to {0,1}
??     zero or one ("non-greedy"), similar to {0,1}?

So if the search engine had a choice, the former would chose one while the latter would chose zero. To be honest I don't know if there could be something that would make a difference in this case. Denn988 has convinced me to default to the non-greedy form unless there's a reason to do otherwise.

Got it! Greedy vs. non-greedy was a little hard to grasp on the first couple of readings (for me in any case) but I think I've got it now.

However, giving it a second look I'm not sure why we should care what the regexp returns since the filters, in the context of MailWasher, operate in a boolean fashion (a match either exists or it doesn't --with actual contents of the match not being of great consequence).

By the way, I've already had a false positive (it was actually spam but the e-mail address was valid) because of the issue corrected by the addition of "?". Which shows the issue you raised does have "real world" application.

Anonymous, I guess wrote:
I won't post those here because they'll mess with the formatting (being rather long) but they can be found at the following address:

http://webspace4me.net/~cosmicaug/msf.html#swenfil

Yeah, even if applying the 'Code' and '/Code' BBCode tags to bracket such filter expressions, the formatting is screwed up. Sad
Thanks for putting up the web page and providing the reference.

Ikeb wrote:
Anonymous, I guess wrote:
BTW, I'm not really intending to be anonymous, I'm just too lazy to register.
--August Pamplona

But not too lazy to put up a web page of some neat Swen filters? What's up with that? Razz

Just weird, I guess.
--August Pamplona

Back to top
Ikeb

Special Response Team
Forums Admin

Joined: Apr 20, 2003
Posts: 16506

Forums Admin Moderators MVP Premium SRT Team CC Committee Team F@H

PostPosted: Fri Nov 28, 2003 10:47 pm    Post subject: Re: I'm glad I found this post
Reply with quote

Anonymous wrote:
Ikeb wrote:
Anonymous, I guess wrote:
BTW, I'm not really intending to be anonymous, I'm just too lazy to register.
--August Pamplona

But not too lazy to put up a web page of some neat Swen filters? What's up with that? Razz

Just weird, I guess.
--August Pamplona

You and Denn988 should compare notes sometime. Rolling Eyes Razz

Back to top
View users profile Send private message
IP: 24.136.*.*

Guest






PostPosted: Wed Dec 03, 2003 12:00 am    Post subject:
Reply with quote

Ikeb wrote:

Presumably one could set a wildcard entry in the blacklist for '?B8L?#LSJK@* in this case. I don't imagine it would ever be a valid email address.

Do you get repeats of the same address and/or common character patterns Jeremy?


No, it's always something different.

-Jeremy

Back to top
Display posts from previous:   
Post new topic   Reply to topic       All -> FavForums -> Mailwasher - Troubleshooting / General All times are GMT
Page 1 of 1

 
Quick Reply:
Username: 

Quote the last message
Attach signature (signatures can be changed in profile)
 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001 phpBB Group
spacer spacer