| View previous topic :: View next topic |
| Author |
Message |
digloo
Guest IP: 68.3.*.*
|
Posted: Fri Sep 26, 2003 5:49 am Post subject: problem with regular expression match |
|
|
I've been getting some emails that look like this:
something@mx###.something.(com|net)
The 'something's vary. The part after the '@' is always the letters 'mx' followed by three digits. So I defined these regexps:
.+@mx[0-9]{3}\..+\.(com|net)
\w+@mx[0-9]{3}\..+\.(com|net)
\w+@mx[0-9]{3}\.\w+\.(com|net)
and other variations
When I test it with the following email address:
driving4dollars.com@mx084.mailtransfer.net
I get an error dialog pop-up that usually says this:
"Wildcard expression not well-formed:
must have a . right of the @"
Sometimes it complains about nothing on the left of the @.
What's wrong here?
-David
|
|
| Back to top |
|
 |
TimeGhost
Captain

 Joined: Apr 11, 2003 Posts: 747 Location: USA
|
Posted: Fri Sep 26, 2003 9:17 pm Post subject: |
|
|
You have a greedy wildcard that eats the top level domain. Try putting a question mark after every plus sign.
Also note that regular expressions do not work for blacklist entries. Hopefully, you're contructing a filter rule. If not, use *@mx???.*.com and *@mx???.*.net .
HTH
|
|
| Back to top |
|
 |
digloo
Guest IP: 68.3.*.*
|
Posted: Sat Sep 27, 2003 9:05 am Post subject: |
|
|
Thanks for the help.
So, exactly which parts of this system use _real_ Regular Expressions, and which ones use the flimsy "wildcard matching"?
-David
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16515
|
Posted: Sat Sep 27, 2003 8:12 pm Post subject: |
|
|
| digloo wrote: | | So, exactly which parts of this system use _real_ Regular Expressions, |
The header and body filters
| Quote: | | and which ones use the flimsy "wildcard matching"? |
The black list and friends list.
|
|
| Back to top |
|
 |
digloo
Guest IP: 68.3.*.*
|
Posted: Sun Sep 28, 2003 8:17 pm Post subject: |
|
|
Which one does the SUBJECT filters use? eg, how do I filter out stuff like:
viagra
V1aGrA
V|agra
ViAGRA
etc.
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10629
|
Posted: Mon Sep 29, 2003 2:10 am Post subject: |
|
|
Try this:
If the Subject field contains "viagra" or "V1aGrA" or "V|agra " or "ViAGRA" then mark the message as mail to be deleted.
But that is only if they don't toss in any stupid html tricks or come up with more variations. Then you'd be better off getting help from Gary one of the local filter experts. Search the forums on "Gary and filters" and see if any of the ones he has written will do for you. _________________ Questions? Try the wiki
http://wiki.castlecops.com/MailWasher_Pro
|
|
| Back to top |
|
 |
digloo
Guest IP: 68.3.*.*
|
Posted: Mon Sep 29, 2003 7:13 am Post subject: |
|
|
Well, I've written a bunch of regular expression filters for the Subject lines, like for V1aGra etc., but they aren't working. There's no way to test them like for the wildcard matches on email addresses.
I'm pretty good with regular expressions, so it's a little confounding to me why some of them are working and some aren't.
-David
|
|
| Back to top |
|
 |
digloo
Guest IP: 68.3.*.*
|
Posted: Mon Sep 29, 2003 7:17 am Post subject: |
|
|
I forgot to mention that I'm not aware that HTML can be placed in SUBJECT header lines.
Also, I'm looking for generic patterns, not a way to enumerate every possible combination. :p That's why you use these critters in the first place!
The pattern "[Vv][Ii1|l][Aa][Gg][Rr][Aa]" should match just about any tricky naming convention other than putting periods in between letters (and misspellings. I'm just unsure if there's a way of specifying "case-insensitive" matching, so a pattern like "v[i1l|]agra" would work.
|
|
| Back to top |
|
 |
TimeGhost
Captain

 Joined: Apr 11, 2003 Posts: 747 Location: USA
|
Posted: Mon Sep 29, 2003 3:10 pm Post subject: |
|
|
1. I think the RegEx FAQ that RusticDog posted has a link to a RegEx tester.
2. Searches are case-insensitive unless you use the (?i) modifier.
3. I highly recommend Gary's Filters. There's one that will handle v i a g r a, for example.
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10629
|
|
| Back to top |
|
 |
TimeGhost
Captain

 Joined: Apr 11, 2003 Posts: 747 Location: USA
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10629
|
Posted: Mon Sep 29, 2003 9:25 pm Post subject: |
|
|
TimeGhost, Thanks, I missed it when I went looking and remembered seeing it somewhere.
Since you have been thinking on filters for a while what do you think of filtering the message body looking for the website link. I put together a few with the built in filter dialog, just a simple test:
If the Body contains "GainTrafficFast.com" or "trafficglue.com" or "patch-direct.com" or "net4net.net" or "getmoresavemore.biz" or "vano-soft.biz" or "shaira123.com" then mark the message as mail to be deleted.
and it seems to be working, catching stuff that black listing from addresses is missing. I have 8 of these simple filters now and I can see it getting out of hand using the dialog box filter builder where a hand built filter might be both faster and easier to maintain.
Off to read the filter tutorials you pointed out. _________________ Questions? Try the wiki
http://wiki.castlecops.com/MailWasher_Pro
|
|
| Back to top |
|
 |
TimeGhost
Captain

 Joined: Apr 11, 2003 Posts: 747 Location: USA
|
Posted: Mon Sep 29, 2003 9:44 pm Post subject: |
|
|
You're welcome. I'm still not sure about the link to the program that does RegEx testing. Maybe someone else will post it.
I have one filter like yours. In that case, there was no other way I could match it.
There are some other neat URL filters that look for encoded characters (%20, etc) or an IP address. Gary's "Questionable Links" filter is a good one to use for this. His "HTML Spam Tricks" is another.
Good luck!
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10629
|
Posted: Mon Sep 29, 2003 10:29 pm Post subject: |
|
|
I'm going to drop a few of Gary's filters in my system as soon as I understand what they are doing. _________________ Questions? Try the wiki
http://wiki.castlecops.com/MailWasher_Pro
|
|
| Back to top |
|
 |
|
|