| View previous topic :: View next topic |
| Author |
Message |
Steve
Guest IP: 68.42.*.*
|
Posted: Sun Jun 22, 2003 3:42 am Post subject: Filter doesn't find keywords in the body |
|
|
I'm having problems matching literal keywords in the body. I'm using no regexes. The filter looks for any of 10 keywords. There are more than 3 in the message body. But the filter never invokes.
I'm using MW PRO 3.0. I have no build number. I just purchased it.
Message in question follows. PLEASE TRY IT FOR YOURSELVES.
Steve Lindner
If the Body contains "penis" or "fuck" or "sex" or "sexual" or "arousal" or "hardcore" or "porn" or "pornography" or "xxx" or "nudity" then mark the message as mail to be deleted and mark the message as mail to be bounced.
|
|
| Back to top |
|
 |
gary
Lieutenant
 Premium Member
 Joined: Dec 22, 2002 Posts: 260 Location: Dallas/Ft. Worth, USA
|
Posted: Sun Jun 22, 2003 5:42 am Post subject: |
|
|
Does the Raw Source View show that these words are all together, and not broken up with HTML comments, escaped, etc.? For example:
BY A SMA<!-- squeak -->LL PEN<!-- %RA=NDOM_WORD -->IS - HO<!-- cut --><!-- piece -->W DO YOU MEA=<!-- turn -->SURE UP</a></p> _________________ Gary
|
|
| Back to top |
|
 |
!Beady
Private

 Joined: Jun 15, 2003 Posts: 49 Location: Afghanistan
|
Posted: Sun Jun 22, 2003 1:23 pm Post subject: |
|
|
Really stupid question here, but are you sure you've got the "Any Rule" button clicked, rather than the "All Rules" button?
You say you're not using regexes, so I assume the one filter has a seperate rule for each word?
|
|
| Back to top |
|
 |
gary
Lieutenant
 Premium Member
 Joined: Dec 22, 2002 Posts: 260 Location: Dallas/Ft. Worth, USA
|
Posted: Sun Jun 22, 2003 1:56 pm Post subject: |
|
|
That's not stupid at all! I screw that up all of the time. Thanks for bringing it up. _________________ Gary
|
|
| Back to top |
|
 |
Steve
Guest IP: 68.42.*.*
|
Posted: Sun Jun 22, 2003 7:04 pm Post subject: |
|
|
I thought of html mixing. Nope, the keywords are isolated simple text with no apparent chance of such camoflage.
Also, this same problem has always happened. I've never seen ANY filter rule fire on something in the body.
|
|
| Back to top |
|
 |
Steve
Guest IP: 68.42.*.*
|
Posted: Sun Jun 22, 2003 7:13 pm Post subject: |
|
|
I also thought of my own stupidity about clicking "ALL" instead of "ANY" clause. Nope, that's fine too. (You can see it yourself in the source I pasted for you.) And again, I have several filters looking for keywords in the body. None of them have ever fired.
I have also thought of the keyword being beyond the contents of what's previewed. But no, several keywords are within the preview frame plain and unmolested. Still no joy.
I also considered case sensitivity. But the default in the PERL documentation linked to in the FAQ claims no sensitivity by default. (I code in PERL occassionally.) Still not joy.
Can YOU (anyone) match on ANY keyword in ANY email body?
|
|
| Back to top |
|
 |
Eggman5X
Captain

 Joined: Mar 13, 2003 Posts: 699 Location: HOU TX USA
|
Posted: Sun Jun 22, 2003 7:51 pm Post subject: |
|
|
Yes, body filters do "fire" ... when is another story.
If I recall correctly, the default for MWP 3.0 is to download only 20 lines of the message. If you right-click the message in question, and select "View entire header" up to the first 800 lines will be downloaded. When the status bar on this "view" window says "Done", simply close the window. The message is then rescanned through the filters. You will most likely see the status of the message change, and your filter will have worked.
A better solution is to download the latest version of MWP (3.1.0, 9 June 2003) which allows you to configure the number of lines of each message downloaded on the first pass. It is on the "Tools >> Options >> General" tab, is called "Spam Throttle", and can be set from 20-800 lines. You should use the least number of lines that will successfully trigger your filters, so that you do not have a negative impact on speed by downloading unnecessary lines.
The manual scan method described in the first paragraph can also be used in version 3.1, and can also be accessed by clicking the "View Raw Source" button on the Preview Pane.
Good Luck.
{edited for lazy fingers on Sunday afternoon} _________________ Lightly scrambled, over-easy and stuffed with all sorts of goodies.
|
|
| Back to top |
|
 |
Neep_heid
Sergeant

 Joined: Mar 31, 2003 Posts: 101 Location: Scotland
|
Posted: Sun Jun 22, 2003 8:02 pm Post subject: |
|
|
Could there be another filter, e.g. a "not to me" that has a higher priority than the one you set up to trap these words?
If this is the case, the "status" column will have the name of the filter displayed opposite these mails.
|
|
| Back to top |
|
 |
dzeni
Guest IP: 202.89.*.*
|
Posted: Sun Jun 22, 2003 9:06 pm Post subject: |
|
|
I have the same problem. Have a filter which should search the body for the words "click here" or "Click Here" or "Click here" (was not sure of the capatilization thing) and it has only worked once!
This is annoying as most of the spam that I receive actually has those two words in it. Am not sure about the preview thing because the "click here" does appear in the first 20 lines. What is going on here?
|
|
| Back to top |
|
 |
Steve
Guest IP: 68.42.*.*
|
Posted: Sun Jun 22, 2003 9:55 pm Post subject: Still no joy |
|
|
I'm beginning to think the keyword scan in the email body doesn't work in a large set of cases.
I've tried several additional possibilities. Nothing works.
BTW: I'm not new to pattern matching. I've written pattern systems before. So I'm wondering...does anyone KNOW that this feature was debugged and working on any platform?
|
|
| Back to top |
|
 |
TCM
Guest IP: 202.37.*.*
|
Posted: Mon Jun 23, 2003 3:03 am Post subject: |
|
|
Look at the raw source of the email, not the MailWasher preview. The email may have been mime encoded in base64, or any number of similiar tricks. Just because a word is in the MailWasher preview of an email, it doesn't mean that word appears in the source of the email.
|
|
| Back to top |
|
 |
dzeni
Guest IP: 202.89.*.*
|
Posted: Mon Jun 23, 2003 9:12 am Post subject: I got it ! |
|
|
Hi All,
Read through your comments and had a look at the original source of the message. The spam is getting through because the keywords were all interspersed with comment tags so that a word like Viagra would look like this Via<!-somer rubbish->ra. I wrote a "comment" filter which said to bounce and delete any message which had <! in the body and it worked at treat!
This is so cool! I believe the mystery is solved.
G-d Bless
Dzeni 
|
|
| Back to top |
|
 |
gary
Lieutenant
 Premium Member
 Joined: Dec 22, 2002 Posts: 260 Location: Dallas/Ft. Worth, USA
|
Posted: Mon Jun 23, 2003 12:43 pm Post subject: |
|
|
If you find you are getting too many false positives with just looking for comments, you might try narrowing it with something like:
((<![\w\s,\.\-]+>)+([\w\s,\.\-]){1,20}){3}
Best of luck! _________________ Gary
|
|
| Back to top |
|
 |
Steve
Guest IP: 68.42.*.*
|
|
| Back to top |
|
 |
IP: 68.42.*.*
Guest
|
Posted: Mon Jun 23, 2003 6:48 pm Post subject: |
|
|
oops, I really meant:
([\w,\.\-])+(<![\w\s,\.\-]+>)+([\w,\.\-])+
| Steve wrote: | Hmmm...interesting...heartfelt thanks Gary!
Dinosaur brain here wants to do you one better as I think about this.
I think you really want comments *embedded* in words. So (rusty PERL here) that would be something like:
([\w\s,\.\-])+(<![\w\s,\.\-]+>)+([\w\s,\.\-])+
Is this right? Maybe the comment match should be non-greedy? Afterall, anybody that puts a comment INSIDE a word is probably trying to hide something? Yes? Does any honest html coder need to do this?
I'm really more of a LISP person (he said defensively). I also don't know html very well.
| gary wrote: | If you find you are getting too many false positives with just looking for comments, you might try narrowing it with something like:
((<![\w\s,\.\-]+>)+([\w\s,\.\-]){1,20}){3}
Best of luck! |
|
|
|
| Back to top |
|
 |
|
|