| View previous topic :: View next topic |
| Author |
Message |
Cowboy
Guest IP: 213.112.*.*
|
Posted: Mon Nov 24, 2003 6:02 pm Post subject: |
|
|
denn988, thank you very much for the filter. That is exactly what I wanted!
I too have tested it for a day now, and like you about the second thing I did was to remove the [!/] to see if I would get more hits.
I did get more hits but unfortunately I also got one bad hit, so I put the [!/] back in. I have had no bad hits with that whatsoever.
I'm currently trying to figure out if there is any other way to catch the non-comment garbage.
I've also got a new problem because i'm getting a lot more non-sex spam lately which doesn't contain much in the way of words you'd like filtered. Need to think about what to do about those.
Plus, I'm getting spam where the letters from the visible spam comes in raw as a string of numbers like #233 and so on, and only the whited out text is normal in raw.
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16543
|
|
| Back to top |
|
 |
denn988
Guest IP: 66.44.*.*
|
Posted: Mon Nov 24, 2003 7:52 pm Post subject: |
|
|
| Cowboy wrote: | denn988, thank you very much for the filter. That is exactly what I wanted!
I too have tested it for a day now, and like you about the second thing I did was to remove the [!/] to see if I would get more hits.
I did get more hits but unfortunately I also got one bad hit, so I put the [!/] back in. I have had no bad hits with that whatsoever.
I'm currently trying to figure out if there is any other way to catch the non-comment garbage.
I've also got a new problem because i'm getting a lot more non-sex spam lately which doesn't contain much in the way of words you'd like filtered. Need to think about what to do about those.
Plus, I'm getting spam where the letters from the visible spam comes in raw as a string of numbers like #233 and so on, and only the whited out text is normal in raw. |
Cowboy,
You could try setting up two versions of the filter.
The first one (higher in priority) would have the !/ included while the second one would be made more broad by not including those charactors.
I would not auto delete based on either of them alone....but...there may be additional rules that you could add to one or the other that would give you two different keys to trap on.
That might make the possibility of false positives rare enough to auto-delete.
Give it some thought. You have some good strategy ideas...and if you could learn regex you might be able to devise some good filters.
As far as the number strings (#233)...
Could you post a few examples??
|
|
| Back to top |
|
 |
Cowboy
Guest IP: 213.112.*.*
|
Posted: Mon Nov 24, 2003 9:25 pm Post subject: |
|
|
I already have the two versions set up like you say. (i put the second filter back when you suggested it)
I'm running both of them to override friends list and they are the two top filters.
But i can not see how the first filter does anything the second filter won't do. I'm rather sceptical to the second filter and would like something more precise, just like the first one.
If the first filter keeps working perfectly for an extended period of time I will switch it to auto. I won't do it right away.
I will think about combining the second filter with something else.
My hope of learning regex shrinks every time I try. The commands list in the help files seems to be incomplete, and I don't quite understand the examples either.
Here's an example of numbers for letters:
----------------------------------
<html?<br><font color=white>Viziertoaskhisson,whoownedthetruth,addingthat,dearlyALCIBIADES: You did.DICAEOPOLIS</font><font color=black><br>
Unleash The Power<font color=white>that have responded.</font><font color=black><br>
Of Your Digital Cable <font color=white>she always did, At night she would not come if it was dark, for she</font><font color=black><br>
-----------------------------------------
And so on...
|
|
| Back to top |
|
 |
Cowboy
Guest IP: 213.112.*.*
|
Posted: Mon Nov 24, 2003 9:30 pm Post subject: |
|
|
Damn numbers come out as text whatever I do.
How do I post the code so you can see it?
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10635
|
Posted: Mon Nov 24, 2003 11:48 pm Post subject: |
|
|
Pretty bogus but this works, add a space after every & sign.
& #115;& #097;& #108;& #101;& #115;& #064;& #115;& #116;& #097;& #110;& #109;& #105;& #108;& #108;& #101;& #114;& #046;& #105;& #110;& #102;& #111;
same thing with no spaces:
sales@stanmiller.info
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16543
|
|
| Back to top |
|
 |
denn988
Guest IP: 66.44.*.*
|
Posted: Tue Nov 25, 2003 1:33 am Post subject: |
|
|
| Cowboy wrote: | Damn numbers come out as text whatever I do.
How do I post the code so you can see it? |
| stan_qaz wrote: | Posted: Mon Nov 24, 2003 6:48 pm Post subject:
--------------------------------------------------------------------------------
Pretty bogus but this works, add a space after every & sign.
& #115;& #097;& #108;& #101;& #115;& #064;& #115;& #116;& #097;& #110;& #109;& #105;& #108;& #108;& #101;& #114;& #046;& #105;& #110;& #102;& #111;
same thing with no spaces:
sales@stanmiller.info
|
This would be something that you could filter with a Regexp as follows:
The body contains...
Regular Expression...
The above will look for 5 consecutive charactors in the format specified.
You could also form the Regexp to look for a certain number of occurences in total:
The above will look for 15 occurences in the entire body.
By the way....I have not tested these Regexps in any way....
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16543
|
|
| Back to top |
|
 |
Cowboy
Guest IP: 213.112.*.*
|
Posted: Tue Nov 25, 2003 5:29 pm Post subject: |
|
|
The filter should also look for the code with 2 digits instead of 3. I don't know if 1 digit is possible. Don't think so.
I'm trying these out now. No bad hits on startup.
The bad hit mail you ask for is unfortunately gone. I didn't think to save it and hotmail deleted it for me.
I looked it through first, and it looked like normal html to me, although I could not find what caused the hit. Should have looked harder.
I think the message had a yahoo groups sponsor message at the bottom but I'm not sure that did it. I haven't had any other hits from yahoo groups or that particular sender.
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16543
|
Posted: Tue Nov 25, 2003 5:42 pm Post subject: |
|
|
| Cowboy wrote: | | I looked it through first, and it looked like normal html to me, although I could not find what caused the hit. Should have looked harder. |
Next time you check out what's causing the hit, use TRegExpr. Just save the message into a file, copy the filter's regex into TRegExpr and run it against the file you saved.
|
|
| Back to top |
|
 |
TimeGhost
Major

 Joined: Apr 11, 2003 Posts: 750 Location: USA
|
Posted: Tue Nov 25, 2003 5:56 pm Post subject: |
|
|
I've been reading this thread with delight. The only thing I want to add is that Gary's "HTML Spam Tricks" and "Questionable Links" filters deal with some of the issues that you're discussing.
The whole set filters can be downloaded at:
www.w5hq.com/MailWasher/MailWasherFilters.txt
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16543
|
|
| Back to top |
|
 |
Cowboy
Guest IP: 213.112.*.*
|
Posted: Tue Nov 25, 2003 9:44 pm Post subject: |
|
|
I just recieved a mail (spam actually) , that triggered the [a-z]<[^<]*?>[a-z] filter by a linebreak < br> between two words, that had no space between them.
Is there a way to prevent this? Seems like a possible source of bad hits.
|
|
| Back to top |
|
 |
denn988
Guest IP: 66.44.*.*
|
Posted: Tue Nov 25, 2003 11:02 pm Post subject: |
|
|
| Cowboy wrote: | I just recieved a mail (spam actually) , that triggered the [a-z]<[^<]*?>[a-z] filter by a linebreak < br> between two words, that had no space between them.
Is there a way to prevent this? Seems like a possible source of bad hits. |
You will need to put the ! back in there, at the least.
| Quote: | | [a-z]<![^<]*?>[a-z] |
This whole thing could be solved if MWP would simply allow the choice of filtering on the RAW text...or the translated text (after it removes the HTML code).
|
|
| Back to top |
|
 |
|
|