|
Donation/Premium |
|
 |
|
|
|
|
|
|
|
Survey |
|
 |
|
|
|
|
|
|
|
 |
 |
| View previous topic :: View next topic |
| Author |
Message |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Mon Jan 14, 2008 7:13 pm Post subject: |
|
|
Dennis;
Thanks for the reply and the questions. I currently only flag the from addresses that are known matches, then manually delete them. I have not had any false positives on the ones I am currently filtering, but that could be dumb luck.
One thing I will reveal is that where the spammers are using this technique, the letters preceding and following the matched domain name are always the same, for that address type, of which there are currently three. In one case the domain extension is not a .com and never varies, making it easy to create a blacklist rule for it. I have posted this info on my blog already.
I figured this wouldn't be an easy rule to create, and is probably not worth the effort and possible CPU load. But, I have been know to fly into dangerous territory before
If I do create such a rule I will post it in this thread. _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16505
|
Posted: Tue Jan 15, 2008 4:47 pm Post subject: |
|
|
Indeed a backreference would be the way to go. A backreference allows a match pattern to be used in a subsequent match within the same regex search.
First a generalization of the given example:
| Code: | | <some characters><domain><some more characters>@<domain>.TLD |
The problem with the task at hand (matching the <domain> fields) is that a backreference only works in the forward direction. Thus, if the same email address were to be found in more than one place within the header, a match on the domain field following the @ could be used to check for a match in the next instance of the email address.
Hope that makes sense. I could illustrate with an example but it's not worth the effort if the headers don't come through that way. If they do, perhaps drop off an example and I could compose a regex to try out. _________________
|
|
| Back to top |
|
 |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Tue Jan 15, 2008 5:20 pm Post subject: |
|
|
Ikeb; I sent the headers to you via a PM. _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
Denn_988
Guest IP: 74.132.*.*
|
Posted: Tue Jan 15, 2008 5:48 pm Post subject: |
|
|
Wizcrafts and Ikeb.
The following link to the Microsoft Developer Network pages on Regex might be useful.
It concerns the 'Lookahead' and 'LOOKBEHIND' grouping constructs in the Regular Expressions implemeted within the .NET Framework used by any Windows program (this should include WMP).
The LOOKBEHIND construct is what might have the potential of working within MWP's implementation of Regex is doing what Wizcraft desires.
It basically usesthe format:
| Code: | (?<= subexpression)
(Zero-width positive lookbehind assertion.) Continues match only if the subexpression matches at this position on the left. For example, (?<=19)99 matches instances of 99 that follow 19. This construct does not backtrack.
|
You will want to investigate wether this construct will allow full use of regular expressions within the construct.
Here is the link:
http://msdn2.microsoft.com/en-us/library/bs2twtah.aspx
Ikeb...
I am not even going to try to respond to your last e-mail.
You are too used to breaking up a post and replying to individual snippets separately. In the case of your reply to my 'Text Vault' e-mail you apparently used the quick-reply function of MWP. Unfortunately you did nothing to try to highlite your individual responses, which makes it extremely difficult to sort through your mail.
In the future.....at least try to leave some open space between the quoted reply and your reponse and some more open space between your response and the next portion of the quoted reply.
Thanks,
Denn
|
|
| Back to top |
|
 |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Tue Jan 15, 2008 9:54 pm Post subject: |
|
|
Dennis;
I am going to work on the information on the MSDN site, which you provided the link to. Thanks for the pointers. Back references and look-ahead's like these types are new territory for me. I gotta get my feet wet anyway, so now's as good a time as any!
Also, after anchoring (^) all possible first characters that are on new lines, in my filters, processing times and CPU load have improved markedly. moving Subject based filters above body based filters also improved the processing times. I am going through the filters that parse the body text to see what else can be done to streamline those rules, many of which are not anchor-able.
All of these (ongoing) alterations are reflected in my published MailWasher Pro filters. It has been a long time since I mentioned them, so for those new to this thread, here's the skinny.
I publish three sets of custom filters, for MailWasher Pro.
Set number 1 is a combination of the ancient "Gary's rules," plus my older rules, and all of my current rules, making it a very large file, but all-encompassing. These rules have been cleansed to insure that they only flag for manual deletion. You still see the spam in the list.
Set number 2 has my current rules, with only flagging for manual deletion of spam. You see the spam in the list.
Set number 3 is my current rules, but with automatic deletion of undisputed spam, and manual deletion of only possible spam. This is the actual filter set I use at any given time. You will see very few spam messages with this filter set, with the visible ones rarely being legitimate mail.
As spam techniques vary over time some old rules become useless and I either "disable" them (in the Filters list), or remove them entirely, from my filters.txt file. I add new rules as needed to better deal with the current crop of spam, scams and embedded threats. I should mention here that I am not a current subscriber to the FirstAlert! system, so I am on my own for classifying what is spam, aside from the DNS blacklists I use (Spamcop, Spamhaus, etc). I do use the learning filter, but my rules already catch everything it does before the filter understands that a message is spam. I could stop using my rules entirely and just teach the learning filter to detect spam, but it would have to be set to only flag spam for deletion, to avoid false positives. With my custom filter rules, when they absolutely match known spam there is no monkeyfutchin around; they are deleted automatically as they come in. The only time I even see what was deleted is when I check the statistics, to see if any legitimate mail was deleted, so it can be restored. This happens about once every week or two.
I recommend all MailWasher Pro owners use the built-in recycle bin. It allows you to restore deleted messages, up to the number of lines you have set MWP to scan, in your general options. I recommend setting the number of lines scanned to at least 250-275, or higher, if the load on the CPU will tolerate it. MWP uses a lot of CPU as it scans email and parses blacklists and filter rules. With 250 or more lines saved a false positive can be mostly restored from the recycle bin, where a 200 line message might be useless, if it is composed in HTML format.
There is always a trade off between scanning times and restorability of accidental deletions. Of course, people who set the rules to only mark for deletion are less likely to be impacted by false positives, since they will see these messages flagged in the incoming mail list. The non-spam messages can be added to your friends list, to avoid further false positives. In your case a lower, faster scanning limit is feasible; like 200 lines. People like me who use automatic deletion have to set higher scanned lines points, just in case.
FYI: I publish my weekly MailWasher Pro spam classifications and percentages every Sunday, on my blog. I obtain the results from the incoming Statistics and the pie chart, which shows the percentage of spam caught by various filters, blacklists and the learning filter (and FirstAlert, if you use it).
Frequent readers of those articles will see that the percentage of "blacklisted" messages is increasing slightly every week. This is directly related to pattern matching and to the questions about forward looking regular expressions that I have posted this week. Dennis and Ikeb are providing very useful input, which I greatly appreciate. If the solution is obtainable without too much additional strain on the CPU, I will update the existing filter rule that currently detects only a couple of repetitive "From" address tricks.
NB: The Blacklist itself is processed before the Filters. Filter rules are processed from the top of the list down, stopping at the first match. Any Filter rules that match items in the Header, like the From address, should go near the top of the Filters list. Subject rules should go next, followed by the Body text rules. Yes, I know I have several rules to rearrange and I'm working on it! _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
Ikeb
Special Response Team Forums Admin
 Joined: Apr 20, 2003 Posts: 16505
|
|
| Back to top |
|
 |
Denn_988
Guest IP: 74.132.*.*
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10579
|
Posted: Wed Jan 16, 2008 2:58 am Post subject: |
|
|
Since they have dumped the idea of Linux and Mac versions it sure doesn't make sense to not use a freely available Windows only library. _________________ Questions? Try the wiki
http://wiki.castlecops.com/MailWasher_Pro
|
|
| Back to top |
|
 |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Wed Jan 16, 2008 3:33 am Post subject: |
|
|
I have a regular expression that I am about to begin testing, to detect this type of spam (a particular type of forged "From" address). I will post my findings as soon as I can confirm that it works as required.
I already have a couple of simplified versions that can be applied to the MailWasher Pro Blacklist, as wildcard rules, which I am using successfully. Each unique "From" trick requires a separate email blacklist filter, which requires constant reading of incoming email senders, from the Statistics page (use the Recycle Bin button on MWP 6.0 and newer).
Stan;
If you are a Mac user and want to run MailWasher Pro, or any other Windows only application, take a look at Parallels for Mac OS X+ systems. It lets you run a Windows OS in a VM Window, on your Mac desktop, along-side of your Mac programs and even drag items between them, if need be. _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
stan_qaz
Premium Member
 Joined: Mar 31, 2003 Posts: 10579
|
Posted: Wed Jan 16, 2008 4:08 am Post subject: |
|
|
Not Mac, I use Linux and while Wine would work I don't care to open that can of worms on a system I depend on. I'm sticking with the last Linux version as long as possible then looking at another solution. Maybe XP in a virtual machine. _________________ Questions? Try the wiki
http://wiki.castlecops.com/MailWasher_Pro
|
|
| Back to top |
|
 |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Wed Jan 16, 2008 4:32 am Post subject: |
|
|
So far 100% on the new From tricks filter. Stay tuned. _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Fri Feb 08, 2008 7:50 pm Post subject: |
|
|
I have tested the custom filter that detects the same domain in the prefix and suffix, of an email address, for about one month and had no false positives. This is an example of what this filter catches, when the From address matches these samples:
Here is the code, which you can add to your filters.txt file, if you want to:
| Code: |
[enabled],XdomainY@domain,BlackList,0,AND,Delete,EntireHeader,containsRE,"^Received: from.*@(([\w\d]*)\.\w{2,4}).*^From:.*<\w{2,}\2\w+?@\1"
|
The above code must be on only one line, in your MWP filters.txt. I also gave it a Status name of "BlackList" - for Statistics reports, but you can rename it if you choose to.
I have this and all of my other filters published, for you to use, at: http://www.wizcrafts.net/mwp-filters.html. There are three sets of filters available on that page. I recommend filters2.txt for most people who use my filters. Copy and paste the contents into your existing filters.txt, or just save it as filters.txt, in your profile > Application Data > MailwasherPro folder. Note, that you must close MailWasher before editing filters.txt, or all your changes will be erased by the program.
My filters are updated quite frequently, including earlier today. Their processing speed has been improved over the last month by fine-tuning some of the more intense Regular Expressions, and getting more detections into Subjects and headers (but that only goes so far). In fact, I have removed a couple of filters that I could not reign in effectively.
If my filters slow down your email delivery you can disable the ones that match body text and rely upon the Learning Filter to flag possible spam.
Note: There is a new kid on the block, in Eastern Europe, that is trying to take business away from the Storm Botnet spammers. It is currently going by the name "Nugache Botnet" and due to some misconfiguration, it is sending German and Russian language spam to English speaking countries. I have a tentative filter for this, but it seems that my body text filters are already matching all of these messages (they are all for viagra or male enhancement). _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
Wizcrafts
Sergeant
 Premium Member
 Joined: Jun 05, 2003 Posts: 95 Location: Michigan
|
Posted: Sat Feb 09, 2008 8:05 am Post subject: |
|
|
I may be wrong in naming Nugache as the Botnet that is sending out the wrong language spams. This was based on an article I read earlier, quoting a security company on the matter. I have just read a comment about that article from one of the foremost botnet researchers, who says he was misquoted, or that his statements were misinterpreted. The confusion may have come from an interview held in November 2007, after a security conference, where the Storm, Nugache, and a new variant of the Rizo Trojans were discussed. _________________ Submitted by Wiz
Guarding the Castle against spammers and scammers
|
|
| Back to top |
|
 |
|
|
|
You can post new topics in this forum You can reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum
|
Powered by phpBB © 2001 phpBB Group
|