Reviews   ::        

Articles   ::        

Home   ::        

Links   ::        

Archives   ::        

Search   ::        

About Us   ::        
 
HDTV Costs     

HDTV Guide     

Diskeeper 9     

Stor n Go PRO     

Blade SSD Server     

 
     
 
 

 
  Evaluating Spam Costs and Filtering Techniques
 
 

This brings us into the new age of antispam techniques called ‚??heuristics‚?Ě. Because there exist so many tricks a spammer can use against any kind of filtering technique, a more complex system is needed. The leading ‚??Open Source‚?Ě software by the Apache Foundation is ‚??Spam Assassin.‚?Ě It uses heuristics to identify and process spam.¬†Spam Assassin's¬†problem solving methods use a combination of all the above techniques to assign scores, then finally conduct a statistical analysis to determine the likelihood the message in question is spam or legitimate (also referred to as ‚??ham‚?Ě by some). Systems like Spam Assassin use everything from blacklists to Bayes to come up with a point tally which a system administrator can use to make decisions on whether to discard or otherwise mark messages as probable spam.

Here is an actual scoring table produced by Spam Assassin on a recent spam in my own mailbox.

Content analysis details: (18.9 points, 4.0 required)
pts rule name description ---- ---------------------- -------------------------------------------------- 
1.3 OFFSHORE_SCAM BODY: Off Shore Scams
 
0.6 FB_URI_4U BODY: m*http://.{3,20}4(?:u|me).{0,6}\.*i 
2.3 BAYES_70 BODY: Bayesian spam probability is 70 to 80%
  [score: 0.7341] 
0.1 HTML_MESSAGE BODY: HTML included in message 
3.0 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence between 51 and 100
  [cf: 100] 0.3 MIME_HTML_ONLY BODY: Message only has text/html MIME parts 
1.0 HTML_LINK_PUSH_HERE BODY: HTML link text says "push here" or similar
 
3.0 SPAMCOP_URI_RBL URI's domain appears in spamcop database at sc.surbl.org
[cj.greatideas4u.com is blacklisted in URI RBL at]  [multi.surbl.org] 
2.1 WS_URI_RBL URI's domain appears in ws database at ws.surbl.org
  [cj.greatideas4u.com is blacklisted in URI RBL at]  [multi.surbl.org] 
1.0 RAZOR2_CHECK Listed in Razor2 (
http://razor.sf.net/) 
2.0 DCC_CHECK Listed in DCC (
http://rhyolite.com/anti-spam/dcc/) 
1.2 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag 
1.1 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts

The administrator can also tweak how points are assigned or add their own sets of rules to supplement the accuracy and relevancy of the scoring.

Previous Page    Next Page
Table of Contents
Page 1: The Cost of Spam
Page 2: Simple Techniques
Page 3: Complex Techniques
Page 4: Integrated Techniques
Page 5: The Future
Page 6: Final Thoughts

 
      Posted by: , August 25, 2004, 6:00 pm  

 
    Cool banner #1
 
 
       ::  USB News

       ::  Bjorn 3D

       ::  [H]ardOCP

       ::  BurnOutPC

       ::  I am Not a Geek

 
 
Top Products


Processors

AMD

Intel

More...


Cases

Antec

Enlight

More...


Motherboards

Abit

Asus

Tyan

More...


Sound Cards

Creative Labs

Hercules

More...


Graphic Cards

ATI

nVidia

More...


Hard Drives

IBM

Maxtor

Quantum

More...


 

 

 
2001 - 2004 Digital Silence
Digital Silence is not responsible for the information or the accuracy of the information above.
All trademarks and copyrights owned by their respective companies.

Graphical Design by Mohsin Ali
Website Layout by Universal Interactive

PHP Programming by Network Innovations
Additional HTML Programming by Moddin.Net