HOME

TheInfoList



OR:

SpamBayes is a Bayesian
spam filter Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly app ...
written in Python which uses techniques laid out by Paul Graham in his essay "A Plan for Spam". It has subsequently been improved by Gary Robinson and Tim Peters, among others. The most notable difference between a conventional Bayesian filter and the filter used by SpamBayes is that there are three classifications rather than two: spam, non-spam (called ''ham'' in SpamBayes), and unsure. The user trains a message as being either ham or spam; when filtering a message, the spam filters generate one score for ham and another for spam. If the spam score is high and the ham score is low, the message will be classified as spam. If the spam score is low and the ham score is high, the message will be classified as ham. If the scores are both high or both low, the message will be classified as unsure. This approach leads to a low number of
false positive A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resul ...
s and
false negative A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
s, but it may result in a number of unsures which need a human decision.


Web filtering

Some work has gone into applying SpamBayes to filter internet content via a proxy web server.


References


External links

* {{official website, http://spambayes.sourceforge.net/
Paul Graham's original idea





Paper on SpamBayes for the Conference on E-mail and Anti-Spam

Winning the War on spam: Comparison of Bayesian spam filters
Spam filtering Anti-spam Email Free software programmed in Python