A hash filter creates a
hash sum
A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called ''hash values'', ''hash codes'', ''digests'', or simply ''hashes''. The values are usually ...
from data, typically
e-mail
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" mean ...
, and compares the sum against other previously defined sums. Depending on the purpose of the filter, the data can then be included or excluded in a
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-orie ...
based on whether it matches an existing sum.
For example, when a message is received by an e-mail server with a hash filter, the contents of the e-mail is converted into a hash sum. If this sum corresponds to the hash sum of another e-mail which has been categorized as
spam, the received e-mail is prevented from being delivered. Spammers attempt to evade this by adding random strings to the text content and random pixel changes ("confetti") to image content (see
image spam
Image-based spam,Giorgio Fumera, Ignazio Pillai, Fabio Roli, Journal of Machine Learning Research (special issue on Machine Learning in Computer Security), vol. 7, pp. 2699-2720, 12/2006.Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio R ...
).
See also
*
Bloom filter
A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in ...
*
Hash buster
*
Locality-sensitive hashing In computer science, locality-sensitive hashing (LSH) is an algorithmic technique that hashes similar input items into the same "buckets" with high probability. (The number of buckets is much smaller than the universe of possible input items.) Since ...
References
Hash Filter algorithm (SQL Anywhere Server - SQL Usage)at iAnywhere.com
Hash functions
{{security-software-stub