In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a family of
hash functions
A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called ''hash values'', ...
is said to be ''k''-independent, ''k''-wise independent or ''k''-universal
if selecting a function at random from the family guarantees that the hash codes of any designated ''k'' keys are
independent random variables
Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of ...
(see precise mathematical definitions below). Such families allow good average case performance in randomized algorithms or data structures, even if the input data is chosen by an adversary. The trade-offs between the degree of independence and the efficiency of evaluating the hash function are well studied, and many ''k''-independent families have been proposed.
Background
The goal of hashing is usually to map keys from some large domain (universe)
into a smaller range, such as
bins (labelled
). In the analysis of randomized algorithms and data structures, it is often desirable for the hash codes of various keys to "behave randomly". For instance, if the hash code of each key were an independent random choice in