MMH-Badger MAC
   HOME

TheInfoList



OR:

Badger is a message authentication code (MAC) based on the idea of universal hashing and was developed by Boesgaard, Scavenius, Pedersen, Christensen, and Zenner. It is constructed by strengthening the ∆-universal hash family MMH using an ϵ-almost strongly universal (ASU) hash function family after the application of ENH (see below), where the value of ϵ is 1/(2^-5). Since Badger is a MAC function based on the universal hash function approach, the conditions needed for the security of Badger are the same as those for other universal hash functions such as UMAC.


Introduction

The Badger MAC processes a message of length up to 2^-1 bits and returns an authentication tag of length u\cdot32 bits, where 1\le u \le 5 . According to the security needs, user can choose the value of u, that is the number of parallel hash trees in Badger. One can choose larger values of ''u'', but those values do not influence further the security of MAC. The
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
uses a 128-bit key and the limited message length to be processed under this key is 2^. The key setup has to be run only once per key in order to run the Badger
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
under a given key, since the resulting internal state of the MAC can be saved to be used with any other message that will be processed later.


ENH

Hash families can be combined in order to obtain new hash families. For the ϵ-AU, ϵ-A∆U, and ϵ-ASU families, the latter are contained in the former. For instance, an A∆U family is also an AU family, an ASU is also an A∆U family, and so forth. On the other hand, a stronger family can be reduced to a weaker one, as long as a performance gain can be reached. A method to reduce ∆-universal hash function to universal hash functions will be described in the following. Theorem 2 Let H^\triangle be an ϵ-AΔU hash family from a set ''A'' to a set ''B''. Consider a message (m, m_b) \in A \times B . Then the family ''H'' consisting of the functions h(m,m_b) = H^\triangle (m) + m_b is ϵ-AU. If m \ne m', then the
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
that h(m,m_b) =h(m',m'_b) is at most ϵ, since H^\triangle is an ϵ-A∆U family. If m = m' but m_b\ne m_b', then the
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
is trivially 0. The proof for Theorem 2 was described in The ENH-family is constructed based on the universal hash family NH (which is also used in UMAC): : NH_K (M)= \sum_^ \frac (k_ +_w m_)\times (k_ +_w m_ ) \mod 2^ Where '+_w' means 'addition modulo 2^w', and m_i,k_i \in \big\. It is a 2^-A∆U hash family. Lemma 1 The following version of NH is 2^-A∆U: : NH_K (M)=(k_1 +_w m_1 )\times(k_2 +_w m_2 ) \mod 2^ Choosing w=32 and applying Theorem 1, one can obtain the 2^-AU function family ENH, which will be the basic building block of the badger MAC: : ENH_ (m_1,m_2,m_3,m_4 )=(m_1 +_ k_1)(m_2 +_ k_2) +_ m_3 +_ 2^ m_4 where all arguments are 32-bits long and the output has 64-bits.


Construction

Badger is constructed using the strongly universality hash family and can be described as : \mathcal=H^* \times F, where an \epsilon_-AU universal function family ''H*'' is used to hash messages of any size onto a fixed size and an \epsilon_-ASU function family ''F'' is used to guarantee the strong universality of the overall construction. ''NH'' and ''ENH'' are used to construct ''H*''. The maximum input size of the function family ''H*'' is 2^-1 and the output size is 128 bits, split into 64 bits each for the message and the hash. The collision probability for the ''H*''-function ranges from 2^ to 2^. To construct the strongly universal function family ''F'', the ∆-universal hash family MMH* is transformed into a strongly universal hash family by adding another key.


Two steps on Badger

There are two steps that have to be executed for every message: processing phase and finalize phase.


Processing phase

In this phase, the data is hashed to a 64-bit string. A core function : \left\^\times \left\^ \to \left\^ is used in this processing phase, that hashes a 128-bit string m_2 \parallel m_1 to a 64-bit string h( k, m_2, m_1 ) as follows: : h(k, m_2, m_1 )= (L(m_1 ) +_ L(k) )\cdot(U(m_1) +_ U(k) ) +_ m_2 \, for any ''n'', +_n means addition modulo 2^n. Given a -bit string ''x'', means least significant ''n'' bits, and means most significant ''n'' bits. A message can be processed by using this function. Denote by k_j^i. Pseudo-code of the processing phase is as follow. ''L'' = , ''M'', if ''L'' = 0 M^1 = \cdots = M^u = 0 Go to finalization ''r'' = ''L'' mod 64 if ''r'' ≠ 0: M = 0^ \parallel M for ''i'' = 1 to ''u'': M^i = M v' = \max\ for ''j'' = 1 to ''v''′: divide into 64-bit blocks, if ''t'' is even: else


Finalize phase

In this phase, the 64-string resulting from the processing phase is transformed into the desired MAC tag. This finalization phase uses the
Rabbit Rabbits are small mammals in the family Leporidae (which also includes the hares), which is in the order Lagomorpha (which also includes pikas). They are familiar throughout the world as a small herbivore, a prey animal, a domesticated ...
stream cipher and uses both key setup and IV setup by taking the finalization key as k_j^i. Pseudo-code of the finalization phase RabbitKeySetup(K) RabbitIVSetup(N) for ''i'' = 1 to ''u'': Q^i =0^7 \parallel L \parallel M^i divide into 27-bit blocks, Q^i=q_5^i \parallel \cdots \parallel q_1^i S^i =(\sum_^5 (q_j^i K_j^i))+K_6^i \bmod p S = S^u \parallel \cdots \parallel S^1 ''S'' = ''S'' ⨁ RabbitNextbit(''u''∙32) return S


Notation

From the pseudocode above, ''k'' denotes the key in the Rabbit Key Setup(K) which initializes Rabbit with the 128-bit key ''k''. ''M'' denotes the message to be hashed and , ''M'', denotes the length of the message in bits. denotes a message ''M'' that is divided into ''i'' blocks. For the given -bit string ''x'' then and respectively denoted its least significant n bits and most significant ''n'' bits.


Performance

Boesgard, Christensen and Zenner report the performance of Badger measured on a 1.0 GHz Pentium III and on a 1.7 GHz
Pentium 4 Pentium 4 is a series of single-core central processing unit, CPUs for Desktop computer, desktops, laptops and entry-level Server (computing), servers manufactured by Intel. The processors were shipped from November 20, 2000 until August 8, 20 ...
processor. The speed-optimized versions were programmed in assembly language inlined in C and compiled using the Intel C++ 7.1 compiler. The following table presents Badger's properties for various restricted message lengths. "Memory req." denotes the amount of memory required to store the internal state including key material and the inner state of the
Rabbit Rabbits are small mammals in the family Leporidae (which also includes the hares), which is in the order Lagomorpha (which also includes pikas). They are familiar throughout the world as a small herbivore, a prey animal, a domesticated ...
stream cipher . "Setup" denotes the key setup, and "Fin." denotes finalization with IV-setup.


MMH (Multilinear Modular Hashing)

The name MMH stands for Multilinear-Modular-Hashing. Applications in
Multimedia Multimedia is a form of communication that uses a combination of different content forms, such as Text (literary theory), writing, Sound, audio, images, animations, or video, into a single presentation. T ...
are for example to verify the
integrity Integrity is the quality of being honest and having a consistent and uncompromising adherence to strong moral and ethical principles and values. In ethics, integrity is regarded as the honesty and Honesty, truthfulness or of one's actions. Integr ...
of an on-line multimedia title. The performance of MMH is based on the improved support of integer scalar products in modern microprocessors. MMH uses single precision scalar products as its most basic operation. It consists of a (modified)
inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
between the message and a key
modulo In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation. Given two positive numbers and , mo ...
a prime p. The construction of MMH works in the
finite field In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field (mathematics), field that contains a finite number of Element (mathematics), elements. As with any field, a finite field is a Set (mathematics), s ...
F_p for some prime integer p.


MMH*

MMH* involves a construction of a family of hash functions consisting of multilinear functions on F_p^k for some positive integer . The family MMH* of functions from F_p^k to F_p is defined as follows. : \mathrm^* = \ where ''x, m'' are vectors, and the functions g_x are defined as follows. : g_x(m) = m x\bmod p = \sum_^n m_i\,x_i\bmod p In the case of MAC, is a message and is a key where m = (m_1,\ldots,m_k) and x = (x_1,\ldots,x_k), x_i, m_i \in \!F_p. MMH* should satisfy the security requirements of a MAC, enabling say Ana and Bob to communicate in an authenticated way. They have a secret key . Say Charles listens to the conversation between Ana and Bob and wants to change the message into his own message to Bob which should pass as a message from Ana. So, his message and Ana's message will differ in at least one bit (e.g. m_1 \ne m'_1 ). Assume that Charles knows that the function is of the form g_x (m) and he knows Ana's message but he does not know the key ''x'' then the probability that Charles can change the message or send his own message can be explained by the following theorem. Theorem 1:The family MMH* is ∆-universal. Proof: Take a\in F_p, and let m , m' be two different messages. Assume without loss of generality that m_1 \ne m'_1 . Then for any choice of x_2,x_3,\ldots,x_s , there is : \begin _ _x (m)-g_x (m')\equiv a \mod p&= _ m_1 x_1+m_2 x_2+ \cdots +m_k x_k )-(m'_1 x_1+m'_2 x_2+\cdots+m'_k x_k )\equiv a \mod p\ &= _ m_1-m'_1)x_1+(m_2-m'_2)x_2+ \cdots +(m_k-m'_k)x_kequiv a \mod p]\\ &= _ m_1-m'_1)x_1+\textstyle \sum_^s(m_k-m'_k)x_k\equiv a \mod p\ &= _ m_1-m'_1)x_1\equiv a - \textstyle \sum_^s(m_k-m'_k)x_k \mod p\ &=\frac \end To explain the theorem above, take F_p for p prime represent the field as F_p = \underbrace_p . If one takes an element in F_p , let say 0\in F_p then the probability that x_1=0 is : _(x_1=0)= \frac So, what one actually needs to compute is : _ (g_x(m)\equiv g_x(m')\mod p) But, : \begin _(g_x(m)\equiv g_x(m')\mod p) &= \sum_ _(,\ldots,)\cdot _(g_x(m)\equiv g_x(m')\mod p)\\ &= \sum_ \frac \cdot \frac \\ &=p^\cdot \frac \cdot \frac \\ &=\frac \end From the proof above, \frac is the collision
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
of the attacker in 1 round, so on average verification queries will suffice to get one message accepted. To reduce the collision probability, it is necessary to choose large ''p'' or to concatenate such MACs using independent keys so that the collision probability] becomes \frac. In this case the number of keys are increased by a factor of and the output is also increased by .


MMH*32

Halevi and Krawczyk construct a variant called \mathrm^*_. The construction works with 32-bit
integers An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
and with the prime integer p=2^+15. Actually the prime ''p'' can be chosen to be any prime which satisfies 2^. This idea is adopted from the suggestion by Carter and Wegman to use the primes 2^+1 or 2^-1. : \mathrm^*_ is defined as follows: : \mathrm^*_=\left\ \to F_p, where \left\^ means \left\ (i.e., binary representation) The functions g_x are defined as follows. : \begin g_x (m)&\ \overset\ m \cdot x \bmod (2^+15)\\ &=\textstyle \sum_^k m_i \cdot x_i \bmod (2^+15) \end where : x= (x_1,\ldots,x_k ),\ m=(m,\ldots,m_k ) By theorem 1, the collision
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
is about \epsilon = 2^, and the family of \mathrm^*_ can be defined as ''ϵ''-almost ∆ Universal with \epsilon = 2^.


The value of ''k''

The value of ''k'' that describes the length of the message and key vectors has several effects: * Since the costly modular reduction over ''k'' is multiply and add operations increasing ''k'' should decrease the speed. * Since the key ''x'' consist of ''k'' 32-bit integers increasing ''k'' will results in a longer key. * The probability of breaking the system is 1/p and p\approx 2^k so increasing ''k'' makes the system harder to break.


Performance

Below are the timing results for various implementations of MMH in 1997, designed by Halevi and Krawczyk. * A 150 MHz
PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...
604 RISC machine running AIX * A 150 MHz Pentium-Pro machine running
Windows NT Windows NT is a Proprietary software, proprietary Graphical user interface, graphical operating system produced by Microsoft as part of its Windows product line, the first version of which, Windows NT 3.1, was released on July 27, 1993. Original ...
* A 200 MHz Pentium-Pro machine running
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...


See also

* UMAC * VMAC * Poly1305-AES


References

{{Reflist Articles with example pseudocode Message authentication codes