cryptography Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...

, the McEliece cryptosystem is an

asymmetric encryption Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...

algorithm developed in 1978 by

Robert McEliece Robert J. McEliece (May 21, 1942 – May 8, 2019) was the Allen E. Puckett Professor and a professor of electrical engineering at the California Institute of Technology (Caltech) best known for his work in error-correcting coding and information the ...

. It was the first such scheme to use

randomization Randomization is a statistical process in which a random mechanism is employed to select a sample from a population or assign subjects to different groups.Oxford English Dictionary "randomization" The process is crucial in ensuring the random alloc ...

in the encryption process. The algorithm has never gained much acceptance in the cryptographic community, but is a candidate for "

post-quantum cryptography Post-quantum cryptography (PQC), sometimes referred to as quantum-proof, quantum-safe, or quantum-resistant, is the development of cryptographic algorithms (usually public-key algorithms) that are currently thought to be secure against a crypt ...

", as it is immune to attacks using

Shor's algorithm Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong ...

and – more generally – measuring coset states using Fourier sampling. The algorithm is based on the hardness of decoding a general

linear code In coding theory, a linear code is an error-correcting code for which any linear combination of Code word (communication), codewords is also a codeword. Linear codes are traditionally partitioned into block codes and convolutional codes, although t ...

(which is known to be

NP-hard In computational complexity theory, a computational problem ''H'' is called NP-hard if, for every problem ''L'' which can be solved in non-deterministic polynomial-time, there is a polynomial-time reduction from ''L'' to ''H''. That is, assumi ...

). For a description of the private key, an

error-correcting code In computing, telecommunication, information theory, and coding theory, forward error correction (FEC) or channel coding is a technique used for controlling errors in data transmission over unreliable or noisy communication channels. The centra ...

is selected for which an efficient decoding algorithm is known, and that is able to correct

t

errors. The original algorithm uses

binary Goppa code In mathematics and computer science, the binary Goppa code is an error-correcting code that belongs to the class of general Goppa codes originally described by Valerii Denisovich Goppa, but the binary structure gives it several mathematical advant ...

s (subfield codes of algebraic geometry codes of a genus-0 curve over finite fields of characteristic 2); these codes can be efficiently decoded, thanks to an algorithm due to Patterson. The public key is derived from the private key by disguising the selected code as a general linear code. For this, the code's

generator matrix In coding theory, a generator matrix is a matrix whose rows form a basis for a linear code. The codewords are all of the linear combinations of the rows of this matrix, that is, the linear code is the row space of its generator matrix. Terminolo ...

G

is perturbated by two randomly selected invertible matrices

S

and

P

(see below). Variants of this cryptosystem exist, using different types of codes. Most of them were proven less secure; they were broken by structural decoding. McEliece with Goppa codes has resisted cryptanalysis so far. The most effective attacks known use information-set decoding algorithms. A 2008 paper describes both an attack and a fix. Another paper shows that for

quantum computing A quantum computer is a computer that exploits quantum mechanical phenomena. On small scales, physical matter exhibits properties of wave-particle duality, both particles and waves, and quantum computing takes advantage of this behavior using s ...

, key sizes must be increased by a factor of four due to improvements in information set decoding. The McEliece cryptosystem has some advantages over, for example, RSA. The encryption and decryption are faster. For a long time, it was thought that McEliece could not be used to produce

signatures A signature (; from , "to sign") is a depiction of someone's name, nickname, or even a simple "X" or other mark that a person writes on documents as a proof of identity and intent. Signatures are often, but not always, handwritten or styliz ...

. However, a signature scheme can be constructed based on the Niederreiter scheme, the dual variant of the McEliece scheme. One of the main disadvantages of McEliece is that the private and public keys are large matrices. For a standard selection of parameters, the public key is 512 kilobits long.

Scheme definition

McEliece consists of three algorithms: a probabilistic key generation algorithm that produces a public and a private key, a

probabilistic encryption Probabilistic encryption is the use of randomness in an encryption algorithm, so that when encrypting the same message several times it will, in general, yield different ciphertexts. The term "probabilistic encryption" is typically used in referen ...

algorithm, and a deterministic decryption algorithm. All users in a McEliece deployment share a set of common security parameters:

n, k, t

Key generation

The principle is that Alice chooses a linear code

C

from some family of codes for which she knows an efficient decoding algorithm, and to make

C

public knowledge but keep the decoding algorithm secret. Such a decoding algorithm requires not just knowing

C

, in the sense of knowing an arbitrary generator matrix, but requires one to know the parameters used when specifying

C

in the chosen family of codes. For instance, for binary Goppa codes, this information would be the Goppa polynomial and the code locators. Therefore, Alice may publish a suitably obfuscated generator matrix of

C

. More specifically, the steps are as follows: # Alice selects a binary

(n, k)

-linear code

C

capable of (efficiently) correcting

t

errors from some large family of codes, e.g. binary Goppa codes. This choice should give rise to an efficient decoding algorithm

A

. Let also

G

be any generator matrix for

C

. Any linear code has many generator matrices, but often there is a natural choice for this family of codes. Knowing this would reveal

A

so it should be kept secret. # Alice selects a random

k \times k

binary

non-singular matrix In linear algebra, an invertible matrix (''non-singular'', ''non-degenarate'' or ''regular'') is a square matrix that has an inverse. In other words, if some other matrix is multiplied by the invertible matrix, the result can be multiplied by an ...

S

. # Alice selects a random

n \times n

permutation matrix In mathematics, particularly in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry of 1 in each row and each column with all other entries 0. An permutation matrix can represent a permutation of elements. ...

P

. # Alice computes the

k \times n

matrix

= SGP

. # Alice's public key is

(, t)

; her private key is

(S, P, A)

. Note that

A

could be encoded and stored as the parameters used for selecting

C

Message encryption

Suppose Bob wishes to send a message

m

to Alice whose public key is

(, t)

: # Bob encodes the message

m

as a binary string of length

k

. # Bob computes the vector

c^ = m

. # Bob generates a random

n

-bit vector

z

containing exactly

t

ones (a vector of length

n

and weight

t

) # Bob sends Alice the ciphertext computed as

c = c^ + z

Message decryption

Upon receipt of

c

, Alice performs the following steps to decrypt the message: # Alice computes the inverse of

P

(i.e.

P^

). # Alice computes

= cP^

. # Alice uses the decoding algorithm

A

to decode

. # Alice computes

m = S^

Proof of message decryption

Note that

= cP^ = mP^ + zP^ = mSG + zP^

, and that

P

is a permutation matrix, thus

zP^

has weight

t

. The Goppa code

G

can correct up to

t

errors, and the word

mSG

is at distance at most

t

from

cP^

. Therefore, the correct code word

= mS

is obtained. Multiplying with the inverse of

S

gives

m = S^= mSS^

, which is the plain text message.

Key sizes

Because there is a free choice in the matrix

S

, it is common to express

in "systematic form" so that the last

k

columns correspond to the identity matrix

= (, I)

. This reduces the key size to

(n-k) \times k

. McEliece originally suggested security parameter sizes of

n=1024, k=524, t=50

, resulting in a public key size of . Recent analysis suggests parameter sizes of

n=2048, k=1751, t=27

for 80

bits of security In cryptography, security level is a measure of the strength that a cryptographic primitive — such as a cipher or hash function — achieves. Security level is usually expressed as a number of " bits of security" (also security strength ...

when using standard algebraic decoding, or

n=1632, k=1269, t=34

when using list decoding for the Goppa code, giving rise to public key sizes of and respectively. For resiliency against quantum computers, sizes of

n=6960, k=5413, t=119

with Goppa code were proposed, giving the size of public key of . In its round 3 submission to the NIST post quantum standardization the highest level of security, level 5 is given for parameter sets 6688128, 6960119, and 8192128. The parameters are

n=6688, k=5024, t= 128

n= 6960, k = 5413, t = 119

n=8192, k =6528, t = 128

respectively.

Attacks

An attack consists of an adversary, who knows the public key

(, t)

but not the private key, deducing the plaintext from some intercepted ciphertext

y \in \mathbb_2^n

. Such attempts should be infeasible. There are two main branches of attacks for McEliece:

Brute-force / unstructured attacks

The attacker knows

\hat G

, the generator matrix of an

(n,k)

code

\hat C

that is combinatorially able to correct

t

errors. The attacker may ignore the fact that

\hat C

is really the obfuscation of a structured code chosen from a specific family, and instead just use an algorithm for decoding with any linear code. Several such algorithms exist, such as going through each codeword of the code,

syndrome decoding In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, ...

, or information set decoding. Decoding a general linear code, however, is known to be

, and all of the above-mentioned methods have exponential running time. In 2008, Bernstein, Lange, and Peters described a practical attack on the original McEliece cryptosystem, using the information set decoding method by Stern. Using the parameters originally suggested by McEliece, the attack could be carried out in 2^60.55 bit operations. Since the attack is

embarrassingly parallel In parallel computing, an embarrassingly parallel workload or problem (also called embarrassingly parallelizable, perfectly parallel, delightfully parallel or pleasingly parallel) is one where little or no effort is needed to split the problem into ...

(no communication between nodes is necessary), it can be carried out in days on modest computer clusters.

Structural attacks

The attacker may instead attempt to recover the "structure" of

C

, thereby recovering the efficient decoding algorithm

A

or another sufficiently strong, efficient decoding algorithm. The family of codes from which

C

is chosen completely determines whether this is possible for the attacker. Many code families have been proposed for McEliece, and most of them have been completely "broken" in the sense that attacks have been found that recover an efficient decoding algorithm, such as Reed-Solomon codes. The originally proposed binary Goppa codes remain one of the few suggested families of codes that have largely resisted attempts at devising structural attacks.

Post-quantum encryption candidate

A variant of this algorithm combined with NTS-KEM was entered into and selected during the third round of the

NIST The National Institute of Standards and Technology (NIST) is an agency of the United States Department of Commerce whose mission is to promote American innovation and industrial competitiveness. NIST's activities are organized into physical s ...

post-quantum encryption competition.

References

External links

* * * * (Submission to the NIST Post-Quantum Cryptography Standardization Project) {{Cryptography navbox , public-key Public-key encryption schemes Code-based cryptography Post-quantum cryptography