In
cryptography
Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...
, the simple XOR cipher is a type of ''additive
cipher
In cryptography, a cipher (or cypher) is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is ''encipherment''. To encipher or encode i ...
'', an
encryption algorithm
In cryptography, encryption (more specifically, encoding) is the process of transforming information in a way that, ideally, only authorized parties can decode. This process converts the original representation of the information, known as plain ...
that operates according to the principles:
:A
0 = A,
:A
A = 0,
:A
B = B
A,
:(A
B)
C = A
(B
C),
:(B
A)
A = B
0 = B
For example where
denotes the
exclusive disjunction
Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or Logical_equality#Inequality, logical inequality is a Logical connective, logical operator whose negation is the logical biconditional. With two inputs, X ...
(XOR) operation. This operation is sometimes called modulus 2 addition (or subtraction, which is identical). With this logic, a string of text can be encrypted by applying the bitwise XOR operator to every character using a given key. To decrypt the output, merely reapplying the XOR function with the key will remove the cipher.
Example
The string "Wiki" ( in 8-bit
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
) can be encrypted with the repeating key as follows:
:
And conversely, for decryption:
:
Use and security
The XOR operator is extremely common as a component in more complex ciphers. By itself, using a constant repeating key, a simple XOR cipher can trivially be broken using
frequency analysis
In cryptanalysis, frequency analysis (also known as counting letters) is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers.
Frequency analysis is based on th ...
. If the content of any message can be guessed or otherwise known then the key can be revealed. Its primary merit is that it is simple to implement, and that the XOR operation is computationally inexpensive. A simple repeating XOR (i.e. using the same key for xor operation on the whole data) cipher is therefore sometimes used for hiding information in cases where no particular security is required. The XOR cipher is often used in computer
malware
Malware (a portmanteau of ''malicious software'')Tahir, R. (2018)A study on malware and malware detection techniques . ''International Journal of Education and Management Engineering'', ''8''(2), 20. is any software intentionally designed to caus ...
to make reverse engineering more difficult.
If the key is random and is at least as long as the message, the XOR cipher is much more secure than when there is key repetition within a message.
When the keystream is generated by a
pseudo-random number generator
A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generate ...
, the result is a
stream cipher
stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream ( keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystrea ...
. With a key that is
truly random, the result is a
one-time pad
The one-time pad (OTP) is an encryption technique that cannot be Cryptanalysis, cracked in cryptography. It requires the use of a single-use pre-shared key that is larger than or equal to the size of the message being sent. In this technique, ...
, which is
unbreakable in theory.
The XOR operator in any of these ciphers is vulnerable to a
known-plaintext attack
The known-plaintext attack (KPA) is an attack model for cryptanalysis where the attacker has access to both the plaintext (called a crib) and its encrypted version (ciphertext). These can be used to reveal secret keys and code books. The term " ...
, since ''plaintext''
''ciphertext'' = ''key''.
It is also trivial to flip arbitrary bits in the decrypted plaintext by manipulating the ciphertext.
This is called
malleability
Ductility refers to the ability of a material to sustain significant plastic Deformation (engineering), deformation before fracture. Plastic deformation is the permanent distortion of a material under applied stress, as opposed to elastic def ...
.
Usefulness in cryptography
The primary reason XOR is so useful in cryptography is because it is "perfectly balanced"; for a given plaintext input 0 or 1, the ciphertext result is equally likely to be either 0 or 1 for a truly random key bit.
The table below shows all four possible pairs of plaintext and key bits. It is clear that if nothing is known about the key or plaintext, nothing can be determined from the ciphertext alone.
Other logical operations such and
AND
And or AND may refer to:
Logic, grammar and computing
* Conjunction, connecting two words, phrases, or clauses
* Logical conjunction in mathematical logic, notated as "∧", "⋅", "&", or simple juxtaposition
* Bitwise AND, a Boolean oper ...
or
OR do not have such a mapping (for example, AND would produce three 0's and one 1, so knowing that a given ciphertext bit is a 0 implies that there is a 2/3 chance that the original plaintext bit was a 0, as opposed to the ideal 1/2 chance in the case of XOR)
Example implementation
Example using the
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (prog ...
programming language.
from os import urandom
def generate_key(length: int) -> bytes:
"""Generate encryption key."""
return urandom(length)
def xor_strings(s, t) -> bytes:
"""Concatenate xor two strings together."""
if isinstance(s, str):
# Text strings contain single characters
return "".join(chr(ord(a) ^ b) for a, b in zip(s, t)).encode("utf8")
else:
# Bytes objects contain integer values in the range 0-255
return bytes( ^ b for a, b in zip(s, t)
message = "This is a secret message"
print("Message:", message)
key = generate_key(len(message))
print("Key:", key)
cipherText = xor_strings(message.encode("utf8"), key)
print("cipherText:", cipherText)
print("decrypted:", xor_strings(cipherText, key).decode("utf8"))
# Verify
if xor_strings(cipherText, key).decode("utf8") message:
print("Unit test passed")
else:
print("Unit test failed")
A shorter example using the
R programming language, based on
puzzleposted on Instagram by
GCHQ
Government Communications Headquarters (GCHQ) is an intelligence and security organisation responsible for providing signals intelligence (SIGINT) and information assurance (IA) to the government and armed forces of the United Kingdom. Primar ...
.
secret_key <- c(0xc6, 0xb5, 0xca, 0x01) , > as.raw()
secret_message <- "I <3 Wikipedia" , >
charToRaw() , >
xor(secret_key) , >
base64enc::base64encode()
secret_message_bytes <- secret_message , >
base64enc::base64decode()
xor(secret_message_bytes, secret_key) , > rawToChar()
See also
*
Block cipher
In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called ''blocks''. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage a ...
*
Vernam cipher
Vernam is a surname. Notable people with the surname include:
* Charles Vernam (born 1996), English professional footballer
*Gilbert Vernam (1890–1960), invented an additive polyalphabetic stream cipher and later co-invented an automated one-tim ...
*
Vigenère cipher
The Vigenère cipher () is a method of encryption, encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key (crypt ...
References
Notes
Citations
Sources
*
*
*
*
*
*
*
* Transcript of a lecture given by Prof. Tutte at the
University of Waterloo
The University of Waterloo (UWaterloo, UW, or Waterloo) is a Public university, public research university located in Waterloo, Ontario, Canada. The main campus is on of land adjacent to uptown Waterloo and Waterloo Park. The university also op ...
{{refend
Stream ciphers