
Xorshift random number generators, also called shift-register generators, are a class of
pseudorandom number generator
A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random number generation, random n ...
s that were invented by
George Marsaglia.
They are a subset of
linear-feedback shift register
In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a Linear#Boolean functions, linear function of its previous state.
The most commonly used linear function of single bits is exclusive-or (XOR). Thus, ...
s (LFSRs) which allow a particularly efficient implementation in software without the excessive use of
sparse polynomials.
They generate the next number in their sequence by repeatedly taking the
exclusive or
Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ (on ...
of a number with a
bit-shifted version of itself. This makes execution extremely efficient on modern computer architectures, but it does not benefit efficiency in a hardware implementation. Like all LFSRs, the parameters have to be chosen very carefully in order to achieve a long period.
For execution in software, xorshift generators are among the fastest PRNGs, requiring very small code and state. However, they do not pass every statistical test without further refinement. This weakness is amended by combining them with a non-linear function, as described in the original paper. Because plain xorshift generators (without a non-linear step) fail some statistical tests, they have been accused of being unreliable.
Example implementation
A
C version of three xorshift algorithms is given here. The first has one 32-bit word of state, and period 2
32−1. The second has one 64-bit word of state and period 2
64−1. The last one has four 32-bit words of state, and period 2
128−1. The 128-bit algorithm passes the
diehard tests. However, it fails the ''MatrixRank'' and ''LinearComp'' tests of the ''BigCrush'' test suite from the
TestU01 framework.
All use three shifts and three or four exclusive-or operations:
#include
struct xorshift32_state ;
/* The state must be initialized to non-zero */
uint32_t xorshift32(struct xorshift32_state *state)
struct xorshift64_state ;
/* The state must be initialized to non-zero */
uint64_t xorshift64(struct xorshift64_state *state)
/* struct xorshift128_state can alternatively be defined as a pair
of uint64_t or a uint128_t where supported */
struct xorshift128_state ;
/* The state must be initialized to non-zero */
uint32_t xorshift128(struct xorshift128_state *state)
In case of one 64-bit word of state, there exist parameters which hold period 2
64−1 with two pair of exclusive-or and shift.
#include
struct xorshift64_state ;
uint64_t xorshift64(struct xorshift64_state *state)
Non-linear variations
All xorshift generators fail some tests in the ''BigCrush'' test suite. This is true for all generators based on linear recurrences, such as the
Mersenne Twister
The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by and . Its name derives from the choice of a Mersenne prime as its period length.
The Mersenne Twister was created specifically to address most of ...
or
WELL
A well is an excavation or structure created on the earth by digging, driving, or drilling to access liquid resources, usually water. The oldest and most common kind of well is a water well, to access groundwater in underground aquifers. The ...
. However, it is easy to scramble the output of such generators to improve their quality.
The scramblers known as and still leave weakness in the low bits,
[ so they are intended for floating point use, where the lowest bits of floating-point numbers have a smaller impact on the interpreted value. For general purpose, the scrambler (pronounced ''starstar'') makes the LFSR generators pass in all bits.
]
xorwow
Marsaglia suggested scrambling the output by combining it with a simple additive counter modulo 232 (which he calls a " Weyl sequence" after Weyl's equidistribution theorem). This also increases the period by a factor of 232, to 2192−232:
#include
struct xorwow_state ;
/* The state array must be initialized to not be all zero in the first four words */
uint32_t xorwow(struct xorwow_state *state)
This performs well, but fails a few tests in BigCrush. This generator is the default in Nvidia's CUDA
In computing, CUDA (Compute Unified Device Architecture) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated gene ...
toolkit.
xorshift*
An generator applies an invertible multiplication (modulo the word size) as a non-linear transformation to the output of an generator, as suggested by Marsaglia. All generators emit a sequence of values that is equidistributed In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be equidistributed, or uniformly distributed, if the proportion of terms falling in a subinterval is proportional to the length of that subinterval. Such sequences ...
in the maximum possible dimension (except that they will never output zero for 16 calls, i.e. 128 bytes, in a row).
The following 64-bit generator has a maximal period of 264−1.
#include
/* xorshift64s, variant A_1(12,25,27) with multiplier M_32 from line 3 of table 5 */
uint64_t xorshift64star(void)
The generator fails only the ''MatrixRank'' test of BigCrush, however if the generator is modified to return only the high 32 bits, then it passes BigCrush with zero failures. In fact, a reduced version with only 40 bits of internal state passes the suite, suggesting a large safety margin. A similar generator suggested in ''Numerical Recipes
''Numerical Recipes'' is the generic title of a series of books on algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a cla ...
'' as RanQ1
also fails the ''BirthdaySpacings'' test.
Vigna suggests the following generator with 1024 bits of state and a maximal period of 21024−1; however, it does not always pass BigCrush. xoshiro256** is therefore a much better option.
#include
/* The state must be seeded so that there is at least one non-zero element in array */
struct xorshift1024s_state ;
uint64_t xorshift1024s(struct xorshift1024s_state *state)
xorshift+
An generator can achieve an order of magnitude fewer failures than Mersenne Twister
The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by and . Its name derives from the choice of a Mersenne prime as its period length.
The Mersenne Twister was created specifically to address most of ...
or WELL
A well is an excavation or structure created on the earth by digging, driving, or drilling to access liquid resources, usually water. The oldest and most common kind of well is a water well, to access groundwater in underground aquifers. The ...
. A native C implementation of an xorshift+ generator that passes all tests from the BigCrush suite can typically generate a random number in fewer than 10 clock cycle
In electronics and especially synchronous digital circuits, a clock signal (historically also known as ''logic beat'') is an electronic logic signal (voltage or current) which oscillates between a high and a low state at a constant frequency and ...
s on x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
, thanks to instruction pipelining
In computer engineering, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming in ...
.
Rather than using multiplication, it is possible to use addition as a faster non-linear transformation. The idea was first proposed by Saito and Matsumoto (also responsible for the Mersenne Twister) in the generator, which adds two consecutive outputs of an underlying generator based on 32-bit shifts. However, one disadvantage of adding consecutive outputs is that, while the underlying generator is 2-dimensionally equidistributed, the generator is only 1-dimensionally equidistributed.
has some weakness in the low-order bits of its output; it fails several BigCrush tests when the output words are bit-reversed. To correct this problem, Vigna introduced the family, based on 64-bit shifts. generators, even as large as , exhibit some detectable linearity in the low-order bits of their output; it passes BigCrush, but doesn't when the 32 lowest-order bits are used in reverse order from each 64-bit word. This generator is one of the fastest generators passing BigCrush.
The following generator uses 128 bits of state and has a maximal period of 2128−1.
#include
struct xorshift128p_state ;
/* The state must be seeded so that it is not all zero */
uint64_t xorshift128p(struct xorshift128p_state *state)
xorshiftr+
(r stands for reduced; reads "xorshifter plus") generator was mainly based on xorshift+ yet incorporates modifications making it significantly faster (especially on lightweight devices) and more successful in randomness tests (including TestU01 BigCrush suite) compared to its predecessors. It is one of the fastest generators passing all tests in TestU01's BigCrush suite. Like xorshift+, a native C implementation of an xorshiftr+ generator that passes all tests from the BigCrush suite can typically generate a random number in fewer than 10 clock cycle
In electronics and especially synchronous digital circuits, a clock signal (historically also known as ''logic beat'') is an electronic logic signal (voltage or current) which oscillates between a high and a low state at a constant frequency and ...
s on x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
, thanks to instruction pipelining
In computer engineering, instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming in ...
.
Unlike , does not return the sum of two variables derived from the state using xorshift-style steps, rather it returns a single variable with the very last operation in its cycle; however, it features an addition just before returning a value, namely in the phase of adjusting the seed for the next cycle; hence the "+" in the name of the algorithm. The variable sizes, including the state, can be increased with no compromise to the randomness scores, but performance drops may be observed on lightweight devices.
The following generator uses 128 bits of state (with two variables) and has a maximal period of 2128−1.
#include
struct xorshiftr128plus_state ;
/* The state must be seeded so that it is not all zero */
uint64_t xorshiftr128plus(struct xorshiftr128plus_state *state)
xoshiro
xoshiro (short for "xor, shift, rotate") and xoroshiro (short for "xor, rotate, shift, rotate") use rotations in addition to shifts. According to Vigna, they are faster and produce better quality output than xorshift.
This class of generator has variants for 32-bit and 64-bit integer and floating point output; for floating point, one takes the upper 53 bits (for binary64) or the upper 23 bits (for binary32), since the upper bits are of better quality than the lower bits in the floating point generators. The algorithms also include a jump
function, which sets the state forward by some number of steps – usually a power of two that allows many threads of execution to start at distinct initial states.
For 32-bit output, xoshiro128** and xoshiro128+ are exactly equivalent to xoshiro256** and xoshiro256+, with in place of , and with different shift/rotate constants.
More recently, the generators have been made as an alternative to the generators. They are used in some implementations of Fortran compilers such as GNU Fortran, and in Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and Julia.
xoshiro256++
xoshiro256++ is the family's general-purpose random 64-bit number generator.
/* Adapted from the code included on Sebastiano Vigna's website */
#include
uint64_t rol64(uint64_t x, int k)
struct xoshiro256pp_state ;
uint64_t xoshiro256pp(struct xoshiro256pp_state *state)
xoshiro256**
xoshiro256** uses multiplication rather than addition in its output function. It is worth noting, however, that the output function is invertible, allowing the underlying state to be trivially uncovered. It is used in GNU Fortran compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
, Lua (as of Lua 5.4), and the .NET
The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
framework (as of .NET 6.0).
/* Adapted from the code included on Sebastiano Vigna's website */
#include
uint64_t rol64(uint64_t x, int k)
struct xoshiro256ss_state ;
uint64_t xoshiro256ss(struct xoshiro256ss_state *state)
xoshiro256+
xoshiro256+ is approximately 15% faster than xoshiro256**, but the lowest three bits have low linear complexity; therefore, it should be used only for floating point results by extracting the upper 53 bits.
#include
uint64_t rol64(uint64_t x, int k)
struct xoshiro256p_state ;
uint64_t xoshiro256p(struct xoshiro256p_state *state)
xoroshiro
If space is at a premium, xoroshiro128** and xoroshiro128+ are equivalent to xoshiro256** and xoshiro256+. These have smaller state spaces, and thus are less useful for massively parallel programs. xoroshiro128+ also exhibits a mild dependency in the population count, generating a failure after of output. The authors do not believe that this can be detected in real world programs. Instead of perpetuating Marsaglia's tradition of xorshift as a basic operation, xoroshiro128+ uses a shift/rotate-based linear transformation designed by Sebastiano Vigna in collaboration with David Blackman. The result is a significant improvement in speed and statistical quality.
xoroshiro64** and xoroshiro64* are equivalent to xoroshiro128** and xoroshiro128+. Unlike the xoshiro generators, they are not straightforward ports of their higher-precision counterparts.
Statistical quality
The lowest bits of the output generated by xoroshiro128+ have low quality. The authors of xoroshiro128+ acknowledge that it does not pass all statistical tests, stating
This is xoroshiro128+ 1.0, our best and fastest small-state generator
for floating-point numbers. We suggest to use its upper bits for
floating-point generation, as it is slightly faster than
xoroshiro128**. It passes all tests we are aware of except for the four
lower bits, which might fail linearity tests (and just those), so if
low linear complexity is not considered an issue (as it is usually the
case) it can be used to generate 64-bit outputs, too; moreover, this
generator has a very mild Hamming-weight dependency making our test
(http://prng.di.unimi.it/hwd.php) fail after 5 TB of output; we believe
this slight bias cannot affect any application. If you are concerned,
use xoroshiro128** or xoshiro256+.
We suggest to use a sign test to extract a random Boolean value, and
right shifts to extract subsets of bits.
The state must be seeded so that it is not everywhere zero. If you have
a 64-bit seed, we suggest to seed a splitmix64 generator and use its
output to fill s.
NOTE: the parameters (a=24, b=16, c=37) of this version give slightly
better results in our test than the 2016 version (a=55, b=14, c=36).
These claims about not passing tests can be confirmed by running PractRand on the input, resulting in output like that shown below:
RNG_test using PractRand version 0.93
RNG = RNG_stdin64, seed = 0xfac83126
test set = normal, folding = standard (64 bit)
rng=RNG_stdin64, seed=0xfac83126
length= 128 megabytes (2^27 bytes), time= 2.1 seconds
Test Name Raw Processed Evaluation
ow1/64Rank(12):256(2) R= +3748 p~= 3e-1129 FAIL !!!!!!!!
ow1/64Rank(12):384(1) R= +5405 p~= 3e-1628 FAIL !!!!!!!!
...and 146 test result(s) without anomalies
Acknowledging the authors go on to say:
We suggest to use a sign test to extract a random Boolean value
Thus, programmers should prefer the highest bits (e.g., making a heads/tails by writing random_number < 0
rather than random_number & 1
). It must be noted, though, that the same test is failed by some instances of the Mersenne Twister
The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by and . Its name derives from the choice of a Mersenne prime as its period length.
The Mersenne Twister was created specifically to address most of ...
and WELL
A well is an excavation or structure created on the earth by digging, driving, or drilling to access liquid resources, usually water. The oldest and most common kind of well is a water well, to access groundwater in underground aquifers. The ...
.
The statistical problems extend far beyond the bottom few bits, because it fails the PractRand test even when truncated and fails multiple tests in BigCrush even when the bits are reversed.
Initialization
In the xoshiro paper, it is recommended to initialize the state of the generators using a generator which is radically different from the initialized generators, as well as one which will never give the "all-zero" state; for shift-register generators, this state is impossible to escape from. The authors specifically recommend using the SplitMix64 generator, from a 64-bit seed, as follows:
#include
struct splitmix64_state ;
uint64_t splitmix64(struct splitmix64_state *state)
struct xorshift128_state ;
// one could do the same for any of the other generators
void xorshift128_init(struct xorshift128_state *state, uint64_t seed)
See also
* List of random number generators
* Linear feedback shift register
In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state.
The most commonly used linear function of single bits is exclusive-or (XOR). Thus, an LFSR is most often a sh ...
Notes
References
Further reading
* Lists generators of various sizes with four shifts (two per feedback word).
External links
*{{Cite web , url=http://vigna.di.unimi.it/xorshift/, title=xoshiro / xoroshiro generators and the PRNG shootout , year =2018 , first = Sebastiano , last = Vigna , access-date= 2018-05-04
Articles with example C code
Pseudorandom number generators