A vector clock is a
data structure used for determining the
partial ordering of events in a
distributed system and detecting
causality
Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the ca ...
violations. Just as in
Lamport timestamps, inter-process messages contain the state of the sending process's
logical clock. A vector clock of a system of ''N'' processes is an
array
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
/vector of ''N'' logical clocks, one clock per process; a local "largest possible values" copy of the global clock-array is kept in each process.
Denote
as the vector clock maintained by process i, the clock updates proceed as follows:
* Initially all clocks are zero.
* Each time a process experiences an internal event, it increments its own
logical clock in the vector by one. For instance, upon an event at process i, it updates
.
* Each time a process sends a message, it increments its own logical clock in the vector by one (as in the bullet above, but not twice for the same event) and then the message piggybacks a copy of its own vector.
* Each time a process receives a message, it increments its own logical clock in the vector by one and updates each element in its vector by taking the maximum of the value in its own vector clock and the value in the vector in the received message (for every element). For example, if process Pj receives a message m from Pi, it updates by setting
.
History
Without using the specific name "vector clock", the concept of a vector clock was first mentioned in a 1986 paper by
Rivka Ladin and
Barbara Liskov
Barbara Liskov (born November 7, 1939 as Barbara Jane Huberman) is an American computer scientist who has made pioneering contributions to programming languages and distributed computing. Her notable work includes the development of the Liskov su ...
where they use the term "multipart timestamp". To quote from page 31 of the Liskov/Ladin paper:
We solve this problem by using ''multipart timestamps'', where there is one part for each replica. Thus, if there are n replicas, a timestamp t is
t =
where each part is a positive integer. Since there will typically be a small number of replicas (e.g., 3 to 7), using such a timestamp is practical.
The term "vector clock" was first used independently by Colin Fidge and
Friedemann Mattern in 1988.
Partial ordering property
Vector clocks allow for the partial causal ordering of events. Defining the following:
*
denotes the vector clock of event
, and
denotes the component of that clock for process
.
*