Causal consistency is one of the major memory
consistency model
In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of rea ...
s. In
concurrent programming
Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to:
Law
* Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea''
* Concurring opinion (also called a "concurrence"), a ...
, where concurrent processes are accessing a shared memory, a
consistency model
In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of rea ...
restricts which accesses are legal. This is useful for defining correct
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the rel ...
s in
distributed shared memory
In computer science, distributed shared memory (DSM) is a form of memory architecture where physically separated memories can be addressed as a single shared address space. The term "shared" does not mean that there is a single centralized memo ...
or
distributed transactions.
Causal Consistency is
“Available under Partition”, meaning that a process can read and write the memory (memory is Available) even while there is no functioning network connection (network is Partitioned) between processes; it is an asynchronous model. Contrast to strong consistency models, such as
sequential consistency
Sequential consistency is a consistency model used in the domain of concurrent computing
Concurrent computing is a form of computing in which several computations are executed '' concurrently''—during overlapping time periods—instead of ...
or
linearizability
In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of invocation and response events ( event), that may be extended by adding response events such that:
# The extended list can be re ...
, which cannot be both
safe
A safe (also called a strongbox or coffer) is a secure Lock (security device), lockable box used for securing valuable objects against theft or fire. A safe is usually a hollow cuboid or cylinder, with one face being removable or hinged to form ...
and
live
Live may refer to:
Arts, entertainment, and media Films
* ''Live!'' (2007 film), 2007 American film
* ''Live'' (2014 film), a 2014 Japanese film
*'' ''Live'' (Apocalyptica DVD)
Music
* Live (band), American alternative rock band
* List of album ...
under partition, and are slow to respond because they require synchronisation.
Causal consistency was proposed in 1990s
[Ahamad, M., Neiger, G., Burns, J. E., Kohli, P., & Hutto, P. W. (1995). Causal memory: Definitions, implementation, and programming. Distributed Computing, 9(1), 37-49.] as a weaker consistency model for shared memory models. Causal consistency is closely related to the concept of Causal Broadcast in communication protocols.
[Kenneth P. Birman, Thomas A. Joseph (1987). Reliable Communication in the Presence of Failures. Trans. on Comp. Sys. (TOCS), 5(1), 47–76.] In these models, a distributed execution is represented as a partial order, based on Lamport's
happened-before In computer science, the happened-before relation (denoted: \to \;) is a relation between the result of two events, such that if one event should happen before another event, the result must reflect that, even if those events are in reality execute ...
concept of potential causality.
[Lamport, L. (1978). Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7), 558-565.]
Causal consistency is a useful consistency model because it matches programmers' intuitions about time, is more available than strong consistency models, yet provides more useful guarantees than
eventual consistency
Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last up ...
. For instance, in distributed databases, causal consistency supports the ordering of operations, in contrast to
eventual consistency
Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last up ...
. Also, causal consistency helps with the development of
abstract data type
In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a '' user'', of the data, specifically in terms of possible values, po ...
s such as queues or counters.
Since time and ordering are so fundamental to our intuition, it is hard
to reason about a system that does not enforce causal consistency.
However, many distributed databases lack this guarantee, even ones that
provide serialisability.
Spanner
A wrench or spanner is a tool used to provide grip and mechanical advantage in applying torque to turn objects—usually rotary fasteners, such as nuts and bolts—or keep them from turning.
In the UK, Ireland, Australia, and New Zealan ...
does guarantee causal consistency, but it also forces strong consistency, thus eschewing
availability under partition.
More available databases that ensure causal consistency include
MongoDB
MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Ser ...
an
AntidoteDB
Definition
Causal consistency captures the potential causal relationships between operations, and guarantees that all processes observe causally-related operations in a common order. In other words, all processes in the system agree on the order of the causally-related operations. They may disagree on the order of operations that are causally unrelated.
Let us define the following relation. If some process performs a write operation A, and some (the same or another) process that observed A then performs a write operation B, then it is possible that A is the cause of B; we say that A “potentially causes” or “causally precedes” B.
Causal Consistency guarantees that if A causally-precedes B, then every process in the system observes A before observing B. Conversely, two write operations C and D are said concurrent, or causally independent, if neither causally precedes the other. In this case, a process may observe either C before D, or D before C. The Causal Precedence relation in shared memory is related to the
happened-before In computer science, the happened-before relation (denoted: \to \;) is a relation between the result of two events, such that if one event should happen before another event, the result must reflect that, even if those events are in reality execute ...
relation in message-based communication.
Thus, a system provides causal consistency if this following condition holds: write operations that are related by potential causality are seen by each process of the system in their causal precedence order. Different processes may observe concurrent writes in different orders.
[Gogia, R., Chhabra, P., & Kumari, R. (2014). Consistency Models in Distributed Shared Memory Systems. International Journal of Computer Science and Mobile Computing,196-201]
The Causal Consistency model is weaker than
sequential consistency
Sequential consistency is a consistency model used in the domain of concurrent computing
Concurrent computing is a form of computing in which several computations are executed '' concurrently''—during overlapping time periods—instead of ...
, which ensures that all processes observe all write operations in common order, whether causally related or not. However, causal consistency is stronger than
PRAM consistency, which requires only the write operations that are done by a single process to be observed in common order by each other process. It follows that when a system is sequentially consistent, it is also causally consistent. Additionally, causal consistency implies PRAM consistency, but not vice versa.
Example
Here is an example of causal consistency.
Causal relations are respected in the following event sequence:
Process P2 observes, reads, the earlier write W(x)1 that is done by process P1. Therefore, the two writes W(x)1 and W(x)2 are causally related. Under causal consistency, every process observes W(x)1 first, before observing W(x)2. Notice that the two write operations W(x)2 and W(x)3, with no intervening read operations, are concurrent, and processes P3 and P4 observe (read) them in different orders.
Session Guarantees
The causal consistency model can be refined into four session guarantees.. They can be summarised as follows:
* Read Your Writes: If a process performs a write, the same process later observes the result of its write.
* Monotonic Reads: the set of writes observed (read) by a process is guaranteed to be monotonically non-decreasing.
*Writes Follow Reads: if some process performs a read followed by a write, and another process observes the result of the write, then it can also observe the read (unless it has been overwritten).
* Monotonic Writes: If some process performs a write, followed some time later by another write, other processes will observe them in the same order.
Transactional session guarantees for serialisability and snapshot isolation are presented by Daudjee and Salem.
Implementation
The system is abstracted as a set of communicating processes.
When a process writes into the shared memory, the implementation sends this event to the other processes (via shared memory or as a message).
Because of concurrency and failures, a process might receive events in any order.
The implementation ''delivers'' an event, i.e., makes it visible to the process, only if all the events that causally precede it have themselves been delivered.
This requires the implementation to maintain ''meta-data'' that represents the causal relationships between memory accesses.
In brief, the implementation includes the following steps:
(1) Maintain ''causal context'' meta-data at every process to summarise what updates causally precede the current state.
(2) When a process updates memory, tag the update event with the causal context of that process, to summarise what updates causally precede this update.
(3) A process that has ''received'' some update event may ''deliver'' it only if the event's tag causally precedes the causal context of the receiving process.
(As a side effect of delivery, add the new event to the causal context of the receiving process.)
Otherwise, the update was received too early, and must remain buffered until event matches the context.
In the meantime, the implementation either passively waits to receive the missing events, or actively fetches them from their source.
This approach enables
availability under partition.
There are two common representations for the causal context meta-data.
One is to maintain an explicit
dependency graph
In mathematics, computer science and digital electronics, a dependency graph is a directed graph representing dependencies of several objects towards each other. It is possible to derive an evaluation order or the absence of an evaluation order t ...
of the causal dependence relation.
Because such a graph can grow arbitrarily large, an event is often tagged with only its immediate predecessors; determining its transitive predecessors requires a distributed graph traversal.
The other is to maintain a
vector clock
A vector clock is a data structure used for determining the partial ordering of events in a distributed system and detecting causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's lo ...
, with one entry per process (or group of processes), counting the number of events generated by the process or group.
This representation has a fixed size, and the ordering of events can be inferred by a
simple comparison of the vectors.
To precisely determine which events are dependent and which are concurrent in a fully peer-to-peer system, the size of the metadata is at least proportional to the number of active writers.
However, a precise determination of concurrency is generally overkill.
Causal consistency requires only that causally-dependent events be delivered in order; it does not matter if two concurrent events end up being ordered.
Therefore, the size can be decreased arbitrarily by using safe approximation techniques.
[Torres-Rojas, Francisco J. and Ahamad, Mustaque. Plausible clocks: constant size logical clocks for distributed systems. Distributed Computing, 12(4), 1999.]
In the limit, a single scalar (a Lamport clock
[) suffices, at the cost of removing any concurrency.
The size of metadata can also be decreased by restricting the communication topology; for instance, in a star, tree or linear topology, a single scalar suffices.
The search for efficient implementations of causal consistency is a very
active research area.
]
References
{{Reflist
Consistency models