Double compare-and-swap (DCAS or CAS2) is an
atomic primitive proposed to support certain
concurrent programming
Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to:
Law
* Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea''
* Concurring opinion (also called a "concurrence"), a ...
techniques. DCAS takes two not necessarily contiguous memory locations and writes new values into them only if they match pre-supplied "expected" values; as such, it is an extension of the much more popular
compare-and-swap In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given (the previous) value and, only if they are the same, modifies the ...
(CAS) operation.
DCAS is sometimes confused with the double-width compare-and-swap (DWCAS) implemented by instructions such as x86 CMPXCHG16B. DCAS, as discussed here, handles two discontiguous memory locations, typically of pointer size, whereas DWCAS handles two adjacent pointer-sized memory locations.
In his doctoral thesis, Michael Greenwald recommended adding DCAS to modern hardware, showing it could be used to create easy-to-apply yet efficient
software transactional memory
In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. ST ...
(STM). Greenwald points out that an advantage of DCAS vs CAS is that higher-order (multiple item) CAS''n'' can be implemented in O(''n'') with DCAS, but algorithms for DCAS that use only unary single-word atomic operations are sensitive to the number of contending processes.
One of the advantages of DCAS is the ability to implement atomic
deques (i.e.
doubly linked list
In computer science, a doubly linked list is a linked data structure that consists of a set of sequentially linked records called nodes. Each node contains three fields: two link fields (references to the previous and to the next node in the se ...
s) with relative ease.
More recently, however, it has been shown that an STM can be implemented with comparable properties using only CAS.
[Keir Fraser (2004), "Practical lock-freedom]
UCAM-CL-TR-579.pdf
/ref> An lock-free deque using hazard pointers and requiring only DWCAS rather than full DCAS was proposed by Maged Michael in 2003.[Maged M. Michael. Cas-based lock-free algorithm for shared deques. In Harald Kosch, László Böszörményi, and Hermann Hellwagner, editors, Euro-Par, volume 2790 of Lecture Notes in Computer Science, pages 651–660.Springer, 2003]
/ref> In general however, DCAS is not a silver bullet
Silver Bullet(s) or The Silver Bullet may refer to:
* Silver bullet, in folklore, a weapon against supernatural creatures; metaphorically, a simple, effective solution to a problem
Film and television
* The Silver Bullet (1935 film), ''The Silve ...
: implementing lock-free and wait-free algorithms
In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking i ...
using it can be just as complex and error-prone as for CAS.
Motorola at one point included DCAS in the instruction set for its 68k
The Motorola 68000 series (also known as 680x0, m68000, m68k, or 68k) is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and w ...
series; however, the slowness of DCAS relative to other primitives (apparently due to cache handling issues) led to its avoidance in practical contexts. , DCAS is not natively supported by any widespread CPUs in production.
The generalization of DCAS to more than two addresses is sometimes called MCAS (multi-word CAS); MCAS can be implemented by a nestable LL/SC, but such a primitive is not directly available in hardware. MCAS can be implemented in software in terms of DCAS, in various ways. In 2013, Trevor Brown, Faith Ellen
Faith Ellen (formerly known as Faith E. Fich) is a Canadian professor of computer science at the University of Toronto who studies distributed data structures and the theory of distributed computing.
Education and career
She earned her bachelor ...
, and Eric Ruppert have implemented in software a multi-address LL/SC extension (which they call LLX/SCX) that while being more restrictive than MCAS enabled them, via some automated code generation, to implement one of the best performing concurrent binary search tree
In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a Rooted tree, rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left ...
(actually a chromatic tree), slightly beating the JDK CAS-based skip list
In computer science, a skip list (or skiplist) is a Randomized algorithm, probabilistic data structure that allows O(\log n) Average-case complexity, average complexity for search as well as O(\log n) average complexity for insertion within an or ...
implementation.
In general, DCAS can be provided by a more expressive hardware transactional memory In computer science and computer engineering, engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an linearizability, atomic way. It is a concurrency control ...
.[Dave Dice, Yossi Lev, Mark Moir, Dan Nussbaum, and Marek Olszewski. (2009) "Early experience with a commercial hardware transactional memory implementation." Sun Microsystems technical report (60 pp.) SMLI TR-2009-180. A short version appeared at ASPLOS’09 {{doi, 10.1145/1508244.1508263. The full-length report discusses how to implement DCAS using HTM in section 5.] IBM POWER8
POWER8 is a family of superscalar multi-core microprocessors based on the Power ISA, announced in August 2013 at the Hot Chips conference. The designs are available for licensing under the OpenPOWER Foundation, which is the first time for suc ...
and Intel Intel TSX provide working implementations of transactional memory. Sun
The Sun is the star at the centre of the Solar System. It is a massive, nearly perfect sphere of hot plasma, heated to incandescence by nuclear fusion reactions in its core, radiating the energy from its surface mainly as visible light a ...
's cancelled Rock processor would have supported it as well.
References
External links
US Patent 4584640 ''Method and apparatus for a compare and swap instruction''
Concurrency control