HOME

TheInfoList



OR:

In computer science, a concurrent data structure is a particular way of storing and organizing data for access by multiple computing threads (or processes) on a computer. Historically, such data structures were used on
uniprocessor A uniprocessor system is defined as a computer system that has a single central processing unit that is used to execute computer tasks. As more and more modern software is able to make use of multiprocessing architectures, such as SMP and MPP, t ...
machines with
operating systems An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
that supported multiple computing threads (or processes). The term
concurrency Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to: Law * Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea'' * Concurring opinion (also called a "concurrence"), a ...
captured the
multiplexing In telecommunications and computer networking, multiplexing (sometimes contracted to muxing) is a method by which multiple analog or digital signals are combined into one signal over a shared medium. The aim is to share a scarce resource - a ...
/interleaving of the threads' operations on the data by the operating system, even though the processors never issued two operations that accessed the data simultaneously. Today, as
multiprocessor Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There a ...
computer architectures that provide parallelism become the dominant computing platform (through the proliferation of
multi-core A multi-core processor is a microprocessor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions (such a ...
processors), the term has come to stand mainly for data structures that can be accessed by multiple threads which may actually access the data simultaneously because they run on different processors that communicate with one another. The concurrent data structure (sometimes also called a ''shared data structure'') is usually considered to reside in an abstract storage environment called
shared memory In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between prog ...
, though this memory may be physically implemented as either a "tightly coupled" or a distributed collection of storage modules.


Basic principles

Concurrent data structures, intended for use in parallel or distributed computing environments, differ from "sequential" data structures, intended for use on a uni-processor machine, in several ways. Most notably, in a sequential environment one specifies the data structure's properties and checks that they are implemented correctly, by providing safety properties. In a concurrent environment, the specification must also describe liveness properties which an implementation must provide. Safety properties usually state that something bad never happens, while liveness properties state that something good keeps happening. These properties can be expressed, for example, using
Linear Temporal Logic In logic, linear temporal logic or linear-time temporal logic (LTL) is a modal temporal logic with modalities referring to time. In LTL, one can encode formulae about the future of paths, e.g., a condition will eventually be true, a condition will ...
. The type of liveness requirements tend to define the data structure. The
method Method ( grc, μέθοδος, methodos) literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In recent centuries it more often means a prescribed process for completing a task. It may refer to: *Scien ...
calls can be blocking or non-blocking. Data structures are not restricted to one type or the other, and can allow combinations where some method calls are blocking and others are non-blocking (examples can be found in the
Java concurrency The Java programming language and the Java virtual machine (JVM) have been designed to support concurrent programming, and all execution takes place in the context of threads. Objects and resources can be accessed by many separate threads; each t ...
software library). The safety properties of concurrent data structures must capture their behavior given the many possible interleavings of methods called by different threads. It is quite intuitive to specify how abstract data structures behave in a sequential setting in which there are no interleavings. Therefore, many mainstream approaches for arguing the safety properties of a concurrent data structure (such as
serializability In concurrency control of databases, Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987)''Concurrency Control and Recovery in Database Systems''(free PDF download), Addison Wesley Publishing Company, Gerhard Weikum, Gottfried Vossen (200 ...
,
linearizability In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of invocation and response events ( event), that may be extended by adding response events such that: # The extended list can be re- ...
,
sequential consistency Sequential consistency is a consistency model used in the domain of concurrent computing (e.g. in distributed shared memory, distributed transactions, etc.). It is the property that "... the result of any execution is the same as if the operation ...
, and quiescent consistency) specify the structures properties sequentially, and map its concurrent executions to a collection of sequential ones. To guarantee the safety and liveness properties, concurrent data structures must typically (though not always) allow threads to reach consensus as to the results of their simultaneous data access and modification requests. To support such agreement, concurrent data structures are implemented using special primitive synchronization operations (see synchronization primitives) available on modern multiprocessor machines that allow multiple threads to reach consensus. This consensus can be achieved in a blocking manner by using locks, or without locks, in which case it is non-blocking. There is a wide body of theory on the design of concurrent data structures (see bibliographical references).


Design and implementation

Concurrent data structures are significantly more difficult to design and to verify as being correct than their sequential counterparts. The primary source of this additional difficulty is concurrency, exacerbated by the fact that threads must be thought of as being completely asynchronous: they are subject to operating system preemption, page faults, interrupts, and so on. On today's machines, the layout of processors and memory, the layout of data in memory, the communication load on the various elements of the multiprocessor architecture all influence performance. Furthermore, there is a tension between correctness and performance: algorithmic enhancements that seek to improve performance often make it more difficult to design and verify a correct data structure implementation. A key measure for performance is scalability, captured by the
speedup In computer architecture, speedup is a number that measures the relative performance of two systems processing the same problem. More technically, it is the improvement in speed of execution of a task executed on two similar architectures with ...
of the implementation. Speedup is a measure of how effectively the application is using the machine it is running on. On a machine with P processors, the speedup is the ratio of the structures execution time on a single processor to its execution time on P processors. Ideally, we want linear speedup: we would like to achieve a speedup of P when using P processors. Data structures whose speedup grows with P are called scalable. The extent to which one can scale the performance of a concurrent data structure is captured by a formula known as
Amdahl's law In computer architecture, Amdahl's law (or Amdahl's argument) is a formula which gives the theoretical speedup in latency of the execution of a task at fixed workload that can be expected of a system whose resources are improved. It states that ...
and more refined versions of it such as
Gustafson's law In computer architecture, Gustafson's law (or Gustafson–Barsis's law) gives the speedup in the execution time of a task that theoretically gains from parallel computing, using a hypothetical run of ''the task'' on a single-core machine as the ba ...
. A key issue with the performance of concurrent data structures is the level of memory contention: the overhead in traffic to and from memory as a result of multiple threads concurrently attempting to access the same locations in memory. This issue is most acute with blocking implementations in which locks control access to memory. In order to acquire a lock, a thread must repeatedly attempt to modify that location. On a cache-coherent multiprocessor (one in which processors have local caches that are updated by hardware to keep them consistent with the latest values stored) this results in long waiting times for each attempt to modify the location, and is exacerbated by the additional memory traffic associated with unsuccessful attempts to acquire the lock.


See also

*
Java concurrency The Java programming language and the Java virtual machine (JVM) have been designed to support concurrent programming, and all execution takes place in the context of threads. Objects and resources can be accessed by many separate threads; each t ...
(JSR 166) *
Java ConcurrentMap The Java programming language's Java Collections Framework version 1.5 and later defines and implements the original regular single-threaded Maps, and also new thread-safe Maps implementing the interface among other concurrent interfaces. In Java ...


References


Further reading

*
Nancy Lynch Nancy Ann Lynch (born January 19, 1948) is a mathematician, a theorist, and a professor at the Massachusetts Institute of Technology. She is the NEC Professor of Software Science and Engineering in the EECS department and heads the "Theory of ...
"Distributed Computing" * Hagit Attiya and Jennifer Welch "Distributed Computing: Fundamentals, Simulations And Advanced Topics, 2nd Ed" *
Doug Lea Douglas S. Lea is a professor of computer science and current head of the computer science department at State University of New York at Oswego, where he specializes in concurrent programming and the design of concurrent data structures. He was ...
, "Concurrent Programming in Java: Design Principles and Patterns" *
Maurice Herlihy Maurice Peter Herlihy (born 4 January 1954) is a computer scientist active in the field of multiprocessor synchronization. Herlihy has contributed to areas including theoretical foundations of wait-free synchronization, linearizable data structure ...
and Nir Shavit, "The Art of Multiprocessor Programming" * Mattson, Sanders, and Massingil "Patterns for Parallel Programming"


External links


Multithreaded data structures for parallel computing, Part 1
(Designing concurrent data structures) by Arpan Sen

(Designing concurrent data structures without mutexes) by Arpan Sen
libcds
– C++ library of lock-free containers and safe memory reclamation schema
Synchrobench
– C/C++ and Java libraries and benchmarks of lock-free, lock-based, TM-based and RCU/COW-based data structures. {{DEFAULTSORT:Concurrent Data Structure Distributed data structures