concurrent programming Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to: Law * Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea'' * Concurring opinion (also called a "concurrence"), a ...

, an Array-Based Queuing Lock (ABQL) is a synchronization mechanism used to control access to shared resources and ensure fairness among competing threads. It is a variation of the ticket lock algorithm. Traditional locking mechanisms often involve threads contending for a single lock variable (a shared data element used to control access). In contrast, an ABQL uses an

array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...

to represent a queue of waiting threads. Each thread, upon attempting to acquire the lock, is assigned a unique position, or "slot," in this array. This approach significantly improves scalability and fairness because threads spin on their individual array slots rather than a single shared location, reducing contention and ensuring a

first-come, first-served Queueing theory is the mathematical study of Queue area, waiting lines, or wikt:queue, queues. A queueing model is constructed so that queue lengths and waiting time can be predicted. Queueing theory is generally considered a branch of operatio ...

acquisition order. This distributed spinning also minimizes

cache coherency In computer architecture, cache coherence is the uniformity of shared resource data that is stored in multiple local caches. In a cache coherent system, if multiple clients have a cached copy of the same region of a shared memory resource, all ...

traffic (the communication required to keep data consistent across multiple processor cores' caches), which further enhances performance, especially in

multi-core A multi-core processor (MCP) is a microprocessor on a single integrated circuit (IC) with two or more separate central processing units (CPUs), called ''cores'' to emphasize their multiplicity (for example, ''dual-core'' or ''quad-core''). Ea ...

and

multi-processor Multiprocessing (MP) is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. Ther ...

systems.

Overview

Synchronization Synchronization is the coordination of events to operate a system in unison. For example, the Conductor (music), conductor of an orchestra keeps the orchestra synchronized or ''in time''. Systems that operate with all parts in synchrony are sa ...

is a major issue in the designing and programming of shared memory multiprocessors. The common problem with lock implementations is the high network contention due to the processors spinning on a shared synchronization flag or memory location. Thus the scalability of the lock is reduced significantly in terms of the number of contending processors. The Array Based Queuing Lock is an extension to the

ticket lock In computer science, a ticket lock is a synchronization mechanism, or lock (computer science), locking algorithm, that is a type of spinlock that uses "tickets" to control which thread (computer science), thread of execution is allowed to enter a cr ...

algorithm which ensures that, on a lock release, only one processor attempts to acquire the lock, decreasing the number of cache misses. This effect can be achieved by having all the processors spin on unique memory locations. One analogy used to explain the lock mechanism is the relay race where the athlete passes on the baton to the next athlete in the queue, which ensures that only one athlete acquires the baton at a time. ABQL also guarantees fairness in lock acquisition by using a first in, first out (FIFO) queue-based mechanism. Additionally, the amount of invalidation is significantly less than ticket-based lock implementations since only one processor incurs a cache miss on a lock release.

Implementation

The foremost requirement of the implementation of array based queuing lock is to ensure that all the threads spin on unique memory locations. This is achieved with an array of length equal to the number of threads which are in contention for the lock. The elements of the array are all initialized to 0 except the first element which is takes the value 1, thus ensuring successful lock acquisition by the first thread in the queue. On a lock release, the hold is passed to the next thread in queue by setting the next element of the array to 1. The requests are granted to the threads in FIFO ordering. Pseudo Code example is listed below. ABQL_init(int *next_ticket, int *can_serve) ABQL_acquire(int *next_ticket, int *can_serve) ABQL_release (int *can_serve) To implement ABQL in the pseudo code above, 3 variables are introduced namely can_serve, next_ticket and my_ticket. The roles of each are described below: * ''can_serve'' array represents the unique memory locations that the threads waiting in the queue for the lock acquisitions spin on. * ''next_ticket'' represents the next available ticket number that is assigned to a new thread. * ''my_ticket'' represents the ticket thread of each unique thread in the queue. In the initialization method (ABQL_init), the variable ''next_ticket'' is initialized to 0. All the elements of the ''can_serve'' array except the first element are initialized to 0. Initialization of the first element in the array ''can_serve'' to 1, ensures successful lock acquisition by the first thread in the queue. The acquire method uses an

atomic operation In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of Execution (computing), invocation and response Event (computing), events, that may be extended by adding response events such tha ...

fetch_and_inc to fetch the next available ticket number (afterwards the ticket number is incremented by 1) that the new thread will use to spin on. The threads in the queue spin on their locations until the value of my_ticket is set to 1 by the previous thread. On acquiring the lock the thread enters the

critical section In concurrent programming, concurrent accesses to shared resources can lead to unexpected or erroneous behavior. Thus, the parts of the program where the shared resource is accessed need to be protected in ways that avoid the concurrent access. One ...

of the code. On release of a lock by a thread, the control is passed to the next thread by setting the next element in the array can_serve to 1. The next thread which was waiting to acquire the lock can now do so successfully. The working of ABQL can be depicted in the table below by assuming 4 processors contending to enter the critical section with the assumption that a thread enters the critical section only once. {, class="wikitable" !Execution Steps ! ! style="font-family:monospace" , can_serve ! ! ! ! !Comments , - , initially , 0 , , 0, 0, 0, 0 , 0 , 0 , 0 , initial value of all variables is 0 , - , style="font-family:monospace;" , P1: fetch_and_inc , 1 , , 0, 0, 0, 0 , 0 , 0 , 0 , P1 attempts and successfully acquires the lock , - , style="font-family:monospace;" , P2: fetch_and_inc , 2 , , 0, 0, 0, 0 , 1 , 0 , 0 , P2 attempts to acquire the lock , - , style="font-family:monospace;" , P3: fetch_and_inc , 3 , , 0, 0, 0, 0 , 1 , 2 , 0 , P3 attempts to acquire the lock , - , style="font-family:monospace;" , P4: fetch_and_inc , 4 , , 0, 0, 0, 0 , 1 , 2 , 3 , P4 attempts to acquire the lock , - , style="font-family:monospace;" , P1: can_serve = 1; can_serve = 0 , 4 , , 1, 0, 0, 0 , 1 , 2 , 3 , P1 releases the lock and P2 successfully acquires the lock , - , style="font-family:monospace;" , P2: can_serve = 1; can_serve = 0 , 4 , , 0, 1, 0, 0 , 1 , 2 , 3 , P2 releases the lock and P3 successfully acquires the lock , - , style="font-family:monospace;" , P3: can_serve = 1; {{quadcan_serve = 0 , 4 , , 0, 0, 1, 0 , 1 , 2 , 3 , P3 releases the lock and P4 successfully acquires the lock , - , style="font-family:monospace;" , P4: can_serve = 0 , 4 , , 0, 0, 0, 0 , 1 , 2 , 3 , P4 releases the lock

Performance metrics

The following performance criteria can be used to analyse the lock implementations: * Uncontended lock-acquisition latency - It is defined as the time taken by a thread to acquire a lock when there is no contention between threads. Due to relatively more number of instructions being executed as opposed to other lock implementations, the uncontented lock acquisition latency for ABQL is high. * Traffic - It is defined as number of bus transactions generated which is dependent on the number of threads in contention for the lock. On a lock release, only 1 cache block is invalidated thus resulting in a single cache miss. This results in much less bus traffic. * Fairness - It ensures that all processors waiting to acquire the

lock Lock(s) or Locked may refer to: Common meanings *Lock and key, a mechanical device used to secure items of importance *Lock (water navigation), a device for boats to transit between different levels of water, as in a canal Arts and entertainme ...

are able to do so in the order in which their requests are issued. Due to the threads waiting in a queue to acquire a lock with each thread spinning on an individual memory location, fairness is ensured. * Storage - The amount of memory required for the processing of the lock mechanism. The storage requirement scales with the number of threads due to the increase in the size of the array ''can_serve.''

Advantages

* ABQL offers improved scalability as each lock release and acquisition triggers only one cache miss resulting in only one cache block suffering from a cache miss to reload the block. * Fairness of lock acquisitions is ensured due to the use of a queue which ensures that the threads acquire the lock in the order in which their requests are issued.

Disadvantages

* ABQL should not be used with threads that can be suspended (sleep or

context In semiotics, linguistics, sociology and anthropology, context refers to those objects or entities which surround a ''focal event'', in these disciplines typically a communicative event, of some kind. Context is "a frame that surrounds the event ...

switch) as any thread not immediately ready to acquire the lock will increase the latency of all those behind it waiting for the lock. * Each lock requires an array with a number of entries equal to the number of processors, resulting in linear space complexity. This is generally wasteful as it is rare that all processors would contend for the same lock simultaneously.

References