Acid (debugger)
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, ACID ( atomicity,
consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
, isolation,
durability Durability is the ability of a physical product to remain functional, without requiring excessive maintenance or repair, when faced with the challenges of normal operation over its design lifetime. There are several measures of durability in us ...
) is a set of properties of
database transaction A database transaction symbolizes a unit of work, performed within a database management system (or similar system) against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally rep ...
s intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s, a sequence of database operations that satisfies the ACID properties (which can be perceived as a single logical operation on the data) is called a ''transaction''. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction. In 1983,
Andreas Reuter Andreas Reuter (born October 31, 1949) is a German computer science professor and research manager. His research focuses on databases, transaction systems, and parallel and distributed computer systems. Reuter has been scientific and executive dir ...
and Theo Härder coined the acronym ''ACID'', building on earlier work by Jim Gray who named atomicity, consistency, and durability, but not isolation, when characterizing the transaction concept. These four properties are the major guarantees of the transaction paradigm, which has influenced many aspects of development in
database systems In computing, a database is an organized collection of Data (computing), data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, Application software, applications, and ...
. According to Gray and Reuter, the
IBM Information Management System The IBM Information Management System (IMS) is a joint hierarchical database model, hierarchical database and information management system that supports transaction processing. Development began in 1966 to keep track of the bill of materials for ...
supported ACID transactions as early as 1973 (although the acronym was created later). BASE stands for basically available, soft state, and eventually consistent: the acronym highlights that BASE is opposite of ACID, like their chemical equivalents. ACID databases prioritize
consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
over availability — the whole transaction fails if an error occurs in any step within the transaction; in contrast, BASE databases prioritize availability over consistency: instead of failing the transaction, users can access inconsistent data temporarily: data consistency is achieved, but not immediately.


Characteristics

The characteristics of these four properties as defined by Reuter and Härder are as follows:


Atomicity

Transactions are often composed of multiple
statements Statement or statements may refer to: Common uses *Statement (computer science), the smallest standalone element of an imperative programming language * Statement (logic and semantics), declarative sentence that is either true or false *Statement, ...
. Atomicity guarantees that each transaction is treated as a single "unit", which either succeeds completely or fails completely: if any of the statements constituting a transaction fails to complete, the entire transaction fails and the database is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes. A guarantee of atomicity prevents updates to the database from occurring only partially, which can cause greater problems than rejecting the whole series outright. As a consequence, the transaction cannot be observed to be in progress by another database client. At one moment in time, it has not yet happened, and at the next, it has already occurred in whole (or nothing happened if the transaction was cancelled in progress).


Consistency

Consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
ensures that a transaction can only bring the database from one consistent state to another, preserving database invariants: any data written to the database must be valid according to all defined rules, including
constraints Constraint may refer to: * Constraint (computer-aided design), a demarcation of geometrical characteristics between two or more entities or solid modeling bodies * Constraint (mathematics), a condition of an optimization problem that the solution m ...
, cascades, triggers, and any combination thereof. This prevents database corruption by an illegal transaction. An example of a database invariant is
referential integrity Referential integrity is a property of data stating that all its references are valid. In the context of relational databases, it requires that if a value of one attribute (column) of a relation (table) references a value of another attribute (e ...
, which guarantees the
primary key In the relational model of databases, a primary key is a designated attribute (column) that can reliably identify and distinguish between each individual record in a table. The database creator can choose an existing unique attribute or combinati ...
foreign key A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples ...
relationship.C. J. Date, "SQL and Relational Theory: How to Write Accurate SQL Code 2nd edition", ''O'reilly Media, Inc.'', 2012, p. 180.


Isolation

Transactions are often executed concurrently (e.g., multiple transactions reading and writing to a table at the same time). Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially. Isolation is the main goal of
concurrency control In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, whil ...
; depending on the isolation level used, the effects of an incomplete transaction might not be visible to other transactions.


Durability

Durability Durability is the ability of a physical product to remain functional, without requiring excessive maintenance or repair, when faced with the challenges of normal operation over its design lifetime. There are several measures of durability in us ...
guarantees that once a transaction has been committed, it will remain committed even in the case of a system failure (e.g., power outage or crash). This usually means that completed transactions (or their effects) are recorded in
non-volatile memory Non-volatile memory (NVM) or non-volatile storage is a type of computer memory that can retain stored information even after power is removed. In contrast, volatile memory needs constant power in order to retain data. Non-volatile memory typ ...
.


Examples

The following examples further illustrate the ACID properties. In these examples, the database table has two columns, A and B. An
integrity constraint Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire life-cycle. It is a critical aspect to the design, implementation, and usage of any system that stores, processes, or retrieves data. The ter ...
requires that the value in A and the value in B must sum to 100. The following
SQL Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel") is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
code creates a table as described above:CREATE TABLE acidtest (A INTEGER, B INTEGER, CHECK (A + B = 100));


Atomicity

Atomicity is the guarantee that series of database operations in an atomic transaction will either all occur (a successful operation), or none will occur (an unsuccessful operation). The series of operations cannot be separated with only some of them being executed, which makes the series of operations "indivisible". A guarantee of atomicity prevents updates to the database from occurring only partially, which can cause greater problems than rejecting the whole series outright. In other words, atomicity means indivisibility and irreducibility. Alternatively, we may say that a logical transaction may be composed of several physical transactions. Unless and until all component physical transactions are executed, the logical transaction will not have occurred. An example of an atomic transaction is a monetary transfer from bank account A to account B. It consists of two operations, withdrawing the money from account A and depositing it to account B. We would not want to see the amount removed from account A before we are sure it has also been transferred into account B. Performing these operations in an atomic transaction ensures that the database remains in a consistent state, that is, money is neither debited nor credited if either of those two operations fails.


Consistency failure

Consistency is a very general term, which demands that the data must meet all validation rules. In the previous example, the validation is a requirement that . All validation rules must be checked to ensure consistency. Assume that a transaction attempts to subtract 10 from without altering . Because consistency is checked after each transaction, it is known that before the transaction begins. If the transaction removes 10 from successfully, atomicity will be achieved. However, a validation check will show that , which is inconsistent with the rules of the database. The entire transaction must be canceled and the affected rows rolled back to their pre-transaction state. If there had been other constraints, triggers, or cascades, every single change operation would have been checked in the same way as above before the transaction was committed. Similar issues may arise with other constraints. We may have required the data types of both and to be integers. If we were then to enter, say, the value 13.5 for , the transaction will be canceled, or the system may give rise to an alert in the form of a trigger (if/when the trigger has been written to this effect). Another example would be integrity constraints, which would not allow us to delete a row in one table whose primary key is referred to by at least one
foreign key A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples ...
in other tables.


Isolation failure

To demonstrate isolation, we assume two transactions execute at the same time, each attempting to modify the same data. One of the two must wait until the other completes in order to maintain isolation. Consider two transactions: * T1 transfers 10 from A to B. * T2 transfers 20 from B to A. Combined, there are four actions: # T1 subtracts 10 from A. # T1 adds 10 to B. # T2 subtracts 20 from B. # T2 adds 20 to A. If these operations are performed in order, isolation is maintained, although T2 must wait. Consider what happens if T1 fails halfway through. The database eliminates T1's effects, and T2 sees only valid data. By interleaving the transactions, the actual order of actions might be: # T1 subtracts 10 from A. # T2 subtracts 20 from B. # T2 adds 20 to A. # T1 adds 10 to B. Again, consider what happens if T1 fails while modifying B in Step 4. By the time T1 fails, T2 has already modified A; it cannot be restored to the value it had before T1 without leaving an invalid database. This is known as a write-write contention, because two transactions attempted to write to the same data field. In a typical system, the problem would be resolved by reverting to the last known good state, canceling the failed transaction T1, and restarting the interrupted transaction T2 from the good state.


Durability failure

Consider a transaction that transfers 10 from A to B. First, it removes 10 from A, then it adds 10 to B. At this point, the user is told the transaction was a success. However, the changes are still queued in the
disk buffer In computer storage, a disk buffer (often ambiguously called a disk cache or a cache buffer) is the embedded memory in a hard disk drive (HDD) or solid-state drive (SSD) acting as a buffer (computer science), buffer between the rest of the com ...
waiting to be committed to disk. Power fails and the changes are lost, but the user assumes (understandably) that the changes persist.


Implementation

Processing a transaction often requires a sequence of operations that is subject to failure for a number of reasons. For instance, the system may have no room left on its disk drives, or it may have used up its allocated CPU time. There are two popular families of techniques:
write-ahead logging In computer science, write-ahead logging (WAL) is a family of techniques for providing atomicity and durability (two of the ACID properties) in database systems. A write ahead log is an append-only auxiliary disk-resident structure used for cras ...
and
shadow paging In computer science, shadow paging is a technique for providing atomicity and durability (two of the ACID properties) in database systems. A ''page'' in this context refers to a unit of physical storage (probably on a hard disk), typically of ...
. In both cases,
lock Lock(s) or Locked may refer to: Common meanings *Lock and key, a mechanical device used to secure items of importance *Lock (water navigation), a device for boats to transit between different levels of water, as in a canal Arts and entertainme ...
s must be acquired on all information to be updated, and depending on the level of isolation, possibly on all data that may be read as well. In write ahead logging, durability is guaranteed by writing the prospective change to a persistent log before changing the database. That allows the database to return to a consistent state in the event of a crash. In shadowing, updates are applied to a partial copy of the database, and the new copy is activated when the transaction commits.


Locking vs. multiversioning

Many databases rely upon locking to provide ACID capabilities. Locking means that the transaction marks the data that it accesses so that the DBMS knows not to allow other transactions to modify it until the first transaction succeeds or fails. The lock must always be acquired before processing data, including data that is read but not modified. Non-trivial transactions typically require a large number of locks, resulting in substantial overhead as well as blocking other transactions. For example, if user A is running a transaction that has to read a row of data that user B wants to modify, user B must wait until user A's transaction completes.
Two-phase locking In databases and transaction processing, two-phase locking (2PL) is a pessimistic concurrency control method that guarantees conflict-serializability. Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987) ''Concurrency Control and Recov ...
is often applied to guarantee full isolation. An alternative to locking is
multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a non-locking concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. ...
, in which the database provides each reading transaction the prior, unmodified version of data that is being modified by another active transaction. This allows readers to operate without acquiring locks, i.e., writing transactions do not block reading transactions, and readers do not block writers. Going back to the example, when user A's transaction requests data that user B is modifying, the database provides A with the version of that data that existed when user B started his transaction. User A gets a consistent view of the database even if other users are changing data. One implementation, namely
snapshot isolation In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed a ...
, relaxes the isolation property.


Distributed transactions

Guaranteeing ACID properties in a
distributed transaction A distributed transaction operates within a distributed environment, typically involving multiple nodes across a network depending on the location of the data. A key aspect of distributed transactions is atomicity, which ensures that the transact ...
across a
distributed database A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a computer network, netwo ...
, where no single node is responsible for all data affecting a transaction, presents additional complications. Network connections might fail, or one node might successfully complete its part of the transaction and then be required to roll back its changes because of a failure on another node. The
two-phase commit protocol In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC, ''tupac'') is a type of Atomic commit, atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that pa ...
(not to be confused with
two-phase locking In databases and transaction processing, two-phase locking (2PL) is a pessimistic concurrency control method that guarantees conflict-serializability. Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987) ''Concurrency Control and Recov ...
) provides atomicity for
distributed transaction A distributed transaction operates within a distributed environment, typically involving multiple nodes across a network depending on the location of the data. A key aspect of distributed transactions is atomicity, which ensures that the transact ...
s to ensure that each participant in the transaction agrees on whether the transaction should be committed or not. Briefly, in the first phase, one node (the coordinator) interrogates the other nodes (the participants), and only when all reply that they are prepared does the coordinator, in the second phase, formalize the transaction.


See also

*
Eventual consistency Eventual consistency is a consistency model used in distributed computing to achieve high availability. Put simply: if no new updates are made to a given data item, ''eventually'' all accesses to that item will return the last updated value. Eve ...
(BASE: basically available) *
CAP theorem In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer (scientist), Eric Brewer, states that any distributed data store can provide at most Inconsistent triad, two of the following three guarantees: ; ...
*
Concurrency control In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, whil ...
*
Java Transaction API The Jakarta Transactions (JTA; formerly Java Transaction API), one of the Jakarta EE APIs, enables distributed transactions to be done across multiple X/Open XA resources in a Java environment. JTA was a specification developed under the Java Com ...
*
Open Systems Interconnection The Open Systems Interconnection (OSI) model is a reference model developed by the International Organization for Standardization (ISO) that "provides a common basis for the coordination of standards development for the purpose of systems inter ...
* Transactional NTFS *
Two-phase commit protocol In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC, ''tupac'') is a type of Atomic commit, atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that pa ...
*
CRUD In computer programming, create, read, update, and delete (CRUD) are the four basic operations (actions) of persistent storage. CRUD is also sometimes used to describe user interface conventions that facilitate viewing, searching, and changing info ...


References

{{DEFAULTSORT:Acid Database management systems Transaction processing Concurrency control