Master-slave Replication
   HOME

TheInfoList



OR:

Replication in
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...
refers to maintaining multiple copies of data, processes, or resources to ensure consistency across redundant components. This fundamental technique spans
databases In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and ana ...
, file systems, and
distributed systems Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different computer network, networked computers. The components of a distribu ...
, serving to improve
availability In reliability engineering, the term availability has the following meanings: * The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...
,
fault-tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission critical, mission-critical, or even life-critical sys ...
, accessibility, and performance. Through replication, systems can continue operating when components fail (
failover Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network in a computer ...
), serve requests from geographically distributed locations, and balance load across multiple machines. The challenge lies in maintaining consistency between replicas while managing the fundamental tradeoffs between data consistency, system availability, and network partition tolerance – constraints known as the
CAP theorem In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer (scientist), Eric Brewer, states that any distributed data store can provide at most Inconsistent triad, two of the following three guarantees: ; ...
.


Terminology

Replication in computing can refer to: * ''Data replication'', where the same data is stored on multiple storage devices * ''Computation replication'', where the same computing task is executed many times. Computational tasks may be: ** ''Replicated in space'', where tasks are executed on separate devices ** ''Replicated in time'', where tasks are executed repeatedly on a single device Replication in space or in time is often linked to scheduling algorithms. Access to a replicated entity is typically uniform with access to a single non-replicated entity. The replication itself should be transparent to an external user. In a failure scenario, a
failover Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network in a computer ...
of replicas should be hidden as much as possible with respect to
quality of service Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitat ...
. Computer scientists further describe replication as being either: * Active replication, which is performed by processing the same request at every replica * Passive replication, which involves processing every request on a single replica and transferring the result to the other replicas When one leader replica is designated via
leader election In distributed computing, leader election is the process of designating a single Process (computing), process as the organizer of some task distributed among several computers (nodes). Before the task has begun, all network nodes are either unawa ...
to process all the requests, the system is using a primary-backup or primary-replica scheme, which is predominant in
high-availability cluster In computing, high-availability clusters (HA clusters) or fail-over clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability sof ...
s. In comparison, if any replica can process a request and distribute a new state, the system is using a multi-primary or multi-master scheme. In the latter case, some form of distributed concurrency control must be used, such as a
distributed lock manager A distributed lock manager (DLM) runs in every machine in a cluster, with an identical copy of a cluster-wide lock database. Operating systems use lock managers to organise and serialise the access to resources. In this way a DLM provides software ...
. Load balancing differs from task replication, since it distributes a load of different computations across machines, and allows a single computation to be dropped in case of failure. Load balancing, however, sometimes uses data replication (especially
multi-master replication Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system i ...
) internally, to distribute its data among machines.
Backup In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "wikt:back ...
differs from replication in that the saved copy of data remains unchanged for a long period of time. Replicas, on the other hand, undergo frequent updates and quickly lose any historical state. Replication is one of the oldest and most important topics in the overall area of
distributed systems Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different computer network, networked computers. The components of a distribu ...
. Data replication and computation replication both require processes to handle incoming events. Processes for data replication are passive and operate only to maintain the stored data, reply to read requests and apply updates. Computation replication is usually performed to provide fault-tolerance, and take over an operation if one component fails. In both cases, the underlying needs are to ensure that the replicas see the same events in equivalent orders, so that they stay in consistent states and any replica can respond to queries.


Replication models in distributed systems

Three widely cited models exist for data replication, each having its own properties and performance: * Transactional replication: used for replicating
transactional data In data management, dynamic data or transactional data is information that is periodically updated, meaning it changes asynchronously over time as new information becomes available. The concept is important in data management, since the time sca ...
, such as a database. The
one-copy serializability In the fields of databases and transaction processing (transaction management), a schedule (or history) of a system is an abstract model to describe the order of executions in a set of transactions running in the system. Often it is a ''list'' o ...
model is employed, which defines valid outcomes of a transaction on replicated data in accordance with the overall
ACID An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
(atomicity, consistency, isolation, durability) properties that transactional systems seek to guarantee. *
State machine replication In computer science, state machine replication (SMR) or state machine approach is a general method for implementing a fault-tolerant service by replicating servers and coordinating client interactions with server replicas. The approach also provid ...
: assumes that the replicated process is a
deterministic finite automaton In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state auto ...
and that
atomic broadcast In fault-tolerant distributed computing, an atomic broadcast or total order broadcast is a broadcast where all correct processes in a system of multiple processes receive the same set of messages in the same order; that is, the same sequence of me ...
of every event is possible. It is based on distributed consensus and has a great deal in common with the transactional replication model. This is sometimes mistakenly used as a synonym of active replication. State machine replication is usually implemented by a replicated log consisting of multiple subsequent rounds of the
Paxos algorithm Paxos is a family of protocols for solving Consensus (computer science), consensus in a network of unreliable or fallible processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult ...
. This was popularized by Google's Chubby system, and is the core behind the open-source Keyspace data store. *
Virtual synchrony Reliable multicast is any computer networking protocol that provides a '' reliable'' sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer. Overview Multicast is a ne ...
: involves a group of processes which cooperate to replicate in-memory data or to coordinate actions. The model defines a distributed entity called a ''process group''. A process can join a group and is provided with a checkpoint containing the current state of the data replicated by group members. Processes can then send
multicast In computer networking, multicast is a type of group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast differs from ph ...
s to the group and will see incoming multicasts in the identical order. Membership changes are handled as a special multicast that delivers a new "membership view" to the processes in the group.


Database replication

Database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
replication involves maintaining copies of the same data on multiple machines, typically implemented through three main approaches: single-leader, multi-leader, and leaderless replication. In single-leader (also called primary/replica) replication, one database instance is designated as the leader (primary), which handles all write operations. The leader logs these updates, which then propagate to replica nodes. Each replica acknowledges receipt of updates, enabling subsequent write operations. Replicas primarily serve read requests, though they may serve stale data due to replication lag – the delay in propagating changes from the leader. In
multi-master replication Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system i ...
(also called multi-leader), updates can be submitted to any database node, which then propagate to other servers. This approach is particularly beneficial in multi-data center deployments, where it enables local write processing while masking inter-data center network latency. However, it introduces substantially increased costs and complexity which may make it impractical in some situations. The most common challenge that exists in multi-master replication is transactional conflict prevention or resolution when concurrent modifications occur on different leader nodes. Most synchronous (or eager) replication solutions perform conflict prevention, while asynchronous (or lazy) solutions have to perform conflict resolution. For instance, if the same record is changed on two nodes simultaneously, an eager replication system would detect the conflict before confirming the commit and abort one of the transactions. A
lazy replication Lazy is the adjective for laziness, a lack of desire to expend effort. It may also refer to: Music Groups and musicians * Lazy (band), a Japanese rock band * Lazy Lester, American blues harmonica player Leslie Johnson (1933–2018) * Lazy Bill L ...
system would allow both transactions to commit and run a conflict resolution during re-synchronization. Conflict resolution methods can include techniques like last-write-wins, application-specific logic, or merging concurrent updates. However, replication transparency can not always be achieved. When data is replicated in a database, they will be constrained by
CAP theorem In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer (scientist), Eric Brewer, states that any distributed data store can provide at most Inconsistent triad, two of the following three guarantees: ; ...
or PACELC theorem. In the NoSQL movement, data consistency is usually sacrificed in exchange for other more desired properties, such as availability (A), partition tolerance (P), etc. Various data consistency models have also been developed to serve as Service Level Agreement (SLA) between service providers and the users. There are several techniques for replicating data changes between nodes: * Statement-based replication: Write requests (such as SQL statements) are logged and transmitted to replicas for execution. This can be problematic with non-deterministic functions or statements having side effects. * Write-ahead log (WAL) shipping: The storage engine's low-level write-ahead log is replicated, ensuring identical data structures across nodes. * Logical (row-based) replication: Changes are described at the row level using a dedicated log format, providing greater flexibility and independence from storage engine internals.


Disk storage replication

Active (real-time) storage replication is usually implemented by distributing updates of a
block device In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These spec ...
to several physical
hard disk A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
s. This way, any file system supported by the
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
can be replicated without modification, as the file system code works on a level above the block device driver layer. It is implemented either in hardware (in a
disk array controller A disk array controller is a device that manages the physical disk drives and presents them to the computer as logical units. It often implements hardware RAID, thus it is sometimes referred to as RAID controller. It also often provides additio ...
) or in software (in a
device driver In the context of an operating system, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabli ...
). The most basic method is
disk mirroring In Data storage device, data storage, disk mirroring is the Replication (computing), replication of logical disk volumes onto separate physical hard disks in Real-time computing, real time to ensure continuous availability. It is most commonly u ...
, which is typical for locally connected disks. The storage industry narrows the definitions, so ''mirroring'' is a local (short-distance) operation. A replication is extendable across a
computer network A computer network is a collection of communicating computers and other devices, such as printers and smart phones. In order to communicate, the computers and devices must be connected by wired media like copper cables, optical fibers, or b ...
, so that the disks can be located in physically distant locations, and the primary/replica database replication model is usually applied. The purpose of replication is to prevent damage from failures or
disaster A disaster is an event that causes serious harm to people, buildings, economies, or the environment, and the affected community cannot handle it alone. '' Natural disasters'' like avalanches, floods, earthquakes, and wildfires are caused by na ...
s that may occur in one location – or in case such events do occur, to improve the ability to recover data. For replication, latency is the key factor because it determines either how far apart the sites can be or the type of replication that can be employed. The main characteristic of such cross-site replication is how write operations are handled, through either asynchronous or synchronous replication; synchronous replication needs to wait for the destination server's response in any write operation whereas asynchronous replication does not.
Synchronous Synchronization is the coordination of events to operate a system in unison. For example, the conductor of an orchestra keeps the orchestra synchronized or ''in time''. Systems that operate with all parts in synchrony are said to be synchrono ...
replication guarantees "zero data loss" by the means of atomic write operations, where the write operation is not considered complete until acknowledged by both the local and remote storage. Most applications wait for a write transaction to complete before proceeding with further work, hence overall performance decreases considerably. Inherently, performance drops proportionally to distance, as minimum latency is dictated by the
speed of light The speed of light in vacuum, commonly denoted , is a universal physical constant exactly equal to ). It is exact because, by international agreement, a metre is defined as the length of the path travelled by light in vacuum during a time i ...
. For 10 km distance, the fastest possible roundtrip takes 67 μs, whereas an entire local cached write completes in about 10–20 μs. In
asynchronous Asynchrony is any dynamic far from synchronization. If and as parts of an asynchronous system become more synchronized, those parts or even the whole system can be said to be in sync. Asynchrony or asynchronous may refer to: Electronics and com ...
replication, the write operation is considered complete as soon as local storage acknowledges it. Remote storage is updated with a small lag. Performance is greatly increased, but in case of a local storage failure, the remote storage is not guaranteed to have the current copy of data (the most recent data may be lost). Semi-synchronous replication typically considers a write operation complete when acknowledged by local storage and received or logged by the remote server. The actual remote write is performed asynchronously, resulting in better performance but remote storage will lag behind the local storage, so that there is no guarantee of durability (i.e., seamless transparency) in the case of local storage failure. Point-in-time replication produces periodic snapshots which are replicated instead of primary storage. This is intended to replicate only the changed data instead of the entire volume. As less information is replicated using this method, replication can occur over less-expensive bandwidth links such as iSCSI or T1 instead of fiberoptic lines.


Implementations

Many
distributed filesystem A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not emplo ...
s use replication to ensure fault tolerance and avoid a single point of failure. Many commercial synchronous replication systems do not freeze when the remote replica fails or loses connection – behaviour which guarantees zero data loss – but proceed to operate locally, losing the desired zero recovery point objective. Techniques of wide-area network (WAN) optimization can be applied to address the limits imposed by latency.


File-based replication

File-based replication conducts data replication at the logical level (i.e., individual data files) rather than at the storage block level. There are many different ways of performing this, which almost exclusively rely on software.


Capture with a kernel driver

A
kernel driver In the context of an operating system, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabli ...
(specifically a filter driver) can be used to intercept calls to the filesystem functions, capturing any activity as it occurs. This uses the same type of technology that real-time active virus checkers employ. At this level, logical file operations are captured like file open, write, delete, etc. The kernel driver transmits these commands to another process, generally over a network to a different machine, which will mimic the operations of the source machine. Like block-level storage replication, the file-level replication allows both synchronous and asynchronous modes. In synchronous mode, write operations on the source machine are held and not allowed to occur until the destination machine has acknowledged the successful replication. Synchronous mode is less common with file replication products although a few solutions exist. File-level replication solutions allow for informed decisions about replication based on the location and type of the file. For example, temporary files or parts of a filesystem that hold no business value could be excluded. The data transmitted can also be more granular; if an application writes 100 bytes, only the 100 bytes are transmitted instead of a complete disk block (generally 4,096 bytes). This substantially reduces the amount of data sent from the source machine and the storage burden on the destination machine. Drawbacks of this software-only solution include the requirement for implementation and maintenance on the operating system level, and an increased burden on the machine's processing power.


File system journal replication

Similarly to database
transaction log In the field of databases in computer science, a transaction log (also transaction journal, database log, binary log or audit trail) is a history of actions executed by a database management system used to guarantee ACID properties over crashes ...
s, many file systems have the ability to
journal A journal, from the Old French ''journal'' (meaning "daily"), may refer to: *Bullet journal, a method of personal organization *Diary, a record of personal secretive thoughts and as open book to personal therapy or used to feel connected to onesel ...
their activity. The journal can be sent to another machine, either periodically or in real time by streaming. On the replica side, the journal can be used to play back file system modifications. One of the notable implementations is
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
's System Center Data Protection Manager (DPM), released in 2005, which performs periodic updates but does not offer real-time replication.


Batch replication

This is the process of comparing the source and destination file systems and ensuring that the destination matches the source. The key benefit is that such solutions are generally free or inexpensive. The downside is that the process of synchronizing them is quite system-intensive, and consequently this process generally runs infrequently. One of the notable implementations is
rsync rsync (remote sync) is a utility for transferring and synchronizing files between a computer and a storage drive and across networked computers by comparing the modification times and sizes of files. It is commonly found on Unix-like opera ...
.


Replication within file

In a
paging In computer operating systems, memory paging is a memory management scheme that allows the physical Computer memory, memory used by a program to be non-contiguous. This also helps avoid the problem of memory fragmentation and requiring compact ...
operating system, pages in a paging file are sometimes replicated within a track to reduce rotational latency. In
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
's VSAM, index data are sometimes replicated within a track to reduce rotational latency.


Distributed shared memory replication

Another example of using replication appears in
distributed shared memory In computer science, distributed shared memory (DSM) is a form of memory architecture where physically separated memories can be addressed as a single shared address space. The term "shared" does not mean that there is a single centralized memo ...
systems, where many nodes of the system share the same
page Page most commonly refers to: * Page (paper), one side of a leaf of paper, as in a book Page, PAGE, pages, or paging may also refer to: Roles * Page (assistance occupation), a professional occupation * Page (servant), traditionally a young m ...
of memory. This usually means that each node has a separate copy (replica) of this page.


Primary-backup and multi-primary replication

Many classical approaches to replication are based on a primary-backup model where one device or process has unilateral control over one or more other processes or devices. For example, the primary might perform some computation, streaming a log of updates to a backup (standby) process, which can then take over if the primary fails. This approach is common for replicating databases, despite the risk that if a portion of the log is lost during a failure, the backup might not be in a state identical to the primary, and transactions could then be lost. A weakness of primary-backup schemes is that only one is actually performing operations. Fault-tolerance is gained, but the identical backup system doubles the costs. For this reason, starting , the distributed systems research community began to explore alternative methods of replicating data. An outgrowth of this work was the emergence of schemes in which a group of replicas could cooperate, with each process acting as a backup while also handling a share of the workload. Computer scientist Jim Gray analyzed multi-primary replication schemes under the transactional model and published a widely cited paper skeptical of the approach "The Dangers of Replication and a Solution". He argued that unless the data splits in some natural way so that the database can be treated as ''n'' disjoint sub-databases, concurrency control conflicts will result in seriously degraded performance and the group of replicas will probably slow as a function of ''n''. Gray suggested that the most common approaches are likely to result in degradation that scales as ''O(n³)''. His solution, which is to partition the data, is only viable in situations where data actually has a natural partitioning key. In the 1985–1987, the
virtual synchrony Reliable multicast is any computer networking protocol that provides a '' reliable'' sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer. Overview Multicast is a ne ...
model was proposed and emerged as a widely adopted standard (it was used in the Isis Toolkit, Horus, Transis, Ensemble, Totem, Spread, C-Ensemble, Phoenix and Quicksilver systems, and is the basis for the
CORBA The Common Object Request Broker Architecture (CORBA) is a standard defined by the Object Management Group (OMG) designed to facilitate the communication of systems that are deployed on diverse platforms. CORBA enables collaboration between sy ...
fault-tolerant computing standard). Virtual synchrony permits a multi-primary approach in which a group of processes cooperates to parallelize some aspects of request processing. The scheme can only be used for some forms of in-memory data, but can provide linear speedups in the size of the group. A number of modern products support similar schemes. For example, the Spread Toolkit supports this same virtual synchrony model and can be used to implement a multi-primary replication scheme; it would also be possible to use C-Ensemble or Quicksilver in this manner.
WANdisco Cirata plc. (formerly known as WANdisco plc.) develops technology that moves large Internet of things, Internet of Things (IoT) datasets, edge data, and Hadoop. The company is dual-headquartered in Sheffield, England and San Ramon, California. H ...
permits active replication where every node on a network is an exact copy or replica and hence every node on the network is active at one time; this scheme is optimized for use in a
wide area network A wide area network (WAN) is a telecommunications network that extends over a large geographic area. Wide area networks are often established with leased telecommunication circuits. Businesses, as well as schools and government entities, use ...
(WAN). Modern multi-primary replication protocols optimize for the common failure-free operation. Chain replication is a  popular family of such protocols. State-of-the-art protocol variants of chain replication offer high throughput and strong consistency by arranging replicas in a chain for writes. This approach enables local reads on all replica nodes but has high latency for writes that must traverse multiple nodes sequentially. A more recent multi-primary protocol
Hermes
combines cache-coherent-inspired invalidations and logical timestamps to achieve strong consistency with local reads and high-performance writes from all replicas. During fault-free operation, its broadcast-based writes are non-conflicting and commit after just one multicast round-trip to replica nodes. This design results in high throughput and low latency for both reads and writes.


See also

*
Change data capture In databases, change data capture (CDC) is a set of software design patterns used to determine and track the data that has changed (the "deltas") so that action can be taken using the changed data. The result is a delta-driven dataset. CDC is an ...
*
Fault-tolerant computer system Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission critical, mission-critical, or even life-critical sys ...
*
Log shipping Log shipping is the process of automating the backup of transaction log files on a primary (production) database server, and then restoring them onto a standby server. This technique is supported by Microsoft SQL Server,Multi-master replication Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system i ...
* Optimistic replication *
Shard (data) A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard may be held on a separate database server instance, to spread load. Some data in a database remains present in all shards, but som ...
*
State machine replication In computer science, state machine replication (SMR) or state machine approach is a general method for implementing a fault-tolerant service by replicating servers and coordinating client interactions with server replicas. The approach also provid ...
*
Virtual synchrony Reliable multicast is any computer networking protocol that provides a '' reliable'' sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer. Overview Multicast is a ne ...


References

{{Authority control Data synchronization Fault-tolerant computer systems Database management systems