HOME





Lockstep Memory
Lockstep systems are fault-tolerant computer systems that run the same set of operations at the same time in parallel. The redundancy (duplication) allows error detection and error correction: the output from lockstep operations can be compared to determine if there has been a fault if there are at least two systems (dual modular redundancy DMR), and the error can be automatically corrected if there are at least three systems (triple modular redundancy TMR), via majority vote. The term " lockstep" originates from army usage, where it refers to synchronized walking, in which marchers walk as closely together as physically practical. To run in lockstep, each system is set up to progress from one well-defined state to the next well-defined state. When a new set of inputs reaches the system, it processes them, generates new outputs and updates its state. This set of changes (new inputs, new outputs, new state) is considered to define that step, and must be treated as an atomic trans ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Fault-tolerant Computer System
Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission critical, mission-critical, or even life-critical systems. Fault tolerance specifically refers to a system's capability to handle faults without any degradation or downtime. In the event of an error, end-users remain unaware of any issues. Conversely, a system that experiences errors with some interruption in service or graceful degradation of performance is termed 'resilient'. In resilience, the system adapts to the error, maintaining service but acknowledging a certain impact on performance. Typically, fault tolerance describes computer systems, ensuring the overall system remains functional despite computer hardware, hardware or software issues. Non-computing examples include structures that retain their integrity despite damage from fatigue (material), fatigue, corrosion or impact. H ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

DIMM
A DIMM (Dual In-line Memory Module) is a popular type of memory module used in computers. It is a printed circuit board with one or both sides (front and back) holding DRAM chips and pins. The vast majority of DIMMs are manufactured in compliance with JEDEC memory standards, although there are proprietary DIMMs. DIMMs come in a variety of speeds and capacities, and are generally one of two lengths: PC, which are , and laptop (SO-DIMM), which are about half the length at . History DIMMs (Dual In-line Memory Module) were a 1990s upgrade for SIMMs (Single In-line Memory Modules) as Intel P5-based Pentium processors began to gain market share. The Pentium had a 64-bit bus width, which would require SIMMs installed in matched pairs in order to populate the data bus. The processor would then access the two SIMMs in parallel. DIMMs were introduced to eliminate this disadvantage. The contacts on SIMMs on both sides are redundant, while DIMMs have separate electrical contacts o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


VAXft
The VAXft was a family of fault-tolerant minicomputers developed and manufactured by Digital Equipment Corporation (DEC) using processors implementing the VAX instruction set architecture (ISA). "VAXft" stood for "Virtual Address Extension, fault tolerant". These systems ran the OpenVMS operating system, and were first supported by VMS 5.4. Two layered software products, VAXft System Services and VMS Volume Shadowing, were required to support the fault-tolerant features of the VAXft and for the redundancy of data stored on hard disk drives. Architecture All VAXft systems shared the same basic system architecture. A VAXft system consisted of two "zones" that operated in lock-step: "Zone A" and "Zone B". Each zone was a fully functional computer, capable of running an operating system, and was identical to the other in hardware configuration. Lock-step was achieved by hardware on the CPU module. The CPU module of each zone was connected to the other with a crosslink cable. The cr ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Stratus VOS
Stratus VOS (Virtual Operating System) is a proprietary operating system running on Stratus Technologies fault-tolerant computer systems. VOS is available on Stratus's ftServer and Continuum platforms. VOS customers use it to support high-volume transaction processing applications which require continuous availability. VOS is notable for being one of the few operating systems which run on fully lockstepped hardware. During the 1980s, an IBM version of Stratus VOS existed and was called the System/88 Operating System. History VOS was designed from its inception as a high-security transaction-processing environment tailored to fault-tolerant hardware. It incorporates much of the design experience that came out of the MIT/Bell-Laboratories/General-Electric (later Honeywell) Multics project. In 1984, Stratus added a UNIX System V implementation called Unix System Facilities (USF) to VOS, integrating Unix and VOS at the kernel level. In recent years, Stratus has added POSIX-com ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

NonStop (server Computers)
NonStop is a series of server computers introduced to market in 1976 by Tandem Computers Inc., beginning with the NonStop product line. It was followed by the Tandem Integrity NonStop line of lock-step fault-tolerant computers, now defunct (not to be confused with the later and much different Hewlett-Packard Integrity product line extension). The original NonStop product line is currently offered by Hewlett Packard Enterprise since Hewlett-Packard Company's split in 2015. Because NonStop systems are based on an integrated hardware/software stack, Tandem and later HPE also developed the NonStop OS operating system for them. NonStop systems are, to an extent, self-healing. To circumvent single points of failure, they are equipped with almost all redundant components. When a mainline component fails, the system automatically falls back to the backup. These systems can be used by banks, stock exchanges, payment applications, retail companies, energy and utility services, healt ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Master-checker
Master-checker or master/checker is a hardware-supported fault tolerance architecture for multiprocessor Multiprocessing (MP) is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. The ... systems, in which two processors, referred to as the ''master'' and ''checker'', calculate the same functions in parallel in order to increase the probability that the result is exact. The checker CPU is synchronised at clock level with the master CPU and processes the same programs as the master. Whenever the master CPU generates an output, the checker CPU compares this output to its own calculation and in the event of a difference raises a warning. The master-checker system generally gives more accurate answers by ensuring that the answer is correct before passing it on to the application requesting the algorithm being complet ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Common Mode Failure
Common and special causes are the two distinct origins of variation in a process, as defined in the statistical thinking and methods of Walter A. Shewhart and W. Edwards Deming. Briefly, "common causes", also called natural patterns, are the usual, historical, quantifiable variation in a system, while "special causes" are unusual, not previously observed, non-quantifiable variation. The distinction is fundamental in philosophy of statistics and philosophy of probability, with different treatment of these issues being a classic issue of probability interpretations, being recognised and discussed as early as 1703 by Gottfried Leibniz; various alternative names have been used over the years. The distinction has been particularly important in the thinking of economists Frank Knight, John Maynard Keynes and G. L. S. Shackle. Origins and concepts In 1703, Jacob Bernoulli wrote to Gottfried Leibniz to discuss their shared interest in applying mathematics and probability to games o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Hewlett-Packard
The Hewlett-Packard Company, commonly shortened to Hewlett-Packard ( ) or HP, was an American multinational information technology company. It was founded by Bill Hewlett and David Packard in 1939 in a one-car garage in Palo Alto, California, where the company would remain headquartered for the remainder of its lifetime; this HP Garage is now a designated landmark and marked with a plaque calling it the "Birthplace of 'Silicon Valley. HP developed and provided a wide variety of hardware components, as well as software and related services, to consumers, small and medium-sized businesses (small and medium-sized enterprises, SMBs), and fairly large companies, including customers in government sectors, until the company officially split into Hewlett Packard Enterprise and HP Inc. in 2015. HP initially produced a line of electronic test and measurement equipment. It won its first big contract in 1938 to provide the HP 200B, a variation of its first product, the HP 200A low-distor ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer components such as central processing units (CPUs) and related products for business and consumer markets. It is one of the world's List of largest semiconductor chip manufacturers, largest semiconductor chip manufacturers by revenue, and ranked in the Fortune 500, ''Fortune'' 500 list of the List of largest companies in the United States by revenue, largest United States corporations by revenue for nearly a decade, from 2007 to 2016 Fiscal year, fiscal years, until it was removed from the ranking in 2018. In 2020, it was reinstated and ranked 45th, being the List of Fortune 500 computer software and information companies, 7th-largest technology company in the ranking. It was one of the first companies listed on Nasdaq. Intel supplies List of I ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

ECC Memory
Error correction code memory (ECC memory) is a type of computer data storage that uses an error correction code (ECC) to detect and correct ''n''-bit data corruption which occurs in memory. Typically, ECC memory maintains a memory system immune to single-bit errors: the data that is read from each word is always the same as the data that had been written to it, even if one of the bits actually stored has been flipped to the wrong state. Most non-ECC memory cannot detect errors, although some non-ECC memory with parity support allows detection but not correction. ECC memory is used in most computers where data corruption cannot be tolerated, like industrial control applications, critical databases, and infrastructural memory caches. Concept Error correction codes protect against undetected data corruption and are used in computers where such corruption is unacceptable, examples being scientific and financial computing applications, or in database and file servers. ECC can a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Single Error Correction And Double Error Detection
In computer science and telecommunications, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the simple parity code cannot correct errors, and can detect only an odd number of bits in error. Hamming codes are perfect codes, that is, they achieve the highest possible rate for codes with their block length and minimum distance of three. Richard W. Hamming invented Hamming codes in 1950 as a way of automatically correcting errors introduced by punched card readers. In his original paper, Hamming elaborated his general idea, but specifically focused on the Hamming(7,4) code which adds three parity bits to four bits of data. In mathematical terms, Hamming codes are a class of binary linear code. For each integer there is a code-word with block length and message length . Hence the rate of Hamming codes is , which is the highest po ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Cache Line
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels (L1, L2, often L3, and rarely even L4), with different instruction-specific and data-specific caches at level 1. The cache memory is typically implemented with static random-access memory (SRAM), in modern CPUs by far the largest part of them by chip area, but SRAM is not always used for all levels (of I- or D-cache), or even any level, sometimes some latter or all levels are implemented with eDRAM. Other types of caches exist (that are not counted towards the "cache size" of the most important caches mentioned above), such as the translation lookaside buffer (TLB) which is part of the memory management unit (MMU) wh ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]