Windows Hardware Error Architecture (WHEA) is an operating system hardware error handling mechanism introduced with
Windows Vista SP1 and
Windows Server 2008
Windows Server 2008 is the fourth release of the Windows Server operating system produced by Microsoft as part of the Windows NT family of the operating systems. It was released to manufacturing on February 4, 2008, and generally to retail on F ...
as a successor to
Machine Check Architecture (MCA) on previous versions of
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
. The architecture consists of several software components that interact with the hardware and firmware of a given platform to handle and notify regarding hardware error conditions. Collectively, these components provide: a generic means of discovering errors, a common error report format for those errors, a way of preserving error records, and an error event model based up on
Event Tracing for Windows (ETW).
WHEA "builds on the PCI Express Advanced Reporting to provide more detailed information about system errors and a common reporting structure."
WHEA allows third-party software to interact with the operating system and react to certain hardware events. For example, when a new CPU is added to a running system—a Windows Server feature known as
Dynamic Hardware Partitioning—the hardware error component stack is notified that a new processor was installed.
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
supports the
ACPI Platform Error Interface
Advanced Configuration and Power Interface (ACPI) is an open standard that operating systems can use to discover and configure computer hardware components, to perform power management (e.g. putting unused hardware components to sleep), auto con ...
(APEI) which is introduced in ACPI 5.0.
See also
*
Machine-check exception
A machine check exception (MCE) is a type of computer error that occurs when a problem involving the computer's hardware is detected. With most mass-market personal computers, an MCE indicates faulty or misconfigured hardware.
The nature and ...
(MCE)
*
Reliability, availability and serviceability
Reliability, availability and serviceability (RAS), also known as reliability, availability, and maintainability (RAM), is a computer hardware engineering term involving reliability engineering, high availability, and serviceability design. The p ...
(RAS)
*
RAMS
In engineering, RAMS (reliability, availability, maintainability and safety)High availability
High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
Modernization has resulted in an increased reliance on these systems. F ...
(HA)
*
Blue screen of death
References
{{Windows Components
Windows components
Windows Vista
Windows Server 2008
Computer errors