
Kernel page-table isolation (KPTI or PTI,
previously called KAISER)
is a
Linux kernel
The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
feature that mitigates the
Meltdown security vulnerability (affecting mainly
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
's
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
s)
and improves kernel hardening against attempts to bypass
kernel address space layout randomization (KASLR). It works by better isolating
user space and kernel space memory.
KPTI was merged into Linux kernel version 4.15,
and
backported to Linux kernels 4.14.11, 4.9.75, and 4.4.110.
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
and
macOS
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
released similar updates. KPTI does not address the related
Spectre
Spectre, specter or the spectre may refer to:
Religion and spirituality
* Vision (spirituality)
* Apparitional experience
* Ghost
Arts and entertainment Film and television
* ''Spectre'' (1977 film), a made-for-television film produced and writ ...
vulnerability.
Background on KAISER
The KPTI patches were based on KAISER (short for ''Kernel Address Isolation to have Side-channels Efficiently Removed''),
a technique conceived in 2016
and published in June 2017 back when Meltdown was not known yet. KAISER makes it harder to defeat KASLR, a 2014 mitigation for a much less severe issue.
In 2014, the Linux kernel adopted
kernel address space layout randomization (KASLR), which makes it more difficult to exploit other kernel vulnerabilities, which relies on kernel address mappings remaining hidden from user space. Despite prohibiting access to these kernel mappings, it turns out that there are several
side-channel attacks in modern processors that can leak the location of this memory, making it possible to work around KASLR.
KAISER addressed these problems in KASLR by eliminating some sources of address leakage.
Whereas KASLR merely prevents address mappings from leaking, KAISER also prevents the data from leaking, thereby covering the Meltdown case.
KPTI is based on KAISER. Without KPTI enabled, whenever executing user-space code (applications), Linux would also keep its entire kernel memory mapped in
page table
A page table is the data structure used by a virtual memory system in a computer operating system to store the mapping between virtual addresses and physical addresses. Virtual addresses are used by the program executed by the accessing process, ...
s, although protected from access. The advantage is that when the application makes a
system call into the kernel or an
interrupt is received, kernel page tables are always present, so most
context switch
In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes ...
ing-related overheads (
TLB flush, page-table swapping, etc) can be avoided.
Meltdown vulnerability and KPTI
In January 2018, the
Meltdown
Meltdown may refer to:
Science and technology
* Nuclear meltdown, a severe nuclear reactor accident
* Meltdown (security vulnerability), affecting computer processors
* Mutational meltdown, in population genetics
Arts and entertainment Music
* Me ...
vulnerability was published, known to affect
Intel's x86 CPUs and
ARM Cortex-A75
The ARM Cortex-A75 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings's Sophia design centre. The Cortex-A75 is a 3-wide decode out-of-order superscalar pipeline. The Cortex-A75 serves as ...
.
It was a far more severe vulnerability than the KASLR bypass that KAISER originally intended to fix: It was found that ''contents'' of kernel memory could also be leaked, not just the locations of memory mappings, as previously thought.
KPTI (conceptually based on KAISER) prevents Meltdown by preventing most protected locations from being mapped to user space.
AMD
Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactur ...
x86 processors are not currently known to be affected by Meltdown and don't need KPTI to mitigate them.
However, AMD processors are still susceptible to KASLR bypass when KPTI is disabled.
Implementation
KPTI fixes these leaks by separating user-space and kernel-space page tables entirely. One set of page tables includes both kernel-space and user-space addresses same as before, but it is only used when the system is running in kernel mode. The second set of page tables for use in user mode contains a copy of user-space and a minimal set of kernel-space mappings that provides the information needed to enter or exit system calls, interrupts and exceptions.
On processors that support the
process-context identifier
A translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. ...
s (PCID), a
translation lookaside buffer
A translation lookaside buffer (TLB) is a memory cache that stores the recent translations of virtual memory to physical memory. It is used to reduce the time taken to access a user memory location. It can be called an address-translation cache. ...
(TLB) flush can be avoided,
but even then it comes at a significant performance cost, particularly in
syscall-heavy and interrupt-heavy workloads.
The overhead was measured to be 0.28% according to KAISER's original authors;
a Linux developer measured it to be roughly 5% for most workloads and up to 30% in some cases, even with the PCID optimization;
for database engine
PostgreSQL
PostgreSQL (, ), also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the In ...
the impact on read-only tests on an Intel
Skylake Skylake or Sky Lake may refer to:
* Skylake (microarchitecture), the codename for a processor microarchitecture developed by Intel as the successor to Broadwell
* Skylake (Mysia), a town of ancient Mysia, now in Turkey
* Sky Lake, Florida
Sky La ...
processor was 7–17% (or 16–23% without PCID), while a full benchmark lost 13–19% (
Coffee Lake vs.
Broadwell-E
Broadwell is the fifth generation of the Intel Core Processor. It is Intel's codename for the 14 nanometer die shrink of its Haswell microarchitecture. It is a "tick" in Intel's tick–tock principle as the next step in semiconductor fabri ...
).
Many benchmarks have been done by
Phoronix,
Redis
Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, su ...
slowed by 6–7%.
Linux kernel compilation slowed down by 5% on
Haswell.
KPTI can partially be disabled with the "nopti" kernel boot option. Also provisions were created to disable KPTI if newer processors fix the information leaks.
References
{{Reflist
External links
KPTI documentation patch
Linux kernel features
Virtual memory
Hardware bugs
X86 architecture