In computing, a page cache, sometimes also called
disk cache, is a transparent
cache for the
pages originating from a
secondary storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a comput ...
device such as a
hard disk drive
A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magn ...
(HDD) or a
solid-state drive
A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It i ...
(SSD). The
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
keeps a page cache in otherwise unused portions of the
main memory
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a comput ...
(RAM), resulting in quicker access to the contents of cached pages and overall performance improvements. A page cache is implemented in
kernel
Kernel may refer to:
Computing
* Kernel (operating system), the central component of most operating systems
* Kernel (image processing), a matrix used for image convolution
* Compute kernel, in GPGPU programming
* Kernel method, in machine lea ...
s with the
paging
In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storag ...
memory management, and is mostly transparent to applications.
Usually, all physical memory not directly allocated to applications is used by the operating system for the page cache. Since the memory would otherwise be idle and is easily reclaimed when applications request it, there is generally no associated performance penalty and the operating system might even report such memory as "free" or "available".
When compared to main memory, hard disk drive read/writes are slow and
random access
Random access (more precisely and more generally called direct access) is the ability to access an arbitrary element of a sequence in equal time or any datum from a population of addressable elements roughly as easily and efficiently as any othe ...
es require expensive
disk seek Higher performance in hard disk drives comes from devices which have better performance characteristics. These performance characteristics can be grouped into two categories: access time and data transfer time (or rate).
Access time
The ''acces ...
s; as a result, larger amounts of main memory bring performance improvements as more data can be cached in memory. Separate disk caching is provided on the hardware side, by dedicated RAM or
NVRAM
Non-volatile random-access memory (NVRAM) is random-access memory that retains data without applied power. This is in contrast to dynamic random-access memory (DRAM) and static random-access memory (SRAM), which both maintain data only for as lon ...
chips located either in the
disk controller (in which case the cache is integrated into a hard disk drive and usually called
disk buffer
In computer storage, disk buffer (often ambiguously called disk cache or cache buffer) is the embedded memory in a hard disk drive (HDD) or solid state drive (SSD) acting as a buffer between the rest of the computer and the physical hard ...
), or in a
disk array controller. Such memory should not be confused with the page cache.
Memory conservation
Pages in the page cache modified after being brought in are called dirty pages. Since non-dirty pages in the page cache have identical copies in
secondary storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a comput ...
(e.g. hard disk drive or solid-state drive), discarding and reusing their space is much quicker than paging out application memory, and is often preferred over flushing the dirty pages into secondary storage and reusing their space. Executable
binaries, such as applications and libraries, are also typically accessed through page cache and mapped to individual
process
A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic.
Things called a process include:
Business and management
*Business process, activities that produce a specific se ...
spaces using
virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
(this is done through the
mmap
In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not immediately read from disk and initially use n ...
system call on Unix-like operating systems). This not only means that the binary files are shared between separate processes, but also that unused parts of binaries will be flushed out of main memory eventually, leading to memory conservation.
Since cached pages can be easily evicted and re-used, some operating systems, notably
Windows NT
Windows NT is a proprietary graphical operating system produced by Microsoft, the first version of which was released on July 27, 1993. It is a processor-independent, multiprocessing and multi-user operating system.
The first version of Wi ...
, even report the page cache usage as "available" memory, while the memory is actually allocated to disk pages. This has led to some confusion about the utilization of page cache in Windows.
Disk writes
The page cache also aids in writing to a disk. Pages in the main memory that have been modified during writing data to disk are marked as "dirty" and have to be flushed to disk before they can be freed. When a file write occurs, the cached page for the particular block is looked up. If it is already found in the page cache, the write is done to that page in the main memory. If it is not found in the page cache, then, when the write perfectly falls on
page size
A page, memory page, or virtual page is a fixed-length contiguous block of virtual memory, described by a single entry in the page table. It is the smallest unit of data for memory management in a virtual memory operating system. Similarly, a p ...
boundaries, the page is not even read from disk, but allocated and immediately marked dirty. Otherwise, the page(s) are fetched from disk and requested modifications are done. A file that is created or opened in the page cache, but not written to, might result in a
zero byte file at a later read.
However, not all cached pages can be written to as program code is often mapped as
read-only or
copy-on-write
Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
; in the latter case, modifications to code will only be visible to the process itself and will not be written to disk.
Side-channel attacks
In 2019, security researchers demonstrated
side-channel attacks against the page cache: it's possible to bypass
privilege separation
In computer programming and computer security, privilege separation is one software-based technique for implementing the principle of least privilege. With privilege separation, a program is divided into parts which are limited to the specific pr ...
and exfiltrate data about other processes by systematically monitoring whether some file pages (for example
executable
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data fil ...
or
library
A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vi ...
files) are present in the cache or not.
See also
*
Demand paging
In computer operating systems, demand paging (as opposed to anticipatory paging) is a method of virtual memory management. In a system that uses demand paging, the operating system copies a disk page into physical memory only if an attempt is m ...
*
Cache (computing)
*
Paging
In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storag ...
*
Page replacement algorithm
In a computer operating system that uses paging for virtual memory management, page replacement algorithms decide which memory pages to page out, sometimes called swap out, or write to disk, when a page of memory needs to be allocated. Page re ...
*
Virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
References
External links
Page Cache, the Affair Between Memory and Files
Computer memory
Memory management
Hard disk computer storage