Flashcache
   HOME

TheInfoList



OR:

Flashcache is a disk cache component for the
Linux kernel The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
, initially developed by
Facebook Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
since April 2010, and released as
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
in 2011. Since January 2013, there is a fork of Flashcache, named EnhanceIO and developed by sTec, Inc. Since 2015 that fork became unmaintained and it was forked again and maintained by individuals. Flashcache works by using
flash memory Flash memory is an Integrated circuit, electronic Non-volatile memory, non-volatile computer memory storage medium that can be electrically erased and reprogrammed. The two main types of flash memory, NOR flash and NAND flash, are named for t ...
, a
USB flash drive A flash drive (also thumb drive, memory stick, and pen drive/pendrive) is a data storage device that includes flash memory with an integrated USB interface. A typical USB drive is removable, rewritable, and smaller than an optical disc, and u ...
,
SD card Secure Digital (SD) is a proprietary, non-volatile, flash memory card format developed by the SD Association (SDA). Owing to their compact size, SD cards have been widely adopted in a variety of portable consumer electronics, including dig ...
,
CompactFlash CompactFlash (CF) is a flash memory mass storage device used mainly in portable electronic devices. The format was specified and the devices were first manufactured by SanDisk in 1994. CompactFlash became one of the most successful of the e ...
or any kind of portable flash mass storage system as a write-back persistent cache. An internal
SSD A solid-state drive (SSD) is a type of solid-state storage device that uses Integrated circuit, integrated circuits to store data persistence (computer science), persistently. It is sometimes called semiconductor storage device, solid-stat ...
can also be used for increasing performance.


Overview

Using flash memory ( NAND memory devices) for caching allows Linux kernel to service random disk IO with better performance than without the cache. This caching applies to all disk content, not just the page file or system binaries. Flash memory based devices are usually a magnitude faster than spinning HDDs for random IO, but with less advantage or even slower in sequential read/writes. By default, flashcache caches all full blocksize IOs, but can be configured to only cache random IO whilst ignoring sequential IO. Similar technology exists in
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
as
ReadyBoost ReadyBoost (codenamed EMD) is a disk caching software component developed by Microsoft for Windows Vista and included in later versions of Windows. ReadyBoost enables NAND memory mass storage CompactFlash, SD card, and USB flash drive devices to ...
since
Windows Vista Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
.


Implementation

Flashcache is built on top of the Linux kernel's
device mapper The device mapper is a framework provided by the Linux kernel for mapping physical block devices onto higher-level ''virtual block devices''. It forms the foundation of the logical volume manager (LVM), software RAIDs and dm-crypt disk encrypt ...
. The data structure of the cache is a set-associative
hash table In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
, in which the cache is divided up into a number of fixed-size sets (buckets), using
linear probing Linear probing is a scheme in computer programming for resolving hash collision, collisions in hash tables, data structures for maintaining a collection of Attribute–value pair, key–value pairs and looking up the value associated with a giv ...
within a set to find blocks. The device mapper layer breaks up all I/O requests into blocksize chunks before passing the requests to the cache layer. When a write request happens, the corresponding cache block is marked dirty; dirty cache blocks are written lazily to disk in the background. There are a few parameters to control the
write-back In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsew ...
policy: dirty-threshold, idleness and contiguity with other dirty blocks about to be written back.


Limitations

There are a few limitations, imposed by the implementation of flashcache: ; Atomicity : Cache block writes are currently non-atomic. ; TRIM support : ATA TRIM command to optimize flash memory are not yet supported. ; Cache pollution protection : A process can be marked non-cacheable to prevent flashcache cache its requests; however, if a process that marked itself non-cacheable dies, flashcache has no way of cleaning up. ; Alignment : Relying on the device mapper resulted in caching performance issues and no caching of writes that are not multiple of 4 KiB. Primarily, this affects the Xen hypervisor. Thus, EnhanceIO has moved away from the device mapper integration, yielding higher performance for unoptimal use cases. ; Write-around read latency impact : in write-around mode all writes bypass the cache for high consistency. The current implementation will fetch reads through the SSD device and then deliver them to the actual reader. This means that previously uncached blocks will always need to go to the SSD device first, causing a constant write IO. Not an issue on enterprise SSD or highend PCIe devices as facebook uses, but degrades performance on lower end SSD. ; Write-around read cache warm-up phase : in write-around mode FlashCache has no information to compare the age of cached pages over the on-disk ones. (1) Because the device could have been mounted outside of FlashCache (2) Because no writes are tracked in this mode. This results in an empty cache after each volume activation (i.e.: reboot). Performance will be degraded until all hot areas have been cached.


See also

* bcache * dm-cache * Cache Acceleration Software (Intel's product)


References


External links


Performance Comparison among EnhanceIO, bcache and dm-cache
(
LKML The Linux kernel mailing list (LKML) is the main electronic mailing list for Linux kernel development, where the majority of the announcements, discussions, debates, and flame wars over the kernel take place. Many other mailing lists exist to d ...
)
EnhanceIO, Bcache & DM-Cache Benchmarked

Flashcache at Facebook: From 2010 to 2013 and beyond


{{Webarchive, url=https://web.archive.org/web/20131220200944/http://www.tomsitpro.com/articles/facebook-releases-flashcache-3-ssd-cache,1-1298.html , date=2013-12-20 Solid-state caching Linux kernel-related software