HOME

TheInfoList



OR:

The NOVA (''
non-volatile memory Non-volatile memory (NVM) or non-volatile storage is a type of computer memory that can retain stored information even after power is removed. In contrast, volatile memory needs constant power in order to retain data. Non-volatile memory typ ...
accelerated'') file system is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
, log-structured file system for byte-addressable
persistent memory In computer science, persistent memory is any method or apparatus for efficiently storing data structures such that they can continue to be accessed using memory instructions or memory APIs even after the end of the process that created or last mo ...
(for example
non-volatile dual in-line memory module A NVDIMM (pronounced "en-vee-dimm") or non-volatile DIMM is a type of persistent random-access memory for computers using widely used DIMM form-factors. Non-volatile memory is memory that retains its contents even when electrical power is remove ...
(NVDIMM) and
3D XPoint 3D XPoint (pronounced ''three-D cross point'') is a discontinued non-volatile memory (NVM) technology developed jointly by Intel and Micron Technology. It was announced in July 2015 and is available on the open market under the brand name Optane ...
DIMMs) for
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
. NOVA is designed specifically for byte-addressable persistent memories and aims to provide high-performance, atomic file and metadata operations, and fault tolerance. To meet these goals NOVA combines several techniques found in other file systems. NOVA uses
log structure In algebraic geometry, a log structure provides an abstract context to study semistable schemes, and in particular the notion of logarithmic differential form and the related Hodge-theoretic concepts. This idea has applications in the theory of ...
,
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
(COW), journaling, and log-structured metadata updates to provide strong atomicity guarantees, and it uses a combination replication, metadata
checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify dat ...
s, and RAID 4 parity to protect data and metadata from media errors and software bugs. It also supports checkpoints to facilitate backups.


Filesystem

NOVA was developed at the
University of California, San Diego The University of California, San Diego (UC San Diego or colloquially, UCSD) is a public university, public Land-grant university, land-grant research university in San Diego, California. Established in 1960 near the pre-existing Scripps Insti ...
, in the Non-Volatile Systems Laboratory of the Computer Science and Engineering Department. Patches were initially made available for version 4.12 of the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
. it is limited to
x86-64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging ...
Linux, and not ready for merging with the upstream kernel.


Log structure

NOVA is primarily a log-structured file system, but it differs from other log-structured file systems in several respects. First, rather than using a single log for the entire file system, each
inode The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attribu ...
has its own, dedicated log that records the updates to the inode. This allows for increased concurrency in file operations, since different
threads Thread may refer to: Objects * Thread (yarn), a kind of thin yarn used for sewing ** Thread (unit of measurement), a cotton yarn measure * Screw thread, a helical ridge on a cylindrical fastener Arts and entertainment * ''Thread'' (film), 2016 ...
can operate on inodes in parallel. Second, the logs do not contain file data, but only metadata updates, resulting in smaller logs. Third, the logs are not stored in physically contiguous memory. Instead, NOVA stores the logs in a
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
of 4 KB memory pages. NOVA uses the logs to provide atomicity for operations that affect a single file (e.g., writing to a file or modifying its metadata). To do this, NOVA writes a log entry to empty space past the end of the log and then atomically updates the inode's pointer to the log tail.


Copy-on-write

NOVA uses
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
(COW) to update file data. When a program writes data to a file, NOVA allocates some unused memory pages to hold the data and writes the data into them. Then, it appends a log entry to the inode's log that points to the new pages and describes their logical location in the file. Since appending the log entry is atomic, the write is also atomic.


Journaling

Some file operations (e.g., moving a file from one directory to another) require modifying multiple inodes. To make these operations atomic, NOVA uses a simple journaling mechanisms. First, it writes the new log entries to ends of the inodes that the operation will affect, then it uses the journal to record the necessary updates to the inodes' log tail pointers. Next, it marks the journal as committed and applies the updates to the tail pointers.


Metadata protection

NOVA uses replication and
checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify dat ...
s to provide protection against metadata corruption due to media errors and software bugs. Every metadata structure (e.g., inodes, superblocks, and log entries) contains a CRC32 checksum that allows NOVA to detect if structures contents have changed with its knowledge. NOVA also stores two copies of each data structure – the "primary" and the "replica" – and stores them far from one another in memory. Whenever NOVA accesses a metadata structure, it first recomputes the checksum on both the primary and the replica. If either check results in a mismatch, NOVA repairs the damage using the other copy. If neither checksum matches, then the structure is lost and NOVA returns an error.


Data protection

NOVA uses RAID 4 to protect file data. It divides each 4 KB page into 512-byte strips and stores a parity strip in a dedicated region of persistent memory. It also computes (and stores a replica of) a CRC32 checksum for the eight data strips and the parity strip. When NOVA reads a page, it confirms the checksum on each strip. If one of the strips is corrupt, it tries to recover the strip using the parity bits. If no other strips have experienced data corruption, recovery will succeed. Otherwise, recovery fails, the contents of the page are lost, and NOVA returns an error.


References


External links


NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories

Hardening the NOVA File System UCSD-CSE Techreport CS2017-1018

NOVA: The Fastest File System for NVDIMMs
{{Linux Free special-purpose file systems Free software programmed in C Linux kernel features Unix file system-related software