Moose File System
   HOME

TheInfoList



OR:

Moose File System (MooseFS) is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
, POSIX-compliant
distributed file system A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not emplo ...
developed by Core Technology. MooseFS aims to be
fault-tolerant Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems. Fault to ...
, highly available, highly performing, scalable general-purpose network distributed file system for
data center A data center is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommunications and storage systems. Since IT operations are crucial for busines ...
s. Initially proprietary software, it was released to the public as
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
on May 30, 2008. Currently two editions of MooseFS are available: * MooseFS - released under GPLv2 license, * MooseFS Professional Edition (MooseFS Pro) - release under proprietary license in binary packages form.


Design

The MooseFS follows similar design principles as
Fossil A fossil (from Classical Latin , ) is any preserved remains, impression, or trace of any once-living thing from a past geological age. Examples include bones, shells, exoskeletons, stone imprints of animals or microbes, objects preserve ...
,
Google File System Google File System (GFS or GoogleFS, not to be confused with the GFS Linux file system) is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Go ...
, Lustre or Ceph. The file system comprises three components: * Metadata server (MDS) — manages the location (layout) of files, file access and namespace hierarchy. The current version of MooseFS does support multiple metadata servers and automatic
failover Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network in a computer ...
. Clients only talk to the MDS to retrieve/update a file's layout and attributes; the data itself is transferred directly between clients and chunk servers. The Metadata server is a user-space
daemon A demon is a malevolent supernatural being, evil spirit or fiend in religion, occultism, literature, fiction, mythology and folklore. Demon, daemon or dæmon may also refer to: Entertainment Fictional entities * Daemon (G.I. Joe), a character ...
; the metadata is kept in memory and lazily stored on local disk. * Metalogger server — periodically pulls the metadata from the MDS to store it for backup. Since version 1.6.5, this is an optional feature. * Chunk servers (CSS) — store the data and optionally replicate it among themselves. There can be many of them, though the scalability limit has not been published. The biggest cluster reported so far consists of 160 servers. The Chunk server is also a user-space
daemon A demon is a malevolent supernatural being, evil spirit or fiend in religion, occultism, literature, fiction, mythology and folklore. Demon, daemon or dæmon may also refer to: Entertainment Fictional entities * Daemon (G.I. Joe), a character ...
that relies on the underlying local file system to manage the actual storage. * Clients — talk to both the MDS and CSS. MooseFS clients mount the file system into user-space via FUSE.


Features

To achieve high reliability and performance MooseFS offers the following features: *
Fault-tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission critical, mission-critical, or even life-critical sys ...
— MooseFS uses replication, data can be replicated across chunkservers, the replication ratio (''N'') is set per file/directory. If (''N''−1) replicas fail the data will still be available. At the moment MooseFS does not offer any other technique for
fault-tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission critical, mission-critical, or even life-critical sys ...
.
Fault-tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission critical, mission-critical, or even life-critical sys ...
for very big files thus requires vast amount of space - ''N'' × filesize instead of filesize + (''N'' × stripesize) as would be the case for
RAID 4 In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create la ...
, RAID 5 or
RAID 6 In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create la ...
. Version 4.x PRO of MooseFS implements 8+''n''
Erasure Coding In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of ''k'' symbols into a longer message (code word) with ''n'' symbols such that t ...
. * Striping — Large files are divided into chunks (up to 64
megabyte The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix ''mega'' is a multiplier of (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes ...
s) that might be stored on different chunk servers in order to achieve higher aggregate bandwidth. * Load balancing — MooseFS attempts to use storage resources equally, the current algorithm seems to take into account only the consumed space. *
Security Security is protection from, or resilience against, potential harm (or other unwanted coercion). Beneficiaries (technically referents) of security may be persons and social groups, objects and institutions, ecosystems, or any other entity or ...
— Apart from classical
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
file permissions, since the 1.6 release MooseFS offers a simple, NFS-like,
authentication Authentication (from ''authentikos'', "real, genuine", from αὐθέντης ''authentes'', "author") is the act of proving an Logical assertion, assertion, such as the Digital identity, identity of a computer system user. In contrast with iden ...
/
authorization Authorization or authorisation (see American and British English spelling differences#-ise, -ize (-isation, -ization), spelling differences), in information security, computer security and identity management, IAM (Identity and Access Managemen ...
. * Coherent snapshots — Quick, low-overhead snapshots. * Transparent "trash bin" — Deleted files are retained for a configurable period of time. * Data tiering / storage classes — Possibility to "label" servers, create label definitions called "Storage Classes" and decide, on which types of servers the data is stored * "Project" quotas support * POSIX locks, flock locks support


Hardware, software and networking

Similarly to other cluster-based file systems MooseFS uses commodity hardware running a
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
compliant operating system.
TCP/IP The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suite are ...
is used as the interconnect.


MooseFS in figures

Source:MooseFS Factsheet
/ref> * Storage size is up to: 264
Byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
s = 16 EiB = 16 384 PiB * Single file size is up to: 257 Bytes = 128 PiB * Number of files is up to: 231 = 2.1 billion * Number of active clients is unlimited it depends on number of file descriptors in the system


See also

*
BeeGFS BeeGFS (formerly FhGFS) is a parallel file system developed for high-performance computing. BeeGFS includes a distributed metadata architecture for scalability and flexibility reasons. It specializes in data throughput. BeeGFS was originally de ...
* Ceph *
Distributed file system A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not emplo ...
*
GlusterFS Gluster Inc. (formerly known as Z RESEARCH) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engine ...
*
Google File System Google File System (GFS or GoogleFS, not to be confused with the GFS Linux file system) is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Go ...
* List of file systems § Distributed fault-tolerant file systems * LizardFS a fork of MooseFS v. 1.6.x * Lustre


References


External links

* * * {{SourceForge, moosefs, MooseFS Distributed file systems Distributed file systems supported by the Linux kernel Network file systems Userspace file systems File system management