Moose File System (MooseFS) is an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
, POSIX-compliant
distributed file system
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
developed by Core Technology. MooseFS aims to be
fault-tolerant
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...
, highly available, highly performing, scalable general-purpose network distributed file system for
data centers. Initially proprietary software, it was released to the public as
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
on May 30, 2008.
Currently two editions of MooseFS are available:
* MooseFS - released under GPLv2 license,
* MooseFS Professional Edition (MooseFS Pro) - release under proprietary license in binary packages form.
Design
The MooseFS follows similar design principles as
Fossil (file system),
Google File System,
Lustre or
Ceph. The file system comprises three components:
* Metadata server (MDS) — manages the location (layout) of files, file access and namespace hierarchy. The current version of MooseFS does support multiple metadata servers and automatic
failover
Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network in a computer n ...
. Clients only talk to the MDS to retrieve/update a file's layout and attributes; the data itself is transferred directly between clients and chunk servers. The Metadata server is a user-space
daemon
Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and mythology and of later Hellenistic religion and philosophy.
The wo ...
; the metadata is kept in memory and lazily stored on local disk.
* Metalogger server — periodically pulls the metadata from the MDS to store it for backup. Since version 1.6.5, this is an optional feature.
* Chunk servers (CSS) — store the data and optionally replicate it among themselves. There can be many of them, though the scalability limit has not been published. The biggest cluster reported so far consists of 160 servers.
The Chunk server is also a user-space
daemon
Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and mythology and of later Hellenistic religion and philosophy.
The wo ...
that relies on the underlying local file system to manage the actual storage.
* Clients — talk to both the MDS and CSS. MooseFS clients mount the file system into user-space via
FUSE
Fuse or FUSE may refer to:
Devices
* Fuse (electrical), a device used in electrical systems to protect against excessive current
** Fuse (automotive), a class of fuses for vehicles
* Fuse (hydraulic), a device used in hydraulic systems to prote ...
.
Features
To achieve high reliability and performance MooseFS offers the following features:
*
Fault-tolerance — MooseFS uses
replication
Replication may refer to:
Science
* Replication (scientific method), one of the main principles of the scientific method, a.k.a. reproducibility
** Replication (statistics), the repetition of a test or complete experiment
** Replication crisi ...
, data can be replicated across chunkservers, the replication ratio (N) is set per file/directory. If (N-1) replicas fail the data will still be available. At the moment MooseFS does not offer any other technique for
fault-tolerance.
Fault-tolerance for very big files thus requires vast amount of space - N*filesize instead of filesize+(N*stripesize) as would be the case for
RAID 4,
RAID 5 or
RAID 6
In computer storage, the standard RAID levels comprise a basic set of RAID ("redundant array of independent disks" or "redundant array of inexpensive disks") configurations that employ the techniques of striping, mirroring, or parity to create l ...
. Version 4.x PRO of MooseFS implements 8+n
Erasure Coding.
*
Striping — Large files are divided into chunks (up to 64
megabyte
The megabyte is a multiple of the unit byte for digital information. Its recommended unit symbol is MB. The unit prefix ''mega'' is a multiplier of (106) in the International System of Units (SI). Therefore, one megabyte is one million bytes o ...
s) that might be stored on different chunk servers in order to achieve higher aggregate bandwidth.
*
Load balancing — MooseFS attempts to use storage resources equally, the current algorithm seems to take into account only the consumed space.
*
Security" \n\n\nsecurity.txt is a proposed standard for websites' security information that is meant to allow security researchers to easily report security vulnerabilities. The standard prescribes a text file called \"security.txt\" in the well known locat ...
— Apart from classical
POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inte ...
file permissions, since the 1.6 release MooseFS offers a simple, NFS-like,
authentication
Authentication (from ''authentikos'', "real, genuine", from αὐθέντης ''authentes'', "author") is the act of proving an assertion, such as the identity of a computer system user. In contrast with identification, the act of indicat ...
/
authorization
Authorization or authorisation (see spelling differences) is the function of specifying access rights/privileges to resources, which is related to general information security and computer security, and to access control in particular. More f ...
.
*
Coherent snapshots — Quick, low-overhead snapshots.
* Transparent "trash bin" — Deleted files are retained for a configurable period of time.
* Data tiering / storage classes — Possibility to "label" servers, create label definitions called "Storage Classes" and decide, on which types of servers the data is stored
*
"Project" quotas support
* POSIX locks, flock locks support
Hardware, software and networking
Similarly to other cluster-based file systems MooseFS uses
commodity hardware running a
POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inte ...
compliant operating system.
TCP/IP
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suit ...
is used as the interconnect.
MooseFS in figuresMooseFS Factsheet
/ref>
* Storage size is up to: 2
64 Byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s = 16 EiB = 16 384 PiB
* Single file size is up to: 2
57 Bytes = 128 PiB
* Number of files is up to: 2
31 = 2.1 × 10
9
* Number of active clients is unlimited it depends on number of file descriptors in the system
See also
*
BeeGFS
*
Ceph
*
Distributed file system
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
*
GlusterFS
Gluster Inc. (formerly known as Z RESEARCH) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineer ...
*
Google File System
*
List of file systems § Distributed fault-tolerant file systems
*
LizardFS
LizardFS is an Open-source software, open source distributed file system that is POSIX-compliant and licensed under GPLv3. It was released in 2013 as fork of MooseFS. LizardFS is also offering a paid Technical Support (Standard, Enterprise and En ...
a fork of MooseFS v. 1.6.x
*
Lustre
References
External links
*
*
* {{SourceForge, moosefs, MooseFS
Distributed file systems
Distributed file systems supported by the Linux kernel
Network file systems
Userspace file systems
File system management