HOME

TheInfoList



OR:

Gluster Inc. (formerly known as Z RESEARCH) was a software company that provided an
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
platform for scale-out public and private
cloud storage Cloud storage is a model of computer data storage in which data, said to be on "the cloud", is stored remotely in logical pools and is accessible to users over a network, typically the Internet. The physical storage spans multiple servers (so ...
. The company was privately funded and headquartered in
Sunnyvale, California Sunnyvale () is a city located in the Santa Clara Valley in northwestern Santa Clara County, California, United States. Sunnyvale lies along the historic El Camino Real (California), El Camino Real and U.S. Route 101 in California, Highway 1 ...
, with an engineering center in
Bangalore Bengaluru, also known as Bangalore (List of renamed places in India#Karnataka, its official name until 1 November 2014), is the Capital city, capital and largest city of the southern States and union territories of India, Indian state of Kar ...
, India. Gluster was funded by Nexus Venture Partners and
Index Ventures Index Ventures is a European venture capital firm with headquarters in both San Francisco and London. It invests primarily in tech companies. History Index Ventures has its origins in a Switzerland, Swiss bond (finance), bond-trading firm cal ...
. Gluster was acquired by
Red Hat Red Hat, Inc. (formerly Red Hat Software, Inc.) is an American software company that provides open source software products to enterprises and is a subsidiary of IBM. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North ...
on October 7, 2011.


History

The name ''Gluster'' comes from the combination of the terms ''
GNU GNU ( ) is an extensive collection of free software (394 packages ), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operating systems popu ...
'' and ''cluster''. Despite the similarity in names, Gluster is not related to the Lustre file system and does not incorporate any Lustre code. Gluster based its product on ''GlusterFS'', an open-source software-based network-attached filesystem that deploys on commodity hardware. The initial version of GlusterFS was written by Anand Babu Periasamy, Gluster's founder and CTO. In May 2010 Ben Golub became the president and chief executive officer.
Red Hat Red Hat, Inc. (formerly Red Hat Software, Inc.) is an American software company that provides open source software products to enterprises and is a subsidiary of IBM. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North ...
became the primary author and maintainer of the GlusterFS
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
project after acquiring the Gluster company in October 2011. The product was first marketed as Red Hat Storage Server, but in early 2015 renamed to be Red Hat Gluster Storage since Red Hat has also acquired the Ceph file system technology. Red Hat Gluster Storage is in the retirement phase of its lifecycle with a end of support life date of December 31, 2024.


Architecture

The GlusterFS architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage (configured as
direct-attached storage Direct-attached storage (DAS) is digital storage directly attached to the computer accessing it, as opposed to storage accessed over a computer network (i.e. network-attached storage). DAS consists of one or more storage units such as hard driv ...
,
JBOD The most widespread standard for configuring multiple hard disk drives is RAID (redundant array of inexpensive/independent disks), which comes in a number of standard configurations and non-standard configurations. Non-RAID drive architectures a ...
, or using a
storage area network A storage area network (SAN) or storage network is a computer network which provides access to consolidated, block device, block-level data storage. SANs are primarily used to access Computer data storage, data storage devices, such as disk ...
) is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.


Public cloud deployment

For public cloud deployments, GlusterFS offers an
Amazon Web Services Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon.com, Amazon that provides Software as a service, on-demand cloud computing computing platform, platforms and Application programming interface, APIs to individuals, companies, and gover ...
(AWS) Amazon Machine Image (AMI), which is deployed on Elastic Compute Cloud (EC2) instances rather than physical servers and the underlying storage is Amazon's Elastic Block Storage (EBS). In this environment, capacity is scaled by deploying more EBS storage units, performance is scaled by deploying more EC2 instances, and availability is scaled by n-way replication between AWS availability zones.


Private cloud deployment

A typical on-premises, or private cloud deployment will consist of GlusterFS installed as a virtual appliance on top of multiple commodity servers running
hypervisor A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called ...
s such as KVM, Xen, or VMware; or on bare metal.


GlusterFS

GlusterFS is a scale-out
network-attached storage Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a Heterogeneous computing, heterogeneous group of clients. In this context, the term "NAS" can refer to both th ...
file system. It has found applications including
cloud computing Cloud computing is "a paradigm for enabling network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand," according to International Organization for ...
, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by
Red Hat Red Hat, Inc. (formerly Red Hat Software, Inc.) is an American software company that provides open source software products to enterprises and is a subsidiary of IBM. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North ...
, Inc., as a result of Red Hat acquiring Gluster in 2011. In June 2012,
Red Hat Storage Server Red Hat Gluster Storage, formerly Red Hat Storage Server, is a computer storage product from Red Hat. It is based on open source technologies such as GlusterFS and Red Hat Enterprise Linux. The latest release, RHGS 3.5, combines Red Hat Enterpri ...
was announced as a commercially supported integration of GlusterFS with
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial Linux distribution developed by Red Hat. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop version for x86-64. Fedora Linux and ...
. Red Hat bought Inktank Storage in April 2014, which is the company behind the Ceph distributed file system, and re-branded GlusterFS-based Red Hat Storage Server to "Red Hat Gluster Storage".


Design

GlusterFS aggregates various storage servers over
Ethernet Ethernet ( ) is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
or
Infiniband InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
RDMA interconnect into one large parallel network file system. It is free software, with some parts licensed under the GNU General Public License (GPL) v3 while others are dual licensed under either GPL v2 or the
Lesser General Public License The GNU Lesser General Public License (LGPL) is a free-software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate a software component released under the LGPL into their own ...
(LGPL) v3. GlusterFS is based on a stackable user space design. GlusterFS has a client and server component. Servers are typically deployed as ''storage bricks'', with each server running a daemon to export a local file system as a ''
volume Volume is a measure of regions in three-dimensional space. It is often quantified numerically using SI derived units (such as the cubic metre and litre) or by various imperial or US customary units (such as the gallon, quart, cubic inch) ...
''. The client process, which connects to servers with a custom protocol over
TCP/IP The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suite are ...
, InfiniBand or Sockets Direct Protocol, creates composite virtual volumes from multiple remote servers using stackable ''translators''. By default, files are stored whole, but striping of files across multiple remote volumes is also possible. The client may mount the composite volume using a GlusterFS native protocol via the FUSE mechanism or using NFS v3 protocol using a built-in server translator, or access the volume via the client library. The client may re-export a native-protocol mount, for example via the kernel
NFSv4 Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like ...
server,
SAMBA Samba () is a broad term for many of the rhythms that compose the better known Brazilian music genres that originated in the Afro-Brazilians, Afro Brazilian communities of Bahia in the late 19th century and early 20th century, It is a name or ...
, or the object-based
OpenStack OpenStack is a free, open standard cloud computing platform. It is mostly deployed as infrastructure-as-a-service (IaaS) in both public and private clouds where virtual servers and other resources are made available to users. The software pla ...
Storage (Swift) protocol using the "UFO" (Unified File and Object) translator. Most of the functionality of GlusterFS is implemented as translators, including file-based
mirroring Mirroring is the behavior in which one person subconsciously imitates the gesture, idiolect, speech pattern, or attitude of another. Mirroring often occurs in social situations, particularly in the company of close friends or family, often going ...
and replication, file-based striping, file-based load balancing, volume
failover Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network in a computer ...
,
scheduling A schedule (, ) or a timetable, as a basic time-management tool, consists of a list of times at which possible tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order in which such things ...
and disk caching, storage quotas, and volume snapshots with user serviceability (since GlusterFS version 3.6). The GlusterFS server is intentionally kept simple: it exports an existing directory as-is, leaving it up to client-side translators to structure the store. The clients themselves are stateless, do not communicate with each other, and are expected to have translator configurations consistent with each other. GlusterFS relies on an elastic hashing algorithm, rather than using either a centralized or distributed metadata model. The user can add, delete, or migrate volumes dynamically, which helps to avoid configuration coherency problems. This allows GlusterFS to scale up to several
petabyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
s on commodity hardware by avoiding bottlenecks that normally affect more tightly coupled distributed file systems. GlusterFS provides data reliability and availability through various kinds of replication: replicated volumes and geo-replication. Replicated volumes ensure that there exists at least one copy of each file across the bricks, so if one fails, data is still stored and accessible. Geo-replication provides a leader-follower model of replication, where volumes are copied across geographically distinct locations. This happens asynchronously and is useful for availability in case of a whole data center failure. GlusterFS has been used as the foundation for academic research and a survey article. Red Hat markets the software for three markets: "on-premises",
public cloud Cloud computing is "a paradigm for enabling network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand," according to ISO. Essential characteristics ...
and "private cloud".


See also

* BeeGFS *
Ceph (software) Ceph (pronounced ) is a Free software, free and open-source software, open-source software-defined Computer data storage, storage computing platform, platform that provides object storage, Block-level_storage, block storage, and File system, fi ...
*
Distributed file system A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not emplo ...
* Distributed parallel fault-tolerant file systems *
Gfarm file system Gfarm file system is an open-source distributed file system, generally used for large-scale cluster computing and wide-area data sharing, and provides features to manage replica location explicitly. The name is derived from the Grid Data Farm arch ...
* IBM Storage Scale (GPFS) * LizardFS * Lustre * MapR FS * Moose File System * OrangeFS * Parallel Virtual File System * Quantcast File System * RozoFS * XtreemFS *
ZFS ZFS (previously Zettabyte File System) is a file system with Volume manager, volume management capabilities. It began as part of the Sun Microsystems Solaris (operating system), Solaris operating system in 2001. Large parts of Solaris, includin ...


References

{{Reflist Computer storage companies Software companies based in the San Francisco Bay Area Cloud storage Companies based in Sunnyvale, California Defunct software companies of the United States